• 跳至… +
    browser.coffee cake.coffee coffee-script.coffee command.coffee grammar.coffee helpers.coffee index.coffee lexer.coffee nodes.coffee optparse.coffee register.coffee repl.coffee rewriter.coffee scope.litcoffee sourcemap.litcoffee
  • rewriter.coffee

  • ¶

    CoffeeScript 語言有許多可選語法、隱含語法和簡寫語法。這會大幅增加語法的複雜度,並擴充產生的分析表。我們不讓剖析器處理所有這些,而是使用此 Rewriter 針對記號串進行一系列的傳遞,將簡寫轉換為明確的長格式,新增隱含的縮排和括號,並大致整理一下。

  • ¶

    建立一個產生的記號:一個由於使用隱含語法而存在的記號。

    generate = (tag, value, origin) ->
      tok = [tag, value]
      tok.generated = yes
      tok.origin = origin if origin
      tok
  • ¶

    Rewriter 類別由 Lexer 使用,直接針對其內部的記號陣列。

    exports.Rewriter = class Rewriter
  • ¶

    一次一個邏輯篩選器,以多重傳遞重寫記號串。這當然可以變更為透過串流的單一傳遞,搭配一個大型有效率的切換,但這樣的工作方式好多了。這些傳遞的順序很重要 – 必須在隱含括號可以包覆程式碼區塊之前修正縮排。

      rewrite: (@tokens) ->
  • ¶

    有助於偵錯的片段:console.log (t[0] + ‘/‘ + t[1] for t in @tokens).join ‘ ‘

        @removeLeadingNewlines()
        @closeOpenCalls()
        @closeOpenIndexes()
        @normalizeLines()
        @tagPostfixConditionals()
        @addImplicitBracesAndParens()
        @addLocationDataToGeneratedTokens()
        @fixOutdentLocationData()
        @tokens
  • ¶

    重寫記號串,向前和向後查看一個記號。允許區塊的傳回值告訴我們在串流中向前(或向後)移動多少個記號,以確保我們不會在插入和移除記號時遺漏任何內容,而且串流長度會在我們腳下變更。

      scanTokens: (block) ->
        {tokens} = this
        i = 0
        i += block.call this, token, i, tokens while token = tokens[i]
        true
    
      detectEnd: (i, condition, action) ->
        {tokens} = this
        levels = 0
        while token = tokens[i]
          return action.call this, token, i     if levels is 0 and condition.call this, token, i
          return action.call this, token, i - 1 if not token or levels < 0
          if token[0] in EXPRESSION_START
            levels += 1
          else if token[0] in EXPRESSION_END
            levels -= 1
          i += 1
        i - 1
  • ¶

    前導換行符號會在語法中引入歧義,因此我們在此分配它們。

      removeLeadingNewlines: ->
        break for [tag], i in @tokens when tag isnt 'TERMINATOR'
        @tokens.splice 0, i if i
  • ¶

    詞法分析器已標記方法呼叫的開括號。將其與配對的閉括號配對。我們在此包含錯誤嵌套的縮排情況,用於在同一行上關閉的呼叫,就在其縮排之前。

      closeOpenCalls: ->
        condition = (token, i) ->
          token[0] in [')', 'CALL_END'] or
          token[0] is 'OUTDENT' and @tag(i - 1) is ')'
    
        action = (token, i) ->
          @tokens[if token[0] is 'OUTDENT' then i - 1 else i][0] = 'CALL_END'
    
        @scanTokens (token, i) ->
          @detectEnd i + 1, condition, action if token[0] is 'CALL_START'
          1
  • ¶

    詞法分析器已標記索引操作呼叫的開括號。將其與配對的閉括號配對。

      closeOpenIndexes: ->
        condition = (token, i) ->
          token[0] in [']', 'INDEX_END']
    
        action = (token, i) ->
          token[0] = 'INDEX_END'
    
        @scanTokens (token, i) ->
          @detectEnd i + 1, condition, action if token[0] is 'INDEX_START'
          1
  • ¶

    從 i 開始,使用 pattern 來比對代幣串流中的標籤,略過「HERECOMMENT」。pattern 可能包含字串(相等)、字串陣列(其中之一)或 null(萬用字元)。傳回比對的索引,如果沒有比對則傳回 -1。

      indexOfTag: (i, pattern...) ->
        fuzz = 0
        for j in [0 ... pattern.length]
          fuzz += 2 while @tag(i + j + fuzz) is 'HERECOMMENT'
          continue if not pattern[j]?
          pattern[j] = [pattern[j]] if typeof pattern[j] is 'string'
          return -1 if @tag(i + j + fuzz) not in pattern[j]
        i + j + fuzz - 1
  • ¶

    如果在類似 @<x>:、<x>: 或 <EXPRESSION_START><x>...<EXPRESSION_END>: 的字串前面,則傳回 yes,略過「HERECOMMENT」。

      looksObjectish: (j) ->
        return yes if @indexOfTag(j, '@', null, ':') > -1 or @indexOfTag(j, null, ':') > -1
        index = @indexOfTag(j, EXPRESSION_START)
        if index > -1
          end = null
          @detectEnd index + 1, ((token) -> token[0] in EXPRESSION_END), ((token, i) -> end = i)
          return yes if @tag(end + 1) is ':'
        no
  • ¶

    如果代幣的目前行包含相同表達式層級標籤的元素,則傳回 yes。在 LINEBREAK 或包含平衡表達式的明確開始處停止搜尋。

      findTagsBackwards: (i, tags) ->
        backStack = []
        while i >= 0 and (backStack.length or
              @tag(i) not in tags and
              (@tag(i) not in EXPRESSION_START or @tokens[i].generated) and
              @tag(i) not in LINEBREAKS)
          backStack.push @tag(i) if @tag(i) in EXPRESSION_END
          backStack.pop() if @tag(i) in EXPRESSION_START and backStack.length
          i -= 1
        @tag(i) in tags
  • ¶

    在代幣串流中尋找隱式呼叫和物件的標誌,並將其加入。

      addImplicitBracesAndParens: ->
  • ¶

    在堆疊中追蹤目前的平衡深度(隱式和明確)。

        stack = []
        start = null
    
        @scanTokens (token, i, tokens) ->
          [tag]     = token
          [prevTag] = prevToken = if i > 0 then tokens[i - 1] else []
          [nextTag] = if i < tokens.length - 1 then tokens[i + 1] else []
          stackTop  = -> stack[stack.length - 1]
          startIdx  = i
  • ¶

    輔助函式,用於在傳回取得新代幣時追蹤已使用和拼接的代幣數目。

          forward   = (n) -> i - startIdx + n
  • ¶

    輔助函式

          isImplicit        = (stackItem) -> stackItem?[2]?.ours
          isImplicitObject  = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '{'
          isImplicitCall    = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '('
          inImplicit        = -> isImplicit stackTop()
          inImplicitCall    = -> isImplicitCall stackTop()
          inImplicitObject  = -> isImplicitObject stackTop()
  • ¶

    隱式括號內未關閉的控制陳述式(例如類別宣告或 if 條件式)

          inImplicitControl = -> inImplicit and stackTop()?[0] is 'CONTROL'
    
          startImplicitCall = (j) ->
            idx = j ? i
            stack.push ['(', idx, ours: yes]
            tokens.splice idx, 0, generate 'CALL_START', '(', ['', 'implicit function call', token[2]]
            i += 1 if not j?
    
          endImplicitCall = ->
            stack.pop()
            tokens.splice i, 0, generate 'CALL_END', ')', ['', 'end of input', token[2]]
            i += 1
    
          startImplicitObject = (j, startsLine = yes) ->
            idx = j ? i
            stack.push ['{', idx, sameLine: yes, startsLine: startsLine, ours: yes]
            val = new String '{'
            val.generated = yes
            tokens.splice idx, 0, generate '{', val, token
            i += 1 if not j?
    
          endImplicitObject = (j) ->
            j = j ? i
            stack.pop()
            tokens.splice j, 0, generate '}', '}', token
            i += 1
  • ¶

    如果下列任一項在引數中,則不要在下次縮排時結束隱式呼叫

          if inImplicitCall() and tag in ['IF', 'TRY', 'FINALLY', 'CATCH',
            'CLASS', 'SWITCH']
            stack.push ['CONTROL', i, ours: yes]
            return forward(1)
    
          if tag is 'INDENT' and inImplicit()
  • ¶

    INDENT 會關閉隱式呼叫,除非

    1. 我們在這一行看到一個 CONTROL 參數。
    2. 縮排前的最後一個符號是下列清單的一部分
            if prevTag not in ['=>', '->', '[', '(', ',', '{', 'TRY', 'ELSE', '=']
              endImplicitCall() while inImplicitCall()
            stack.pop() if inImplicitControl()
            stack.push [tag, i]
            return forward(1)
  • ¶

    明確表達式的直接開始

          if tag in EXPRESSION_START
            stack.push [tag, i]
            return forward(1)
  • ¶

    關閉所有在明確關閉表達式內的隱含表達式。

          if tag in EXPRESSION_END
            while inImplicit()
              if inImplicitCall()
                endImplicitCall()
              else if inImplicitObject()
                endImplicitObject()
              else
                stack.pop()
            start = stack.pop()
  • ¶

    辨識標準的隱含呼叫,例如 f a、f() b、f? c、h[0] d 等。

          if (tag in IMPLICIT_FUNC and token.spaced or
              tag is '?' and i > 0 and not tokens[i - 1].spaced) and
             (nextTag in IMPLICIT_CALL or
              nextTag in IMPLICIT_UNSPACED_CALL and
              not tokens[i + 1]?.spaced and not tokens[i + 1]?.newLine)
            tag = token[0] = 'FUNC_EXIST' if tag is '?'
            startImplicitCall i + 1
            return forward(2)
  • ¶

    隱含呼叫將隱含縮排物件作為第一個參數。

    f
      a: b
      c: d
    

    和

    f
      1
      a: b
      b: c
    

    當在與下列控制結構同一行時,不要接受此類型的隱含呼叫,因為這可能會誤解類似於

    if f
       a: 1
    

    as

    if f(a: 1)
    

    的結構,這可能總是無意的。此外,不要在字面陣列中允許這樣做,因為這會產生語法上的歧義。

          if tag in IMPLICIT_FUNC and
             @indexOfTag(i + 1, 'INDENT') > -1 and @looksObjectish(i + 2) and
             not @findTagsBackwards(i, ['CLASS', 'EXTENDS', 'IF', 'CATCH',
              'SWITCH', 'LEADING_WHEN', 'FOR', 'WHILE', 'UNTIL'])
            startImplicitCall i + 1
            stack.push ['INDENT', i + 2]
            return forward(3)
  • ¶

    隱含物件從這裡開始

          if tag is ':'
  • ¶

    回到物件的(隱含)開頭

            s = switch
              when @tag(i - 1) in EXPRESSION_END then start[1]
              when @tag(i - 2) is '@' then i - 2
              else i - 1
            s -= 2 while @tag(s - 2) is 'HERECOMMENT'
  • ¶

    標記值是否為 for 迴圈

            @insideForDeclaration = nextTag is 'FOR'
    
            startsLine = s is 0 or @tag(s - 1) in LINEBREAKS or tokens[s - 1].newLine
  • ¶

    我們是否只是繼續已宣告的物件?

            if stackTop()
              [stackTag, stackIdx] = stackTop()
              if (stackTag is '{' or stackTag is 'INDENT' and @tag(stackIdx - 1) is '{') and
                 (startsLine or @tag(s - 1) is ',' or @tag(s - 1) is '{')
                return forward(1)
    
            startImplicitObject(s, !!startsLine)
            return forward(2)
  • ¶

    在鏈接方法呼叫時結束隱含呼叫,例如

    f ->
      a
    .g b, ->
      c
    .h a
    

    還有

    f a
    .g b
    .h a
    
  • ¶

    將所有封閉物件標記為 not sameLine

          if tag in LINEBREAKS
            for stackItem in stack by -1
              break unless isImplicit stackItem
              stackItem[2].sameLine = no if isImplicitObject stackItem
    
          newLine = prevTag is 'OUTDENT' or prevToken.newLine
          if tag in IMPLICIT_END or tag in CALL_CLOSERS and newLine
            while inImplicit()
              [stackTag, stackIdx, {sameLine, startsLine}] = stackTop()
  • ¶

    在到達參數清單的結尾時關閉隱含呼叫

              if inImplicitCall() and prevTag isnt ','
                endImplicitCall()
  • ¶

    關閉隱含物件,例如:return a: 1, b: 2 unless true

              else if inImplicitObject() and not @insideForDeclaration and sameLine and
                      tag isnt 'TERMINATOR' and prevTag isnt ':'
                endImplicitObject()
  • ¶

    當行尾時關閉隱含物件,行尾沒有逗號,隱含物件沒有在行首,且下一行不像是物件的延續。

              else if inImplicitObject() and tag is 'TERMINATOR' and prevTag isnt ',' and
                      not (startsLine and @looksObjectish(i + 1))
                return forward 1 if nextTag is 'HERECOMMENT'
                endImplicitObject()
              else
                break
  • ¶

    如果逗號是最後一個字元,且後面看起來不像是屬於它的,則關閉隱含物件。這用於尾隨逗號和呼叫,例如

    x =
        a: b,
        c: d,
    e = 2
    

    和

    f a, b: c, d: e, f, g: h: i, j
    
          if tag is ',' and not @looksObjectish(i + 1) and inImplicitObject() and
             not @insideForDeclaration and
             (nextTag isnt 'TERMINATOR' or not @looksObjectish(i + 2))
  • ¶

    當 nextTag 為 OUTDENT 時,逗號不重要,應略過,因此將它嵌入隱含物件中。

    當它不是逗號時,會繼續在堆疊中扮演呼叫或陣列的角色,因此給它一個機會。

            offset = if nextTag is 'OUTDENT' then 1 else 0
            while inImplicitObject()
              endImplicitObject i + offset
          return forward(1)
  • ¶

    將位置資料新增到重寫器產生的所有權杖。

      addLocationDataToGeneratedTokens: ->
        @scanTokens (token, i, tokens) ->
          return 1 if     token[2]
          return 1 unless token.generated or token.explicit
          if token[0] is '{' and nextLocation=tokens[i + 1]?[2]
            {first_line: line, first_column: column} = nextLocation
          else if prevLocation = tokens[i - 1]?[2]
            {last_line: line, last_column: column} = prevLocation
          else
            line = column = 0
          token[2] =
            first_line:   line
            first_column: column
            last_line:    line
            last_column:  column
          return 1
  • ¶

    OUTDENT 權杖應始終置於前一個權杖的最後一個字元,以便以 OUTDENT 權杖結尾的 AST 節點最終會對應到節點下方的最後一個「實際」權杖的位置。

      fixOutdentLocationData: ->
        @scanTokens (token, i, tokens) ->
          return 1 unless token[0] is 'OUTDENT' or
            (token.generated and token[0] is 'CALL_END') or
            (token.generated and token[0] is '}')
          prevLocationData = tokens[i - 1][2]
          token[2] =
            first_line:   prevLocationData.last_line
            first_column: prevLocationData.last_column
            last_line:    prevLocationData.last_line
            last_column:  prevLocationData.last_column
          return 1
  • ¶

    由於我們的語法是 LALR(1),因此無法處理缺少結束分隔符號的一些單行表達式。重寫器會新增隱含區塊,因此不需要這樣做。為了保持語法簡潔,會移除表達式中的尾隨換行符,並新增空區塊的縮排權杖。

      normalizeLines: ->
        starter = indent = outdent = null
    
        condition = (token, i) ->
          token[1] isnt ';' and token[0] in SINGLE_CLOSERS and
          not (token[0] is 'TERMINATOR' and @tag(i + 1) in EXPRESSION_CLOSE) and
          not (token[0] is 'ELSE' and starter isnt 'THEN') and
          not (token[0] in ['CATCH', 'FINALLY'] and starter in ['->', '=>']) or
          token[0] in CALL_CLOSERS and
          (@tokens[i - 1].newLine or @tokens[i - 1][0] is 'OUTDENT')
    
        action = (token, i) ->
          @tokens.splice (if @tag(i - 1) is ',' then i - 1 else i), 0, outdent
    
        @scanTokens (token, i, tokens) ->
          [tag] = token
          if tag is 'TERMINATOR'
            if @tag(i + 1) is 'ELSE' and @tag(i - 1) isnt 'OUTDENT'
              tokens.splice i, 1, @indentation()...
              return 1
            if @tag(i + 1) in EXPRESSION_CLOSE
              tokens.splice i, 1
              return 0
          if tag is 'CATCH'
            for j in [1..2] when @tag(i + j) in ['OUTDENT', 'TERMINATOR', 'FINALLY']
              tokens.splice i + j, 0, @indentation()...
              return 2 + j
          if tag in SINGLE_LINERS and @tag(i + 1) isnt 'INDENT' and
             not (tag is 'ELSE' and @tag(i + 1) is 'IF')
            starter = tag
            [indent, outdent] = @indentation tokens[i]
            indent.fromThen   = true if starter is 'THEN'
            tokens.splice i + 1, 0, indent
            @detectEnd i + 2, condition, action
            tokens.splice i, 1 if tag is 'THEN'
            return 1
          return 1
  • ¶

    將後置條件標記為這樣,以便我們可以使用不同的優先順序來分析它們。

      tagPostfixConditionals: ->
    
        original = null
    
        condition = (token, i) ->
          [tag] = token
          [prevTag] = @tokens[i - 1]
          tag is 'TERMINATOR' or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)
    
        action = (token, i) ->
          if token[0] isnt 'INDENT' or (token.generated and not token.fromThen)
            original[0] = 'POST_' + original[0]
    
        @scanTokens (token, i) ->
          return 1 unless token[0] is 'IF'
          original = token
          @detectEnd i + 1, condition, action
          return 1
  • ¶

    根據同一行上的另一個權杖產生縮排權杖。

      indentation: (origin) ->
        indent  = ['INDENT', 2]
        outdent = ['OUTDENT', 2]
        if origin
          indent.generated = outdent.generated = yes
          indent.origin = outdent.origin = origin
        else
          indent.explicit = outdent.explicit = yes
        [indent, outdent]
    
      generate: generate
  • ¶

    根據權杖索引查詢標籤。

      tag: (i) -> @tokens[i]?[0]
  • ¶

    常數

  • ¶
  • ¶

    必須平衡的權杖對清單。

    BALANCED_PAIRS = [
      ['(', ')']
      ['[', ']']
      ['{', '}']
      ['INDENT', 'OUTDENT'],
      ['CALL_START', 'CALL_END']
      ['PARAM_START', 'PARAM_END']
      ['INDEX_START', 'INDEX_END']
      ['STRING_START', 'STRING_END']
      ['REGEX_START', 'REGEX_END']
    ]
  • ¶

    我們嘗試修復 BALANCED_PAIRS 的反向對應,因此我們可以從任一端查詢。

    exports.INVERSES = INVERSES = {}
  • ¶

    標記平衡配對開始/結束的代幣。

    EXPRESSION_START = []
    EXPRESSION_END   = []
    
    for [left, rite] in BALANCED_PAIRS
      EXPRESSION_START.push INVERSES[rite] = left
      EXPRESSION_END  .push INVERSES[left] = rite
  • ¶

    表示表達式子句結束的代幣。

    EXPRESSION_CLOSE = ['CATCH', 'THEN', 'ELSE', 'FINALLY'].concat EXPRESSION_END
  • ¶

    如果後接 IMPLICIT_CALL,則表示函數呼叫的代幣。

    IMPLICIT_FUNC    = ['IDENTIFIER', 'PROPERTY', 'SUPER', ')', 'CALL_END', ']', 'INDEX_END', '@', 'THIS']
  • ¶

    如果前接 IMPLICIT_FUNC,則表示函數呼叫。

    IMPLICIT_CALL    = [
      'IDENTIFIER', 'PROPERTY', 'NUMBER', 'INFINITY', 'NAN'
      'STRING', 'STRING_START', 'REGEX', 'REGEX_START', 'JS'
      'NEW', 'PARAM_START', 'CLASS', 'IF', 'TRY', 'SWITCH', 'THIS'
      'UNDEFINED', 'NULL', 'BOOL'
      'UNARY', 'YIELD', 'UNARY_MATH', 'SUPER', 'THROW'
      '@', '->', '=>', '[', '(', '{', '--', '++'
    ]
    
    IMPLICIT_UNSPACED_CALL = ['+', '-']
  • ¶

    總是標記單行隱式呼叫結束的代幣。

    IMPLICIT_END     = ['POST_IF', 'FOR', 'WHILE', 'UNTIL', 'WHEN', 'BY',
      'LOOP', 'TERMINATOR']
  • ¶

    具有未封閉結尾的區塊表達式的單行形式。語法無法消除它們的歧義,因此我們插入隱式縮排。

    SINGLE_LINERS    = ['ELSE', '->', '=>', 'TRY', 'FINALLY', 'THEN']
    SINGLE_CLOSERS   = ['TERMINATOR', 'CATCH', 'FINALLY', 'ELSE', 'OUTDENT', 'LEADING_WHEN']
  • ¶

    結束一行的代幣。

    LINEBREAKS       = ['TERMINATOR', 'INDENT', 'OUTDENT']
  • ¶

    在換行後關閉開放呼叫的代幣。

    CALL_CLOSERS     = ['.', '?.', '::', '?::']