从 Vue 源码学习编译及 TypeScript（一） —— parse

Question

从 Vue 源码学习编译及 TypeScript（一） —— parse

QC-L opened this issue 5 years ago · 0 comments

QC-L commented 5 years ago

最近在学习 TS 和编译相关的，为了加深学习笔者参考了 vue-next 的 complier-core 部分。

准备

目录结构

在开始源码阅读前，需要掌握一些基本信息，如项目依赖，构建方式，配置文件等。

首先，先来了解下整个目录的大体结构：

.
├── __tests__/
├── api-extractor.json
├── dist/
├── index.js
├── node_modules/
├── package.json
└── src/

从以上目录及文件信息，我们可以得知如下信息：

此 package 依赖了 estree-walker、source-map 以及 babel；
package.json 中不包含 scripts 字段，说明由全局统一构建；
项目构建用到了微软的 @microsoft/api-extractor；
测试框架采用了 Jest。
此 package 的入口为 index.js，而 index.js 根据环境 NODE_ENV，区分了是否为 production

了解了基本信息之后，我们先来编译一下 complier-core 部分：

yarn build compiler-core -t

编译后，dist 目录内容如下：

├── dist
    ├── compiler-core.cjs.js         # 包含异常的 cjs
    ├── compiler-core.cjs.prod.js    # 用于生产的 cjs
    ├── compiler-core.d.ts           # ts 声明文件
    └── compiler-core.esm-bundler.js # 用于 esm 的构建器

了解了基本的项目结构后，我们再来了解下编译。

编译原理

这里以 Babel 为例，简单介绍下相关的编译原理：

Babel 采用 AST 的形式（Abstract Syntax Tree，抽象语法树）对 JavaScript 源代码进行处理。

具体工作流参照下图：

Babel 中不同的 package 完成不同的工作。

@babel/parser 将源码解析生成 AST
1. 词法分析（Lexical analysis）
2. 语法分析（Syntax analysis)
3. 语义分析（Semantic analysis)
@babel/traverse 转换修改 AST
@babel/generator 根据 AST 生成新的源码，但并不会帮你格式化代码（可以使用 prettier）
@babel/core 核心库，很多 Babel 组件依赖，用于加载 preset 和 plugin
@babel/types types 包含所有 AST 中使用的类型，便于修改 AST
@babel/template 采用 template 的形式简化修改 AST 的过程

ps: 编译器基本原理相似，因此，对比学习的方式最佳。

complier-core 概览

了解了 Babel 的大概原理，那我们再来看看 complier-core/src 中的文件：

└── src
    ├── ast.ts
    ├── codegen.ts
    ├── compile.ts
    ├── errors.ts
    ├── index.ts
    ├── options.ts
    ├── parse.ts
    ├── runtimeHelpers.ts
    ├── transform.ts
    ├── transforms
    │   ├── hoistStatic.ts
    │   ├── noopDirectiveTransform.ts
    │   ├── transformElement.ts
    │   ├── transformExpression.ts
    │   ├── transformSlotOutlet.ts
    │   ├── transformText.ts
    │   ├── vBind.ts
    │   ├── vFor.ts
    │   ├── vIf.ts
    │   ├── vModel.ts
    │   ├── vOn.ts
    │   ├── vOnce.ts
    │   └── vSlot.ts
    └── utils.ts

看完目录，我们就基本能找到对应关系，也基本能了解每个 ts 文件的作用：

parse 等价于 @babel/parser
transform 等价于 @babel/traverse
codegen 等价于 @babel/generator

我们把 Babel 的图替换下，得出下图：

这里我们来贴一段源码，大家就可以理解：

function baseCompile(template, options = {}) {
    // ...
    const prefixIdentifiers =  (options.prefixIdentifiers === true || isModuleMode);
    // ...
    const ast = shared.isString(template) ? baseParse(template, options) : template;
    const [nodeTransforms, directiveTransforms] = getBaseTransformPreset(prefixIdentifiers);
    transform(ast, {
        ...options,
        prefixIdentifiers,
        nodeTransforms: [
            ...nodeTransforms,
            ...(options.nodeTransforms || []) // user transforms
        ],
        directiveTransforms: {
            ...directiveTransforms,
            ...(options.directiveTransforms || {}) // user transforms
        }
    });
    return generate(ast, {
        ...options,
        prefixIdentifiers
    });
}

ps: 代码中省略了异常处理部分，只保留了核心代码，便于理解。

从上述代码中，我们可以看出 compiler-core 预留了 options.nodeTransforms，也就意味着 AST 转换部分支持自定义。

大致了解了 complier-core 所做的事，那我们使用 complier-core 来编译一段 vue template 的代码。

编译

官方推出了 vue-next-template-explorer 供大家预览，所以这里我们使用此网站进行编译

编译前：

<div v-if="item.isShow" v-for="(item, index) in items">{{item.name}}</div>

编译后：

import { renderList as _renderList, Fragment as _Fragment, openBlock as _openBlock, createBlock as _createBlock, toDisplayString as _toDisplayString, createVNode as _createVNode, createCommentVNode as _createCommentVNode } from "vue"

export function render(_ctx, _cache) {
  return (_ctx.item.isShow)
    ? (_openBlock(true), _createBlock(_Fragment, { key: 0 }, _renderList(_ctx.items, (item, index) => {
        return (_openBlock(), _createBlock("div", null, _toDisplayString(item.name), 1 /* TEXT */))
      }), 256 /* UNKEYED_FRAGMENT */))
    : _createCommentVNode("v-if", true)
}

template 经过 complier-core 编译后，会被转换为 render 函数。

了解了转换结果，我们开始正式的 complier 的学习。

Parse 阶段 —— template -> AST

如图中所示，vue 的 template 模板会被转成 AST，这个过程对应代码中的 parse.ts。

接下来会分为两部分去分析 Parse，一是 TypeScript，二则是 Parse 的核心逻辑。

ps: 由于 index.ts 是将所有 ts 文件引入并导出，因此不做过多解释。

1.TypeScript

基础语法请参考 TS 官方文档，这里只讲解一些实用的内容。

先来看这样一个 type：

type MergedParserOptions = Omit<Required<ParserOptions>, OptionalOptions> &
  Pick<ParserOptions, OptionalOptions>

Required
Pick
Omit

Required

/**
 * Make all properties in T required
 */
type Required<T> = {
    [P in keyof T]-?: T[P];
};

其实很好理解，字面意思，就是必须的（必选项）。

其中 -? 为核心操作，将可选变为必选。

与之对应的，是 Partial, 将选项变为可选。

除此之外，keyof 有必要介绍下。

keyof

keyof 有点像 Object.keys，会取出 interface 中的所有 key，并产生联合类型。

Pick

/**
 * From T, pick a set of properties whose keys are in the union K
 */
type Pick<T, K extends keyof T> = {
    [P in K]: T[P];
};

复杂内容简单化，将 K extends keyof T 单独提取出来。

K extends keyof T 这里的含义是，K 包含在 keyof T 的键联合类型内。

从 T 中取出联合类型 K 的属性，并生成新的 type。

Omit

/**
 * Exclude from T those types that are assignable to U
 */
type Exclude<T, U> = T extends U ? never : T;

/**
 * Construct a type with the properties of T except for those in type K.
 */
type Omit<T, K extends keyof any> = Pick<T, Exclude<keyof T, K>>;

其中 Exclude 代表移除掉 T 中 U 相关的属性。

Omit 则为移除 T 中联合类型 K 的属性，并生成新的 type。

解释 MergedParserOptions

type OptionalOptions = 'isNativeTag' | 'isBuiltInComponent'
type MergedParserOptions = Omit<Required<ParserOptions>, OptionalOptions> &
  Pick<ParserOptions, OptionalOptions>

其实简单来说，interface ParserOptions 中与联合类型 OptionalOptions 所对应的属性为可选项，而除了联合类型 OptionalOptions 外的属性为必填项。

验证

下方源码中为默认的 parser 选项，除了 isNativeTag 和 isBuiltInComponent 以外，均为默认值。

export const defaultParserOptions: MergedParserOptions = {
  delimiters: [`{{`, `}}`],
  getNamespace: () => Namespaces.HTML,
  getTextMode: () => TextModes.DATA,
  isVoidTag: NO,
  isPreTag: NO,
  isCustomElement: NO,
  decodeEntities: (rawText: string): string =>
    rawText.replace(decodeRE, (_, p1) => decodeMap[p1]),
  onError: defaultOnError
}

以上是 parse.ts 文件中稍微高级一些的 ts 用法，用到了 Utility Types。

2.Parse 核心逻辑

在学习核心逻辑之前，我们看看 vue-next 是如何对编译器进行调试的。

本地调试

在文章开始时，我们提到了 vue-next-template-explorer，这个工具除了给大家学习参考外，也是编译器的调试工具。

翻看源码时，发现了 template-explorer 的启动命令

yarn dev-compiler
yarn open

接下来，我们就可以对源代码为所欲为了~

核心逻辑

我们继续沿用，文章开始的例子（因为例子中包含了 if 和 for）：

<div v-if="item.isShow" v-for="(item, index) in items">{{item.name}}</div>

我们先找到 Parse 的主函数 baseParse：

export function baseParse(
  content: string,
  options: ParserOptions = {}
): RootNode {
  const context = createParserContext(content, options)
  const start = getCursor(context)
  return createRoot(
    parseChildren(context, TextModes.DATA, []),
    getSelection(context, start)
  )
}

我们在最开始已经了解了 template 在 parse 阶段，会被编译成 AST。

由此可以得知，上述代码中 root 为解析后的 AST 对象，其类型为 RootNode。

AST 本质上就是一个 JSON 对象，让我们来看看上述 template 的 AST 的基本结构：

{
  "type": 0,
  "children": [
    {
      "type": 1,
      "ns": 0,
      "tag": "div",
      "tagType": 0,
      "props": [
        {
          "type": 7,
          "name": "if",
          "exp": {
            "type": 4,
            "content": "item.isShow",
            "isStatic": false,
            "isConstant": false,
            "loc": {
              "start": {
                "column": 12,
                "line": 1,
                "offset": 11
              },
              "end": {
                "column": 23,
                "line": 1,
                "offset": 22
              },
              "source": "item.isShow"
            }
          },
          "modifiers": [],
          "loc": {
            "start": {
              "column": 6,
              "line": 1,
              "offset": 5
            },
            "end": {
              "column": 24,
              "line": 1,
              "offset": 23
            },
            "source": "v-if=\"item.isShow\""
          }
        },
        {
          "type": 7,
          "name": "for",
          "exp": {
            "type": 4,
            "content": "(item, index) in items",
            "isStatic": false,
            "isConstant": false,
            "loc": {
              "start": {
                "column": 32,
                "line": 1,
                "offset": 31
              },
              "end": {
                "column": 54,
                "line": 1,
                "offset": 53
              },
              "source": "(item, index) in items"
            }
          },
          "modifiers": [],
          "loc": {
            "start": {
              "column": 25,
              "line": 1,
              "offset": 24
            },
            "end": {
              "column": 55,
              "line": 1,
              "offset": 54
            },
            "source": "v-for=\"(item, index) in items\""
          }
        }
      ],
      "isSelfClosing": false,
      "children": [
        {
          "type": 5,
          "content": {
            "type": 4,
            "isStatic": false,
            "isConstant": false,
            "content": "item.name",
            "loc": {
              "start": {
                "column": 58,
                "line": 1,
                "offset": 57
              },
              "end": {
                "column": 67,
                "line": 1,
                "offset": 66
              },
              "source": "item.name"
            }
          },
          "loc": {
            "start": {
              "column": 56,
              "line": 1,
              "offset": 55
            },
            "end": {
              "column": 69,
              "line": 1,
              "offset": 68
            },
            "source": "{{item.name}}"
          }
        }
      ],
      "loc": {
        "start": {
          "column": 1,
          "line": 1,
          "offset": 0
        },
        "end": {
          "column": 75,
          "line": 1,
          "offset": 74
        },
        "source": "<div v-if=\"item.isShow\" v-for=\"(item, index) in items\">{{item.name}}</div>"
      }
    }
  ],
  "helpers": [],
  "components": [],
  "directives": [],
  "hoists": [],
  "imports": [],
  "cached": 0,
  "temps": 0,
  "loc": {
    "start": {
      "column": 1,
      "line": 1,
      "offset": 0
    },
    "end": {
      "column": 75,
      "line": 1,
      "offset": 74
    },
    "source": "<div v-if=\"item.isShow\" v-for=\"(item, index) in items\">{{item.name}}</div>"
  }
}

baseParse 中调用了 5 个函数：

createParserContext
getCursor
createRoot
getSelection
parseChildren —— 核心处理逻辑

createParserContext

此函数创建了一个 context，用于关联上下文保存数据。

function createParserContext(
  content: string,
  options: ParserOptions
): ParserContext {
  return {
    options: {
      ...defaultParserOptions,
      ...options
    },
    column: 1,
    line: 1,
    offset: 0,
    originalSource: content,
    source: content,
    inPre: false,
    inVPre: false
  }
}

getCursor

Using cursors, one can search an AST for a selected node and replace, delete, update, or detach it. —— AST_Cursors

cursor 可以理解为对每个节点加了一个下标，此方法用于获取上下文中 cursor 的值。

cusor 由 column、line 以及 offset 组成。

function getCursor(context: ParserContext): Position {
  const { column, line, offset } = context
  return { column, line, offset }
}

createRoot

其含义是，创建 AST JSON 的根。

大家可以理解为每个 template 的根都是一样的。

参数为 children 和 loc

export const locStub: SourceLocation = {
  source: '',
  start: { line: 1, column: 1, offset: 0 },
  end: { line: 1, column: 1, offset: 0 }
}

export function createRoot(
  children: TemplateChildNode[],
  loc = locStub
): RootNode {
  return {
    type: NodeTypes.ROOT,
    children,
    helpers: [],
    components: [],
    directives: [],
    hoists: [],
    imports: [],
    cached: 0,
    temps: 0,
    codegenNode: undefined,
    loc
  }
}

getSelection

function getSelection(
  context: ParserContext,
  start: Position,
  end?: Position
): SourceLocation {
  end = end || getCursor(context)
  return {
    start,
    end,
    source: context.originalSource.slice(start.offset, end.offset)
  }
}

parseChildren

此函数为核心处理逻辑。（最重要的放在最后）

大家在大学时，都学过树的遍历方式。

深度优先遍历
广度优先遍历

这里 AST 的本质就是一颗树，因此上述遍历方式均有效。

那如果将 template -> AST，会如何做？

比如，这个例子：

<div>
  <div>
    <span>示例</span>
  </div>
</div>

抛开自动闭合、注释、属性、指令及插值等特性，简化版的 AST 如下：

ps: 上述抛开的特性在源码中均有处理。

{
  type: 0,
  children: [
    {
      tag: 'div',
      children: [
        {
          tag: 'div',
          children: [
            {
              tag: 'span',
              children: [
                {
                  content: '示例'
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

对源码进行逐行处理，根据 < 和 </ 来判断是节点开始，还是节点结束。

逐行解析。

vue-next 在解析时，处理了几种文本类型：

文本类型	适用
DATA	通用类型
RCDATA	`<textarea>`
RAWTEXT	`<style>`,`<script>`
CDATA	用于处理 XML 中的 `<![CDATA[]]>`
ATTRIBUTE_VALUE	属性

以注释代替代码：

function parseChildren(
  context: ParserContext,
  mode: TextModes,
  ancestors: ElementNode[]
): TemplateChildNode[] {
  // while 循环，判断是否结束，以模板最后的结束符为准
  while (!isEnd(context, mode, ancestors)) {
    // 处理插值
    // 处理注释
    // 处理 tag
    //   递归调用 parseChildren
    // 处理 element（自定义组件）
    //   递归调用 parseChildren
    // 处理所有属性
    //   处理指令
  }
}

准备