词汇语法

本页描述了 JavaScript 的词法语法。JavaScript 源文本只是一个字符序列 - 为了让解释器理解它,必须将字符串解析为更结构化的表示形式。解析的初始步骤称为 词法分析,其中从左到右扫描文本并将其转换为一系列单独的原子输入元素。一些输入元素对解释器来说无关紧要,在此步骤之后将被剥离 - 它们包括 空白comments。其他的,包括 identifierskeywordsliterals 和标点符号(主要是 operators),将用于进一步的语法分析。线路终结器 和多行注释在语法上也无关紧要,但它们指导 自动插入分号 的过程,使某些无效的标记序列变得有效。

¥This page describes JavaScript's lexical grammar. JavaScript source text is just a sequence of characters — in order for the interpreter to understand it, the string has to be parsed to a more structured representation. The initial step of parsing is called lexical analysis, in which the text gets scanned from left to right and is converted into a sequence of individual, atomic input elements. Some input elements are insignificant to the interpreter, and will be stripped after this step — they include white space and comments. The others, including identifiers, keywords, literals, and punctuators (mostly operators), will be used for further syntax analysis. Line terminators and multiline comments are also syntactically insignificant, but they guide the process for automatic semicolons insertion to make certain invalid token sequences become valid.

格式控制字符

¥Format-control characters

格式控制字符没有视觉表示,但用于控制文本的解释。

¥Format-control characters have no visual representation but are used to control the interpretation of the text.

代码点 名称 缩写 描述
U+200C 零宽度非连接器 <ZWNJ> 放置在字符之间,以防止连接成某些语言的连字 (维基百科)。
U+200D 零宽度连接器 <ZWJ> 放置在通常不会连接的字符之间,以便使字符在某些语言中使用其连接形式呈现(维基百科)。
U+FEFF 字节顺序标记 <BOM> 在脚本开头使用,将其标记为 Unicode 和文本的字节顺序 (维基百科)。

在 JavaScript 源文本中, <ZWNJ> 和 <ZWJ> 被视为 identifier 部分,而 <BOM> (也称为 零宽度不间断空格 <ZWNBSP> 当不在文本开头时)被视为 空白

¥In JavaScript source text, <ZWNJ> and <ZWJ> are treated as identifier parts, while <BOM> (also called a zero-width no-break space <ZWNBSP> when not at the start of text) is treated as white space.

空白区域

¥White space

空白区域 个字符提高了源文本的可读性并将标记彼此分开。这些字符对于代码的功能通常是不必要的。缩小工具 通常用于删除空格,以减少需要传输的数据量。

¥White space characters improve the readability of source text and separate tokens from each other. These characters are usually unnecessary for the functionality of the code. Minification tools are often used to remove whitespace in order to reduce the amount of data that needs to be transferred.

代码点 名称 缩写 描述 转义序列
U+0009 字符列表 <TAB> 水平制表 \t
U+000B 线路制表 <VT> 垂直制表 \v
U+000C 换页 <FF> 分页控制字符 (维基百科)。 \F
U+0020 空间 <SP> 普通空间
U+00A0 无间断空间 <NBSP> 正常空间,但没有可能断线的点
U+FEFF 零宽度不间断空间 <ZWNBSP> 当不在脚本开头时,BOM 标记是普通的空白字符。
其他的 其他 Unicode 空格字符 <USP> "Space_Separator" 一般类别中的字符

注意:其中 具有 "White_Space" 属性但不属于 "Space_Separator" 一般类别的字符、U+0009、U+000B 和 U+000C 在 JavaScript 中仍然被视为空格;U+0085 NEXT LINE 无特殊作用;其他的成为 行终止符 的集合。

¥Note: Of those characters with the "White_Space" property but are not in the "Space_Separator" general category, U+0009, U+000B, and U+000C are still treated as white space in JavaScript; U+0085 NEXT LINE has no special role; others become the set of line terminators.

注意:JavaScript 引擎使用的 Unicode 标准的更改可能会影响程序的行为。例如,ES2016 将参考 Unicode 标准从 5.1 升级到 8.0.0,导致 U+180E 蒙古元音分隔符从 "Space_Separator" 类别移至 "格式(比照)" 类别,并使其成为非空白。随后,"\u180E".trim().length 的结果从 0 变成了 1

¥Note: Changes to the Unicode standard used by the JavaScript engine may affect programs' behavior. For example, ES2016 upgraded the reference Unicode standard from 5.1 to 8.0.0, which caused U+180E MONGOLIAN VOWEL SEPARATOR to be moved from the "Space_Separator" category to the "Format (Cf)" category, and made it a non-whitespace. Subsequently, the result of "\u180E".trim().length changed from 0 to 1.

线路终结器

¥Line terminators

除了 空白 字符之外,还使用行终止符来提高源文本的可读性。然而,在某些情况下,行终止符可能会影响 JavaScript 代码的执行,因为有一些地方是禁止使用行终止符的。行终止符也会影响 自动插入分号 的进程。

¥In addition to white space characters, line terminator characters are used to improve the readability of the source text. However, in some cases, line terminators can influence the execution of JavaScript code as there are a few places where they are forbidden. Line terminators also affect the process of automatic semicolon insertion.

在词汇语法的上下文之外,空格和行终止符经常被混为一谈。例如,String.prototype.trim() 删除字符串开头和结尾的所有空格和行终止符。正则表达式中的 \s 字符类转义 匹配所有空格和行终止符。

¥Outside the context of lexical grammar, white space and line terminators are often conflated. For example, String.prototype.trim() removes all white space and line terminators from the beginning and end of a string. The \s character class escape in regular expressions matches all white space and line terminators.

在 ECMAScript 中,只有以下 Unicode 代码点被视为行终止符,其他换行符被视为空格(例如,Next Line、NEL、U+0085 被视为空格)。

¥Only the following Unicode code points are treated as line terminators in ECMAScript, other line breaking characters are treated as white space (for example, Next Line, NEL, U+0085 is considered as white space).

代码点 名称 缩写 描述 转义序列
U+000A 换行 <LF> UNIX 系统中的换行符。 \n
U+000D 回车 <CR> Commodore 和早期 Mac 系统中的新行字符。 \r
U+2028 行分隔符 <LS> 维基百科
U+2029 段落分隔符 <PS> 维基百科

评论

¥Comments

注释用于向 JavaScript 代码添加提示、注释、建议或警告。这可以使它更容易阅读和理解。它们还可以用于禁用代码以防止其执行;这可能是一个有价值的调试工具。

¥Comments are used to add hints, notes, suggestions, or warnings to JavaScript code. This can make it easier to read and understand. They can also be used to disable code to prevent it from being executed; this can be a valuable debugging tool.

JavaScript 有两种长期存在的方法来向代码添加注释:行注释和块注释。此外,还有一种特殊的 hashbang 注释语法。

¥JavaScript has two long-standing ways to add comments to code: line comments and block comments. In addition, there's a special hashbang comment syntax.

线路评论

¥Line comments

第一种方式是 // 注释;这使得同一行中其后面的所有文本都成为注释。例如:

¥The first way is the // comment; this makes all text following it on the same line into a comment. For example:

js
function comment() {
  // This is a one line JavaScript comment
  console.log("Hello world!");
}
comment();

阻止评论

¥Block comments

第二种方式是 /* */ 风格,更加灵活。

¥The second way is the /* */ style, which is much more flexible.

例如,你可以在一行上使用它:

¥For example, you can use it on a single line:

js
function comment() {
  /* This is a one line JavaScript comment */
  console.log("Hello world!");
}
comment();

你还可以进行多行注释,如下所示:

¥You can also make multiple-line comments, like this:

js
function comment() {
  /* This comment spans multiple lines. Notice
     that we don't need to end the comment until we're done. */
  console.log("Hello world!");
}
comment();

如果你愿意,你也可以在行的中间使用它,尽管这会使你的代码更难以阅读,因此应谨慎使用:

¥You can also use it in the middle of a line, if you wish, although this can make your code harder to read so it should be used with caution:

js
function comment(x) {
  console.log("Hello " + x /* insert the value of x */ + " !");
}
comment("world");

此外,你可以使用它来禁用代码以防止其运行,方法是将代码封装在注释中,如下所示:

¥In addition, you can use it to disable code to prevent it from running, by wrapping code in a comment, like this:

js
function comment() {
  /* console.log("Hello world!"); */
}
comment();

在这种情况下,永远不会发出 console.log() 调用,因为它位于注释内。可以通过这种方式禁用任意数量的代码行。

¥In this case, the console.log() call is never issued, since it's inside a comment. Any number of lines of code can be disabled this way.

至少包含一个行终止符的块注释的行为类似于 自动插入分号 中的 行终止符

¥Block comments that contain at least one line terminator behave like line terminators in automatic semicolon insertion.

哈希邦评论

¥Hashbang comments

有一种特殊的第三种注释语法,即 hashbang 注释。hashbang 注释的行为与单行 (//) 注释完全相同,只是它以 #! 开头,并且仅在脚本或模块的绝对开始处有效。另请注意,#! 之前不允许有任何类型的空格。注释由 #! 之后到第一行末尾的所有字符组成;只允许一条这样的评论。

¥There's a special third comment syntax, the hashbang comment. A hashbang comment behaves exactly like a single line-only (//) comment, except that it begins with #! and is only valid at the absolute start of a script or module. Note also that no whitespace of any kind is permitted before the #!. The comment consists of all the characters after #! up to the end of the first line; only one such comment is permitted.

JavaScript 中的 Hashbang 注释类似于 Unix 中的 shebang,它提供了你要用来执行脚本的特定 JavaScript 解释器的路径。在 hashbang 评论成为标准化之前,它实际上已经在 Node.js 等非浏览器主机中实现,在传递到引擎之前它从源文本中剥离。示例如下:

¥Hashbang comments in JavaScript resemble shebangs in Unix which provide the path to a specific JavaScript interpreter that you want to use to execute the script. Before the hashbang comment became standardized, it had already been de-facto implemented in non-browser hosts like Node.js, where it was stripped from the source text before being passed to the engine. An example is as follows:

js
#!/usr/bin/env node

console.log("Hello world");

JavaScript 解释器会将其视为普通注释 - 如果脚本直接在 shell 中运行,则它仅对 shell 具有语义意义。

¥The JavaScript interpreter will treat it as a normal comment — it only has semantic meaning to the shell if the script is directly run in a shell.

警告:如果你希望脚本可以直接在 shell 环境中运行,请使用 UTF-8 对其进行编码,不带 BOM。尽管 BOM 不会对浏览器中运行的代码造成任何问题(因为在分析源文本之前,它会在 UTF-8 解码过程中被剥离),但如果 hashbang 前面有 BOM 字符,Unix/Linux shell 将无法识别它。

¥Warning: If you want scripts to be runnable directly in a shell environment, encode them in UTF-8 without a BOM. Although a BOM will not cause any problems for code running in a browser — because it's stripped during UTF-8 decoding, before the source text is analyzed — a Unix/Linux shell will not recognize the hashbang if it's preceded by a BOM character.

你只能使用 #! 注释样式来指定 JavaScript 解释器。在所有其他情况下,只需使用 // 注释(或多行注释)。

¥You must only use the #! comment style to specify a JavaScript interpreter. In all other cases just use a // comment (or multiline comment).

身份标识

¥Identifiers

标识符用于将值与名称链接起来。标识符可以用在很多地方:

¥An identifier is used to link a value with a name. Identifiers can be used in various places:

js
const decl = 1; // Variable declaration (may also be `let` or `var`)
function fn() {} // Function declaration
const obj = { key: "value" }; // Object keys
// Class declaration
class C {
  #priv = "value"; // Private property
}
lbl: console.log(1); // Label

在 JavaScript 中,标识符通常由字母数字字符、下划线 (_) 和美元符号 ($) 组成。标识符不允许以数字开头。然而,JavaScript 标识符不仅限于 ASCII — 还允许许多 Unicode 代码点。即,ID_Start 类别中的任何字符都可以开始标识符,而 ID_Continue 类别中的任何字符可以出现在第一个字符之后。

¥In JavaScript, identifiers are commonly made of alphanumeric characters, underscores (_), and dollar signs ($). Identifiers are not allowed to start with numbers. However, JavaScript identifiers are not only limited to ASCII — many Unicode code points are allowed as well. Namely, any character in the ID_Start category can start an identifier, while any character in the ID_Continue category can appear after the first character.

注意:如果出于某种原因,你需要自己解析某些 JavaScript 源代码,请不要假设所有标识符都遵循模式 /[A-Za-z_$][\w$]*/(即仅限 ASCII)!标识符的范围可以通过正则表达式 /[$_\p{ID_Start}][$\u200c\u200d\p{ID_Continue}]*/u 来描述(不包括 unicode 转义序列)。

¥Note: If, for some reason, you need to parse some JavaScript source yourself, do not assume all identifiers follow the pattern /[A-Za-z_$][\w$]*/ (i.e. ASCII-only)! The range of identifiers can be described by the regex /[$_\p{ID_Start}][$\u200c\u200d\p{ID_Continue}]*/u (excluding unicode escape sequences).

此外,JavaScript 允许在标识符中以 \u0000\u{000000} 的形式使用 Unicode 转义序列,它们编码与实际 Unicode 字符相同的字符串值。例如,你好\u4f60\u597d 是相同的标识符:

¥In addition, JavaScript allows using Unicode escape sequences in the form of \u0000 or \u{000000} in identifiers, which encode the same string value as the actual Unicode characters. For example, 你好 and \u4f60\u597d are the same identifiers:

js
const 你好 = "Hello";
console.log(\u4f60\u597d); // Hello

并非所有地方都接受完整范围的标识符。某些语法(例如函数声明、函数表达式和变量声明)要求使用非 保留字 的标识符名称。

¥Not all places accept the full range of identifiers. Certain syntaxes, such as function declarations, function expressions, and variable declarations require using identifiers names that are not reserved words.

js
function import() {} // Illegal: import is a reserved word.

最值得注意的是,私有属性和对象属性允许保留字。

¥Most notably, private properties and object properties allow reserved words.

js
const obj = { import: "value" }; // Legal despite `import` being reserved
class C {
  #import = "value";
}

关键词

¥Keywords

关键字是看起来像标识符但在 JavaScript 中具有特殊含义的标记。例如,函数声明前的关键字 async 表示该函数是异步的。

¥Keywords are tokens that look like identifiers but have special meanings in JavaScript. For example, the keyword async before a function declaration indicates that the function is asynchronous.

有些关键字是保留的,这意味着它们不能用作变量声明、函数声明等的标识符。它们通常称为保留字。下面提供了 这些保留字的列表。并非所有关键字都是保留的 - 例如,async 可以在任何地方用作标识符。有些关键字仅在上下文中保留 - 例如,await 仅在异步函数体内保留,而 let 仅在严格模式代码或 constlet 声明中保留。

¥Some keywords are reserved, meaning that they cannot be used as an identifier for variable declarations, function declarations, etc. They are often called reserved words. A list of these reserved words is provided below. Not all keywords are reserved — for example, async can be used as an identifier anywhere. Some keywords are only contextually reserved — for example, await is only reserved within the body of an async function, and let is only reserved in strict mode code, or const and let declarations.

标识符始终按字符串值进行比较,因此转义序列会被解释。例如,这仍然是一个语法错误:

¥Identifiers are always compared by string value, so escape sequences are interpreted. For example, this is still a syntax error:

js
const els\u{65} = 1;
// `els\u{65}` encodes the same identifier as `else`

保留字

¥Reserved words

这些关键字不能在 JavaScript 源代码中的任何地方用作变量、函数、类等的标识符。

¥These keywords cannot be used as identifiers for variables, functions, classes, etc. anywhere in JavaScript source.

以下仅当在严格模式代码中找到时才保留:

¥The following are only reserved when they are found in strict mode code:

  • let(也在 constlet 和类声明中保留)
  • static
  • yield(也在生成器函数体中保留)

仅当在模块代码或异步函数体中找到以下内容时才保留它们:

¥The following are only reserved when they are found in module code or async function bodies:

未来保留字

¥Future reserved words

以下内容被 ECMAScript 规范保留为未来的关键字。它们目前没有特殊功能,但将来可能会具有特殊功能,因此它们不能用作标识符。

¥The following are reserved as future keywords by the ECMAScript specification. They have no special functionality at present, but they might at some future time, so they cannot be used as identifiers.

这些始终是保留的:

¥These are always reserved:

  • enum

以下仅当在严格模式代码中找到时才保留:

¥The following are only reserved when they are found in strict mode code:

  • implements
  • interface
  • package
  • private
  • protected
  • public

旧标准中的未来保留字

¥Future reserved words in older standards

以下内容被较旧的 ECMAScript 规范(ECMAScript 1 至 3)保留为未来关键字。

¥The following are reserved as future keywords by older ECMAScript specifications (ECMAScript 1 till 3).

  • abstract
  • boolean
  • byte
  • char
  • double
  • final
  • float
  • goto
  • int
  • long
  • native
  • short
  • synchronized
  • throws
  • transient
  • volatile

具有特殊含义的标识符

¥Identifiers with special meanings

一些标识符在某些上下文中具有特殊含义,而不是任何类型的保留字。他们包括:

¥A few identifiers have a special meaning in some contexts without being reserved words of any kind. They include:

文字

¥Literals

注意:本节讨论原子标记的文字。对象字面量数组文字 是由一系列令牌组成的 expressions

¥Note: This section discusses literals that are atomic tokens. Object literals and array literals are expressions that consist of a series of tokens.

空文字

¥Null literal

另请参阅 null 了解更多信息。

¥See also null for more information.

js
null

布尔文字

¥Boolean literal

另请参阅 布尔类型 了解更多信息。

¥See also boolean type for more information.

js
true
false

数字文字

¥Numeric literals

数字BigInt 类型使用数字文字。

¥The Number and BigInt types use numeric literals.

十进制

¥Decimal

js
1234567890
42

十进制文字可以以零 (0) 开头,后跟另一个十进制数字,但如果前导 0 之后的所有数字都小于 8,则该数字将被解释为八进制数。这被认为是旧语法,以 0 为前缀的数字文字,无论解释为八进制还是十进制,都会导致 严格模式 出现语法错误 - 因此,请改用 0o 前缀。

¥Decimal literals can start with a zero (0) followed by another decimal digit, but if all digits after the leading 0 are smaller than 8, the number is interpreted as an octal number. This is considered a legacy syntax, and number literals prefixed with 0, whether interpreted as octal or decimal, cause a syntax error in strict mode — so, use the 0o prefix instead.

js
0888 // 888 parsed as decimal
0777 // parsed as octal, 511 in decimal

指数

¥Exponential

十进制指数文字由以下格式指定:beN;其中 b 是基数(整数或浮点数),后跟 Ee 字符(用作分隔符或指数指示符)和 N,它是指数或幂数 - 有符号整数。

¥The decimal exponential literal is specified by the following format: beN; where b is a base number (integer or floating), followed by an E or e character (which serves as separator or exponent indicator) and N, which is exponent or power number – a signed integer.

js
0e-5   // 0
0e+5   // 0
5e1    // 50
175e-2 // 1.75
1e3    // 1000
1e-3   // 0.001
1E3    // 1000

二进制

¥Binary

二进制数语法使用前导零,后跟小写或大写拉丁字母 "B"(0b0B)。0b 之后的任何非 0 或 1 的字符都将终止文字序列。

¥Binary number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "B" (0b or 0B). Any character after the 0b that is not 0 or 1 will terminate the literal sequence.

js
0b10000000000000000000000000000000 // 2147483648
0b01111111100000000000000000000000 // 2139095040
0B00000000011111111111111111111111 // 8388607

八进制

¥Octal

八进制数字语法使用前导零,后跟小写或大写拉丁字母 "O"(0o0O))。0o 之后超出范围 (01234567) 的任何字符都将终止文字序列。

¥Octal number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "O" (0o or 0O). Any character after the 0o that is outside the range (01234567) will terminate the literal sequence.

js
0O755 // 493
0o644 // 420

十六进制

¥Hexadecimal

十六进制数字语法使用前导零,后跟小写或大写拉丁字母 "X"(0x0X)。0x 之后超出范围 (0123456789ABCDEF) 的任何字符都将终止文字序列。

¥Hexadecimal number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "X" (0x or 0X). Any character after the 0x that is outside the range (0123456789ABCDEF) will terminate the literal sequence.

js
0xFFFFFFFFFFFFFFFFF // 295147905179352830000
0x123456789ABCDEF   // 81985529216486900
0XA                 // 10

BigInt 字面值

¥BigInt literal

BigInt 类型是 JavaScript 中的数字基元,可以表示任意精度的整数。BigInt 文字是通过将 n 附加到整数末尾来创建的。

¥The BigInt type is a numeric primitive in JavaScript that can represent integers with arbitrary precision. BigInt literals are created by appending n to the end of an integer.

js
123456789123456789n     // 123456789123456789
0o777777777777n         // 68719476735
0x123456789ABCDEFn      // 81985529216486895
0b11101001010101010101n // 955733

BigInt 文字不能以 0 开头,以避免与旧八进制文字混淆。

¥BigInt literals cannot start with 0 to avoid confusion with legacy octal literals.

js
0755n; // SyntaxError: invalid BigInt syntax

对于八进制 BigInt 数字,始终使用零后跟字母 "o"(大写或小写):

¥For octal BigInt numbers, always use zero followed by the letter "o" (uppercase or lowercase):

js
0o755n;

有关 BigInt 的更多信息,另请参阅 JavaScript 数据结构

¥For more information about BigInt, see also JavaScript data structures.

数字分隔符

¥Numeric separators

为了提高数字文字的可读性,可以使用下划线 (_, U+005F) 作为分隔符:

¥To improve readability for numeric literals, underscores (_, U+005F) can be used as separators:

js
1_000_000_000_000
1_050.95
0b1010_0001_1000_0101
0o2_2_5_6
0xA0_B0_C0
1_000_000_000_000_000_000_000n

请注意这些限制:

¥Note these limitations:

js
// More than one underscore in a row is not allowed
100__000; // SyntaxError

// Not allowed at the end of numeric literals
100_; // SyntaxError

// Can not be used after leading 0
0_1; // SyntaxError

字符串文字

¥String literals

string 文字是用单引号或双引号括起来的零个或多个 Unicode 代码点。Unicode 代码点也可以由转义序列表示。除以下代码点外,所有代码点都可能按字面意思出现在字符串文字中:

¥A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for these code points:

  • U+005C \(反斜杠)
  • U+000D <CR>
  • U+000A <LF>
  • 与字符串文字开头的引号相同

任何代码点都可以以转义序列的形式出现。字符串文字的计算结果为 ECMAScript 字符串值。生成这些字符串值时,Unicode 代码点采用 UTF-16 编码。

¥Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded.

js
'foo'
"bar"

以下小节描述了字符串文字中可用的各种转义序列(\ 后跟一个或多个字符)。下面未列出的任何转义序列都会成为 "身份逃避",进而成为代码点本身。例如,\zz 相同。已弃用和过时的功能 页中描述了一种已弃用的八进制转义序列语法。其中许多转义序列在正则表达式中也是有效的 — 请参阅 字符转义

¥The following subsections describe various escape sequences (\ followed by one or more characters) available in string literals. Any escape sequence not listed below becomes an "identity escape" that becomes the code point itself. For example, \z is the same as z. There's a deprecated octal escape sequence syntax described in the Deprecated and obsolete features page. Many of these escape sequences are also valid in regular expressions — see Character escape.

转义序列

¥Escape sequences

特殊字符可以使用转义序列进行编码:

¥Special characters can be encoded using escape sequences:

转义序列 Unicode 代码点
\0 空字符 (U+0000 NULL)
\' 单引号 (U+0027 撇号)
\" 双引号 (U+0022 引号)
\\ 反斜杠 (U+005C REVERSE SOLIDUS)
\n 换行符(U+000A 换行;LF)
\r 回车(U+000D 回车;CR)
\v 垂直制表符(U+000B 行制表符)
\t 选项卡(U+0009 字符制表)
\b 退格键 (U+0008 退格键)
\f 换页 (U+000C 换页)
\ 后跟 行终止符 空字符串

最后一个转义序列 \ 后跟行终止符,对于将字符串文字拆分为多行而不改变其含义非常有用。

¥The last escape sequence, \ followed by a line terminator, is useful for splitting a string literal across multiple lines without changing its meaning.

js
const longString =
  "This is a very long string which needs \
to wrap across multiple lines because \
otherwise my code is unreadable.";

确保反斜杠后面没有空格或任何其他字符(换行符除外),否则它将不起作用。如果下一行缩进,多余的空格也将出现在字符串的值中。

¥Make sure there is no space or any other character after the backslash (except for a line break), otherwise it will not work. If the next line is indented, the extra spaces will also be present in the string's value.

你还可以使用 + 运算符将多个字符串附加在一起,如下所示:

¥You can also use the + operator to append multiple strings together, like this:

js
const longString =
  "This is a very long string which needs " +
  "to wrap across multiple lines because " +
  "otherwise my code is unreadable.";

上述两种方法都会产生相同的字符串。

¥Both of the above methods result in identical strings.

十六进制转义序列

¥Hexadecimal escape sequences

十六进制转义序列由 \x 后跟两个十六进制数字组成,表示 0x0000 到 0x00FF 范围内的代码单元或代码点。

¥Hexadecimal escape sequences consist of \x followed by exactly two hexadecimal digits representing a code unit or code point in the range 0x0000 to 0x00FF.

js
"\xA9"; // "©"

Unicode 转义序列

¥Unicode escape sequences

Unicode 转义序列恰好由 \u 后面的四个十六进制数字组成。它代表 UTF-16 编码中的一个代码单元。对于代码点 U+0000 到 U+FFFF,代码单元等于代码点。代码点 U+10000 到 U+10FFFF 需要两个转义序列,表示用于对字符进行编码的两个代码单元(代理对);代理对与代码点不同。

¥A Unicode escape sequence consists of exactly four hexadecimal digits following \u. It represents a code unit in the UTF-16 encoding. For code points U+0000 to U+FFFF, the code unit is equal to the code point. Code points U+10000 to U+10FFFF require two escape sequences representing the two code units (a surrogate pair) used to encode the character; the surrogate pair is distinct from the code point.

另请参见 String.fromCharCode()String.prototype.charCodeAt()

¥See also String.fromCharCode() and String.prototype.charCodeAt().

js
"\u00A9"; // "©" (U+A9)

Unicode 代码点转义

¥Unicode code point escapes

Unicode 代码点转义由 \u{、后跟十六进制代码点和 } 组成。十六进制数字的值必须在 0 和 0x10FFFF 范围内(包括 0 和 0x10FFFF)。U+10000 到 U+10FFFF 范围内的代码点不需要表示为代理对。

¥A Unicode code point escape consists of \u{, followed by a code point in hexadecimal base, followed by }. The value of the hexadecimal digits must be in the range 0 and 0x10FFFF inclusive. Code points in the range U+10000 to U+10FFFF do not need to be represented as a surrogate pair.

另请参见 String.fromCodePoint()String.prototype.codePointAt()

¥See also String.fromCodePoint() and String.prototype.codePointAt().

js
"\u{2F804}"; // CJK COMPATIBILITY IDEOGRAPH-2F804 (U+2F804)

// the same character represented as a surrogate pair
"\uD87E\uDC04";

正则表达式文字

¥Regular expression literals

正则表达式文字由两个正斜杠 (/) 括起来。词法分析器会消耗直到下一个未转义的正斜杠或行尾的所有字符,除非正斜杠出现在字符类 ([]) 中。某些字符(即 标识符部分 的字符)可以出现在结束斜杠之后,表示标志。

¥Regular expression literals are enclosed by two forward slashes (/). The lexer consumes all characters up to the next unescaped forward slash or the end of the line, unless the forward slash appears within a character class ([]). Some characters (namely, those that are identifier parts) can appear after the closing slash, denoting flags.

词汇语法非常宽松:并非所有被标识为一个标记的正则表达式文字都是有效的正则表达式。

¥The lexical grammar is very lenient: not all regular expression literals that get identified as one token are valid regular expressions.

另请参阅 RegExp 了解更多信息。

¥See also RegExp for more information.

js
/ab+c/g
/[/]/

正则表达式文字不能以两个正斜杠 (//) 开头,因为那将是行注释。要指定空正则表达式,请使用 /(?:)/

¥A regular expression literal cannot start with two forward slashes (//), because that would be a line comment. To specify an empty regular expression, use /(?:)/.

模板文字

¥Template literals

一个模板文字由多个标记组成: xxx${(模板头)、}xxx${(模板中)和 }xxx (模板尾)是单独的标记,而任何表达式都可以位于它们之间。

¥One template literal consists of several tokens: `xxx${ (template head), }xxx${ (template middle), and }xxx` (template tail) are individual tokens, while any expression may come between them.

另请参阅 模板文字 了解更多信息。

¥See also template literals for more information.

js
`string text`

`string text line 1
 string text line 2`

`string text ${expression} string text`

tag`string text ${expression} string text`

自动插入分号

¥Automatic semicolon insertion

某些 JavaScript 语句' 语法定义需要在末尾添加分号 (;)。他们包括:

¥Some JavaScript statements' syntax definitions require semicolons (;) at the end. They include:

然而,为了使语言更平易近人、更方便,JavaScript 能够在使用令牌流时自动插入分号,以便某些无效的令牌序列可以 "fixed" 为有效语法。此步骤发生在根据词法语法将程序文本解析为标记之后。自动插入分号有以下三种情况:

¥However, to make the language more approachable and convenient, JavaScript is able to automatically insert semicolons when consuming the token stream, so that some invalid token sequences can be "fixed" to valid syntax. This step happens after the program text has been parsed to tokens according to the lexical grammar. There are three cases when semicolons are automatically inserted:

js
{ 1
2 } 3

// is transformed by ASI into:

{ 1
;2 ;} 3;

// Which is valid grammar encoding three statements,
// each consisting of a number literal

do...while 的结尾 ")" 也被视为此规则的特殊情况。

¥The ending ")" of do...while is taken care of as a special case by this rule as well.

js
do {
  // ...
} while (condition) /* ; */ // ASI here
const a = 1

但是,如果分号随后将成为 for 语句头部的分隔符,则不会插入分号。

¥However, semicolons are not inserted if the semicolon would then become the separator in the for statement's head.

js
for (
  let a = 1 // No ASI here
  a < 10 // No ASI here
  a++
) {}

分号也永远不会作为 空语句 插入。例如,在下面的代码中,如果在 ")" 之后插入分号,则该代码将有效,其中 if 主体为空语句,而 const 声明为单独的语句。但是,由于自动插入的分号不能成为空语句,这会导致 declaration 成为 if 语句的主体,这是无效的。

¥Semicolons are also never inserted as empty statements. For example, in the code below, if a semicolon is inserted after ")", then the code would be valid, with an empty statement as the if body and the const declaration being a separate statement. However, because automatically inserted semicolons cannot become empty statements, this causes a declaration to become the body of the if statement, which is not valid.

js
if (Math.random() > 0.5)
const x = 1 // SyntaxError: Unexpected token 'const'
js
const a = 1 /* ; */ // ASI here

该规则是对前一规则的补充,特别是针对没有 "违规令牌" 但输入流结尾的情况。

¥This rule is a complement to the previous rule, specifically for the case where there's no "offending token" but the end of input stream.

3.这些地方包括:

¥3. When the grammar forbids line terminators in some place but a line terminator is found, a semicolon is inserted. These places include:

  • expr <here> ++expr <here> --
  • continue <here> lbl
  • break <here> lbl
  • return <here> expr
  • throw <here> expr
  • yield <here> expr
  • yield <here> * expr
  • (param) <here> => {}
  • async <here> functionasync <here> prop()async <here> function*async <here> *prop()async <here> (param) <here> => {}

这里,++ 不被视为应用于变量 b 的后缀运算符,因为行终止符出现在 b++ 之间。

¥Here ++ is not treated as a postfix operator applying to variable b, because a line terminator occurs between b and ++.

js
a = b
++c

// is transformed by ASI into

a = b;
++c;

这里,return 语句返回 undefineda + b 变成不可达语句。

¥Here, the return statement returns undefined, and the a + b becomes an unreachable statement.

js
return
a + b

// is transformed by ASI into

return;
a + b;

请注意,仅当换行符分隔标记时才会触发 ASI,否则会产生无效语法。如果下一个标记可以被解析为有效结构的一部分,则不会插入分号。例如:

¥Note that ASI would only be triggered if a line break separates tokens that would otherwise produce invalid syntax. If the next token can be parsed as part of a valid structure, semicolons would not be inserted. For example:

js
const a = 1
(1).toString()

const b = 1
[1, 2, 3].forEach(console.log)

因为 () 可以看作是一个函数调用,所以它通常不会触发 ASI。同样,[] 可能是会员访问。上面的代码相当于:

¥Because () can be seen as a function call, it would usually not trigger ASI. Similarly, [] may be a member access. The code above is equivalent to:

js
const a = 1(1).toString();

const b = 1[1, 2, 3].forEach(console.log);

这恰好是有效的语法。1[1, 2, 3] 是带有 comma 连接表达式的 属性访问器。因此,运行代码时会出现 "1 不是函数" 和 "无法读取未定义的属性(读取 'forEach')" 之类的错误。

¥This happens to be valid syntax. 1[1, 2, 3] is a property accessor with a comma-joined expression. Therefore, you would get errors like "1 is not a function" and "Cannot read properties of undefined (reading 'forEach')" when running the code.

在类中,类字段和生成器方法也可能是一个陷阱。

¥Within classes, class fields and generator methods can be a pitfall as well.

js
class A {
  a = 1
  *gen() {}
}

它被视为:

¥It is seen as:

js
class A {
  a = 1 * gen() {}
}

因此在 { 左右将会出现语法错误。

¥And therefore will be a syntax error around {.

如果你想强制执行无分号样式,则有以下处理 ASI 的经验法则:

¥There are the following rules-of-thumb for dealing with ASI, if you want to enforce semicolon-less style:

  • 将后缀 ++-- 写在与它们的操作数相同的行上。
    js
    const a = b
    ++
    console.log(a) // ReferenceError: Invalid left-hand side expression in prefix operation
    
    js
    const a = b++
    console.log(a)
    
  • returnthrowyield 之后的表达式应与关键字在同一行。
    js
    function foo() {
      return
        1 + 1 // Returns undefined; 1 + 1 is ignored
    }
    
    js
    function foo() {
      return 1 + 1
    }
    
    function foo() {
      return (
        1 + 1
      )
    }
    
  • 同样,breakcontinue 之后的标签标识符应与关键字在同一行。
    js
    outerBlock: {
      innerBlock: {
        break
          outerBlock // SyntaxError: Illegal break statement
      }
    }
    
    js
    outerBlock: {
      innerBlock: {
        break outerBlock
      }
    }
    
  • 箭头函数的 => 应与其参数末尾在同一行。
    js
    const foo = (a, b)
      => a + b
    
    js
    const foo = (a, b) =>
      a + b
    
  • 异步函数、方法等的 async 后面不能直接跟行终止符。
    js
    async
    function foo() {}
    
    js
    async function
    foo() {}
    
  • 如果一行以 ([、```、+-/ 之一开头(如正则表达式文字),请在其前面添加分号,或者在前一行中添加分号。
    js
    // The () may be merged with the previous line as a function call
    (() => {
      // ...
    })()
    
    // The [ may be merged with the previous line as a property access
    [1, 2, 3].forEach(console.log)
    
    // The ` may be merged with the previous line as a tagged template literal
    `string text ${data}`.match(pattern).forEach(console.log)
    
    // The + may be merged with the previous line as a binary + expression
    +a.toString()
    
    // The - may be merged with the previous line as a binary - expression
    -a.toString()
    
    // The / may be merged with the previous line as a division expression
    /pattern/.exec(str).forEach(console.log)
    
    js
    ;(() => {
      // ...
    })()
    ;[1, 2, 3].forEach(console.log)
    ;`string text ${data}`.match(pattern).forEach(console.log)
    ;+a.toString()
    ;-a.toString()
    ;/pattern/.exec(str).forEach(console.log)
    
  • 类字段最好始终以分号结尾 - 除了前面的规则(包括字段声明后跟 计算属性,因为后者以 [ 开头)之外,字段声明和生成器方法之间也需要分号。
    js
    class A {
      a = 1
      [b] = 2
      *gen() {} // Seen as a = 1[b] = 2 * gen() {}
    }
    
    js
    class A {
      a = 1;
      [b] = 2;
      *gen() {}
    }
    

规范

Specification
ECMAScript Language Specification

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看

¥See also