文字字符:a, b
文字字符准确指定其自身要在输入文本中匹配。
¥A literal character specifies exactly itself to be matched in the input text.
语法
参数
描述
¥Description
在正则表达式中,大多数字符都可以按字面意思出现。它们通常是模式最基本的构建块。例如,以下是 删除 HTML 标签 示例中的模式:
¥In regular expressions, most characters can appear literally. They are usually the most basic building blocks of patterns. For example, here is a pattern from the Removing HTML tags example:
const pattern = /<.+?>/g;
在此示例中,.、+ 和 ? 称为语法字符。它们在正则表达式中具有特殊含义。模式中的其余字符(< 和 >)是文字字符。它们在输入文本中进行自我匹配:左尖括号和右尖括号。
¥In this example, ., +, and ? are called syntax characters. They have special meanings in regular expressions. The rest of the characters in the pattern (< and >) are literal characters. They match themselves in the input text: the left and right angle brackets.
以下字符是正则表达式中的语法字符,它们不能作为文字字符出现:
¥The following characters are syntax characters in regular expressions, and they cannot appear as literal characters:
在字符类中,可以按字面意思显示更多字符。有关详细信息,请参阅 字符类 页。例如,\. 和 [.] 都匹配文字 .。然而,在 v-模式字符类 中,保留了一组不同的字符作为语法字符。为了最全面,下面是 ASCII 字符表,以及它们在不同上下文中是否可能出现转义或未转义,其中 "✅" 表示该字符代表其自身,"❌" 表示它抛出语法错误,"⚠️" 表示该字符有效但表示某些内容 除了它自己之外。
¥Within character classes, more characters can appear literally. For more information, see the Character class page. For example \. and [.] both match a literal .. In v-mode character classes, however, there are a different set of characters reserved as syntax characters. To be most comprehensive, below is a table of ASCII characters and whether they may appear escaped or unescaped in different contexts, where "✅" means the character represents itself, "❌" means it throws a syntax error, and "⚠️" means the character is valid but means something other than itself.
| 人物 | u 或 v 模式下的外部字符类 | 
      在 u 模式字符类中 | 
      在 v 模式字符类中 | 
    |||
|---|---|---|---|---|---|---|
| 未转义 | 逃脱 | 未转义 | 逃脱 | 未转义 | 逃脱 | |
XSPACE123456789 "' | 
      ✅ | ❌ | ✅ | ❌ | ✅ | ❌ | 
!#%&,:;<=>@`~ | 
      ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | 
] | 
      ❌ | ✅ | ❌ | ✅ | ❌ | ✅ | 
()[{} | 
      ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | 
*+? | 
      ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | 
/ | 
      ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | 
XSPACE0DSWbdfnrstvw | 
      ✅ | ⚠️ | ✅ | ⚠️ | ✅ | ⚠️ | 
B | 
      ✅ | ⚠️ | ✅ | ❌ | ✅ | ❌ | 
$. | 
      ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | 
| | 
      ⚠️ | ✅ | ✅ | ✅ | ❌ | ✅ | 
- | 
      ✅ | ❌ | ✅⚠️ | ✅ | ❌⚠️ | ✅ | 
^ | 
      ⚠️ | ✅ | ✅⚠️ | ✅ | ✅⚠️ | ✅ | 
| <代码>\ | ❌⚠️ | ✅ | ❌⚠️ | ✅ | ❌⚠️ | ✅ | 
注意:在
v模式字符类中可以转义和不转义的字符正是 "双标点符号" 所禁止的字符。请参阅v-模式字符类 了解更多信息。¥Note: The characters that can both be escaped and unescaped in
v-mode character classes are exactly those forbidden as "double punctuators". Seev-mode character classes for more information.
每当你想要字面匹配语法字符时,你需要用反斜杠(\)对其进行 escape。例如,要匹配模式中的文字 *,你需要在模式中写入 \*。使用语法字符作为文字字符会导致意外结果或导致语法错误 - 例如,/*/ 不是有效的正则表达式,因为量词前面没有模式。如果无法将 Unicode 不识别模式、]、{ 和 } 解析为字符类或量词分隔符的结尾,则它们可能按字面意思出现。这是 已弃用的 Web 兼容性语法,你不应该依赖它。
¥Whenever you want to match a syntax character literally, you need to escape it with a backslash (\). For example, to match a literal * in a pattern, you need to write \* in the pattern. Using syntax characters as literal characters either leads to unexpected results or causes syntax errors — for example, /*/ is not a valid regular expression because the quantifier is not preceded by a pattern. In Unicode-unaware mode, ], {, and } may appear literally if it's not possible to parse them as the end of a character class or quantifier delimiters. This is a deprecated syntax for web compatibility, and you should not rely on it.
正则表达式文字不能用某些非语法文字字符指定。/ 不能作为正则表达式文字中的文字字符出现,因为 / 用作文字本身的分隔符。如果你想匹配文字 /,则需要将其转义为 \/。行终止符也不能在正则表达式文字中显示为文字字符,因为文字不能跨越多行。你需要使用像 \n 这样的 字符转义。使用 RegExp() 构造函数时没有这样的限制,尽管字符串文字有自己的转义规则(例如,"\\" 实际上表示单个反斜杠字符,因此 new RegExp("\\*") 和 /\*/ 是等效的)。
¥Regular expression literals cannot be specified with certain non-syntax literal characters. / cannot appear as a literal character in a regular expression literal, because / is used as the delimiter of the literal itself. You need to escape it as \/ if you want to match a literal /. Line terminators cannot appear as literal characters in a regular expression literal either, because a literal cannot span multiple lines. You need to use a character escape like \n instead. There are no such restrictions when using the RegExp() constructor, although string literals have their own escaping rules (for example, "\\" actually denotes a single backslash character, so new RegExp("\\*") and /\*/ are equivalent).
在 Unicode 不识别模式 中,该模式被解释为 UTF-16 代码单元 的序列。这意味着代理对实际上代表两个文字字符。与其他功能配合使用时,这会导致意外行为:
¥In Unicode-unaware mode, the pattern is interpreted as a sequence of UTF-16 code units. This means surrogate pairs actually represent two literal characters. This causes unexpected behaviors when paired with other features:
/^[😄]$/.test("😄"); // false, because the pattern is interpreted as /^[\ud83d\udc04]$/
/^😄+$/.test("😄😄"); // false, because the pattern is interpreted as /^\ud83d\udc04+$/
在 Unicode 感知模式下,模式被解释为 Unicode 代码点序列,并且代理对不会被拆分。因此,你应该始终优先使用 u 标志。
¥In Unicode-aware mode, the pattern is interpreted as a sequence of Unicode code points, and surrogate pairs do not get split. Therefore, you should always prefer to use the u flag.
示例
使用文字字符
¥Using literal characters
下面的例子是从 字符转义 复制过来的。a 和 b 字符是模式中的文字字符,而 \n 是转义字符,因为它不能按字面意思出现在正则表达式文字中。
¥The following example is copied from Character escape. The a and b characters are literal characters in the pattern, and \n is an escaped character because it cannot appear literally in a regular expression literal.
const pattern = /a\nb/;
const string = `a
b`;
console.log(pattern.test(string)); // true
规范
| Specification | 
|---|
| ECMAScript Language Specification  # prod-PatternCharacter  | 
浏览器兼容性
BCD tables only load in the browser
也可以看看
¥See also