文字字符：a, b

文字字符准确指定其自身要在输入文本中匹配。

¥A literal character specifies exactly itself to be matched in the input text.

语法

¥Syntax

regex

描述

¥Description

在正则表达式中，大多数字符都可以按字面意思出现。它们通常是模式最基本的构建块。例如，以下是删除 HTML 标签示例中的模式：

¥In regular expressions, most characters can appear literally. They are usually the most basic building blocks of patterns. For example, here is a pattern from the Removing HTML tags example:

const pattern = /<.+?>/g;

在此示例中，.、+ 和 ? 称为语法字符。它们在正则表达式中具有特殊含义。模式中的其余字符（< 和 >）是文字字符。它们在输入文本中进行自我匹配：左尖括号和右尖括号。

¥In this example, ., +, and ? are called syntax characters. They have special meanings in regular expressions. The rest of the characters in the pattern (< and >) are literal characters. They match themselves in the input text: the left and right angle brackets.

以下字符是正则表达式中的语法字符，它们不能作为文字字符出现：

¥The following characters are syntax characters in regular expressions, and they cannot appear as literal characters:

在字符类中，可以按字面意思显示更多字符。有关详细信息，请参阅字符类页。例如，\. 和 [.] 都匹配文字 .。然而，在 v-模式字符类中，保留了一组不同的字符作为语法字符。为了最全面，下面是 ASCII 字符表，以及它们在不同上下文中是否可能出现转义或未转义，其中 "✅" 表示该字符代表其自身，"❌" 表示它抛出语法错误，"⚠️" 表示该字符有效但表示某些内容除了它自己之外。

¥Within character classes, more characters can appear literally. For more information, see the Character class page. For example \. and [.] both match a literal .. In v-mode character classes, however, there are a different set of characters reserved as syntax characters. To be most comprehensive, below is a table of ASCII characters and whether they may appear escaped or unescaped in different contexts, where "✅" means the character represents itself, "❌" means it throws a syntax error, and "⚠️" means the character is valid but means something other than itself.

人物	`u` 或 `v` 模式下的外部字符类		在 `u` 模式字符类中		在 `v` 模式字符类中
人物	未转义	逃脱	未转义	逃脱	未转义	逃脱
`XSPACE123456789 "' ACEFGHIJKLMN OPQRTUVXYZ_ aceghijklmop quxyz`	✅	❌	✅	❌	✅	❌
!#%&,:;<=>@`~	✅	❌	✅	❌	✅	✅
`]`	❌	✅	❌	✅	❌	✅
`()[{}`	❌	✅	✅	✅	❌	✅
`*+?`	❌	✅	✅	✅	✅	✅
`/`	✅	✅	✅	✅	❌	✅
`XSPACE0DSWbdfnrstvw`	✅	⚠️	✅	⚠️	✅	⚠️
`B`	✅	⚠️	✅	❌	✅	❌
`$.`	⚠️	✅	✅	✅	✅	✅
`\|`	⚠️	✅	✅	✅	❌	✅
`-`	✅	❌	✅⚠️	✅	❌⚠️	✅
`^`	⚠️	✅	✅⚠️	✅	✅⚠️	✅
<代码>\	❌⚠️	✅	❌⚠️	✅	❌⚠️	✅

注意：在 v 模式字符类中可以转义和不转义的字符正是 "双标点符号" 所禁止的字符。请参阅 v-模式字符类了解更多信息。

¥Note: The characters that can both be escaped and unescaped in v-mode character classes are exactly those forbidden as "double punctuators". See v-mode character classes for more information.

每当你想要字面匹配语法字符时，你需要用反斜杠（\）对其进行 escape。例如，要匹配模式中的文字 *，你需要在模式中写入 \*。使用语法字符作为文字字符会导致意外结果或导致语法错误 - 例如，/*/ 不是有效的正则表达式，因为量词前面没有模式。如果无法将 Unicode 不识别模式、]、{ 和 } 解析为字符类或量词分隔符的结尾，则它们可能按字面意思出现。这是已弃用的 Web 兼容性语法，你不应该依赖它。

¥Whenever you want to match a syntax character literally, you need to escape it with a backslash (\). For example, to match a literal * in a pattern, you need to write \* in the pattern. Using syntax characters as literal characters either leads to unexpected results or causes syntax errors — for example, /*/ is not a valid regular expression because the quantifier is not preceded by a pattern. In Unicode-unaware mode, ], {, and } may appear literally if it's not possible to parse them as the end of a character class or quantifier delimiters. This is a deprecated syntax for web compatibility, and you should not rely on it.

正则表达式文字不能用某些非语法文字字符指定。/ 不能作为正则表达式文字中的文字字符出现，因为 / 用作文字本身的分隔符。如果你想匹配文字 /，则需要将其转义为 \/。行终止符也不能在正则表达式文字中显示为文字字符，因为文字不能跨越多行。你需要使用像 \n 这样的字符转义。使用 RegExp() 构造函数时没有这样的限制，尽管字符串文字有自己的转义规则（例如，"\\" 实际上表示单个反斜杠字符，因此 new RegExp("\\*") 和 /\*/ 是等效的）。

¥Regular expression literals cannot be specified with certain non-syntax literal characters. / cannot appear as a literal character in a regular expression literal, because / is used as the delimiter of the literal itself. You need to escape it as \/ if you want to match a literal /. Line terminators cannot appear as literal characters in a regular expression literal either, because a literal cannot span multiple lines. You need to use a character escape like \n instead. There are no such restrictions when using the RegExp() constructor, although string literals have their own escaping rules (for example, "\\" actually denotes a single backslash character, so new RegExp("\\*") and /\*/ are equivalent).

在 Unicode 不识别模式中，该模式被解释为 UTF-16 代码单元的序列。这意味着代理对实际上代表两个文字字符。与其他功能配合使用时，这会导致意外行为：

¥In Unicode-unaware mode, the pattern is interpreted as a sequence of UTF-16 code units. This means surrogate pairs actually represent two literal characters. This causes unexpected behaviors when paired with other features:

/^[😄]$/.test("😄"); // false, because the pattern is interpreted as /^[\ud83d\udc04]$/
/^😄+$/.test("😄😄"); // false, because the pattern is interpreted as /^\ud83d\udc04+$/

在 Unicode 感知模式下，模式被解释为 Unicode 代码点序列，并且代理对不会被拆分。因此，你应该始终优先使用 u 标志。

¥In Unicode-aware mode, the pattern is interpreted as a sequence of Unicode code points, and surrogate pairs do not get split. Therefore, you should always prefer to use the u flag.

示例

¥Examples

使用文字字符

¥Using literal characters

下面的例子是从字符转义复制过来的。a 和 b 字符是模式中的文字字符，而 \n 是转义字符，因为它不能按字面意思出现在正则表达式文字中。

¥The following example is copied from Character escape. The a and b characters are literal characters in the pattern, and \n is an escaped character because it cannot appear literally in a regular expression literal.

const pattern = /a\nb/;
const string = `a
b`;
console.log(pattern.test(string)); // true

规范

Specification
ECMAScript Language Specification # prod-PatternCharacter

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看

¥See also