反向引用:\\1, \\2

反向引用是指前一个 捕获组 的子匹配,并匹配与该组相同的文本。对于 命名捕获组,你可能更喜欢使用 命名反向引用 语法。

¥A backreference refers to the submatch of a previous capturing group and matches the same text as that group. For named capturing groups, you may prefer to use the named backreference syntax.

语法

¥Syntax

regex
\N

注意:N 不是文字字符。

¥Note: N is not a literal character.

参数

¥Parameters

N

正整数,表示捕获组的数量。

描述

¥Description

反向引用是一种匹配与捕获组先前匹配的相同文本的方法。捕获组从 1 开始计数,因此第一个捕获组的结果可以用 \1 引用,第二个捕获组的结果可以用 \2 引用,依此类推。\0 是 NUL 字符的 字符转义

¥A backreference is a way to match the same text as previously matched by a capturing group. Capturing groups count from 1, so the first capturing group's result can be referenced with \1, the second with \2, and so on. \0 is a character escape for the NUL character.

case-insensitive 匹配中,反向引用可能会匹配与原始文本大小写不同的文本。

¥In case-insensitive matching, the backreference may match text with different casing from the original text.

js
/(b)\1/i.test("bB"); // true

反向引用必须引用现有的捕获组。如果它指定的数量大于捕获组的总数,则会引发语法错误。

¥The backreference must refer to an existent capturing group. If the number it specifies is greater than the total number of capturing groups, a syntax error is thrown.

js
/(a)\2/u; // SyntaxError: Invalid regular expression: Invalid escape

Unicode 不识别模式 中,无效的反向引用变成 遗留八进制转义 序列。这是 已弃用的 Web 兼容性语法,你不应该依赖它。

¥In Unicode-unaware mode, invalid backreferences become a legacy octal escape sequence. This is a deprecated syntax for web compatibility, and you should not rely on it.

js
/(a)\2/.test("a\x02"); // true

如果引用的捕获组不匹配(例如,因为它属于 disjunction 中不匹配的替代项),或者该组尚未匹配(例如,因为它位于反向引用的右侧),则反向引用始终会成功 (就好像它匹配空字符串)。

¥If the referenced capturing group is unmatched (for example, because it belongs to an unmatched alternative in a disjunction), or the group hasn't matched yet (for example, because it lies to the right of the backreference), the backreference always succeeds (as if it matches the empty string).

js
/(?:a|(b))\1c/.test("ac"); // true
/\1(a)/.test("a"); // true

示例

¥Examples

配对报价

¥Pairing quotes

以下函数匹配字符串中的 title='xxx'title="xxx" 模式。为了确保引号匹配,我们使用反向引用来引用第一个引号。访问第二个捕获组 ([2]) 返回匹配引号字符之间的字符串:

¥The following function matches the title='xxx' and title="xxx" patterns in a string. To ensure the quotes match, we use a backreference to refer to the first quote. Accessing the second capturing group ([2]) returns the string between the matching quote characters:

js
function parseTitle(metastring) {
  return metastring.match(/title=(["'])(.*?)\1/)[2];
}

parseTitle('title="foo"'); // 'foo'
parseTitle("title='foo' lang='en'"); // 'foo'
parseTitle('title="Named capturing groups\' advantages"'); // "Named capturing groups' advantages"

匹配重复的单词

¥Matching duplicate words

以下函数查找字符串中的重复单词(通常是拼写错误)。请注意,它使用 \w 字符类转义,它仅匹配英文字母,但不匹配任何重音字母或其他字母。如果你想要更通用的匹配,你可能需要通过空格对字符串进行 split 并迭代结果数组。

¥The following function finds duplicate words in a string (which are usually typos). Note that it uses the \w character class escape, which only matches English letters but not any accented letters or other alphabets. If you want more generic matching, you may want to split the string by whitespace and iterate over the resulting array.

js
function findDuplicates(text) {
  return text.match(/\b(\w+)\s+\1\b/i)?.[1];
}

findDuplicates("foo foo bar"); // 'foo'
findDuplicates("foo bar foo"); // undefined
findDuplicates("Hello hello"); // 'Hello'
findDuplicates("Hello hellos"); // undefined

规范

Specification
ECMAScript Language Specification
# prod-DecimalEscape

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看