捕获组:(...)

捕获组对子模式进行分组,允许你将 quantifier 应用于整个组或在其中使用 disjunctions。它会记住有关子模式匹配的信息,以便你稍后可以使用 backreference 引用它,或通过 比赛结果 访问该信息。

¥A capturing group groups a subpattern, allowing you to apply a quantifier to the entire group or use disjunctions within it. It memorizes information about the subpattern match, so that you can refer back to it later with a backreference, or access the information through the match results.

如果不需要子模式匹配的结果,请改用 非捕获组,这样可以提高性能并避免重构风险。

¥If you don't need the result of the subpattern match, use a non-capturing group instead, which improves performance and avoids refactoring hazards.

语法

¥Syntax

regex
(pattern)

参数

¥Parameters

pattern

由你可以在正则表达式文字中使用的任何内容组成的模式,包括 disjunction

描述

¥Description

捕获组的作用类似于 JavaScript 表达式中的 分组运算符,允许你将子模式用作单个 atom

¥A capturing group acts like the grouping operator in JavaScript expressions, allowing you to use a subpattern as a single atom.

捕获组按照左括号的顺序进行编号。第一个捕获组编号为 1,第二个捕获组编号为 2,依此类推。命名捕获组 也是捕获组,并与其他(未命名)捕获组一起编号。捕获组的匹配信息可以通过以下方式访问:

¥Capturing groups are numbered by the order of their opening parentheses. The first capturing group is numbered 1, the second 2, and so on. Named capturing groups are also capturing groups and are numbered together with other (unnamed) capturing groups. The information of the capturing group's match can be accessed through:

注意:即使在 exec() 的结果数组中,捕获组也是通过数字 12 等访问的,因为 0 元素是整个匹配项。\0 不是反向引用,而是 NUL 字符的 字符转义

¥Note: Even in exec()'s result array, capturing groups are accessed by numbers 1, 2, etc., because the 0 element is the entire match. \0 is not a backreference, but a character escape for the NUL character.

正则表达式源代码中的捕获组与其结果一一对应。如果某个捕获组不匹配(例如,它属于 disjunction 中不匹配的替代项),则相应的结果为 undefined

¥Capturing groups in the regex source code correspond to their results one-to-one. If a capturing group is not matched (for example, it belongs to an unmatched alternative in a disjunction), the corresponding result is undefined.

js
/(ab)|(cd)/.exec("cd"); // ['cd', undefined, 'cd']

捕获组可以是 quantified。此时,该组对应的比赛信息就是该组的最后一场比赛。

¥Capturing groups can be quantified. In this case, the match information corresponding to this group is the last match of the group.

js
/([ab])+/.exec("abc"); // ['ab', 'b']; because "b" comes after "a", this result overwrites the previous one

捕获组可用于 lookaheadlookbehind 断言中。因为后向断言向后匹配它们的原子,所以与该组对应的最终匹配是出现在字符串左端的匹配。但是,匹配组的索引仍然对应于它们在正则表达式源中的相对位置。

¥Capturing groups can be used in lookahead and lookbehind assertions. Because lookbehind assertions match their atoms backwards, the final match corresponding to this group is the one that appears to the left end of the string. However, the indices of the match groups still correspond to their relative locations in the regex source.

js
/c(?=(ab))/.exec("cab"); // ['c', 'ab']
/(?<=(a)(b))c/.exec("abc"); // ['c', 'a', 'b']
/(?<=([ab])+)c/.exec("abc"); // ['c', 'a']; because "a" is seen by the lookbehind after the lookbehind has seen "b"

捕获组可以嵌套,在这种情况下,首先对外部组进行编号,然后是内部组,因为它们是按左括号排序的。如果嵌套组通过量词重复,则每次该组匹配时,子组的结果都会被覆盖,有时会被 undefined 覆盖。

¥Capturing groups can be nested, in which case the outer group is numbered first, then the inner group, because they are ordered by their opening parentheses. If a nested group is repeated by a quantifier, then each time the group matches, the subgroups' results are all overwritten, sometimes with undefined.

js
/((a+)?(b+)?(c))*/.exec("aacbbbcac"); // ['aacbbbcac', 'ac', 'a', undefined, 'c']

在上面的示例中,外部组匹配了 3 次:

¥In the example above, the outer group is matched three times:

  1. 匹配 "aac",以及子组 "aa"undefined"c"
  2. 匹配 "bbbc",以及子组 undefined"bbb""c"
  3. 匹配 "ac",以及子组 "a"undefined"c"

第二场比赛的 "bbb" 结果不会保留,因为第三场比赛会用 undefined 覆盖它。

¥The "bbb" result from the second match is not preserved, because the third match overwrites it with undefined.

你可以使用 d 标志获取输入字符串中每个捕获组的开始和结束索引。这会在 exec() 返回的数组上创建一个额外的 indices 属性。

¥You can get the start and end indices of each capturing group in the input string by using the d flag. This creates an extra indices property on the array returned by exec().

你可以选择为捕获组指定名称,这有助于避免与组位置和索引相关的陷阱。请参阅 命名捕获组 了解更多信息。

¥You can optionally specify a name to a capturing group, which helps avoid pitfalls related to group positions and indexing. See Named capturing groups for more information.

括号在不同的正则表达式语法中还有其他用途。例如,它们还包含前向断言和后向断言。因为这些语法都以 ? 开头,而 ?quantifier,通常不能直接出现在 ( 之后,所以这不会导致歧义。

¥Parentheses have other purposes in different regex syntaxes. For example, they also enclose lookahead and lookbehind assertions. Because these syntaxes all start with ?, and ? is a quantifier which normally cannot occur directly after (, this does not lead to ambiguities.

示例

¥Examples

匹配日期

¥Matching date

以下示例匹配 YYYY-MM-DD 格式的日期:

¥The following example matches a date in the format YYYY-MM-DD:

js
function parseDate(input) {
  const parts = /^(\d{4})-(\d{2})-(\d{2})$/.exec(input);
  if (!parts) {
    return null;
  }
  return parts.slice(1).map((p) => parseInt(p, 10));
}

parseDate("2019-01-01"); // [2019, 1, 1]
parseDate("2019-06-19"); // [2019, 6, 19]

配对报价

¥Pairing quotes

以下函数匹配字符串中的 title='xxx'title="xxx" 模式。为了确保引号匹配,我们使用反向引用来引用第一个引号。访问第二个捕获组 ([2]) 返回匹配引号字符之间的字符串:

¥The following function matches the title='xxx' and title="xxx" patterns in a string. To ensure the quotes match, we use a backreference to refer to the first quote. Accessing the second capturing group ([2]) returns the string between the matching quote characters:

js
function parseTitle(metastring) {
  return metastring.match(/title=(["'])(.*?)\1/)[2];
}

parseTitle('title="foo"'); // 'foo'
parseTitle("title='foo' lang='en'"); // 'foo'
parseTitle('title="Named capturing groups\' advantages"'); // "Named capturing groups' advantages"

规范

Specification
ECMAScript Language Specification
# prod-Atom

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看