后向断言:(?<=...), (?<!...)
回顾断言 "向后看":它尝试将先前的输入与给定的模式进行匹配,但它不会消耗任何输入 - 如果匹配成功,输入中的当前位置保持不变。它以相反的顺序匹配其模式中的每个原子。
¥A lookbehind assertion "looks behind": it attempts to match the previous input with the given pattern, but it does not consume any of the input — if the match is successful, the current position in the input stays the same. It matches each atom in its pattern in the reverse order.
语法
参数
描述
¥Description
正则表达式一般从左到右匹配。这就是为什么 lookahead 和 lookbehind 断言被这样调用 - lookahead 断言右侧的内容,lookbehind 断言左侧的内容。
¥A regular expression generally matches from left to right. This is why lookahead and lookbehind assertions are called as such — lookahead asserts what's on the right, and lookbehind asserts what's on the left.
为了使 (?<=pattern)
断言成功,pattern
必须匹配当前位置左侧的输入,但在匹配后续输入之前当前位置不会更改。(?<!pattern)
形式否定断言 - 如果 pattern
与当前位置左侧的输入不匹配,则断言成功。
¥In order for a (?<=pattern)
assertion to succeed, the pattern
must match the input immediately to the left of the current position, but the current position is not changed before matching the subsequent input. The (?<!pattern)
form negates the assertion — it succeeds if the pattern
does not match the input immediately to the left of the current position.
Lookbehind 通常与 Lookahead 具有相同的语义 - 然而,在 Lookbehind 断言中,正则表达式向后匹配。例如,
¥Lookbehind generally has the same semantics as lookahead — however, within a lookbehind assertion, the regular expression matches backwards. For example,
/(?<=([ab]+)([bc]+))$/.exec("abc"); // ['', 'a', 'bc']
// Not ['', 'ab', 'c']
如果 lookbehind 从左到右匹配,则应该先贪婪地匹配 [ab]+
,这使得第一组捕获 "ab"
,剩下的 "c"
被 [bc]+
捕获。然而,由于 [bc]+
先被匹配,所以它贪婪地抢了 "bc"
,只留下 "a"
给 [ab]+
。
¥If the lookbehind matches from left to right, it should first greedily match [ab]+
, which makes the first group capture "ab"
, and the remaining "c"
is captured by [bc]+
. However, because [bc]+
is matched first, it greedily grabs "bc"
, leaving only "a"
for [ab]+
.
这种行为是合理的 - 匹配器不知道从哪里开始匹配(因为后向查找可能不是固定长度的),但它确实知道在哪里结束(在当前位置)。因此,它从当前位置开始向后工作。(其他一些语言中的正则表达式禁止非固定长度的向后查找以避免此问题。)
¥This behavior is reasonable — the matcher does not know where to start the match (because the lookbehind may not be fixed-length), but it does know where to end (at the current position). Therefore, it starts from the current position and works backwards. (Regexes in some other languages forbid non-fixed-length lookbehind to avoid this issue.)
对于后向查找中的 quantified 捕获组,由于向后匹配,会捕获距离输入字符串最左侧的匹配项(而不是右侧的匹配项)。有关详细信息,请参阅捕获组页面。Lookbehind 中的 反向引用 必须出现在它所引用的组的左侧,这也是由于向后匹配的原因。然而,disjunctions 仍然尝试从左到右。
¥For quantified capturing groups inside the lookbehind, the match furthest to the left of the input string — instead of the one on the right — is captured because of backward matching. See the capturing groups page for more information. Backreferences inside the lookbehind must appear on the left of the group it's referring to, also due to backward matching. However, disjunctions are still attempted left-to-right.
示例
匹配字符串而不消耗它们
¥Matching strings without consuming them
与 lookaheads 类似,lookbehinds 可用于匹配字符串而不消耗它们,以便仅提取有用的信息。例如,以下正则表达式与价格标签中的数字匹配:
¥Similar to lookaheads, lookbehinds can be used to match strings without consuming them so that only useful information is extracted. For example, the following regex matches the number in a price label:
function getPrice(label) {
return /(?<=\$)\d+(?:\.\d*)?/.exec(label)?.[0];
}
getPrice("$10.53"); // "10.53"
getPrice("10.53"); // undefined
通过 capturing 你感兴趣的子匹配可以实现类似的效果。
¥A similar effect can be achieved by capturing the submatch you are interested in.
规范
Specification |
---|
ECMAScript Language Specification # prod-Assertion |
浏览器兼容性
BCD tables only load in the browser
也可以看看
¥See also