String.prototype.codePointAt()
String
值的 codePointAt()
方法返回一个非负整数,它是从给定索引开始的字符的 Unicode 代码点值。请注意,该索引仍然基于 UTF-16 代码单元,而不是 Unicode 代码点。
¥The codePointAt()
method of String
values returns a non-negative integer that is the Unicode code point value of the character starting at the given index. Note that the index is still based on UTF-16 code units, not Unicode code points.
Try it
语法
参数
返回值
¥Return value
一个非负整数,表示给定 index
处字符的代码点值。
¥A non-negative integer representing the code point value of the character at the given index
.
- 如果
index
超出0
–str.length - 1
范围,则codePointAt()
返回undefined
。 - 如果
index
处的元素是 UTF-16 前导代理项,则返回代理项对的代码点。 - 如果
index
处的元素是 UTF-16 尾随代理项,则仅返回尾随代理项代码单元。
描述
¥Description
字符串中的字符从左到右进行索引。名为 str
的字符串中第一个字符的索引是 0
,最后一个字符的索引是 str.length - 1
。
¥Characters in a string are indexed from left to right. The index of the first character is 0
, and the index of the last character in a string called str
is str.length - 1
.
Unicode 代码点范围从 0
到 1114111
(0x10FFFF
)。在 UTF-16 中,每个字符串索引都是一个值为 0
– 65535
的代码单元。较高的代码点由一对 16 位代理伪字符表示。因此,codePointAt()
返回可能跨越两个字符串索引的代码点。有关 Unicode 的信息,请参阅 UTF-16 字符、Unicode 代码点和字素簇。
¥Unicode code points range from 0
to 1114111
(0x10FFFF
). In UTF-16, each string index is a code unit with value 0
– 65535
. Higher code points are represented by a pair of 16-bit surrogate pseudo-characters. Therefore, codePointAt()
returns a code point that may span two string indices. For information on Unicode, see UTF-16 characters, Unicode code points, and grapheme clusters.
示例
使用 codePointAt()
¥Using codePointAt()
"ABC".codePointAt(0); // 65
"ABC".codePointAt(0).toString(16); // 41
"😍".codePointAt(0); // 128525
"\ud83d\ude0d".codePointAt(0); // 128525
"\ud83d\ude0d".codePointAt(0).toString(16); // 1f60d
"😍".codePointAt(1); // 56845
"\ud83d\ude0d".codePointAt(1); // 56845
"\ud83d\ude0d".codePointAt(1).toString(16); // de0d
"ABC".codePointAt(42); // undefined
使用 codePointAt() 循环
¥Looping with codePointAt()
因为使用字符串索引进行循环会导致相同的代码点被访问两次(一次用于前导代理项,一次用于尾随代理项),并且第二次 codePointAt()
仅返回尾随代理项,所以最好避免按索引循环。
¥Because using string indices for looping causes the same code point to be visited twice (once for the leading surrogate, once for the trailing surrogate), and the second time codePointAt()
returns only the trailing surrogate, it's better to avoid looping by index.
const str = "\ud83d\udc0e\ud83d\udc71\u2764";
for (let i = 0; i < str.length; i++) {
console.log(str.codePointAt(i).toString(16));
}
// '1f40e', 'dc0e', '1f471', 'dc71', '2764'
相反,请使用 for...of
语句或 把绳子展开,这两者都会调用字符串的 [Symbol.iterator]()
,后者按代码点进行迭代。然后,使用 codePointAt(0)
获取每个元素的代码点。
¥Instead, use a for...of
statement or spread the string, both of which invoke the string's [Symbol.iterator]()
, which iterates by code points. Then, use codePointAt(0)
to get the code point of each element.
for (const codePoint of str) {
console.log(codePoint.codePointAt(0).toString(16));
}
// '1f40e', '1f471', '2764'
[...str].map((cp) => cp.codePointAt(0).toString(16));
// ['1f40e', '1f471', '2764']
规范
Specification |
---|
ECMAScript Language Specification # sec-string.prototype.codepointat |
浏览器兼容性
BCD tables only load in the browser