encodeURIComponent()

encodeURIComponent() 函数通过将某些字符的每个实例替换为表示字符 UTF-8 编码的一个、两个、三个或四个转义序列(对于由两个代理字符组成的字符,只有四个转义序列)来对 URI 进行编码。与 encodeURI() 相比,此函数编码更多字符,包括属于 URI 语法一部分的字符。

¥The encodeURIComponent() function encodes a URI by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character (will only be four escape sequences for characters composed of two surrogate characters). Compared to encodeURI(), this function encodes more characters, including those that are part of the URI syntax.

Try it

语法

¥Syntax

js
encodeURIComponent(uriComponent)

参数

¥Parameters

uriComponent

要编码为 URI 组件的字符串(路径、查询字符串、片段等)。其他值为 转换为字符串

返回值

¥Return value

表示所提供的 uriComponent 的新字符串,编码为 URI 组件。

¥A new string representing the provided uriComponent encoded as a URI component.

例外情况

¥Exceptions

URIError

如果 uriComponent 包含 孤独的代理,则抛出该异常。

描述

¥Description

encodeURIComponent() 是全局对象的函数属性。

¥encodeURIComponent() is a function property of the global object.

encodeURIComponent() 使用与 encodeURI() 中描述的相同的编码算法。它转义所有字符,除了:

¥encodeURIComponent() uses the same encoding algorithm as described in encodeURI(). It escapes all characters except:

A–Z a–z 0–9 - _ . ! ~ * ' ( )

encodeURI() 相比,encodeURIComponent() 转义了更大的字符集。在从 POST 发送到服务器的表单中用户输入的字段上使用 encodeURIComponent() — 这将对 character references 或其他需要编码/解码的字符的数据输入期间可能无意中生成的 & 符号进行编码。例如,如果用户写入 Jack & Jill,而不写入 encodeURIComponent(),则与号可能会在服务器上被解释为新字段的开始,从而危及数据的完整性。

¥Compared to encodeURI(), encodeURIComponent() escapes a larger set of characters. Use encodeURIComponent() on user-entered fields from forms POST'd to the server — this will encode & symbols that may inadvertently be generated during data entry for character references or other characters that require encoding/decoding. For example, if a user writes Jack & Jill, without encodeURIComponent(), the ampersand could be interpreted on the server as the start of a new field and jeopardize the integrity of the data.

对于 application/x-www-form-urlencoded,空格将被 + 替换,因此可能希望在 encodeURIComponent() 替换之后再用 + 替换 %20

¥For application/x-www-form-urlencoded, spaces are to be replaced by +, so one may wish to follow a encodeURIComponent() replacement with an additional replacement of %20 with +.

示例

¥Examples

内容处置和链接标头的编码

¥Encoding for Content-Disposition and Link headers

以下示例提供了 UTF-8 Content-DispositionLink 服务器响应标头参数(例如,UTF-8 文件名)内所需的特殊编码:

¥The following example provides the special encoding required within UTF-8 Content-Disposition and Link server response header parameters (e.g., UTF-8 filenames):

js
const fileName = "my file(2).txt";
const header = `Content-Disposition: attachment; filename*=UTF-8''${encodeRFC5987ValueChars(
  fileName,
)}`;

console.log(header);
// "Content-Disposition: attachment; filename*=UTF-8''my%20file%282%29.txt"

function encodeRFC5987ValueChars(str) {
  return (
    encodeURIComponent(str)
      // The following creates the sequences %27 %28 %29 %2A (Note that
      // the valid encoding of "*" is %2A, which necessitates calling
      // toUpperCase() to properly encode). Although RFC3986 reserves "!",
      // RFC5987 does not, so we do not need to escape it.
      .replace(
        /['()*]/g,
        (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`,
      )
      // The following are not required for percent-encoding per RFC5987,
      // so we can allow for a little better readability over the wire: |`^
      .replace(/%(7C|60|5E)/g, (str, hex) =>
        String.fromCharCode(parseInt(hex, 16)),
      )
  );
}

RFC3986 的编码

¥Encoding for RFC3986

较新的 RFC3986 保留 !'()*,即使这些字符没有正式的 URI 分隔用途。以下函数对符合 RFC3986 的 URL 组件格式的字符串进行编码。它还对 [] 进行编码,它们是 IPv6 URI 语法的一部分。符合 RFC3986 的 encodeURI 实现不应逃避它们,这在 encodeURI() 示例 中得到了演示。

¥The more recent RFC3986 reserves !, ', (, ), and *, even though these characters have no formalized URI delimiting uses. The following function encodes a string for RFC3986-compliant URL component format. It also encodes [ and ], which are part of the IPv6 URI syntax. An RFC3986-compliant encodeURI implementation should not escape them, which is demonstrated in the encodeURI() example.

js
function encodeRFC3986URIComponent(str) {
  return encodeURIComponent(str).replace(
    /[!'()*]/g,
    (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`,
  );
}

对单独的代理抛出进行编码

¥Encoding a lone surrogate throws

如果尝试对不属于高低对一部分的代理项进行编码,则会抛出 URIError。例如:

¥A URIError will be thrown if one attempts to encode a surrogate which is not part of a high-low pair. For example:

js
// High-low pair OK
encodeURIComponent("\uD800\uDFFF"); // "%F0%90%8F%BF"

// Lone high-surrogate code unit throws "URIError: malformed URI sequence"
encodeURIComponent("\uD800");

// Lone high-surrogate code unit throws "URIError: malformed URI sequence"
encodeURIComponent("\uDFFF");

你可以使用 String.prototype.toWellFormed(),它用 Unicode 替换字符 (U+FFFD) 替换单独代理,以避免此错误。你还可以在将字符串传递给 encodeURIComponent() 之前使用 String.prototype.isWellFormed() 检查字符串是否包含单独的代理项。

¥You can use String.prototype.toWellFormed(), which replaces lone surrogates with the Unicode replacement character (U+FFFD), to avoid this error. You can also use String.prototype.isWellFormed() to check if a string contains lone surrogates before passing it to encodeURIComponent().

规范

Specification
ECMAScript Language Specification
# sec-encodeuricomponent-uricomponent

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看