encodeURI()

encodeURI() 函数通过将某些字符的每个实例替换为表示字符 UTF-8 编码的一个、两个、三个或四个转义序列(对于由两个代理字符组成的字符,只有四个转义序列)来对 URI 进行编码。与 encodeURIComponent() 相比,此函数编码的字符更少,保留了 URI 语法的一部分。

¥The encodeURI() function encodes a URI by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character (will only be four escape sequences for characters composed of two surrogate characters). Compared to encodeURIComponent(), this function encodes fewer characters, preserving those that are part of the URI syntax.

Try it

语法

¥Syntax

js
encodeURI(uri)

参数

¥Parameters

uri

要编码为 URI 的字符串。

返回值

¥Return value

一个新字符串,表示编码为 URI 的提供字符串。

¥A new string representing the provided string encoded as a URI.

例外情况

¥Exceptions

URIError

如果 uri 包含 孤独的代理,则抛出该异常。

描述

¥Description

encodeURI() 是全局对象的函数属性。

¥encodeURI() is a function property of the global object.

encodeURI() 函数按 UTF-8 代码单元转义字符,每个八位字节以 %XX 格式编码,如有必要,在左侧填充 0。由于 UTF-16 中的单独代理不编码任何有效的 Unicode 字符,因此它们会导致 encodeURI() 抛出 URIError

¥The encodeURI() function escapes characters by UTF-8 code units, with each octet encoded in the format %XX, left-padded with 0 if necessary. Because lone surrogates in UTF-16 do not encode any valid Unicode character, they cause encodeURI() to throw a URIError.

encodeURI() 转义所有字符,除了:

¥encodeURI() escapes all characters except:

A–Z a–z 0–9 - _ . ! ~ * ' ( )

; / ? : @ & = + $ , #

第二行的字符可能是 URI 语法的一部分,并且只能通过 encodeURIComponent() 转义。encodeURI()encodeURIComponent() 都不对字符 -.!~*'()(称为 "无保留标记")进行编码,该字符没有保留用途,但允许在 URI "按原样" 中使用。(见 RFC2396

¥The characters on the second line are characters that may be part of the URI syntax, and are only escaped by encodeURIComponent(). Both encodeURI() and encodeURIComponent() do not encode the characters -.!~*'(), known as "unreserved marks", which do not have a reserved purpose but are allowed in a URI "as is". (See RFC2396)

encodeURI() 函数不会对 URI 具有特殊含义的字符(保留字符)进行编码。以下示例显示 URI 可能包含的所有部分。请注意某些字符如何用于表示特殊含义:

¥The encodeURI() function does not encode characters that have special meaning (reserved characters) for a URI. The following example shows all the parts that a URI can possibly contain. Note how certain characters are used to signify special meaning:

url
http://www.example.com:80/path/to/file.php?foo=316&bar=this+has+spaces#anchor

encodeURI,顾名思义,用于对整个 URL 进行编码,假设它已经是格式良好的。如果你想要将字符串值动态组装到 URL 中,你可能需要在每个动态段上使用 encodeURIComponent(),以避免在不需要的位置出现 URL 语法字符。

¥encodeURI, as the name implies, is used to encode a URL as a whole, assuming it is already well-formed. If you want to dynamically assemble string values into a URL, you probably want to use encodeURIComponent() on each dynamic segment instead, to avoid URL syntax characters in unwanted places.

js
const name = "Ben & Jerry's";

// This is bad:
const link = encodeURI(`https://example.com/?choice=${name}`); // "https://example.com/?choice=Ben%20&%20Jerry's"
console.log([...new URL(link).searchParams]); // [['choice', 'Ben '], [" Jerry's", '']

// Instead:
const link = encodeURI(
  `https://example.com/?choice=${encodeURIComponent(name)}`,
);
// "https://example.com/?choice=Ben%2520%2526%2520Jerry's"
console.log([...new URL(link).searchParams]); // [['choice', "Ben%20%26%20Jerry's"]]

示例

¥Examples

encodeURI() vs. encodeURIComponent()

encodeURI()encodeURIComponent() 的不同之处如下:

¥encodeURI() differs from encodeURIComponent() as follows:

js
const set1 = ";/?:@&=+$,#"; // Reserved Characters
const set2 = "-.!~*'()"; // Unreserved Marks
const set3 = "ABC abc 123"; // Alphanumeric Characters + Space

console.log(encodeURI(set1)); // ;/?:@&=+$,#
console.log(encodeURI(set2)); // -.!~*'()
console.log(encodeURI(set3)); // ABC%20abc%20123 (the space gets encoded as %20)

console.log(encodeURIComponent(set1)); // %3B%2C%2F%3F%3A%40%26%3D%2B%24%23
console.log(encodeURIComponent(set2)); // -.!~*'()
console.log(encodeURIComponent(set3)); // ABC%20abc%20123 (the space gets encoded as %20)

对单独的代理抛出进行编码

¥Encoding a lone surrogate throws

如果尝试对不属于高低对一部分的代理项进行编码,则会抛出 URIError。例如:

¥A URIError will be thrown if one attempts to encode a surrogate which is not part of a high-low pair. For example:

js
// High-low pair OK
encodeURI("\uD800\uDFFF"); // "%F0%90%8F%BF"

// Lone high-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uD800");

// Lone low-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uDFFF");

你可以使用 String.prototype.toWellFormed(),它用 Unicode 替换字符 (U+FFFD) 替换单独代理,以避免此错误。你还可以在将字符串传递给 encodeURI() 之前使用 String.prototype.isWellFormed() 检查字符串是否包含单独的代理项。

¥You can use String.prototype.toWellFormed(), which replaces lone surrogates with the Unicode replacement character (U+FFFD), to avoid this error. You can also use String.prototype.isWellFormed() to check if a string contains lone surrogates before passing it to encodeURI().

RFC3986 的编码

¥Encoding for RFC3986

最近的 RFC3986 保留了方括号(对于 IPv6),因此在形成可能是 URL 一部分的内容(例如主机)时不会进行编码。它还保留 !、'、(、) 和 *,即使这些字符没有正式的 URI 分隔用途。以下函数对符合 RFC3986 的 URL 格式的字符串进行编码。

¥The more recent RFC3986 makes square brackets reserved (for IPv6) and thus not encoded when forming something which could be part of a URL (such as a host). It also reserves !, ', (, ), and *, even though these characters have no formalized URI delimiting uses. The following function encodes a string for RFC3986-compliant URL format.

js
function encodeRFC3986URI(str) {
  return encodeURI(str)
    .replace(/%5B/g, "[")
    .replace(/%5D/g, "]")
    .replace(
      /[!'()*]/g,
      (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`,
    );
}

规范

Specification
ECMAScript Language Specification
# sec-encodeuri-uri

¥Specifications

浏览器兼容性

BCD tables only load in the browser

¥Browser compatibility

也可以看看