Two JavaScript built-ins, two character sets, two completely different jobs. encodeURI and encodeURIComponent look like they do the same thing
and they do not, and confusing them is the single most common cause of broken query strings,
mangled email addresses, and the dreaded %2520 in production logs.
This cheatsheet walks through what RFC 3986 actually says, what each function escapes and what it leaves alone, the form-encoding rules that overlap with URI rules but are not the same, and the practical bugs — double-encoding, disappearing pluses, vanishing slashes — that follow from picking the wrong tool.
The RFC 3986 Character Categories
URLs are built from a tightly specified set of ASCII characters. RFC 3986 divides them into two groups:
- Unreserved:
A–Z a–z 0–9 - _ . ~. Always safe to use literally in any URI component, never need encoding, never have special meaning. - Reserved:
: / ? # [ ] @(gen-delims) and! $ & ' ( ) * + , ; =(sub-delims). These carry structural meaning in URLs —?starts a query,&separates parameters,=joins key to value, and so on. They must be percent-encoded when used as data rather than structure.
Anything else — spaces, non-ASCII bytes, control characters — must always be
percent-encoded. Percent-encoding takes a byte (UTF-8 for non-ASCII) and writes it as %XX where XX is the two-digit uppercase hex value.
What encodeURI Does
encodeURI is designed for one job: take an already-assembled full URI and
escape the characters that would otherwise corrupt it during transport. It deliberately leaves all reserved characters alone on the assumption that they are already
serving their structural role.
Characters left unescaped by encodeURI:
- Unreserved:
A–Z a–z 0–9 - _ . ~ - Reserved (gen-delims and sub-delims):
; , / ? : @ & = + $ ! * ' ( ) #
Everything else gets percent-encoded. The practical effect: pass it a full URL with spaces
and Unicode and you get back a valid URL with spaces replaced by %20 and
non-ASCII characters UTF-8 encoded. Pass it a piece of data that contains &, =, ?, or # and the function cheerfully
leaves those characters intact — corrupting your URL.
encodeURI('https://example.com/path with space?q=a&b=c#fragment');
// "https://example.com/path%20with%20space?q=a&b=c#fragment"
// ^^^ ^ ^ ^
// space encoded & = # left alone (correct)
encodeURI('a&b=c');
// "a&b=c" — & and = NOT encoded (almost certainly a bug) What encodeURIComponent Does
encodeURIComponent is for a single piece of a URL — one path segment, one query parameter
name, one query parameter value, one fragment. It assumes nothing about the structure of the surrounding
URL, so it escapes every reserved character that could accidentally take on a delimiter role.
Characters left unescaped by encodeURIComponent:
- Unreserved per RFC 3986:
A–Z a–z 0–9 - _ . ~ - Plus, for historical ECMAScript reasons:
! * ' ( )
Everything else — /, ?, #, &, =, +, space, Unicode — gets percent-encoded.
encodeURIComponent('a&b=c');
// "a%26b%3Dc" — & encoded to %26, = to %3D (correct for a query value)
encodeURIComponent('path/with/slashes');
// "path%2Fwith%2Fslashes" — / encoded to %2F The Right Tool for the Job
The rule of thumb that works 95% of the time: build URLs one piece at a time using encodeURIComponent on each piece, and you will almost never reach for encodeURI.
// Wrong — query value contains an ampersand and equals sign
const q = 'a&b=c';
const url = encodeURI(`https://api.example.com/search?q=$${q}`);
// "https://api.example.com/search?q=a&b=c"
// Server parses ?q=a&b=c → q="a", b="c" — wrong
// Right — encode the value as a component
const url = `https://api.example.com/search?q=$${encodeURIComponent(q)}`;
// "https://api.example.com/search?q=a%26b%3Dc"
// Server parses → q="a&b=c" — correct The modern, even more robust alternative is the URL and URLSearchParams classes, which handle the encoding rules for you:
const url = new URL('https://api.example.com/search');
url.searchParams.set('q', 'a&b=c');
url.toString();
// "https://api.example.com/search?q=a%26b%3Dc" URLSearchParams uses the form-encoding rules (see below), so spaces become + rather than %20. Both are accepted by virtually every server.
application/x-www-form-urlencoded Is a Different Beast
This is where the most surprising bugs come from. The media type application/x-www-form-urlencoded — used by HTML form submissions and the body of
most POST requests — has its own rules that overlap with but are not identical to RFC 3986 URI
percent-encoding.
The differences that matter:
- Space encodes as
+, not%20. - Literal
+must be encoded as%2B, because otherwise it decodes back to a space. - The
~character is reserved here in some implementations, even though it is unreserved in RFC 3986.
Servers are forgiving and usually accept both + and %20 as a space
when decoding query strings. The dangerous case is the reverse: writing "a+b@example.com" into a query string without encoding the +, then
watching the server decode it to "a b@example.com".
// In a browser address bar this URL...
https://example.com/?email=a+b@example.com
// ...will be parsed by most server frameworks as
email = "a b@example.com"
// — the user’s plus has become a space. The fix is always the same: encodeURIComponent('a+b@example.com') produces "a%2Bb%40example.com", and that round-trips correctly through both URI-decoding
and form-decoding paths.
%20 vs +: Quick Decision Table
- In a URL path: spaces must be
%20.+is literal. - In a URL query string per RFC 3986: spaces should be
%20.+is literal. - In a URL query string per HTML form encoding: spaces become
+. Literal+must be%2B. - In an
application/x-www-form-urlencodedPOST body: same as HTML form encoding. Spaces are+, literal+is%2B. - In a URL fragment: spaces must be
%20.+is literal.
Double-Encoding and How to Spot It
Double-encoding happens when one layer of code encodes a value and a later layer encodes the
already-encoded string a second time. The signature is recognisable: percent signs
themselves get encoded as %25, so a space (%20) becomes %2520, an ampersand (%26) becomes %2526, and so on.
Common sources:
- A framework that auto-encodes path or query parameters, on top of a value the developer already encoded.
- A reverse proxy that decodes the request once, the application server decodes again, and the application code decodes a third time.
- A client that copies a URL from the address bar (where the browser has already decoded it for display) and pastes it into another encoding pass.
Detection is straightforward: search the affected URL or log line for %25 followed by two hex digits. If you see any of %2520, %2526, %253F, %2523, you are looking at double-encoded data.
The fix is never to "decode it twice" in the consuming code — that just papers over the bug. Find the duplicate encode call and remove it.
The Forward-Slash Trap
/ in a URL is always interpreted as a path separator by intermediate
infrastructure, even when it is percent-encoded as %2F. Historically, both
NGINX and Apache have, by default, refused to forward requests that contain %2F in the path — they either reject the request outright or silently decode the slash and treat it
as a separator, breaking your routing.
Affected scenarios: an identifier that legitimately contains a slash (a Git ref name like refs/heads/main, a file path inside an archive, a hierarchical username), an
opaque token that happens to include a slash, or an old-style base64 (non-url) signature in
a path segment.
Workarounds:
- Use
base64url(see the Base64 encoding article) instead of standard Base64, so slashes never appear in the input. - Move the value into a query parameter, where
%2Fis safe. - Configure the server to allow encoded slashes: Apache
AllowEncodedSlashes On, NGINX requires module-level configuration to preserve them.
Practical Examples Across Languages
JavaScript
encodeURIComponent('hello world'); // "hello%20world"
encodeURIComponent('a+b@example.com'); // "a%2Bb%40example.com"
decodeURIComponent('a%2Bb%40example.com'); // "a+b@example.com"
// URLSearchParams uses form encoding
new URLSearchParams({email: 'a+b@example.com'}).toString();
// "email=a%2Bb%40example.com"
new URLSearchParams({q: 'hello world'}).toString();
// "q=hello+world" — space becomes + Python
from urllib.parse import quote, quote_plus
quote('hello world') # "hello%20world" — encodeURIComponent equivalent
quote_plus('hello world') # "hello+world" — form encoding equivalent
quote('a/b', safe='') # "a%2Fb" — pass safe='' to encode slash curl
# --data sends the raw bytes; you must pre-encode any reserved characters
curl --data 'q=a&b=c' https://api.example.com/search
# --data-urlencode does it for you
curl --data-urlencode 'q=a&b=c' https://api.example.com/search
# Sends: q=a%26b%3Dc Decoding Pitfalls
decodeURIComponent throws URIError: malformed URI sequence on
input that contains a stray % not followed by two hex digits, or an invalid UTF-8
sequence. When the input comes from a user — pasted from anywhere outside your control — wrap
the call:
function safeDecode(s) {
try {
return decodeURIComponent(s);
} catch {
return s; // or null, or a sanitised fallback
}
} On the server, the equivalent guard depends on your stack — Node’s built-in URL parser
tolerates some malformed input that decodeURIComponent rejects, so be explicit about
which decoder is running on the suspect string.
A Concrete Quick Reference
- Building a query string from scratch? Use
new URLSearchParamsor callencodeURIComponenton each value. - Inserting a value into a path segment?
encodeURIComponent, and beware of the slash trap. - You have a whole pre-built URL and just want to escape spaces and non-ASCII?
encodeURI— this is the only time it is the right answer. - Reading data from a query string in code that uses
application/x-www-form-urlencodedparsing? Remember that+decodes to space and adjust your producers accordingly. - Seeing
%2520in logs? You are encoding twice. Trace the layers and remove one.
For round-tripping a value through both encoding flavours and seeing exactly which bytes change, the URL Encoder & Decoder shows the standard URI and form-encoded variants side by side. Useful when you need to confirm, for example, what your shell, your HTTP client, and your server are each going to do with the same input string. For neighbouring fundamentals, the top JSON errors piece covers the related territory of malformed text formats and the parser errors they produce.