URL Encoder Decoder

URL Encoding — Why %20 Isn't Just Random Letters and When It Goes Wrong | StoreDropship

URL Encoding — Why %20 Isn't Just Random Letters and When It Goes Wrong

📅 July 14, 2025 ✍️ StoreDropship 🏷️ Developer Tools ⏱️ 10 min read
You've seen %20 in a URL. You've probably also seen a broken link that showed a question mark where a character should be, or an API that returned garbled text instead of a search result. Both of these are URL encoding stories — one working correctly, one not. Here's the complete picture.

The Problem URLs Have With Most Human Text

URLs were designed in the early 1990s when the internet was almost entirely ASCII — the 128-character set covering English letters, digits, and basic punctuation. The URL specification (RFC 3986) allows only a specific subset of ASCII characters to appear unencoded. Everything else must be converted.

Fast-forward to today. The web is global. Search queries are in Hindi, Tamil, Arabic, Japanese, and hundreds of other languages. Email addresses contain international characters. Product names include symbols like ₹, ©, and ™. Affiliate links contain tracking parameters with spaces, equals signs, and ampersands inside values.

None of these characters are safe in a raw URL. They need to be encoded — and that's what percent-encoding does. It's not arbitrary. It's a precisely defined system that makes any character safe for use anywhere in a URL.

The Mechanism — What %20 Actually Means

Every character in a computer has a numeric code. The ASCII code for a space is 32. In hexadecimal, 32 is 20. So a space, when percent-encoded, becomes %20. The percent sign announces "what follows is a hex code, not a literal character." The two hex digits are the byte value of the character.

For characters that require multiple bytes in UTF-8 encoding — which includes virtually all non-Latin scripts — each byte gets its own %XX triplet. The Hindi character ₹ (U+20B9) encodes to three UTF-8 bytes: E2, 82, B9. So ₹ becomes %E2%82%B9 in a URL.

This means a Hindi search query like "नमस्ते दुनिया" doesn't become two words with a space. It becomes a long string of percent triplets that is unambiguous, safe, and fully reversible. Every web browser, every server, every API client that follows the standard can decode it back to the original text.

Input:    नमस्ते दुनिया
Encoded:  %E0%A4%A8%E0%A4%AE%E0%A4%B8%E0%A5%8D%E0%A4%A4%E0%A5%87%20%E0%A4%A6%E0%A5%81%E0%A4%A8%E0%A4%BF%E0%A4%AF%E0%A4%BE

Input:    price = ₹999 & discount = 10%
Encoded:  price%20%3D%20%E2%82%B9999%20%26%20discount%20%3D%2010%25

encodeURI vs encodeURIComponent — The Distinction That Trips Everyone Up

This is where most encoding bugs come from. JavaScript provides two encoding functions and they behave very differently. Using the wrong one produces a URL that appears to work but silently corrupts your data.

encodeURI is designed to encode a complete URL. It knows that characters like :, /, ?, #, &, and = have structural meaning in a URL, so it leaves them alone. If you pass a full URL like https://storedropship.in/search?q=hello world to encodeURI, you get https://storedropship.in/search?q=hello%20world — the space is encoded but the structure is preserved.

encodeURIComponent encodes everything except letters, digits, and a handful of safe characters (- _ . ! ~ * ' ( )). This is what you should use when encoding a value that will be placed inside a URL — a query parameter value, a form field, a redirect destination. It encodes colons, slashes, ampersands, equals signs — all of them.

💡 The rule: Use encodeURIComponent for values (query param values, form fields, anything inside a URL). Use encodeURI for a complete URL that already has its structure. When in doubt, encodeURIComponent is the safer choice.

The Silent Bug That Breaks APIs

Here's a scenario that happens constantly in real codebases. A developer builds an API call that passes a redirect URL as a query parameter:

// Wrong — using encodeURI for a parameter value
const redirectUrl = "https://storedropship.in/dashboard?ref=email";
const apiUrl = "https://api.example.com/auth?redirect=" + encodeURI(redirectUrl);

// Result:
// https://api.example.com/auth?redirect=https://storedropship.in/dashboard?ref=email
// The ? and = inside the redirect value are NOT encoded!
// The API server reads: redirect=https://storedropship.in/dashboard
// And sees: ref=email as a second query parameter on the API URL
// Your redirect destination is silently truncated.
// Correct — using encodeURIComponent for a parameter value
const redirectUrl = "https://storedropship.in/dashboard?ref=email";
const apiUrl = "https://api.example.com/auth?redirect=" + encodeURIComponent(redirectUrl);

// Result:
// https://api.example.com/auth?redirect=https%3A%2F%2Fstoredropship.in%2Fdashboard%3Fref%3Demail
// The entire redirect URL is encoded as a single opaque value.
// The API server correctly receives and decodes the full destination URL.

This bug is insidious because in simple cases it works fine — if the redirect URL has no query parameters, encodeURI and encodeURIComponent produce the same result. The bug only appears when the value being encoded contains structural URL characters. By then it can be hard to trace.

%20 vs + — The Space Encoding Confusion

If you've ever looked at a URL from a search engine, you've seen both %20 and + used for spaces in different URLs. They're not the same thing and they come from different specifications.

%20 is the correct percent-encoding of a space, defined in RFC 3986 and used by encodeURIComponent and encodeURI. It works everywhere a URL is used.

+ representing a space comes from the HTML form encoding format (application/x-www-form-urlencoded), which predates the modern URL spec. When an HTML form is submitted via GET, browsers encode the form data in this older format, using + for spaces and %XX for everything else.

The problem arises when you mix them. A URL containing + interpreted by a system expecting %20 will show a literal plus sign, not a space. If you're building an API or processing query strings in JavaScript, always use %20 (via encodeURIComponent) and always use decodeURIComponent to decode, then separately handle any + characters if you're dealing with form-encoded data.

Indian Languages and Multilingual URLs — The Real-World Case

India has 22 officially recognised languages. Websites serving Indian audiences increasingly use Hindi, Tamil, Bengali, and other regional languages in URLs — for SEO in regional search, for user-friendly addresses, and for legal and government portals.

When a browser navigates to a URL with non-ASCII characters in the path — say, a Hindi slug like storedropship.in/ब्लॉग/ — the browser automatically percent-encodes the non-ASCII portion before sending the HTTP request. The server receives the encoded version. When the URL is displayed in the browser's address bar, modern browsers decode it back to the readable script for display purposes only.

🇮🇳 SEO implication for Indian websites: If you're building URLs for Hindi or regional language content, your server should canonicalise both the encoded and decoded forms to the same page. A user might share the decoded-looking URL from their browser bar, while a scraper or crawler might use the percent-encoded form. Both should resolve correctly and have a consistent canonical tag.

For Indian e-commerce and content sites: Store your URLs internally as percent-encoded strings. Display the decoded (readable) form to users. Use the encoded form in all server-side operations, API calls, and canonical tags. This prevents character set issues across different environments.

When Decoding Goes Wrong — Malformed Sequences

Decoding is usually the simpler half of URL encoding — but it has one significant failure mode. If a percent sign appears in a URL without exactly two hexadecimal digits following it, the sequence is invalid. %GG is not valid (G is not a hex digit). A lone % at the end of a string is not valid.

JavaScript's decodeURIComponent throws a URIError when it encounters these. If your code doesn't handle this exception, the error propagates up and can crash your page, your API handler, or your server process.

This matters practically when you're processing user-supplied URLs — in a URL shortener, a redirect handler, an analytics platform, or a browser extension. Always wrap your decoding in a try-catch. If decoding fails, either reject the input with a clear error or pass the raw string through without decoding rather than producing corrupted output.

⚠️ Security note: Double-decoding attacks occur when an application decodes a URL twice. An attacker encodes a malicious string twice. The first decode produces what looks like a safe percent-encoded string. The second decode reveals the attack. Always decode exactly once, and validate input after decoding, not before.

Practical Checklist — URL Encoding in Your Workflow

  • Always use encodeURIComponent when building query parameter values programmatically — never string concatenation without encoding.
  • When copying a URL from a browser address bar to share it, the browser typically shows the decoded form. The actual URL being used may be percent-encoded. Use a tool to verify if precision matters.
  • When reading campaign tracking URLs from ad platforms, decode them before analysing the parameters — the raw encoded form is hard to read and error-prone to edit manually.
  • In server-side code (Node.js, Python, PHP, etc.), use the language's built-in URL parsing library rather than manually handling percent-encoding. These libraries handle edge cases you'll miss doing it by hand.
  • When a URL looks "broken" with question marks, boxes, or garbled characters, the first thing to check is whether the encoding is wrong or whether the server is interpreting the encoding with a different character set than the client expected.

URL Encoding Across Languages

Hindi
URL एन्कोडिंग — वेब पर सुरक्षित अक्षर रूपांतरण
Tamil
URL குறியாக்கம் — இணையத்தில் பாதுகாப்பான எழுத்து மாற்றம்
Telugu
URL ఎన్‌కోడింగ్ — వెబ్‌లో సురక్షిత అక్షర మార్పిడి
Bengali
URL এনকোডিং — ওয়েবে নিরাপদ অক্ষর রূপান্তর
Marathi
URL एन्कोडिंग — वेबवर सुरक्षित अक्षर रूपांतरण
Gujarati
URL એન્કોડિંગ — વેબ પર સુરક્ષિત અક્ષર રૂપાંતરણ
Kannada
URL ಎನ್‌ಕೋಡಿಂಗ್ — ವೆಬ್‌ನಲ್ಲಿ ಸುರಕ್ಷಿತ ಅಕ್ಷರ ಪರಿವರ್ತನೆ
Malayalam
URL എൻകോഡിംഗ് — വെബ്ബിൽ സുരക്ഷിത അക്ഷര പരിവർത്തനം
Spanish
Codificación de URL — conversión segura de caracteres en la web
French
Encodage d'URL — conversion sécurisée des caractères sur le web
German
URL-Kodierung — sichere Zeichenkonvertierung im Web
Japanese
URLエンコード — ウェブ上での安全な文字変換
Arabic
ترميز URL — تحويل آمن للأحرف على الويب
Portuguese
Codificação de URL — conversão segura de caracteres na web
Korean
URL 인코딩 — 웹에서 안전한 문자 변환

Encode or Decode Any URL Now

Paste your URL or text and get the encoded or decoded output instantly — with Full Encode and Query String modes, right in your browser.

Open the URL Encoder Decoder →

Recommended Hosting

Hostinger

If you are building a website for your tools, blog, or store, reliable hosting matters for speed and uptime. Hostinger is a popular option used worldwide.

Visit Hostinger →

Disclosure: This is a sponsored link.

Contact Us

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
💬
Advertisement
Advertisement