URL decoding
You are encouraged to solve this task according to the task description, using any language you may know.
This task (the reverse of URL encoding) is to provide a function or mechanism to convert a url-encoded string into its original unencoded form.
Example
The encoded string "http%3A%2F%2Ffoo%20bar%2F
" should be reverted to the unencoded form "http://foo bar/
".
Ada
<lang Ada>with AWS.URL; with Ada.Text_IO; use Ada.Text_IO; procedure Decode is
Encoded : constant String := "http%3A%2F%2Ffoo%20bar%2F";
begin
Put_Line (AWS.URL.Decode (Encoded));
end Decode; </lang>
AutoHotkey
<lang AutoHotkey>encURL := "http%3A%2F%2Ffoo%20bar%2F" SetFormat, Integer, hex Loop Parse, encURL
If A_LoopField = `% reading := 2, read := "" else if reading { read .= A_LoopField, --reading if not reading out .= Chr("0x" . read) } else out .= A_LoopField
MsgBox % out ; http://foo bar/ </lang>
AWK
<lang AWK>
- syntax: GAWK -f URL_DECODING.AWK
BEGIN {
str = "http%3A%2F%2Ffoo%20bar%2F" # "http://foo bar/" printf("%s\n",str) while (match(str,/%/)) { L = substr(str,1,RSTART-1) # chars to left of "%" M = substr(str,RSTART+1,2) # 2 chars to right of "%" R = substr(str,RSTART+3) # chars to right of "%xx" str = sprintf("%s%c%s",L,hex2dec(M),R) } printf("%s\n",str) exit(0)
} function hex2dec(s, num) {
num = index("0123456789ABCDEF",toupper(substr(s,length(s)))) - 1 sub(/.$/,"",s) return num + (length(s) ? 16*hex2dec(s) : 0)
} </lang>
output:
http%3A%2F%2Ffoo%20bar%2F http://foo bar/
C
<lang c>#include <stdio.h>
- include <string.h>
inline int ishex(int x) { return (x >= '0' && x <= '9') || (x >= 'a' && x <= 'f') || (x >= 'A' && x <= 'F'); }
int decode(char *s, char *dec) { char *o, *end = s + strlen(s); int c;
for (o = dec; s <= end; o++) { c = *s++; if (c == '+') c = ' '; else if (c == '%' && ( !ishex(*s++) || !ishex(*s++) || !sscanf(s - 2, "%2x", &c))) return -1;
if (dec) *o = c; }
return o - dec; }
int main() { char url[] = "http%3A%2F%2ffoo+bar%2fabcd"; char out[sizeof(url)];
printf("length: %d\n", decode(url, 0)); puts(decode(url, out) < 0 ? "bad string" : out);
return 0; }</lang>
C++
using the Poco libraries , compiling it with g++ -lPocoFoundation file.cpp -o file
<lang C++>#include <string>
- include "Poco/URI.h"
- include <iostream>
int main( ) {
std::string encoded( "http%3A%2F%2Ffoo%20bar%2F" ) ; std::string decoded ; Poco::URI::decode ( encoded , decoded ) ; std::cout << encoded << " is decoded: " << decoded << " !" << std::endl ; return 0 ;
}</lang> Output:
http%3A%2F%2Ffoo%20bar%2F is decoded: http://foo bar/ !
C#
<lang c sharp>using System;
namespace URLEncode {
internal class Program { private static void Main(string[] args) { Console.WriteLine(Decode("http%3A%2F%2Ffoo%20bar%2F")); }
private static string Decode(string uri) { return Uri.UnescapeDataString(uri); } }
}</lang>
Output
http://foo bar/
Clojure
<lang clojure>(java.net.URLDecoder/decode "http%3A%2F%2Ffoo%20bar%2F")</lang>
CoffeeScript
<lang coffeescript> console.log decodeURIComponent "http%3A%2F%2Ffoo%20bar%2F?name=Foo%20Barson" </lang>
<lang> > coffee foo.coffee http://foo bar/?name=Foo Barson </lang>
D
<lang d>import std.stdio, std.uri;
void main() {
writeln(decodeComponent("http%3A%2F%2Ffoo%20bar%2F"));
}</lang>
http://foo bar/
Delphi
<lang Delphi>program URLEncoding;
{$APPTYPE CONSOLE}
uses IdURI;
begin
Writeln(TIdURI.URLDecode('http%3A%2F%2Ffoo%20bar%2F'));
end.</lang>
Go
<lang go>package main
import (
"fmt" "net/url"
)
const escaped = "http%3A%2F%2Ffoo%20bar%2F"
func main() {
if u, err := url.QueryUnescape(escaped); err == nil { fmt.Println(u) } else { fmt.Println(err) }
}</lang> Output:
http://foo bar/
Icon and Unicon
<lang Icon>link hexcvt
procedure main() ue := "http%3A%2F%2Ffoo%20bar%2F" ud := decodeURL(ue) | stop("Improperly encoded string ",image(ue)) write("encoded = ",image(ue)) write("decoded = ",image(ue)) end
procedure decodeURL(s) #: decode URL/URI encoded data static de initial { # build lookup table for everything
de := table() every de[hexstring(ord(c := !string(&ascii)),2)] := c }
c := "" s ? until pos(0) do # decode every %xx or fail
c ||:= if ="%" then \de[move(2)] | fail else move(1)
return c end</lang>
Output:
encoded = "http%3A%2F%2Ffoo%20bar%2F" decoded = "http://foo bar/"
J
J does not have a native urldecode (until version 7 when jhs includes a jurldecode).
Here is an implementation:
<lang j>require'strings convert' urldecode=: rplc&(;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>
Example use:
<lang j> urldecode 'http%3A%2F%2Ffoo%20bar%2F' http://foo bar/</lang>
Note that a minor efficiency improvement is possible, by eliminating duplicated escape codes: <lang j>urldecode=: rplc&(~.,/;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>
Java
<lang java>import java.io.UnsupportedEncodingException; import java.net.URLDecoder;
public class Main {
public static void main(String[] args) throws UnsupportedEncodingException { String encoded = "http%3A%2F%2Ffoo%20bar%2F"; String normal = URLDecoder.decode(encoded, "utf-8"); System.out.println(normal); }
}</lang>
Output:
http://foo bar/
JavaScript
<lang javascript>decodeURIComponent("http%3A%2F%2Ffoo%20bar%2F")</lang>
Liberty BASIC
<lang lb>
dim lookUp$( 256)
for i =0 to 256 lookUp$( i) ="%" +dechex$( i) next i
url$ ="http%3A%2F%2Ffoo%20bar%2F"
print "Supplied URL '"; url$; "'" print "As string '"; url2string$( url$); "'"
end
function url2string$( i$)
for j =1 to len( i$) c$ =mid$( i$, j, 1) if c$ ="%" then nc$ =chr$( hexdec( mid$( i$, j +1, 2))) url2string$ =url2string$ +nc$ j =j +2 else url2string$ =url2string$ +c$ end if next j
end function </lang>
Supplied URL 'http%3A%2F%2Ffoo%20bar%2F' As string 'http://foo bar/'
NetRexx
<lang NetRexx>/* NetRexx */ options replace format comments java crossref savelog symbols nobinary
url = [ -
'http%3A%2F%2Ffoo%20bar%2F', - 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E', - '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E' - ]
loop u_ = 0 to url.length - 1
say url[u_] say DecodeURL(url[u_]) say end u_
return
method DecodeURL(arg) public static
Parse arg encoded decoded = PCT = '%'
loop label e_ while encoded.length() > 0 parse encoded head (PCT) +1 code +2 tail decoded = decoded || head select when code.strip('T').length() = 2 & code.datatype('X') then do code = code.x2c() decoded = decoded || code end when code.strip('T').length() \= 0 then do decoded = decoded || PCT tail = code || tail end otherwise do nop end end encoded = tail end e_
return decoded
</lang>
Output:
http%3A%2F%2Ffoo%20bar%2F http://foo bar/ mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E mailto:"Ivan Aim" <ivan.aim@email.com> %6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E mailto:"Irma User" <irma.user@mail.com>
Objective-C
<lang objc>NSString *encoded = @"http%3A%2F%2Ffoo%20bar%2F"; NSString *normal = [encoded stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; NSLog(@"%@", normal);</lang>
OCaml
Using the library ocamlnet from the interactive loop:
<lang ocaml>$ ocaml
- #use "topfind";;
- #require "netstring";;
- Netencoding.Url.decode "http%3A%2F%2Ffoo%20bar%2F" ;;
- : string = "http://foo bar/"</lang>
ooRexx
While the Rexx implementation shown here will also work with ooRexx, this version uses ooRexx syntax to invoke the built-in functions. <lang ooRexx>/* Rexx */
Do
X = 0 url. = X = X + 1; url.0 = X; url.X = 'http%3A%2F%2Ffoo%20bar%2F' X = X + 1; url.0 = X; url.X = 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E' X = X + 1; url.0 = X; url.X = '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E'
Do u_ = 1 to url.0 Say url.u_ Say DecodeURL(url.u_) Say End u_
Return
End Exit
DecodeURL: Procedure Do
Parse Arg encoded decoded = PCT = '%'
Do label e_ while encoded~length() > 0 Parse Var encoded head (PCT) +1 code +2 tail decoded = decoded || head Select when code~strip('T')~length() = 2 & code~datatype('X') then Do code = code~x2c() decoded = decoded || code End when code~strip('T')~length() \= 0 then Do decoded = decoded || PCT tail = code || tail End otherwise Do Nop End End encoded = tail End e_
Return decoded
End Exit </lang>
Output:
http%3A%2F%2Ffoo%20bar%2F http://foo bar/ mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E mailto:"Ivan Aim" <ivan.aim@email.com> %6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E mailto:"Irma User" <irma.user@mail.com>
Perl
<lang Perl>#!/usr/bin/perl -w use strict ; use URI::Escape ;
my $encoded = "http%3A%2F%2Ffoo%20bar%2F" ; my $unencoded = uri_unescape( $encoded ) ; print "The unencoded string is $unencoded !\n" ;</lang>
Perl 6
<lang perl6>my $url = "http%3A%2F%2Ffoo%20bar%2F";
say $url.subst: :g,
/'%'(<:hexdigit>**2)/, -> ($ord ) { chr(:16(~$ord)) }</lang>
Alternately, you can use an in-place substitution: <lang perl6>$url ~~ s:g[ '%' (<:hexdigit> ** 2) ] = chr :16(~$0); say $url;</lang>
PHP
<lang php><?php $encoded = "http%3A%2F%2Ffoo%20bar%2F"; $unencoded = rawurldecode($encoded); echo "The unencoded string is $unencoded !\n"; ?></lang>
PicoLisp
: (ht:Pack (chop "http%3A%2F%2Ffoo%20bar%2F")) -> "http://foo bar/"
PureBasic
<lang PureBasic>URL$ = URLDecoder("http%3A%2F%2Ffoo%20bar%2F")
Debug URL$ ; http://foo bar/</lang>
Python
<lang Python>import urllib print urllib.unquote("http%3A%2F%2Ffoo%20bar%2F")</lang>
Retro
This is provided by the casket library (used for web app development).
<lang Retro>create buffer 32000 allot
{{
create bit 5 allot : extract ( $c-$a ) drop @+ bit ! @+ bit 1+ ! bit ; : render ( $c-$n ) dup '+ = [ drop 32 ] ifTrue dup 13 = [ drop 32 ] ifTrue dup 10 = [ drop 32 ] ifTrue dup '% = [ extract hex toNumber decimal ] ifTrue ; : <decode> ( $-$ ) repeat @+ 0; render ^buffer'add again ;
---reveal---
: decode ( $- ) buffer ^buffer'set <decode> drop ;
}}
"http%3A%2F%2Ffoo%20bar%2F" decode buffer puts</lang>
REXX
Tested with the ooRexx and Regina interpreters. <lang REXX>/* Rexx */
Do
X = 0 url. = X = X + 1; url.0 = X; url.X = 'http%3A%2F%2Ffoo%20bar%2F' X = X + 1; url.0 = X; url.X = 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E' X = X + 1; url.0 = X; url.X = '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E'
Do u_ = 1 to url.0 Say url.u_ Say DecodeURL(url.u_) Say End u_
Return
End Exit
DecodeURL: Procedure Do
Parse Arg encoded decoded = PCT = '%'
Do while length(encoded) > 0 Parse Var encoded head (PCT) +1 code +2 tail decoded = decoded || head Select When length(strip(code, 'T')) = 2 & datatype(code, 'X') then Do code = x2c(code) decoded = decoded || code End When length(strip(code, 'T')) \= 0 then Do decoded = decoded || PCT tail = code || tail End Otherwise Do Nop End End encoded = tail End
Return decoded
End Exit </lang>
Output:
http%3A%2F%2Ffoo%20bar%2F http://foo bar/ mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E mailto:"Ivan Aim" <ivan.aim@email.com> %6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E mailto:"Irma User" <irma.user@mail.com>
Ruby
Use any one of CGI.unescape
or URI.decode_www_form_component
. These methods also convert "+" to " ".
<lang ruby>require 'cgi' puts CGI.unescape("http%3A%2F%2Ffoo%20bar%2F")
- => "http://foo bar/"</lang>
<lang ruby>require 'uri' puts URI.decode_www_form_component("http%3A%2F%2Ffoo%20bar%2F")
- => "http://foo bar/"</lang>
URI.unescape
(alias URI.unencode
) still works. URI.unescape
is obsolete since Ruby 1.9.2 because of problems with its sibling URI.escape
.
Scala
<lang scala>import java.net._ val encoded="http%3A%2F%2Ffoo%20bar%2F" val decoded=URLDecoder.decode(encoded, "UTF-8") println(decoded) // -> http://foo bar/</lang>
Seed7
The library encoding.s7i defines functions to handle URL respectively percent encoding. The function fromPercentEncoded decodes a percend-encoded string. The function fromUrlEncoded works like fromPercentEncoded and additionally decodes '+' with a space. Both functions return byte sequences. To decode Unicode characters it is necessary to convert them from UTF-8 with utf8ToStri afterwards. <lang seed7>$ include "seed7_05.s7i";
include "encoding.s7i";
const proc: main is func
begin writeln(fromPercentEncoded("http%3A%2F%2Ffoo%20bar%2F")); writeln(fromUrlEncoded("http%3A%2F%2Ffoo+bar%2F")); end func;</lang>
Output:
http://foo bar/ http://foo bar/
Tcl
This code is careful to ensure that any untoward metacharacters in the input string still do not cause any problems. <lang tcl>proc urlDecode {str} {
set specialMap {"[" "%5B" "]" "%5D"} set seqRE {%([0-9a-fA-F]{2})} set replacement {[format "%c" [scan "\1" "%2x"]]} set modStr [regsub -all $seqRE [string map $specialMap $str] $replacement] return [encoding convertfrom utf-8 [subst -nobackslash -novariable $modStr]]
}</lang> Demonstrating: <lang tcl>puts [urlDecode "http%3A%2F%2Ffoo%20bar%2F"]</lang> Output:
http://foo bar/
TUSCRIPT
<lang tuscript> $$ MODE TUSCRIPT url_encoded="http%3A%2F%2Ffoo%20bar%2F" BUILD S_TABLE hex=":%><:><2<>2<%:" hex=STRINGS (url_encoded,hex), hex=SPLIT(hex) hex=DECODE (hex,hex) url_decoded=SUBSTITUTE(url_encoded,":%><2<>2<%:",0,0,hex) PRINT "encoded: ", url_encoded PRINT "decoded: ", url_decoded </lang> Output:
encoded: http%3A%2F%2Ffoo%20bar%2F decoded: http://foo bar/