URL decoding

From Rosetta Code
Revision as of 21:49, 18 July 2011 by Ulrie (talk | contribs)
Task
URL decoding
You are encouraged to solve this task according to the task description, using any language you may know.

This task (the reverse of URL encoding) is to provide a function or mechanism to convert a url-encoded string into its original unencoded form.

Example

The encoded string "http%3A%2F%2Ffoo%20bar%2F" should be reverted to the unencoded form "http://foo bar/".

Icon and Unicon

<lang Icon>link hexcvt

procedure main() ue := "http%3A%2F%2Ffoo%20bar%2F" ud := decodeURL(ue) | stop("Improperly encoded string ",image(ue)) write("encoded = ",image(ue)) write("decoded = ",image(ue)) end

procedure decodeURL(s) #: decode URL/URI encoded data static de initial { # build lookup table for everything

 de := table()
 every de[hexstring(ord(c := !string(&ascii)),2)] := c
 }

c := "" s ? until pos(0) do # decode every %xx or fail

  c ||:= if ="%" then \de[move(2)] | fail
  else move(1)

return c end</lang>

hexcvt provides hexstring

Output:

encoded = "http%3A%2F%2Ffoo%20bar%2F"
decoded = "http://foo bar/"

J

J does not have a native urldecode (until version 7 when jhs includes a jurldecode).

Here is an implementation:

<lang j>require'strings convert' urldecode=: rplc&(;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

Example use:

<lang j> urldecode 'http%3A%2F%2Ffoo%20bar%2F' http://foo bar/</lang>

Note that a minor efficiency improvement is possible, by eliminating duplicated escape codes: <lang j>urldecode=: rplc&(~.,/;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

Perl

<lang Perl>#!/usr/bin/perl -w use strict ; use URI::Escape ;

my $encoded = "http%3A%2F%2Ffoo%20bar%2F" ; my $unencoded = uri_unescape( $encoded ) ; print "The unencoded string is $unencoded !\n" ;</lang>

Perl 6

<lang Perl 6>use v6;

my $url = "http%3A%2F%2Ffoo%20bar%2F";

my regex url {

   [ <text=&text> [\% <hex=&hex>]+ ]+ <text2=&text>?

}

my regex hex {

   \w\w

}

my regex text {

   \w+

}

$url ~~ /<url=&url>/;

my $dec_url; for $<url>.caps {

   if .key eq "hex"
   {

$dec_url ~= :10("0x" ~ .value).chr;

   } else {

$dec_url ~= .value;

   }

}

say $dec_url;</lang>

PicoLisp

: (ht:Pack (chop "http%3A%2F%2Ffoo%20bar%2F"))
-> "http://foo bar/"

PureBasic

<lang PureBasic>URL$ = URLDecoder("http%3A%2F%2Ffoo%20bar%2F")

Debug URL$  ; http://foo bar/</lang>

Retro

This is provided by the casket library (used for web app development).

<lang Retro>create buffer 32000 allot

{{

 create bit 5 allot
 : extract  ( $c-$a ) drop @+ bit ! @+ bit 1+ ! bit ;
 : render   ( $c-$n )
   dup '+ = [ drop 32 ] ifTrue
   dup 13 = [ drop 32 ] ifTrue
   dup 10 = [ drop 32 ] ifTrue
   dup '% = [ extract hex toNumber decimal ] ifTrue ;
 : <decode> (  $-$  ) repeat @+ 0; render ^buffer'add again ;

---reveal---

 : decode   (  $-   ) buffer ^buffer'set <decode> drop ;

}}

"http%3A%2F%2Ffoo%20bar%2F" decode buffer puts</lang>

Scala

<lang scala>import java.net._ val encoded="http%3A%2F%2Ffoo%20bar%2F" val decoded=URLDecoder.decode(encoded, "UTF-8") println(decoded) // -> http://foo bar/</lang>

Tcl

This code is careful to ensure that any untoward metacharacters in the input string still do not cause any problems. <lang tcl>proc urlDecode {str} {

   set specialMap {"[" "%5B" "]" "%5D"}
   set seqRE {%([0-9a-fA-F]{2})}
   set replacement {[format "%c" [scan "\1" "%2x"]]}
   set modStr [regsub -all $seqRE [string map $specialMap $str] $replacement]
   return [encoding convertfrom utf-8 [subst -nobackslash -novariable $modStr]]

}</lang> Demonstrating: <lang tcl>puts [urlDecode "http%3A%2F%2Ffoo%20bar%2F"]</lang> Output:

http://foo bar/

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT url_encoded="http%3A%2F%2Ffoo%20bar%2F" BUILD S_TABLE hex=":%><:><2<>2<%:" hex=STRINGS (url_encoded,hex), hex=SPLIT(hex) hex=DECODE (hex,hex) url_decoded=SUBSTITUTE(url_encoded,":%><2<>2<%:",0,0,hex) PRINT "encoded: ", url_encoded PRINT "decoded: ", url_decoded </lang> Output:

encoded: http%3A%2F%2Ffoo%20bar%2F
decoded: http://foo bar/