URL decoding

From Rosetta Code
Revision as of 03:11, 28 July 2011 by 208.80.119.67 (talk) (added objective-c)
Task
URL decoding
You are encouraged to solve this task according to the task description, using any language you may know.

This task (the reverse of URL encoding) is to provide a function or mechanism to convert a url-encoded string into its original unencoded form.

Example

The encoded string "http%3A%2F%2Ffoo%20bar%2F" should be reverted to the unencoded form "http://foo bar/".

Delphi

<lang Delphi>program URLEncoding;

{$APPTYPE CONSOLE}

uses IdURI;

begin

 Writeln(TIdURI.URLDecode('http%3A%2F%2Ffoo%20bar%2F'));

end.</lang>

Go

<lang go>package main

import ( "os" "fmt" "http" )

func main() { url, err := http.URLUnescape("http%3A%2F%2Ffoo%20bar%2F") if err != nil { fmt.Println(err) os.Exit(1) } fmt.Println(url) }</lang>

Icon and Unicon

<lang Icon>link hexcvt

procedure main() ue := "http%3A%2F%2Ffoo%20bar%2F" ud := decodeURL(ue) | stop("Improperly encoded string ",image(ue)) write("encoded = ",image(ue)) write("decoded = ",image(ue)) end

procedure decodeURL(s) #: decode URL/URI encoded data static de initial { # build lookup table for everything

 de := table()
 every de[hexstring(ord(c := !string(&ascii)),2)] := c
 }

c := "" s ? until pos(0) do # decode every %xx or fail

  c ||:= if ="%" then \de[move(2)] | fail
  else move(1)

return c end</lang>

hexcvt provides hexstring

Output:

encoded = "http%3A%2F%2Ffoo%20bar%2F"
decoded = "http://foo bar/"

J

J does not have a native urldecode (until version 7 when jhs includes a jurldecode).

Here is an implementation:

<lang j>require'strings convert' urldecode=: rplc&(;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

Example use:

<lang j> urldecode 'http%3A%2F%2Ffoo%20bar%2F' http://foo bar/</lang>

Note that a minor efficiency improvement is possible, by eliminating duplicated escape codes: <lang j>urldecode=: rplc&(~.,/;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

NetRexx

<lang NetRexx>/* NetRexx */ options replace format comments java crossref savelog symbols nobinary

url = [ -

 'http%3A%2F%2Ffoo%20bar%2F', -
 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E', -
 '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E' -
 ]

loop u_ = 0 to url.length - 1

 say url[u_]
 say DecodeURL(url[u_])
 say
 end u_

return

method DecodeURL(arg) public static

 Parse arg encoded
 decoded = 
 PCT = '%'
 loop label e_ while encoded.length() > 0
   parse encoded head (PCT) +1 code +2 tail
   decoded = decoded || head
   select
     when code.strip('T').length() = 2 & code.datatype('X') then do
       code = code.x2c()
       decoded = decoded || code
       end
     when code.strip('T').length() \= 0 then do
       decoded = decoded || PCT
       tail = code || tail
       end
     otherwise do
       nop
       end
     end
   encoded = tail
   end e_
 return decoded

</lang>

Output:

http%3A%2F%2Ffoo%20bar%2F
http://foo bar/

mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E
mailto:"Ivan Aim" <ivan.aim@email.com>

%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E
mailto:"Irma User" <irma.user@mail.com>

Objective-C

Works with: Cocoa version Mac OS X 10.3+

<lang objc>NSString *encoded = @"http%3A%2F%2Ffoo%20bar%2F"; NSString *normal = [encoded stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; NSLog(@"%@", normal);</lang>

Perl

<lang Perl>#!/usr/bin/perl -w use strict ; use URI::Escape ;

my $encoded = "http%3A%2F%2Ffoo%20bar%2F" ; my $unencoded = uri_unescape( $encoded ) ; print "The unencoded string is $unencoded !\n" ;</lang>

Perl 6

<lang Perl 6>use v6;

my $url = "http%3A%2F%2Ffoo%20bar%2F";

my regex url {

   [ <text=&text> [\% <hex=&hex>]+ ]+ <text2=&text>?

}

my regex hex {

   \w\w

}

my regex text {

   \w+

}

$url ~~ /<url=&url>/;

my $dec_url; for $<url>.caps {

   if .key eq "hex"
   {

$dec_url ~= :10("0x" ~ .value).chr;

   } else {

$dec_url ~= .value;

   }

}

say $dec_url;</lang>

PHP

<lang php><?php $encoded = "http%3A%2F%2Ffoo%20bar%2F"; $unencoded = rawurldecode($encoded); echo "The unencoded string is $unencoded !\n"; ?></lang>

PicoLisp

: (ht:Pack (chop "http%3A%2F%2Ffoo%20bar%2F"))
-> "http://foo bar/"

PureBasic

<lang PureBasic>URL$ = URLDecoder("http%3A%2F%2Ffoo%20bar%2F")

Debug URL$  ; http://foo bar/</lang>

Python

<lang Python>import urllib print urllib.unquote("http%3A%2F%2Ffoo%20bar%2F")</lang>

Retro

This is provided by the casket library (used for web app development).

<lang Retro>create buffer 32000 allot

{{

 create bit 5 allot
 : extract  ( $c-$a ) drop @+ bit ! @+ bit 1+ ! bit ;
 : render   ( $c-$n )
   dup '+ = [ drop 32 ] ifTrue
   dup 13 = [ drop 32 ] ifTrue
   dup 10 = [ drop 32 ] ifTrue
   dup '% = [ extract hex toNumber decimal ] ifTrue ;
 : <decode> (  $-$  ) repeat @+ 0; render ^buffer'add again ;

---reveal---

 : decode   (  $-   ) buffer ^buffer'set <decode> drop ;

}}

"http%3A%2F%2Ffoo%20bar%2F" decode buffer puts</lang>

REXX

<lang REXX>/* Rexx */

Do

 X = 0
 url. = 
 X = X + 1; url.0 = X; url.X = 'http%3A%2F%2Ffoo%20bar%2F'
 X = X + 1; url.0 = X; url.X = 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E'
 X = X + 1; url.0 = X; url.X = '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E'
 Do u_ = 1 to url.0
   Say url.u_
   Say DecodeURL(url.u_)
   Say
   End u_
 Return

End Exit

DecodeURL: Procedure Do

 Parse Arg encoded
 decoded = 
 PCT = '%'
 Do label e_ while encoded~length() > 0
   Parse Var encoded head (PCT) +1 code +2 tail
   decoded = decoded || head
   Select
     when code~strip('T')~length() = 2 & code~datatype('X') then Do
       code = code~x2c()
       decoded = decoded || code
       End
     when code~strip('T')~length() \= 0 then Do
       decoded = decoded || PCT
       tail = code || tail
       End
     otherwise Do
       Nop
       End
     End
   encoded = tail
   End e_
 Return decoded

End Exit </lang>

Output:

http%3A%2F%2Ffoo%20bar%2F
http://foo bar/

mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E
mailto:"Ivan Aim" <ivan.aim@email.com>

%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E
mailto:"Irma User" <irma.user@mail.com>

Scala

<lang scala>import java.net._ val encoded="http%3A%2F%2Ffoo%20bar%2F" val decoded=URLDecoder.decode(encoded, "UTF-8") println(decoded) // -> http://foo bar/</lang>

Tcl

This code is careful to ensure that any untoward metacharacters in the input string still do not cause any problems. <lang tcl>proc urlDecode {str} {

   set specialMap {"[" "%5B" "]" "%5D"}
   set seqRE {%([0-9a-fA-F]{2})}
   set replacement {[format "%c" [scan "\1" "%2x"]]}
   set modStr [regsub -all $seqRE [string map $specialMap $str] $replacement]
   return [encoding convertfrom utf-8 [subst -nobackslash -novariable $modStr]]

}</lang> Demonstrating: <lang tcl>puts [urlDecode "http%3A%2F%2Ffoo%20bar%2F"]</lang> Output:

http://foo bar/

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT url_encoded="http%3A%2F%2Ffoo%20bar%2F" BUILD S_TABLE hex=":%><:><2<>2<%:" hex=STRINGS (url_encoded,hex), hex=SPLIT(hex) hex=DECODE (hex,hex) url_decoded=SUBSTITUTE(url_encoded,":%><2<>2<%:",0,0,hex) PRINT "encoded: ", url_encoded PRINT "decoded: ", url_decoded </lang> Output:

encoded: http%3A%2F%2Ffoo%20bar%2F
decoded: http://foo bar/