URL decoding: Difference between revisions

From Rosetta Code
Content deleted Content added
Add Seed7 example
Added C# solution.
Line 100: Line 100:
return 0;
return 0;
}</lang>
}</lang>

=={{header|C sharp}}==

<lang c sharp>using System;

namespace URLEncode
{
internal class Program
{
private static void Main(string[] args)
{
Console.WriteLine(Decode("http%3A%2F%2Ffoo%20bar%2F"));
}

private static string Decode(string uri)
{
return Uri.UnescapeDataString(uri);
}
}
}</lang>

'''Output'''

<pre>http://foo bar/
</pre>


=={{header|Delphi}}==
=={{header|Delphi}}==

Revision as of 23:53, 1 November 2011

Task
URL decoding
You are encouraged to solve this task according to the task description, using any language you may know.

This task (the reverse of URL encoding) is to provide a function or mechanism to convert a url-encoded string into its original unencoded form.

Example

The encoded string "http%3A%2F%2Ffoo%20bar%2F" should be reverted to the unencoded form "http://foo bar/".

Ada

Library: AWS

<lang Ada>with AWS.URL; with Ada.Text_IO; use Ada.Text_IO; procedure Decode is

  Encoded : constant String := "http%3A%2F%2Ffoo%20bar%2F";

begin

  Put_Line (AWS.URL.Decode (Encoded));

end Decode; </lang>

AutoHotkey

<lang AutoHotkey>encURL := "http%3A%2F%2Ffoo%20bar%2F" SetFormat, Integer, hex Loop Parse, encURL

  If A_LoopField = `%
     reading := 2, read := ""
  else if reading
  {
     read .= A_LoopField, --reading
     if not reading
        out .= Chr("0x" . read)
  }
  else out .= A_LoopField

MsgBox % out ; http://foo bar/ </lang>

AWK

<lang AWK>

  1. syntax: GAWK -f URL_DECODING.AWK

BEGIN {

   str = "http%3A%2F%2Ffoo%20bar%2F" # "http://foo bar/"
   printf("%s\n",str)
   while (match(str,/%/)) {
     L = substr(str,1,RSTART-1) # chars to left of "%"
     M = substr(str,RSTART+1,2) # 2 chars to right of "%"
     R = substr(str,RSTART+3)   # chars to right of "%xx"
     str = sprintf("%s%c%s",L,hex2dec(M),R)
   }
   printf("%s\n",str)
   exit(0)

} function hex2dec(s, num) {

   num = index("0123456789ABCDEF",toupper(substr(s,length(s)))) - 1
   sub(/.$/,"",s)
   return num + (length(s) ? 16*hex2dec(s) : 0)

} </lang>

output:

http%3A%2F%2Ffoo%20bar%2F
http://foo bar/

C

<lang c>#include <stdio.h>

  1. include <string.h>

inline int ishex(int x) { return (x >= '0' && x <= '9') || (x >= 'a' && x <= 'f') || (x >= 'A' && x <= 'F'); }

int decode(char *s, char *dec) { char *o, *end = s + strlen(s); int c;

for (o = dec; s <= end; o++) { c = *s++; if (c == '+') c = ' '; else if (c == '%' && ( !ishex(*s++) || !ishex(*s++) || !sscanf(s - 2, "%2x", &c))) return -1;

if (dec) *o = c; }

return o - dec; }

int main() { char url[] = "http%3A%2F%2ffoo+bar%2fabcd"; char out[sizeof(url)];

printf("length: %d\n", decode(url, 0)); puts(decode(url, out) < 0 ? "bad string" : out);

return 0; }</lang>

C#

<lang c sharp>using System;

namespace URLEncode {

   internal class Program
   {
       private static void Main(string[] args)
       {
           Console.WriteLine(Decode("http%3A%2F%2Ffoo%20bar%2F"));
       }
       private static string Decode(string uri)
       {
           return Uri.UnescapeDataString(uri);
       }
   }

}</lang>

Output

http://foo bar/

Delphi

<lang Delphi>program URLEncoding;

{$APPTYPE CONSOLE}

uses IdURI;

begin

 Writeln(TIdURI.URLDecode('http%3A%2F%2Ffoo%20bar%2F'));

end.</lang>

Go

<lang go>package main

import (

   "fmt"
   "url"

)

const escaped = "http%3A%2F%2Ffoo%20bar%2F"

func main() {

   if u, err := url.QueryUnescape(escaped); err == nil {
       fmt.Println(u)
   } else {
       fmt.Println(err)
   }

}</lang>

Icon and Unicon

<lang Icon>link hexcvt

procedure main() ue := "http%3A%2F%2Ffoo%20bar%2F" ud := decodeURL(ue) | stop("Improperly encoded string ",image(ue)) write("encoded = ",image(ue)) write("decoded = ",image(ue)) end

procedure decodeURL(s) #: decode URL/URI encoded data static de initial { # build lookup table for everything

 de := table()
 every de[hexstring(ord(c := !string(&ascii)),2)] := c
 }

c := "" s ? until pos(0) do # decode every %xx or fail

  c ||:= if ="%" then \de[move(2)] | fail
  else move(1)

return c end</lang>

hexcvt provides hexstring

Output:

encoded = "http%3A%2F%2Ffoo%20bar%2F"
decoded = "http://foo bar/"

J

J does not have a native urldecode (until version 7 when jhs includes a jurldecode).

Here is an implementation:

<lang j>require'strings convert' urldecode=: rplc&(;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

Example use:

<lang j> urldecode 'http%3A%2F%2Ffoo%20bar%2F' http://foo bar/</lang>

Note that a minor efficiency improvement is possible, by eliminating duplicated escape codes: <lang j>urldecode=: rplc&(~.,/;"_1&a."2(,:tolower)'%',.hfd i.#a.)</lang>

Java

<lang java>import java.io.UnsupportedEncodingException; import java.net.URLDecoder;

public class Main {

   public static void main(String[] args) throws UnsupportedEncodingException
   {
       String encoded = "http%3A%2F%2Ffoo%20bar%2F";
       String normal = URLDecoder.decode(encoded, "utf-8");
       System.out.println(normal);
   }

}</lang>

Output:

http://foo bar/

JavaScript

<lang javascript>decodeURIComponent("http%3A%2F%2Ffoo%20bar%2F")</lang>

NetRexx

<lang NetRexx>/* NetRexx */ options replace format comments java crossref savelog symbols nobinary

url = [ -

 'http%3A%2F%2Ffoo%20bar%2F', -
 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E', -
 '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E' -
 ]

loop u_ = 0 to url.length - 1

 say url[u_]
 say DecodeURL(url[u_])
 say
 end u_

return

method DecodeURL(arg) public static

 Parse arg encoded
 decoded = 
 PCT = '%'
 loop label e_ while encoded.length() > 0
   parse encoded head (PCT) +1 code +2 tail
   decoded = decoded || head
   select
     when code.strip('T').length() = 2 & code.datatype('X') then do
       code = code.x2c()
       decoded = decoded || code
       end
     when code.strip('T').length() \= 0 then do
       decoded = decoded || PCT
       tail = code || tail
       end
     otherwise do
       nop
       end
     end
   encoded = tail
   end e_
 return decoded

</lang>

Output:

http%3A%2F%2Ffoo%20bar%2F
http://foo bar/

mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E
mailto:"Ivan Aim" <ivan.aim@email.com>

%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E
mailto:"Irma User" <irma.user@mail.com>

Objective-C

Works with: Cocoa version Mac OS X 10.3+

<lang objc>NSString *encoded = @"http%3A%2F%2Ffoo%20bar%2F"; NSString *normal = [encoded stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; NSLog(@"%@", normal);</lang>

Perl

<lang Perl>#!/usr/bin/perl -w use strict ; use URI::Escape ;

my $encoded = "http%3A%2F%2Ffoo%20bar%2F" ; my $unencoded = uri_unescape( $encoded ) ; print "The unencoded string is $unencoded !\n" ;</lang>

Perl 6

<lang Perl 6>use v6;

my $url = "http%3A%2F%2Ffoo%20bar%2F";

my regex url {

   [ <text=&text> [\% <hex=&hex>]+ ]+ <text2=&text>?

}

my regex hex {

   \w\w

}

my regex text {

   \w+

}

$url ~~ /<url=&url>/;

my $dec_url; for $<url>.caps {

   if .key eq "hex"
   {

$dec_url ~= :10("0x" ~ .value).chr;

   } else {

$dec_url ~= .value;

   }

}

say $dec_url;</lang>

PHP

<lang php><?php $encoded = "http%3A%2F%2Ffoo%20bar%2F"; $unencoded = rawurldecode($encoded); echo "The unencoded string is $unencoded !\n"; ?></lang>

PicoLisp

: (ht:Pack (chop "http%3A%2F%2Ffoo%20bar%2F"))
-> "http://foo bar/"

PureBasic

<lang PureBasic>URL$ = URLDecoder("http%3A%2F%2Ffoo%20bar%2F")

Debug URL$  ; http://foo bar/</lang>

Python

<lang Python>import urllib print urllib.unquote("http%3A%2F%2Ffoo%20bar%2F")</lang>

Retro

This is provided by the casket library (used for web app development).

<lang Retro>create buffer 32000 allot

{{

 create bit 5 allot
 : extract  ( $c-$a ) drop @+ bit ! @+ bit 1+ ! bit ;
 : render   ( $c-$n )
   dup '+ = [ drop 32 ] ifTrue
   dup 13 = [ drop 32 ] ifTrue
   dup 10 = [ drop 32 ] ifTrue
   dup '% = [ extract hex toNumber decimal ] ifTrue ;
 : <decode> (  $-$  ) repeat @+ 0; render ^buffer'add again ;

---reveal---

 : decode   (  $-   ) buffer ^buffer'set <decode> drop ;

}}

"http%3A%2F%2Ffoo%20bar%2F" decode buffer puts</lang>

REXX

<lang REXX>/* Rexx */

Do

 X = 0
 url. = 
 X = X + 1; url.0 = X; url.X = 'http%3A%2F%2Ffoo%20bar%2F'
 X = X + 1; url.0 = X; url.X = 'mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E'
 X = X + 1; url.0 = X; url.X = '%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E'
 Do u_ = 1 to url.0
   Say url.u_
   Say DecodeURL(url.u_)
   Say
   End u_
 Return

End Exit

DecodeURL: Procedure Do

 Parse Arg encoded
 decoded = 
 PCT = '%'
 Do label e_ while encoded~length() > 0
   Parse Var encoded head (PCT) +1 code +2 tail
   decoded = decoded || head
   Select
     when code~strip('T')~length() = 2 & code~datatype('X') then Do
       code = code~x2c()
       decoded = decoded || code
       End
     when code~strip('T')~length() \= 0 then Do
       decoded = decoded || PCT
       tail = code || tail
       End
     otherwise Do
       Nop
       End
     End
   encoded = tail
   End e_
 Return decoded

End Exit </lang>

Output:

http%3A%2F%2Ffoo%20bar%2F
http://foo bar/

mailto%3A%22Ivan%20Aim%22%20%3Civan%2Eaim%40email%2Ecom%3E
mailto:"Ivan Aim" <ivan.aim@email.com>

%6D%61%69%6C%74%6F%3A%22%49%72%6D%61%20%55%73%65%72%22%20%3C%69%72%6D%61%2E%75%73%65%72%40%6D%61%69%6C%2E%63%6F%6D%3E
mailto:"Irma User" <irma.user@mail.com>

Ruby

Use any one of CGI.unescape or URI.decode_www_form_component. These methods also convert "+" to " ".

<lang ruby>require 'cgi' puts CGI.unescape("http%3A%2F%2Ffoo%20bar%2F")

  1. => "http://foo bar/"</lang>
Works with: Ruby version 1.9.2

<lang ruby>require 'uri' puts URI.decode_www_form_component("http%3A%2F%2Ffoo%20bar%2F")

  1. => "http://foo bar/"</lang>

URI.unescape (alias URI.unencode) still works. URI.unescape is obsolete since Ruby 1.9.2 because of problems with its sibling URI.escape.

Scala

<lang scala>import java.net._ val encoded="http%3A%2F%2Ffoo%20bar%2F" val decoded=URLDecoder.decode(encoded, "UTF-8") println(decoded) // -> http://foo bar/</lang>

Seed7

The library encoding.s7i defines functions to handle URL respectively percent encoding. The function fromPercentEncoded decodes a percend-encoded string. The function fromUrlEncoded works like fromPercentEncoded and additionally decodes '+' with a space. Both functions return byte sequences. To decode Unicode characters it is necessary to convert them from UTF-8 with utf8ToStri afterwards. <lang seed7>$ include "seed7_05.s7i";

 include "encoding.s7i";

const proc: main is func

 begin
   writeln(fromPercentEncoded("http%3A%2F%2Ffoo%20bar%2F"));
   writeln(fromUrlEncoded("http%3A%2F%2Ffoo+bar%2F"));
 end func;</lang>

Output:

http://foo bar/
http://foo bar/

Tcl

This code is careful to ensure that any untoward metacharacters in the input string still do not cause any problems. <lang tcl>proc urlDecode {str} {

   set specialMap {"[" "%5B" "]" "%5D"}
   set seqRE {%([0-9a-fA-F]{2})}
   set replacement {[format "%c" [scan "\1" "%2x"]]}
   set modStr [regsub -all $seqRE [string map $specialMap $str] $replacement]
   return [encoding convertfrom utf-8 [subst -nobackslash -novariable $modStr]]

}</lang> Demonstrating: <lang tcl>puts [urlDecode "http%3A%2F%2Ffoo%20bar%2F"]</lang> Output:

http://foo bar/

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT url_encoded="http%3A%2F%2Ffoo%20bar%2F" BUILD S_TABLE hex=":%><:><2<>2<%:" hex=STRINGS (url_encoded,hex), hex=SPLIT(hex) hex=DECODE (hex,hex) url_decoded=SUBSTITUTE(url_encoded,":%><2<>2<%:",0,0,hex) PRINT "encoded: ", url_encoded PRINT "decoded: ", url_decoded </lang> Output:

encoded: http%3A%2F%2Ffoo%20bar%2F
decoded: http://foo bar/