URL encoding: Difference between revisions
(added python) |
|||
Line 105: | Line 105: | ||
=={{header|Perl}}== |
=={{header|Perl}}== |
||
<lang perl>use URI::Escape; |
|||
my $s = 'http://foo/bar/'; |
|||
print uri_escape($s);</lang> |
|||
Use standard CGI module: |
Use standard CGI module: |
||
<lang perl>use 5.10.0; |
<lang perl>use 5.10.0; |
||
Line 112: | Line 117: | ||
say $s = CGI::escape($s); |
say $s = CGI::escape($s); |
||
say $s = CGI::unescape($s);</lang> |
say $s = CGI::unescape($s);</lang> |
||
=={{header|Perl 6}}== |
=={{header|Perl 6}}== |
Revision as of 02:53, 28 July 2011
The task is to provide a function or mechanism to convert a provided string into URL encoding representation.
In URL encoding, special characters, control characters and extended characters are converted into a percent symbol followed by a two digit hexadecimal code, So a space character encodes into %20 within the string.
The following characters require conversion:
- ASCII control codes (Character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).
- ASCII symbols (Character ranges 32-47 decimal (20-2F hex))
- ASCII symbols (Character ranges 58-64 decimal (3A-40 hex))
- ASCII symbols (Character ranges 91-96 decimal (5B-60 hex))
- ASCII symbols (Character ranges 123-126 decimal (7B-7E hex))
- Extended characters with character codes of 128 decimal (80 hex) and above.
Example
The string "http://foo bar/
" would be encoded as "http%3A%2F%2Ffoo%20bar%2F
".
Options
It is permissible for an exception string (containing a set of symbols that do not need to be converted) to be utilized. However, this is an optional feature and is not a requirement of this task.
See also
Go
<lang go>package main
import ( "fmt" "http" "strings" )
func main() { url := http.URLEscape("http://foo bar/") // http.URLEscape replaces ' ' with '+', so: url = strings.Replace(url, "+", "%20", -1) fmt.Println(url) }</lang>
Icon and Unicon
<lang Icon>link hexcvt
procedure main() write("text = ",image(u := "http://foo bar/")) write("encoded = ",image(ue := encodeURL(u))) end
procedure encodeURL(s) #: encode data for inclusion in a URL/URI static en initial { # build lookup table for everything
en := table() every en[c := !string(~(&digits++&letters))] := "%"||hexstring(ord(c),2) every /en[c := !string(&cset)] := c }
every (c := "") ||:= en[!s] # re-encode everything return c end </lang>
Output:
text = "http://foo bar/" encoded = "http%3A%2F%2Ffoo%20bar%2F"
J
J has a urlencode in the gethttp package, but this task requires that all non-alphanumeric characters be encoded.
Here's an implementation that does that:
<lang j>require'strings convert' urlencode=: rplc&((#~2|_1 47 57 64 90 96 122 I.i.@#)a.;"_1'%',.hfd i.#a.)</lang>
Example use:
<lang j> urlencode 'http://foo bar/' http%3A%2F%2Ffoo%20bar%2F</lang>
Java
The built-in URLEncoder in Java converts the space " " into a plus-sign "+" instead of "%20": <lang java>import java.io.UnsupportedEncodingException; import java.net.URLEncoder;
public class Main {
public static void main(String[] args) throws UnsupportedEncodingException { String normal = "http://foo bar/"; String encoded = URLEncoder.encode(normal, "utf-8"); System.out.println(encoded); }
}</lang>
Output:
http%3A%2F%2Ffoo+bar%2F
Perl
<lang perl>use URI::Escape;
my $s = 'http://foo/bar/'; print uri_escape($s);</lang>
Use standard CGI module: <lang perl>use 5.10.0; use CGI;
my $s = 'http://foo/bar/'; say $s = CGI::escape($s); say $s = CGI::unescape($s);</lang>
Perl 6
<lang perl6>my $url = 'http://foo bar/';
say $url.subst(/<-[ A..Z a..z 0..9 ]>/, *.ord.fmt("%%%02X"), :g);</lang>
Output:
http%3A%2F%2Ffoo%20bar%2F
PHP
<lang php><?php
$s = 'http://foo/bar/';
$s = rawurlencode($s);
?></lang>
There is also urlencode()
, which also encodes spaces as "+" signs
PicoLisp
<lang PicoLisp>(de urlEncodeTooMuch (Str)
(pack (mapcar '((C) (if (or (>= "9" C "0") (>= "Z" (uppc C) "A")) C (list '% (hex (char C))) ) ) (chop Str) ) ) )</lang>
Test:
: (urlEncodeTooMuch "http://foo bar/") -> "http%3A%2F%2Ffoo%20bar%2F"
PureBasic
<lang PureBasic>URL$ = URLEncoder("http://foo bar/")</lang>
Python
<lang python>import urllib
s = 'http://foo/bar/'
s = urllib.quote(s)</lang>
There is also urllib.quote_plus()
, which also encodes spaces as "+" signs
Tcl
<lang tcl># Encode all except "unreserved" characters; use UTF-8 for extended chars.
- See http://tools.ietf.org/html/rfc3986 §2.4 and §2.5
proc urlEncode {str} {
set uStr [encoding convertto utf-8 $str] set chRE {[^-A-Za-z0-9._~\n]}; # Newline is special case! set replacement {%[format "%02X" [scan "\\\0" "%c"]]} return [string map {"\n" "%0A"} [subst [regsub -all $chRE $uStr $replacement]]]
}</lang> Demonstrating: <lang tcl>puts [urlEncode "http://foo bar/"]</lang> Output:
http%3A%2F%2Ffoo%20bar%2F%E2%82%AC
TUSCRIPT
<lang tuscript> $$ MODE TUSCRIPT text="http://foo bar/" BUILD S_TABLE spez_char="::>/:</::<%:" spez_char=STRINGS (text,spez_char) LOOP/CLEAR c=spez_char c=ENCODE(c,hex),c=concat("%",c),spez_char=APPEND(spez_char,c) ENDLOOP url_encoded=SUBSTITUTE(text,spez_char,0,0,spez_char) print "text: ", text PRINT "encoded: ", url_encoded - Programmfehler: url=ENCODE (url,cgi/utf8) </lang> Output:
text: http://foo bar/ encoded: http%3A%2F%2Ffoo%20bar%2F