Talk:URL encoding: Difference between revisions

(we could provide an exception string)
Line 17:
 
:: I suppose as a bonus, we could provide an exception string, which contains a list of characters that do not become encoded. --[[User:Markhobley|Markhobley]] 17:57, 20 June 2011 (UTC)
 
== Encoding by RFC 3986 or HTML 5 ==
 
The current task lists six groups of characters to encode. The puzzle became, which groups of characters to preserve?
 
* The current task preserves only "0-9A-Za-z".
* My interpretation of RFC 3986 is to preserve "-._~0-9A-Za-z".
* My interpretation of HTML 5, [http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#url-encoded-form-data URL-encoded form data], is to preserve "-._*0-9A-Za-z" and to encode " " to "+".
 
I added this information to the task. If I understand well, RFC 3986 preserves '~' and encodes '*', while HTML 5 preserves '*' and encodes '~'. RFC 3986 also permits lowercase, so "http%3a%2f%2ffoo%20bar%2f" is valid. HTML 5 has specific rule to always encode to uppercase. --[[User:Kernigh|Kernigh]] 00:29, 31 July 2011 (UTC)
Anonymous user