Talk:Find URI in text

Unicode Chars

My hunch is just to leave Unicode characters alone. This can be regarded as a matter of conversion before the URL is used. It depends on the purpose of extracting URL's from text. (Are they headed for a processing stage which deals with those characters fine?)24.85.131.247 19:01, 3 January 2012 (UTC)

that's the intention exactly. non-ascii characters are mentioned because they should be included. a parser that only accepts legal characters would not do that.--eMBee 02:14, 4 January 2012 (UTC)

So, since spaces can be entered in a browser, they can be accepted as part of a URI, here? --Rdm 18:17, 5 January 2012 (UTC)

i suppose if the url has a delimiter like quotes or <http://go.here/to this place>, then i don't see why not. it's depends on the ability to figure out the users intent. and on the application. depending on where the parser is used there might even be an opportunity to verify that a url actually exists. (now that would actually be an interesting feature: you type some text on some website, and the browser or server tells you that the url you typed does not exist)--eMBee 03:38, 6 January 2012 (UTC)