Talk:Find URI in text: Difference between revisions

(Task and RFC are not aligned)
Line 11:
* nothing in the RFC indicates that parenthesis must be balanced and the characters are allowed via the 'segment' parts of URIs.
Based on this a solution that gives "stop:" and containing unbalanced parenthesis are technically valid but probably not what the author intended. --[[User:Dgamey|Dgamey]] 02:17, 8 January 2012 (UTC)
: not sure about "stop:". because for one, new schemes can be made up. some applications have internal schemes that are known to us. the task only asks to find URIs, not process them, thus the decision to deal with "stop:" or not, can be handled in the processing stage. for example in some cases you may only be interested in http, https, and maybe ftp. in such a case you'd go through the list of matches and remove anything that is not of interest. of course one could write the parser in a way that it can take a list going in to decided which schemes should be found, but by default there is no harm in finding to much.
: nothing in the task indicates that parenthesis must be balanced either. unbalanced parenthesis are certainly valid and are what the author intended too. please look at the live example i found from wikipedia: [http://en.wikipedia.org/wiki/-) http://en.wikipedia.org/wiki/-)] (and note how mediawiki parses it wrong :-).--[[User:EMBee|eMBee]] 03:57, 8 January 2012 (UTC)
Anonymous user