Talk:XML/Input: Difference between revisions

Line 8:

:::I added a <del>entity</del> numeric character reference, since XML processors in general need to be able to handle & and the full character set. --[[User:Kevin Reid|Kevin Reid]] 00:44, 2 June 2009 (UTC)

::::Are you suggesting that the program should convert HTML entities and numeric references into some character encoding? I think that should be a separate task. And, AFAIK, it is HTML specific, not XML. --[[User:PauliKL|PauliKL]] 09:03, 2 June 2009 (UTC)

:::::No. Numeric references, a small set of predefined entities, and the permitted character set, are [http://www.w3.org/TR/xml11/ part of the XML specification]. All XML parsers must support them. Practically, I think it is better for Rosetta Code if our examples show ''robust'', fully-general solutions rather than just-enough-for-the-example-at-hand. Don't spread code that will break when someone with an accent in their name comes along. --[[User:Kevin Reid|Kevin Reid]] 12:23, 2 June 2009 (UTC)

::Donal, the problem is that AWK implementation does not interpret the structure at all. It is quite possible to do some parsing even if there are no ready-made library routines for that. But that does not mean that we should implement a full XML parser. The task should be kept relatively simple.

::I notice that the XML input file has now been changed. But the the task description needs to be changed, too. --[[User:PauliKL|PauliKL]] 09:14, 2 June 2009 (UTC)