Talk:Natural sorting: Difference between revisions

Lexical ordering system
(Lexical ordering system)
Line 32:
:I may be blinded by the only way I can think of to do it in Java, but it seems to me that the task is a super-complex version of [[Sort using a custom comparator]]. I'm not sure that linking to a particular example's output is the best way to define the task. We ran into problems with that in [[Multisplit]]. Beyond all of that it does seem like a lot of work. Do umlauts count as accents? Should they be sorted as their expansion like scharfes? What about circumflexes (circumfleces? circumfli?)? There is a whole list of marks [[wp:Diacritic|here]]. Some of them represent condensations of letters (like ss condensed to a scharfe), some represent accents, and some represent different pronunciations (like a cedilla in French or tilde in Spanish). As you can see, this can get pretty complicated quickly. --[[User:Mwn3d|Mwn3d]] 15:25, 27 April 2011 (UTC)
:: I'll only answer the point about accents at the moment. Unicode is a pig for me too, so you only ''need'' to handle the accents mentioned in the particular test for that section if you don't have a convenient unicode class to make it more generic. You might implement parts via expandable means like using a table for example - expand the table to handle more than what task examples require. --[[User:Paddy3118|Paddy3118]] 18:01, 27 April 2011 (UTC)
 
=== Lexical ordering system ===
 
I always place symbols before numbers and letters, caps after lowercase and symbols are sorted in a different order to which they appear in the ascii table:
 
foo bar
Foo bar
FOO bar
foo_bar
foo-bar
foo9bar
foobar
 
[[User:Markhobley|Markhobley]] 19:46, 27 April 2011 (UTC)