how is this sorting different?
- Almost all of the other sort tasks are dealing with multiple records (or multiple strings), or an array (whose elements are either strings or numbers). This Rosetta Code task is restricted to sorting one string (of characters, which could include duplicates, blanks, and upper- and lower-cased letters). I would've preferred a particular string that would be used (in common) for all computer programming languages for this task to make comparison a little easier. -- Gerard Schildberger (talk) 17:22, 24 July 2021 (UTC)
rename this task ?
Perhaps this (draft) task should be renamed to: Sort a string of characters
which would "fit in" with the nomenclature with other Rosetta Code sorting tasks. -- Gerard Schildberger (talk) 17:43, 24 July 2021 (UTC)
- It certainly needs renaming, or the difference between alphabetical ordering and alphabitical ordering needs explaining!--Nigel Galloway (talk) 19:06, 24 July 2021 (UTC)
- "Sort a string of characters in lexicographical order" with an included link to the wikipedia entry for lexicographical - given the current task description? --Paddy3118 (talk) 08:11, 25 July 2021 (UTC)
- The requirement to write a sort routine, even if one comes with your language muddles the language comparisons though unless those languages with the in-built ability show that too. --Paddy3118 (talk) 08:15, 25 July 2021 (UTC)
task usage of alphabetical order
I suggest that some wording be added to the (draft) task that explains that the phrase alphabetical order depends on the hardware system being used.
The ASCII order is (essentially):
digits, uppercase Latin letters, lowercase Latin letters, with other special characters strewn about.
The EBCDIC order is (essentially):
lowercase Latin letters, uppercase Latin letters, digits, with other special characters strewn about.
- ASCII or EBCDIC are both completely immaterial. The task SPECIFICALLY states alphabetical (used to be alphabitical but whatever). In the alphabet, 'N' does not come before 'a' (as most of these example wrongly demonstrate). The task is stupid and should be removed. But if we insist on keeping it, the examples should at least follow the actual task title / description. Or, the title / description should be changed to what the task is actually demonstrating. --Thundergnat (talk) 13:13, 26 July 2021 (UTC)
- Not completely immaterial. An alphabet is an ordered set of symbols, which children learn during early learning. The task should specify the alphabet and its ordering. perhaps using a non-English alphabet such that not only the order of N n and a n have to be considered but also à á â ä æ ã å and there capitalized partners. I suggest 3 ways of ordering the symbols on a computer: 1) Form a bijection between the symbols and a subset of the set of integers, and sort the integers. ASCII and EBCDIC are such bijections, but may not produce the required result. 2) Write the alphabet as a list, given 2 symbols search through the list and return the one found first as the smaller. 3) Build the alphabet as a binary tree.--Nigel Galloway (talk) 14:44, 26 July 2021 (UTC)
- <pedantic-mode: on;> COMPLETELY immaterial. Neither is an alphabet. True, the task doesn't specify which alphabet to use, but neither of those are one. As for not using a sort, the task specifically requires writing a routine to sort. In what way could you interpret that to mean you don't need to sort? Yes, yes, I agree, you don't need to sort to get the end result required by the task, but it is pretty clear "Write a sort function, don't just use a built-in" requires there to be some kind of sort involved. <pedantic-mode: off;> --Thundergnat (talk) 14:59, 26 July 2021 (UTC)
- I think you meant to say utterly failed to demonstrate. I see no function there, then or now. Otherwise I assume you are agreeing with me, or is there an interpretation of "Write the function" that specifically means "not a one-liner" that I don't know about, or for that matter is there an interpretation of "Write the function" that specifically means "or not, top level code will do just fine too"? --Pete Lomax (talk)
- Yeah, and my pedantic point is that if he had written an actual function that just simply used Ring's built-in Sort(), it would not actually disagree with the task description, and of course this is all a bit "Call me a taxi." "You are a taxi.", but no task entry, even if it is the first, can override the task description. ☺ --Pete Lomax (talk)
- Well, we can argue till the cows come home about what words mean or allow us to do and, if someone set their mind to it, they could probably drive a coach and horses through many of the task descriptions on RC. However, there is such a thing as the spirit of a task and, for a non-pedant such as me, that at least is clear and is not to use built-in sort functionality at all for this particular task. Simply hiding it behind the facade of your own function would go against that spirit. --PureFox (talk) 08:50, 27 July 2021 (UTC)
- The task title and description are currently at odds with each other. The former says sort in alphabetical order and the latter in lexicographical order. Given that we have example strings containing both upper and lower case letters and non-letters, lexicographical order seems the most appropriate to me and I therefore agree with Paddy3118's suggested title change above. --PureFox (talk) 15:48, 26 July 2021 (UTC)
- Yes, it was me who changed 'alphabetical' to 'lexicographical' shortly after the task was introduced because I thought it was more appropriate and I knew that the task would have to be renamed anyway because 'alphabitical' was mis-spelled. Unfortunately, CalmoSoft went ahead and corrected the spelling but ignored the lexicographical aspect. --PureFox (talk) 18:03, 26 July 2021 (UTC)
- COMPLETELY Shouty and Bold my favourite mode of discussion (when the pedant is wrong). "Neither is an alphabet. True," I did not write that ASCII or EBDIC are alphabets I wrote that they are examples of bijections of the characters comprising the alphabet and a subset of the set of integers. In F# "int 'a' binds 97 and "char 97" binds 'a'. Probably utf8 another example of a bijection of the characters comprising the alphabet and a subset of the set of integers. 'a' < 'b' binds true because 97 is less than 98. Using the first method I identified is really sorting integers not "the characters of a string".--Nigel Galloway (talk) 14:00, 27 July 2021 (UTC)
For this task it is not necessary to perform a sort!
Given the alphabet as an ordered list of characters. Pass through the string once counting the number of occurrences of each character. Pass through the alphabet once outputting the corresponding character the appropriate number of times.--Nigel Galloway (talk) 14:44, 26 July 2021 (UTC)
- Fair point but (not that this task merits such attention to detail) I wonder what the output of "baNAnaBAnaNA" should be...
- Just to be awkward, I'm going with "aAaAaAbBNnnN". --Pete Lomax (talk) 16:26, 26 July 2021 (UTC)
Would anyone be interested to see the Slacksort method developed further and added to the existing sort block of tasks? It may be too trivial, but it does offer some versatility. Asking ahead of time, in case it is undesirable to clutter up the site. Perhaps the algorithm goes by another name and is already there. --Enter your username (talk) 16:19, 25 July 2021 (UTC)
- I might suggest your Slacksort is a variant of Sorting_algorithms/Selection_sort except that one inefficiency of the latter is it can make multiple passes down the rest of the array in order to obtain the same minimum value, whereas one inefficiency of the former is it
canwill make multiple passes down the entire array that achieve nothing.
- Plus as written it does not appear it can actually be used as a sort method, as in s := Slacksort(s), and is integer/char-only. --Pete Lomax (talk) 20:15, 25 July 2021 (UTC)
Request collation and sample input/output
The task appears to require a collation sequence, but doesn't clearly specify. Further, title says "letters in alphabetical order", while task says "characters in lexicographical order". (I would presume task description as more-authoritative than title) It also isn't clear how to sort upper/lower of same letter - does sort("CcBbAa") → "aAbBcC" or → "AaBbCc"? (dictionaries and phone books often disagree on that point) Numeric characters will have a "natural for your language" order, but what about punctuation characters, et al? Which comes first: comma or period? And do those come before zero? And so on for every other character to be considered. Author seems to assume this "natural" order is self-evident; I think it's not. ASCII or EBCDIC or UTF or similar are certainly NOT "natural" to any language (that I'm aware of!) so let's rule those out. I'd propose presenting an input string of all characters to be considered, as well as an output string showing the desired sequencing (i.e. the answer for given input). Further, it would be helpful to specify what to do with any characters encountered that aren't in the input string - discard? keep-in-place where encountered? append-at-end unsorted in order of encounter? (several other possibilities here) Additionally, if the collation sequence is indeed the focus of this task, then perhaps the (seemingly odd) final line of the task description (Write the function even if your language has a built-in function for it.) perhaps becomes unneccessary - because then the task would really be about writing the comparator rather than the sorter, so why not allow a built-in sort if available? --Davbol (talk) 19:11, 12 August 2022 (UTC)
- As written, the task allows ASCII/UTF or EBCDIC ordering (or, hypothetically, some other ordering).
- Also, here, sorting does not require a comparator.
- But... when you say it would be helpful to specify what to do with any characters encountered that aren't in the input string -- what do you mean? If the characters are not in the input string, how would they be encountered??
- --Rdm (talk) 19:32, 12 August 2022 (UTC)
- Maybe I'm just reading too strictly, but the wording of the task encourages my confusion..
- As I read it, it asks for characters "natural to the language". ASCII is certainly not "natural", EBCDIC perhaps even less-so. Both are machine constructs, not language constructs, so it seemed to me like the task-writer might have had something else in mind.
- If this is just another pointless variation of "rewrite a sort algorithm" then still same request: please clarify/formalize the specification, provide an example, etc. There are hints that the task-writer might have been attempting to convey the use of a natural-language-specific collated sort.
- It's a question of domain, and how to handle values outside the domain. Let's assume the task-author agrees to provide sample input/output strings in order to demonstrate the intended collation sequence, but still allows implementations to choose their OWN input strings, natural to their language (as is currently the case). If, hypothetically, the task-author failed to include/demonstrate the sorting of punctuation, then how should an implementation handle punctuation characters if encountered? Either that, or require the use of the example input, and accept only the example output. (eliminating free choice input entirely, so that the domain of characters is predetermined)
- If the task-writer merely wants characters of a string sorted by whatever machine-collation (whether ASCII, EBCDIC or other) happens to be in effect, then I'd recommend the task be written in simpler terms, so as not to suggest that something more might have been intended. --Davbol (talk) 05:38, 13 August 2022 (UTC)
- Maybe I'm just reading too strictly, but the wording of the task encourages my confusion..
- Indeed "natural to the language" and the most typical meaning for "natural" represent very different concepts (and rely on different "natural language" definitions of the adjective).
- But perhaps you could suggest a better phrasing which you think would serve here? --Rdm (talk) 07:53, 13 August 2022 (UTC)
- The task says "A character for this purpose should be whatever is natural for your language" - surely, this means the programming language, not the programmer's natural language. The collating sequence is thus defined by the character set being used - ASCII, Unicode, EBCDIC or whatever. Most languages have a character/byte/rune/whatever type and so the task author is saying use whatever's best for the language.
- Perhaps the wording could be changed to "A character for this purpose should be whatever is natural for your programming language".
- Mandating the collating sequence would disadvantage languages where the specified sequence wasn't the one "natural" to the programming language and surely over-complicate the code. The task is a simple sorting application, requiring demonstration of how to get at the individual characters of a string.--Tigerofdarkness (talk) 08:59, 13 August 2022 (UTC)
- So it'd be ok if "N" comes before "a"? In ASCII it does, alphabetically it does not. Even case-insensitive, it still isn't specified which of e.g. "nN" or "Nn" would be correct. If the tasks indeed just wants a simple ASCII sort then it should just come out and say so - all of the extraneous language in the task description does NOT clarify that. If the task is merely to demonstrate individual character retrieval, then there are simpler ways to demonstrate (e.g, just count the chars, or reverse the string, etc) without opening up the can-of-worms of lexicographical order. If the task has something else in mind, then it fails to convey it adequately. --Davbol (talk) 18:10, 15 August 2022 (UTC)
- I agree - looking at the existing samples, it would seem that almost all of them have N before a, a small number have a before A, so the "acceptable but not required" interpretion seems good to me. I see your Lua sample is one of the exceptions, which is fine. I don't think making 99% of the others invalid is worth it. --Tigerofdarkness (talk) 20:22, 15 August 2022 (UTC)