Talk:Words containing "the" substring

From Rosetta Code

Trivial task

This seems to me to be just a trivial subtask of String matching. Can we have more original tasks and less pointless busywork please? Thebigh (talk) 09:36, 6 December 2020 (UTC)

The Rosetta Code task  String matching  doesn't handle a dictionary,   and because this task uses a dictionary,   there are   (or should be)   more concerns such as (possible extra) whitespace   (either tabs and/or leading/trailing blanks,   and/or double words)   in the dictionary as well as handling caseless searches   (although the particular dictionary specified has no capital letters in it,   nor duplicate words).   Another restriction is the situation when a word in the dictionary that is too short   (as per the task requirements).   Also, a minor detail is counting the number of words found and also possibly showing the number of words found (searched) in the dictionary.     -- Gerard Schildberger (talk) 17:13, 6 December 2020 (UTC)
<quote>doesn't handle a dictionary</quote> So? <quote>this task uses a dictionary</quote> So? <quote>whitespace ... double words ... in the dictionary ... caseless searches</quote> So? The dictionary to use was specified. Which makes all of that completely besides the point. <quote>counting the number of words</quote> Where is there ANY mention or requirement to count the words?
Those are all incidental and peripheral to the task. It also doesn't specify to power your computer on first. I suppose I should look forward to the task "Turn on your computer then find words containing "the" substring" because "Thatss completetly diffent!!1!1" Sheesh. --Thundergnat (talk) 17:48, 6 December 2020 (UTC)
I agree. Finding a substring of a string is just problem 2 of String matching, and reading in a dictionary first is a trivial addition- especially since so many of the recent tasks involve reading in the same dictionary. Thebigh (talk) 18:20, 6 December 2020 (UTC)
I'll try to answer the   "So?"   queries as politely as possible and keep my answers as civil as possible, ignoring your use of a strawman augment.   The addition of reading/processing the words in a dictionary (file) is somewhat trivial,   but it is part of the task,   and part of Rosetta Code's purpose is to compare how different computer programming languages (and programmers) implement even small requirements, albeit maybe somewhat trivial,   but not incidental.   I don't know what the author of this task considers incidental,   but I won't say that he considers it trivial or not.   As trivial as it seems,   it is necessary to read/process the input file (the dictionary) and it is one of the task's requirements (although implied),   but different computer programming languages could do it much differently and/or simply;   SAS  and  APL   come to mind.   I don't understand the need to mention your non sequitur comment about powering on your computer first.   Furthermore, I never said nor implied that the showing the number of substrings found was a task requirement.   It's common sense   (but not required)   to either show a running index count of the words found   (especially if the number of words found isn't easily countable),   or a summary total at the end of the displayed list,   but that is something the programmer decides to implement  (or not).     -- Gerard Schildberger (talk) 19:11, 6 December 2020 (UTC)
Looks like you need to go and reread the definition of strawman since you seem to be confused about it. I directly refuted the points you brought up, quoting your words directly.
There is no requirement that you load a dictionary, so talking about task requirements (even trivial ones) dealing with loading, filtering and storing the dictionary IS COMPLETELY BESIDES THE POINT. The requirement is that you use a specific dictionary. How it is done is peripheral to the task. The task is Find words containing the substring "the". In what way does loading the dictionary for this task differ from that in: Prime words or Odd words or Alternade words or ABC words or Teacup_rim_text or many others? It doesn't. <quote>It's common sense</quote> But it is not a requirement, so using that as a justification for why this task is different IS COMPLETELY BESIDES THE POINT.
That brings us back to the actual task requirements, essentially, filter a list based on some substring thereof; much like String matching or ABC words, two tasks among many that quite admirably cover this concept quite well. The point this whole thread started with was "Please lets cut down on the largely redundant tasks". Using examples of irrelevant implementation details and what the programmer <quote>decides to implement</quote> as argument for why it is not, is disingenuous at best. --Thundergnat (talk) 20:14, 6 December 2020 (UTC)
I know what a strawman argument is.   It was your last statement (Where is there ANY mention ...).     In any case, saying   "So?"   after a statement isn't a refutation.   Some people think I disagree about the triviality of some of the task's requirements.   I don't think they aren't trivial   (yeah, I know, double negative).   I was attempting to explain the differences as I see them   (however trivial they appear)   and in doing so,   doesn't make my thoughts/writings on these matters disingenuous.   Calling them that isn't a productive way to have a discussion when you start categorizing/defaming people's opinions that don't agree with yours.     -- Gerard Schildberger (talk) 20:41, 6 December 2020 (UTC)
<quote>So? after a statement isn't a refutation</quote> Quite right, the refutation would have been the sentence: <quote>The dictionary to use was specified. Which makes all of that completely besides the point.</quote> just following that.
Ok, so why is "filter a list based on some substring" is not already adequately covered by String matching or ABC words among others? We've already established that all of the specifics of dictionary loading and handling are not requirements, nor is the specific layout, order, counting or tracking of the output. The "trivial" part of the remaining requirements is the filtering, and it isn't that the filtering is trivial, it that there are already many existing tasks for which this is a trivial variation; which was why this whole thread started. --Thundergnat (talk) 21:12, 6 December 2020 (UTC)
I never used it as a justification,   it was just a comment.   Just saying that it's completely besides the point doesn't make it so.   Saying it in all caps doesn't make it true.     -- Gerard Schildberger (talk) 20:46, 6 December 2020 (UTC)
True, I agree. Saying it is besides the point doesn't make it so. It is the fact that it is besides the point that makes it so. <quote>Saying it in all caps doesn't make it true.</quote> Also true. It is just a method of emphasising a point that I think was worth emphasising since you seem to be missing it. Cheers! --Thundergnat (talk) 21:12, 6 December 2020 (UTC)
If I may insert my 2 cents... although Gerard is possibly overstating the case, the use of the dictionary does make it a little different as we have to look at actual words in a language (English in the unixdict case). Questions like "which words in the language contain "the" as a substring" are (perhaps not enormously) interesting questions that are IMHO different to "show how to check a string contains a substring".
I see Thundergnat has (rightly IMHO) deleted the "find words that contain all the vowels" task as it is little different from this and the "find the words that contain "a", "b" and "c" in order" task etc. (Actually, I think the "find the words that contain all the vowels" is actually a marginally more interesting task than the ABC one but the ABC one came first...). --Tigerofdarkness (talk) 20:09, 6 December 2020 (UTC)

Some Comparisons

We can directly compare answers to the tasks said to be similar. Would a curious programmer not versed in these particular languages , having read a languages example in the string matching task, gain much from the languages extra code in this new task?



--Paddy3118 (talk) 00:23, 7 December 2020 (UTC)

"(on this page)"?

I think I overlooked that little part; am I right in interpreting this as that very RosettaPage's contents? I've changed my code snippet, but do not see it in other solutions, and now I am a bit confused. Sorry, me stupid; if so I can revert to the old. Cg (talk) 14:52, 9 December 2020 (CET)

Unlikely, it probably just meant "show output here", and in any case any pertinent words in unixdict.txt are all by now repeated multiple times on this page, which would make any such filtering utterly pointless. --Pete Lomax (talk) 16:16, 9 December 2020 (UTC)
Yup, you're right. Counting might have made a little sense, but these numbers would also change... Thanks. Cg (talk) 17:24, 9 December 2020 (CET)