Anonymous user
Talk:Textonyms: Difference between revisions
→Correct number of Textonyms in unixdict.txt?: added a comment and a new talk section.
(→Correct number of Textonyms in unixdict.txt?: added a comment and a new talk section.) |
|||
Line 54:
(The Go and [[Textonyms#J|J]] examples agree on 661 Textonyms in the [[Textonyms/wordlist]]; the other other examples don't give any other values).
:: They do now (specifically, REXX). See the talk section on '''duplicate words in dictionary''' (below). -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC)
—[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 23:33, 8 February 2015 (UTC)
Line 65 ⟶ 67:
:::: Good point, I don't think I can mark it "inconsistent" instead of "incorrect" tho :). I think the task is fairly clear "#{3} is the number of #{2} which represent more than one word." Only 1473 of the 22903 numbers "represent more than one word". The task should probably have included a short example that made the "correct" output more obvious. From what little I can guess from the python code, I thought/assumed that "<tt>sum(1 for w in num2words if len(w) > 1)</tt>" was counting the entries where <tt>len(w) > 1</tt>; if it's instead summing the lengths I think must be a mistake/typo given the variable name itself is <tt>morethan1word</tt> (rather than something like <tt>pairsOfTextonyms</tt>). —[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 01:22, 9 February 2015 (UTC)
::::: When a digit combination is entered it maps to: A) Gibberish; B) An English word; C) Multiple English Words. Case C) are Textonyms. #{3} is the number of digit combinations which fall into category C). So 1473 is the correct answer.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 15:33, 9 February 2015 (UTC)
==duplicate words in dictionary==
Consider the dictionary file:
<pre>
aleph
aleph
aleph
bet
</pre>
Question: how many words are in that dictionary file?
If you say "two", then that affects the count of how many words can be represented by the ''key digits'' (as described by this Rosetta Code task).
In the REXX program that I coded, it detects duplicate words (and ignores them, but displays a count). I believe that having duplicate words shouldn't alter the count of words representable by ''key digits''. As the REXX program is currently coded, it ignores duplicated words and it shows a different digit combination count ('''650''' digit combinations instead of '''661''', the latter counts duplicate words and reflects another way to count words representable by ''key digits'').
Better still, it would be nice to have a clean dictionary, or at the least, agree on whether or not duplicate words should be ignored (and instead report on unique words that are in the dictionary).
It was asked (elsewhere): "what is being counted?" (by Rdm). This is the crux of the ambiguity.
The '''UNIXDICT''' dictionary doesn't have that problem, fortunately. In reality, almost all dictionaries have duplicate words (either by meaning, by use, by their derivation/root, by case/capitalization, or by whatever). That shouldn't preclude the correct/accurate counting of (unique) words. -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC)
-----
|