Talk:Textonyms: Difference between revisions
Content added Content deleted
(→Correct number of Textonyms in unixdict.txt?: added a comment and a new talk section.) |
|||
Line 54: | Line 54: | ||
(The Go and [[Textonyms#J|J]] examples agree on 661 Textonyms in the [[Textonyms/wordlist]]; the other other examples don't give any other values). |
(The Go and [[Textonyms#J|J]] examples agree on 661 Textonyms in the [[Textonyms/wordlist]]; the other other examples don't give any other values). |
||
:: They do now (specifically, REXX). See the talk section on '''duplicate words in dictionary''' (below). -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC) |
|||
—[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 23:33, 8 February 2015 (UTC) |
—[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 23:33, 8 February 2015 (UTC) |
||
Line 65: | Line 67: | ||
:::: Good point, I don't think I can mark it "inconsistent" instead of "incorrect" tho :). I think the task is fairly clear "#{3} is the number of #{2} which represent more than one word." Only 1473 of the 22903 numbers "represent more than one word". The task should probably have included a short example that made the "correct" output more obvious. From what little I can guess from the python code, I thought/assumed that "<tt>sum(1 for w in num2words if len(w) > 1)</tt>" was counting the entries where <tt>len(w) > 1</tt>; if it's instead summing the lengths I think must be a mistake/typo given the variable name itself is <tt>morethan1word</tt> (rather than something like <tt>pairsOfTextonyms</tt>). —[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 01:22, 9 February 2015 (UTC) |
:::: Good point, I don't think I can mark it "inconsistent" instead of "incorrect" tho :). I think the task is fairly clear "#{3} is the number of #{2} which represent more than one word." Only 1473 of the 22903 numbers "represent more than one word". The task should probably have included a short example that made the "correct" output more obvious. From what little I can guess from the python code, I thought/assumed that "<tt>sum(1 for w in num2words if len(w) > 1)</tt>" was counting the entries where <tt>len(w) > 1</tt>; if it's instead summing the lengths I think must be a mistake/typo given the variable name itself is <tt>morethan1word</tt> (rather than something like <tt>pairsOfTextonyms</tt>). —[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 01:22, 9 February 2015 (UTC) |
||
::::: When a digit combination is entered it maps to: A) Gibberish; B) An English word; C) Multiple English Words. Case C) are Textonyms. #{3} is the number of digit combinations which fall into category C). So 1473 is the correct answer.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 15:33, 9 February 2015 (UTC) |
::::: When a digit combination is entered it maps to: A) Gibberish; B) An English word; C) Multiple English Words. Case C) are Textonyms. #{3} is the number of digit combinations which fall into category C). So 1473 is the correct answer.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 15:33, 9 February 2015 (UTC) |
||
==duplicate words in dictionary== |
|||
Consider the dictionary file: |
|||
<pre> |
|||
aleph |
|||
aleph |
|||
aleph |
|||
bet |
|||
</pre> |
|||
Question: how many words are in that dictionary file? |
|||
If you say "two", then that affects the count of how many words can be represented by the ''key digits'' (as described by this Rosetta Code task). |
|||
In the REXX program that I coded, it detects duplicate words (and ignores them, but displays a count). I believe that having duplicate words shouldn't alter the count of words representable by ''key digits''. As the REXX program is currently coded, it ignores duplicated words and it shows a different digit combination count ('''650''' digit combinations instead of '''661''', the latter counts duplicate words and reflects another way to count words representable by ''key digits''). |
|||
Better still, it would be nice to have a clean dictionary, or at the least, agree on whether or not duplicate words should be ignored (and instead report on unique words that are in the dictionary). |
|||
It was asked (elsewhere): "what is being counted?" (by Rdm). This is the crux of the ambiguity. |
|||
The '''UNIXDICT''' dictionary doesn't have that problem, fortunately. In reality, almost all dictionaries have duplicate words (either by meaning, by use, by their derivation/root, by case/capitalization, or by whatever). That shouldn't preclude the correct/accurate counting of (unique) words. -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC) |
|||
----- |