Talk:Textonyms: Difference between revisions

→‎Correct number of Textonyms in unixdict.txt?: added a comment and a new talk section.
(→‎Correct number of Textonyms in unixdict.txt?: added a comment and a new talk section.)
Line 54:
 
(The Go and [[Textonyms#J|J]] examples agree on 661 Textonyms in the [[Textonyms/wordlist]]; the other other examples don't give any other values).
 
:: They do now   (specifically, REXX).   See the talk section on   '''duplicate words in dictionary'''   (below). -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC)
 
—[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 23:33, 8 February 2015 (UTC)
Line 65 ⟶ 67:
:::: Good point, I don't think I can mark it "inconsistent" instead of "incorrect" tho :). I think the task is fairly clear "#{3} is the number of #{2} which represent more than one word." Only 1473 of the 22903 numbers "represent more than one word". The task should probably have included a short example that made the "correct" output more obvious. From what little I can guess from the python code, I thought/assumed that "<tt>sum(1 for w in num2words if len(w) > 1)</tt>" was counting the entries where <tt>len(w) > 1</tt>; if it's instead summing the lengths I think must be a mistake/typo given the variable name itself is <tt>morethan1word</tt> (rather than something like <tt>pairsOfTextonyms</tt>). &mdash;[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 01:22, 9 February 2015 (UTC)
::::: When a digit combination is entered it maps to: A) Gibberish; B) An English word; C) Multiple English Words. Case C) are Textonyms. #{3} is the number of digit combinations which fall into category C). So 1473 is the correct answer.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 15:33, 9 February 2015 (UTC)
 
==duplicate words in dictionary==
 
Consider the dictionary file:
<pre>
aleph
aleph
aleph
bet
</pre>
Question: &nbsp; how many words are in that dictionary file?
 
If you say "two", then that affects the count of how many words can be represented by the ''key digits'' &nbsp; (as described by this Rosetta Code task).
 
In the REXX program that I coded, it detects duplicate words (and ignores them, but displays a count). &nbsp; I believe that having duplicate words shouldn't alter the count of words representable by ''key digits''. &nbsp; As the REXX program is currently coded, it ignores duplicated words and it shows a different digit combination count &nbsp; ('''650''' digit combinations instead of '''661''', the latter counts duplicate words and reflects another way to count words representable by ''key digits'').
 
Better still, it would be nice to have a clean dictionary, or at the least, agree on whether or not duplicate words should be ignored &nbsp; (and instead report on unique words that are in the dictionary).
 
It was asked (elsewhere): &nbsp; "what is being counted?" &nbsp; (by Rdm). &nbsp; This is the crux of the ambiguity.
 
The &nbsp; '''UNIXDICT''' &nbsp; dictionary doesn't have that problem, fortunately. &nbsp; In reality, almost all dictionaries have duplicate words (either by meaning, by use, by their derivation/root, by case/capitalization, or by whatever). &nbsp; That shouldn't preclude the correct/accurate counting of (unique) words. -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:27, 10 February 2015 (UTC)
 
-----