Talk:Word frequency: Difference between revisions

m
Line 41:
It seems the original task author used the regexp \w+ in the Clojure and first Python examples. Maybe he should expand on what \w+ matches and define it as the meaning of a word for the purposes of the task? --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 20:09, 17 August 2017 (UTC)
:\w means [A-z0-9]. This could be extended to include accented Latin characters: [A-z0-9À-ÿ]. But this would not change that the answers are wrong. There are 41082 occurrences of the word 'the', not 41036. The text contains for instance "BOOK SECOND--THE FALL". I suspect that the Python and Clojure solutions miss this.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 11:51, 18 August 2017 (UTC)
:They are probably missing the two of the three occurrences of 'the' in:
<pre>
"The beds," pursued the director, "are very much crowded against each
2,171

edits