Talk:Word frequency: Difference between revisions
Content added Content deleted
Line 41: | Line 41: | ||
It seems the original task author used the regexp \w+ in the Clojure and first Python examples. Maybe he should expand on what \w+ matches and define it as the meaning of a word for the purposes of the task? --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 20:09, 17 August 2017 (UTC) |
It seems the original task author used the regexp \w+ in the Clojure and first Python examples. Maybe he should expand on what \w+ matches and define it as the meaning of a word for the purposes of the task? --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 20:09, 17 August 2017 (UTC) |
||
:\w means [A-z0-9]. This could be extended to include accented Latin characters: [A-z0-9À-ÿ]. But this would not change that the answers are wrong. There are 41082 occurrences of the word 'the', not 41036. The text contains for instance "BOOK SECOND--THE FALL". I suspect that the Python and Clojure solutions miss this.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 11:51, 18 August 2017 (UTC) |
:\w means [A-z0-9]. This could be extended to include accented Latin characters: [A-z0-9À-ÿ]. But this would not change that the answers are wrong. There are 41082 occurrences of the word 'the', not 41036. The text contains for instance "BOOK SECOND--THE FALL". I suspect that the Python and Clojure solutions miss this.--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 11:51, 18 August 2017 (UTC) |
||
:They are probably missing the two occurrences of 'the' in: |
:They are probably missing the two of the three occurrences of 'the' in: |
||
<pre> |
<pre> |
||
"The beds," pursued the director, "are very much crowded against each |
"The beds," pursued the director, "are very much crowded against each |