WiktionaryDumps to words: Difference between revisions
Content added Content deleted
(Added C) |
(useful for spell checkers) |
||
Line 1: | Line 1: | ||
{{draft task}} |
{{draft task}} |
||
;NOTE: Please help addressing the issues about this task on the discussion page. If you add another language, be aware that |
;NOTE: Please help addressing the issues about this task on the discussion page. If you add another language, be aware that this task may change in the future, and that you will need to update your example. |
||
;Task: |
|||
⚫ | Use the [https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2 wiktionary dump] (input) to create a file equivalent than [ |
||
Make a file that can be useful with [https://en.wikipedia.org/wiki/Spell_checker spell checkers] like [https://fr.wikipedia.org/wiki/Ispell Ispell] and [https://en.wikipedia.org/wiki/GNU_Aspell Aspell]. |
|||
⚫ | Use the [https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2 wiktionary dump] (input) to create a file equivalent than [https://manpages.ubuntu.com/manpages/bionic/man5/spanish.5.html "/usr/share/dict/spanish"] (output). The input file is an XML dump of the Wiktionary that is a bz2'ed file of about 800MB. The output file should be a file similar than "/usr/share/dict/spanish" which contains one word of a given language by line in a simple text file. An example of such a file is available in Ubuntu with the package '''wspanish'''. |
||