Inverted index
An Inverted Index is a data structure used to create full text search.
Inverted index
You are encouraged to solve this task according to the task description, using any language you may know.
You are encouraged to solve this task according to the task description, using any language you may know.
Given a set of text files, implement a program to create an inverted index. Also create a user interface to do a search using that inverted index which returns a list of files that contain the query term / terms. The search index can be in memory.
AutoHotkey
<lang AutoHotkey>
inputbox, files, files, file pattern such as c:\files\*.txt
word2docs := object() ; autohotkey_L is needed.
stime := A_tickcount Loop, %files%, 0,1 {
tooltip,%A_index% / 500 wordList := WordsIn(A_LoopFileFullPath) InvertedIndex(wordList, A_loopFileFullpath)
}
tooltip msgbox, % "total time " (A_tickcount-stime)/1000
gosub, search return
search: Loop {
InputBox, keyword , input single keyword only msgbox, % foundDocs := findword(keyword)
} return
WordsIn(docpath) {
FileRead, content, %docpath% spos = 1 Loop { if !(spos := Regexmatch(content, "[a-zA-Z]{2,}",match, spos)) break spos += strlen(match) this_wordList .= match "`n" } Sort, this_wordList, U return this_wordList
}
InvertedIndex(byref words, docpath) {
global word2docs
loop, parse, words, `n,`r { if A_loopField = continue word2docs[A_loopField] := word2docs[A_loopField] docpath "`n" }
}
findWord(word2find) {
global word2docs
if (word2docs[word2find] = "") return "" else return word2docs[word2find]
} </lang>