Talk:Soundex: Difference between revisions
m
→Which Soundex?
m (→Which Soundex?) |
|||
Line 26:
::The bottom line is there is no standard soundex definition. Rerencing Wikipedia is pointless since it's constantly changing and referencing Knuth doesn't seem to be any better. It would have helped if this task actually included the algorithm (any algorithm) as part of the task definition, but it's far too late to fix that now. Just accept the fact that half the solutions are going to implement the H/W rule and half aren't. --[[User:J4 james|j4_james]] ([[User talk:J4 james|talk]]) 00:44, 23 October 2015 (UTC)
:::I have the second edition of Knuth's TAOCP vol. 3, and it states in rule 3 p. 394: ''"If two or more letters with the same code were adjacent in the original name (before step 1), or adjacent except for intervening h's and w's, omit all but the first."'' This seems to follow the "A261" rule. However, a Google search shows me that both conventions are widespread. There is even a case where both algorithms seem to be used in the same place (probably a bug): [http://rsl.rootsweb.ancestry.com/cgi-bin/rslsql.cgi here] and [http://resources.rootsweb.ancestry.com/cgi-bin/soundexconverter here] at rootsweb.ancestry.com. Try ASHCROFT in both (select "soundex" in "Select type of search" in the first page). There is a [http://search.cpan.org/~rjbs/Text-Soundex-3.05/Soundex.pm Perl package] providing both functions as '''soundex''' and '''soundex_nara''', and so does [https://www.stata.com/help.cgi?soundex Stata]. The book "SQL for Smarties: Advanced SQL Programming" states on page 245 that both methods were actually used by the Census Bureau (
|