Talk:Soundex: Difference between revisions
Content added Content deleted
No edit summary |
(→Which Soundex?: new section) |
||
Line 4: | Line 4: | ||
::Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –[[User:Dkf|Donal Fellows]] 16:30, 12 November 2009 (UTC) |
::Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –[[User:Dkf|Donal Fellows]] 16:30, 12 November 2009 (UTC) |
||
:As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --[[User:Rldrenth|Rldrenth]] 21:03, 12 November 2009 (UTC) |
:As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --[[User:Rldrenth|Rldrenth]] 21:03, 12 November 2009 (UTC) |
||
== Which Soundex? == |
|||
It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters: |
|||
A261 for Ashcraft |
|||
B620 for Burroughs and Burrows |
|||
--[[User:IanOsgood|IanOsgood]] 15:35, 13 November 2009 (UTC) |
Revision as of 15:35, 13 November 2009
Task Improvement
It's all very well to have "do soundex" as a task, but it would be far better if we had a more concrete task. For example, attempting spelling corrections on a short text using soundex matching against a supplied dictionary. Right now, it feels like this task isn't really going anywhere. –Donal Fellows 15:51, 12 November 2009 (UTC)
- A contributor to the problem is that there's no algorithm. I looked at the one on WP and it doesn't make that much sense to me. I checked the talk page and there's an alternate algorithm proposed, but it apparently doesn't cover all cases. Also, for languages without built-in soundex libraries, doing the conversion alone seems like task enough to me. --Mwn3d 16:07, 12 November 2009 (UTC)
- Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –Donal Fellows 16:30, 12 November 2009 (UTC)
- As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --Rldrenth 21:03, 12 November 2009 (UTC)
Which Soundex?
It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters:
A261 for Ashcraft B620 for Burroughs and Burrows
--IanOsgood 15:35, 13 November 2009 (UTC)