Talk:Soundex: Difference between revisions

Content added Content deleted

Inline

Revision as of 15:35, 13 November 2009

Task Improvement

It's all very well to have "do soundex" as a task, but it would be far better if we had a more concrete task. For example, attempting spelling corrections on a short text using soundex matching against a supplied dictionary. Right now, it feels like this task isn't really going anywhere. –Donal Fellows 15:51, 12 November 2009 (UTC)

A contributor to the problem is that there's no algorithm. I looked at the one on WP and it doesn't make that much sense to me. I checked the talk page and there's an alternate algorithm proposed, but it apparently doesn't cover all cases. Also, for languages without built-in soundex libraries, doing the conversion alone seems like task enough to me. --Mwn3d 16:07, 12 November 2009 (UTC)

Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –Donal Fellows 16:30, 12 November 2009 (UTC)

As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --Rldrenth 21:03, 12 November 2009 (UTC)

Which Soundex?

It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters:

A261 for Ashcraft
B620 for Burroughs and Burrows

--IanOsgood 15:35, 13 November 2009 (UTC)

@@ Line 4: / Line 4: @@
 ::Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –[[User:Dkf|Donal Fellows]] 16:30, 12 November 2009 (UTC)
 :As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --[[User:Rldrenth|Rldrenth]] 21:03, 12 November 2009 (UTC)
+== Which Soundex? ==
+It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters:
+ A261 for Ashcraft
+ B620 for Burroughs and Burrows
+--[[User:IanOsgood|IanOsgood]] 15:35, 13 November 2009 (UTC)