Talk:Soundex: Difference between revisions

From Rosetta Code
Content added Content deleted
No edit summary
(→‎Which Soundex?: new section)
Line 4: Line 4:
::Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –[[User:Dkf|Donal Fellows]] 16:30, 12 November 2009 (UTC)
::Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –[[User:Dkf|Donal Fellows]] 16:30, 12 November 2009 (UTC)
:As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --[[User:Rldrenth|Rldrenth]] 21:03, 12 November 2009 (UTC)
:As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --[[User:Rldrenth|Rldrenth]] 21:03, 12 November 2009 (UTC)

== Which Soundex? ==

It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters:
A261 for Ashcraft
B620 for Burroughs and Burrows
--[[User:IanOsgood|IanOsgood]] 15:35, 13 November 2009 (UTC)

Revision as of 15:35, 13 November 2009

Task Improvement

It's all very well to have "do soundex" as a task, but it would be far better if we had a more concrete task. For example, attempting spelling corrections on a short text using soundex matching against a supplied dictionary. Right now, it feels like this task isn't really going anywhere. –Donal Fellows 15:51, 12 November 2009 (UTC)

A contributor to the problem is that there's no algorithm. I looked at the one on WP and it doesn't make that much sense to me. I checked the talk page and there's an alternate algorithm proposed, but it apparently doesn't cover all cases. Also, for languages without built-in soundex libraries, doing the conversion alone seems like task enough to me. --Mwn3d 16:07, 12 November 2009 (UTC)
Fair point about languages without soundex in libs. Maybe the other idea would be better as a task that builds on this one… –Donal Fellows 16:30, 12 November 2009 (UTC)
As I understand, there are different soundex algorithms, based somewhat on the language and on the applicatons. I also read the Wikipedia entry and it does not present the algorithm clearly. A couple of years ago I needed an algorithm to match information for new entries in a database to existing names. That's when I ran across the soundex algorithm. --Rldrenth 21:03, 12 November 2009 (UTC)

Which Soundex?

It isn't clear which Soundex algorithm each example is implementing. For example, the US Census rules have a special case for "H" and "W" (ignored but don't separate runs of consonants). I suggest adding a set of test cases to the problem description which can distinguish between the many variants of Soundex out there. For starters:

A261 for Ashcraft
B620 for Burroughs and Burrows

--IanOsgood 15:35, 13 November 2009 (UTC)