Jaro similarity: Difference between revisions
Correct distance to similarity; tried to clarify definition of transpositions as well.
m (Markjreed moved page Jaro distance to Jaro similarity: Described task calculates the similarity (1=identical) rather than the distance (0=identical)) |
(Correct distance to similarity; tried to clarify definition of transpositions as well.) |
||
Line 1:
{{task}}
The Jaro distance is a measure of edit distance between two strings; its inverse, called the ''Jaro similarity'', is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that '''0''' equates to no similarities and '''1''' is an exact match.
;;Definition
The Jaro
: <math>d_j = \left\{
Line 24 ⟶ 20:
Two characters from <math>s_1</math> and <math>s_2</math> respectively, are considered ''matching'' only if they are the same and not farther apart than <math>\left\lfloor\frac{\max(|s_1|,|s_2|)}{2}\right\rfloor-1</math> characters.
Each character of <math>s_1</math> is compared with all its matching characters in <math>s_2</math>. Each difference in position is half a ''transposition''; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one
Line 50 ⟶ 42:
;Task
Implement the Jaro
* ("MARTHA", "MARHTA")
|