Jaro-Winkler distance: Difference between revisions
m
typo
(Added TypeScript implementation) |
m (typo) |
||
(4 intermediate revisions by 4 users not shown) | |||
Line 2:
The Jaro-Winkler distance is a metric for measuring the edit distance between words.
It is similar to the more basic
for transpositions between letters in the words. With the Winkler modification to the Jaro
metric, the Jaro-Winkler distance also adds an increase in similarity for words which
Line 76:
{{trans|Python}}
<
V MISSPELLINGS = [‘accomodate’,
‘definately’,
Line 123:
print("\nClose dictionary words ( distance < 0.15 using Jaro-Winkler distance) to \" "STR" \" are:\n Word | Distance")
L(w) within_distance(0.15, STR, 5)
print(‘#14 | #.4’.format(w, jaro_winkler_distance(STR, w)))</
{{out}}
Line 155:
=={{header|Elm}}==
Author: zh5
<
Line 317:
else
result
</syntaxhighlight>
=={{header|ALGOL 68}}==
Line 326:
<br>
Prints the 6 closest matches regarddless of their distance (i.e. we don't restrict it to matches closer that 0.15).
<
IF STRING s1 = sp1[ AT 0 ];
STRING s2 = sp2[ AT 0 ];
Line 457:
print( ( newline ) )
OD
FI</
{{out}}
<pre>
Line 535:
=={{header|C++}}==
{{trans|Swift}}
<
#include <cstdlib>
#include <fstream>
Line 638:
}
return EXIT_SUCCESS;
}</
{{out}}
Line 718:
=={{header|F_Sharp|F#}}==
This task uses [http://www.rosettacode.org/wiki/Jaro_distance#F.23 Jaro Distance (F#)]
<
// Calculate Jaro-Winkler Similarity of 2 Strings. Nigel Galloway: August 7th., 2020
let Jw P n g=let L=float(let i=Seq.map2(fun n g->n=g) n g in (if Seq.length i>4 then i|>Seq.take 4 else i)|>Seq.takeWhile id|>Seq.length)
Line 727:
["accomodate";"definately";"goverment";"occured";"publically";"recieve";"seperate";"untill";"wich"]|>
List.iter(fun n->printfn "%s" n;fN n|>Array.take 5|>Array.iter(fun n->printf "%A" n);printfn "\n")
</syntaxhighlight>
{{out}}
<pre>
Line 760:
=={{header|Go}}==
This uses unixdict and borrows code from the [[Jaro_distance#Go]] task. Otherwise it is a translation of the Wren entry.
<
import (
Line 889:
fmt.Println()
}
}</
{{out}}
Line 967:
Implementation:
<
Eq=. (x=/y)*(<.<:-:x>.&#y)>:|x -/&i.&# y
xM=. (+./"1 Eq)#x
Line 981:
simj=. x jaro y
-.simj + l*p*-.simj
}}</
Task example:
<
words=. <;._2 fread '/usr/share/dict/words'
for_word. ;:'accomodate definately goverment occured publically recieve seperate untill wich' do.
Line 1,040:
0.0533333 winch
0.0533333 witch
</syntaxhighlight>
=={{header|Java}}==
{{trans|C++}}
<
import java.util.*;
Line 1,152:
}
}
}</
{{out}}
Line 1,236:
This entry, which uses unixdict.txt, borrows the implementation in jq of the Jaro similarity measure as defined at
[[Jaro_similarity#jq]]; since it is quite long, it is not repeated here.
<
def length_of_common_prefix($s1; $s2):
Line 1,267:
(.[] | "\(.[0] | lpad(21)) : \(.[-1] * 1000 | round / 1000)") ;
task</
{{out}}
Invocation: jq -rRn -f program.jq unixdict.txt
Line 1,322:
=={{header|Julia}}==
<
const words = read("linuxwords.txt", String) |> split .|> strip
Line 1,372:
end
end
</
<pre>
Close dictionary words ( distance < 0.15 using Jaro-Winkler distance) to 'accomodate' are:
Line 1,448:
=={{header|Mathematica}}/{{header|Wolfram Language}}==
<
JWD[a_][b_]:=Experimental`JaroWinklerDistance[a,b]
dict=DictionaryLookup[];
Line 1,459:
TakeSmallestBy[dict->{"Element","Value"},JWD["seperate"],5]//Grid
TakeSmallestBy[dict->{"Element","Value"},JWD["untill"],5]//Grid
TakeSmallestBy[dict->{"Element","Value"},JWD["wich"],5]//Grid</
{{out}}
<pre>accommodate 0.0181818
Line 1,517:
=={{header|Nim}}==
{{trans|Go}}
<
func jaroSim(s1, s2: string): float =
Line 1,589:
echo &"{c.dist:0.4f} {c.word}"
if i == 5: break
echo()</
{{out}}
Line 1,662:
=={{header|Perl}}==
<
use warnings;
use List::Util qw(min max head);
Line 1,714:
printf "%15s : %0.4f\n", $_, $J{$_}
for head 5, sort { $J{$a} <=> $J{$b} or $a cmp $b } grep { $J{$_} < 0.15 } keys %J;
}</
{{out}}
<pre style="height:40ex">Closest 5 dictionary words with a Jaro-Winkler distance < .15 from 'accomodate':
Line 1,781:
=={{header|Phix}}==
Uses jaro() from [[Jaro_distance#Phix]] (reproduced below for your convenience) and the standard unix_dict()
<!--<
<span style="color: #008080;">function</span> <span style="color: #000000;">jaro</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">str1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">str2</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">str1</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">trim</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">upper</span><span style="color: #0000FF;">(</span><span style="color: #000000;">str1</span><span style="color: #0000FF;">))</span>
Line 1,858:
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<!--</
Output identical to <del>Go/Wren</del> Algol68
=={{header|Python}}==
<
Test Jaro-Winkler distance metric.
linuxwords.txt is from http://users.cs.duke.edu/~ola/ap/linuxwords
Line 1,928:
for w in within_distance(0.15, STR, 5):
print('{:>14} | {:6.4f}'.format(w, jaro_winkler_distance(STR, w)))
</
<pre>
Close dictionary words ( distance < 0.15 using Jaro-Winkler distance) to " accomodate " are:
Line 2,009:
using the unixdict.txt file from www.puzzlers.org
<syntaxhighlight lang="raku"
return 0 if $s eq $t;
Line 2,070:
printf "%15s : %0.4f\n", .key, .value for %result.grep({ .value < .15 }).sort({+.value, ~.key}).head(5);
}</
{{out}}
<pre>Closest 5 dictionary words with a Jaro-Winkler distance < .15 from accomodate:
Line 2,136:
=={{header|Rust}}==
{{trans|Python}}
<
use std::io::{self, BufRead};
Line 2,246:
Err(error) => eprintln!("{}", error),
}
}</
{{out}}
Line 2,326:
=={{header|Swift}}==
{{trans|Rust}}
<
func loadDictionary(_ path: String) throws -> [String] {
Line 2,415:
} catch {
print(error.localizedDescription)
}</
{{out}}
Line 2,495:
=={{header|Typescript}}==
{{trans|Java}}
<
var fs = require('fs')
Line 2,530:
}
}
if (!matches
return 1.0
}
Line 2,620:
}
main();
</syntaxhighlight>
{{out}}
<pre>
Line 2,687:
</pre>
=={{header|V (Vlang)}}==
{{trans|Go}}
<syntaxhighlight lang="v (vlang)">import os
fn jaro_sim(str1 string, str2 string) f64 {
Line 2,805:
println('')
}
}</
{{out}}
Line 2,883:
{{libheader|Wren-sort}}
This uses unixdict and borrows code from the [[Jaro_distance#Wren]] task.
<
import "./fmt" for Fmt
import "./sort" for Sort
var jaroSim = Fn.new { |s1, s2|
Line 2,955:
for (c in closest.take(6)) Fmt.print("$0.4f $s", c[1], c[0])
System.print()
}</
{{out}}
|