How to resolve the algorithm Jaro similarity step by step in the Haxe programming language
How to resolve the algorithm Jaro similarity step by step in the Haxe programming language
Table of Contents
Problem Statement
The Jaro distance is a measure of edit distance between two strings; its inverse, called the Jaro similarity, is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that 0 equates to no similarities and 1 is an exact match.
The Jaro similarity
d
j
{\displaystyle d_{j}}
of two given strings
s
1
{\displaystyle s_{1}}
and
s
2
{\displaystyle s_{2}}
is Where:
Two characters from
s
1
{\displaystyle s_{1}}
and
s
2
{\displaystyle s_{2}}
respectively, are considered matching only if they are the same and not farther apart than
⌊
max (
|
s
1
|
,
|
s
2
|
)
2
⌋
− 1
{\displaystyle \left\lfloor {\frac {\max(|s_{1}|,|s_{2}|)}{2}}\right\rfloor -1}
characters. Each character of
s
1
{\displaystyle s_{1}}
is compared with all its matching characters in
s
2
{\displaystyle s_{2}}
. Each difference in position is half a transposition; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one.
Given the strings
s
1
{\displaystyle s_{1}}
DWAYNE and
s
2
{\displaystyle s_{2}}
DUANE we find:
We find a Jaro score of:
Implement the Jaro algorithm and show the similarity scores for each of the following pairs:
Let's start with the solution:
Step by Step solution about How to resolve the algorithm Jaro similarity step by step in the Haxe programming language
Source code in the haxe programming language
class Jaro {
private static function jaro(s1: String, s2: String): Float {
var s1_len = s1.length;
var s2_len = s2.length;
if (s1_len == 0 && s2_len == 0) return 1;
var match_distance = Std.int(Math.max(s1_len, s2_len)) / 2 - 1;
var matches = { s1: [for(n in 0...s1_len) false], s2: [for(n in 0...s2_len) false] };
var m = 0;
for (i in 0...s1_len) {
var start = Std.int(Math.max(0, i - match_distance));
var end = Std.int(Math.min(i + match_distance + 1, s2_len));
for (j in start...end)
if (!matches.s2[j] && s1.charAt(i) == s2.charAt(j)) {
matches.s1[i] = true;
matches.s2[j] = true;
m++;
break;
}
}
if (m == 0) return 0;
var k = 0;
var t = 0.;
for (i in 0...s1_len)
if (matches.s1[i]) {
while (!matches.s2[k]) k++;
if (s1.charAt(i) != s2.charAt(k++)) t += 0.5;
}
return (m / s1_len + m / s2_len + (m - t) / m) / 3.0;
}
public static function main() {
Sys.println(jaro( "MARTHA", "MARHTA"));
Sys.println(jaro( "DIXON", "DICKSONX"));
Sys.println(jaro("JELLYFISH", "SMELLYFISH"));
}
}
You may also check:How to resolve the algorithm Sum and product of an array step by step in the Nim programming language
You may also check:How to resolve the algorithm Hello world/Web server step by step in the AntLang programming language
You may also check:How to resolve the algorithm Parallel calculations step by step in the Python programming language
You may also check:How to resolve the algorithm Bitmap/Bézier curves/Cubic step by step in the 11l programming language
You may also check:How to resolve the algorithm String length step by step in the 6502 Assembly programming language