How to resolve the algorithm Jaro similarity step by step in the Haxe programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Jaro similarity step by step in the Haxe programming language

Table of Contents

Problem Statement

The Jaro distance is a measure of edit distance between two strings; its inverse, called the Jaro similarity, is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that   0   equates to no similarities and   1   is an exact match.

The Jaro similarity

d

j

{\displaystyle d_{j}}

of two given strings

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

is Where:

Two characters from

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

respectively, are considered matching only if they are the same and not farther apart than

max (

|

s

1

|

,

|

s

2

|

)

2

− 1

{\displaystyle \left\lfloor {\frac {\max(|s_{1}|,|s_{2}|)}{2}}\right\rfloor -1}

characters. Each character of

s

1

{\displaystyle s_{1}}

is compared with all its matching characters in

s

2

{\displaystyle s_{2}}

. Each difference in position is half a transposition; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one.

Given the strings

s

1

{\displaystyle s_{1}}

DWAYNE   and

s

2

{\displaystyle s_{2}}

DUANE   we find:

We find a Jaro score of:

Implement the Jaro algorithm and show the similarity scores for each of the following pairs:

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Jaro similarity step by step in the Haxe programming language

Source code in the haxe programming language

class Jaro {
    private static function jaro(s1: String, s2: String): Float {
        var s1_len = s1.length;
        var s2_len = s2.length;
        if (s1_len == 0 && s2_len == 0) return 1;
 
        var match_distance = Std.int(Math.max(s1_len, s2_len)) / 2 - 1; 
        var matches = { s1: [for(n in 0...s1_len) false], s2: [for(n in 0...s2_len) false] };
        var m = 0;
        for (i in 0...s1_len) {
            var start = Std.int(Math.max(0, i - match_distance));
            var end = Std.int(Math.min(i + match_distance + 1, s2_len));
            for (j in start...end)
                if (!matches.s2[j] && s1.charAt(i) == s2.charAt(j)) {
	                matches.s1[i] = true;
	                matches.s2[j] = true;
	                m++;
	                break;
                }
        }
        if (m == 0) return 0;
 
        var k = 0;
        var t = 0.;
        for (i in 0...s1_len)
            if (matches.s1[i]) {
            	while (!matches.s2[k]) k++;
            	if (s1.charAt(i) != s2.charAt(k++)) t += 0.5;
            }
 
        return (m / s1_len + m / s2_len + (m - t) / m) / 3.0;
    }
 
    public static function main() {
        Sys.println(jaro(   "MARTHA",      "MARHTA"));
        Sys.println(jaro(    "DIXON",    "DICKSONX"));
        Sys.println(jaro("JELLYFISH",  "SMELLYFISH"));
    }
}


  

You may also check:How to resolve the algorithm Sum and product of an array step by step in the Nim programming language
You may also check:How to resolve the algorithm Hello world/Web server step by step in the AntLang programming language
You may also check:How to resolve the algorithm Parallel calculations step by step in the Python programming language
You may also check:How to resolve the algorithm Bitmap/Bézier curves/Cubic step by step in the 11l programming language
You may also check:How to resolve the algorithm String length step by step in the 6502 Assembly programming language