How to resolve the algorithm Jaro similarity step by step in the CLU programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Jaro similarity step by step in the CLU programming language

Table of Contents

Problem Statement

The Jaro distance is a measure of edit distance between two strings; its inverse, called the Jaro similarity, is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that   0   equates to no similarities and   1   is an exact match.

The Jaro similarity

d

j

{\displaystyle d_{j}}

of two given strings

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

is Where:

Two characters from

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

respectively, are considered matching only if they are the same and not farther apart than

max (

|

s

1

|

,

|

s

2

|

)

2

− 1

{\displaystyle \left\lfloor {\frac {\max(|s_{1}|,|s_{2}|)}{2}}\right\rfloor -1}

characters. Each character of

s

1

{\displaystyle s_{1}}

is compared with all its matching characters in

s

2

{\displaystyle s_{2}}

. Each difference in position is half a transposition; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one.

Given the strings

s

1

{\displaystyle s_{1}}

DWAYNE   and

s

2

{\displaystyle s_{2}}

DUANE   we find:

We find a Jaro score of:

Implement the Jaro algorithm and show the similarity scores for each of the following pairs:

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Jaro similarity step by step in the CLU programming language

Source code in the clu programming language

max = proc [T: type] (a, b: T) returns (T)
      where T has lt: proctype (T,T) returns (bool)
    if a
end max

min = proc [T: type] (a, b: T) returns (T)
      where T has lt: proctype (T,T) returns (bool)
    if a
end min

jaro = proc (s1, s2: string) returns (real)
    s1_len: int := string$size(s1)
    s2_len: int := string$size(s2)
    
    if s1_len = 0 & s2_len = 0 then return(1.0)
    elseif s1_len = 0 | s2_len = 0 then return(0.0)
    end

    dist: int := max[int](s1_len, s2_len)/2 - 1
    s1_match: array[bool] := array[bool]$fill(1,s1_len,false)
    s2_match: array[bool] := array[bool]$fill(1,s2_len,false)

    matches: real := 0.0
    transpositions: real := 0.0
    for i: int in int$from_to(1, s1_len) do
        start: int := max[int](1, i-dist)
        end_: int := min[int](i+dist, s2_len)
        
        for k: int in int$from_to(start, end_) do
            if s2_match[k] then continue end
            if s1[i] ~= s2[k] then continue end
            s1_match[i] := true
            s2_match[k] := true
            matches := matches + 1.0
            break
        end
    end

    if matches=0.0 then return(0.0) end
    k: int := 1
    for i: int in int$from_to(1, s1_len) do
        if ~s1_match[i] then continue end
        while ~s2_match[k] do k := k + 1 end
        if s1[i] ~= s2[k] then
            transpositions := transpositions + 1.0
        end
        k := k+1
    end

    transpositions := transpositions / 2.0
    return( ((matches / real$i2r(s1_len)) + 
             (matches / real$i2r(s2_len)) +
             ((matches - transpositions) / matches)) / 3.0)
end jaro

start_up = proc ()
    po: stream := stream$primary_output()
    stream$putl(po, f_form(jaro("MARTHA", "MARHTA"), 1, 6))
    stream$putl(po, f_form(jaro("DIXON", "DICKSONX"), 1, 6))
    stream$putl(po, f_form(jaro("JELLYFISH", "SMELLYFISH"), 1, 6))
end start_up

  

You may also check:How to resolve the algorithm Bulls and cows step by step in the Seed7 programming language
You may also check:How to resolve the algorithm Tree traversal step by step in the UNIX Shell programming language
You may also check:How to resolve the algorithm Fractal tree step by step in the Rust programming language
You may also check:How to resolve the algorithm Dynamic variable names step by step in the Z80 Assembly programming language
You may also check:How to resolve the algorithm Comments step by step in the Slate programming language