How to resolve the algorithm Jaro similarity step by step in the Turbo-Basic XL programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Jaro similarity step by step in the Turbo-Basic XL programming language

Table of Contents

Problem Statement

The Jaro distance is a measure of edit distance between two strings; its inverse, called the Jaro similarity, is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that   0   equates to no similarities and   1   is an exact match.

The Jaro similarity

d

j

{\displaystyle d_{j}}

of two given strings

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

is Where:

Two characters from

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

respectively, are considered matching only if they are the same and not farther apart than

max (

|

s

1

|

,

|

s

2

|

)

2

− 1

{\displaystyle \left\lfloor {\frac {\max(|s_{1}|,|s_{2}|)}{2}}\right\rfloor -1}

characters. Each character of

s

1

{\displaystyle s_{1}}

is compared with all its matching characters in

s

2

{\displaystyle s_{2}}

. Each difference in position is half a transposition; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one.

Given the strings

s

1

{\displaystyle s_{1}}

DWAYNE   and

s

2

{\displaystyle s_{2}}

DUANE   we find:

We find a Jaro score of:

Implement the Jaro algorithm and show the similarity scores for each of the following pairs:

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Jaro similarity step by step in the Turbo-Basic XL programming language

Source code in the turbo-basic programming language

10 DIM Word_1$(20), Word_2$(20), Z$(20)
11 CLS
20 Word_1$="MARTHA" : Word_2$="MARHTA" : ? Word_1$;" - ";Word_2$ : EXEC _JWD_ : ?
30 Word_1$="DIXON" : Word_2$="DICKSONX" : ? Word_1$;" - ";Word_2$ : EXEC _JWD_ : ?
40 Word_1$="JELLYFISH" : Word_2$="SMELLYFISH" : ? Word_1$;" - ";Word_2$ : EXEC _JWD_ : ?

11000 END
12000 REM JaroWinklerDistance INPUT(Word_1$, Word_2$) USE(Z$, I, J, K, L, M, N, S1, S2, Min, Max) RETURN(FLOAT Result)
12000 PROC _JWD_
12010   Result=0 : S1=LEN(Word_1$) : S2=LEN(Word_2$)
12020   IF S1>S2 THEN Z$=Word_1$ : Word_1$=Word_2$ : Word_2$=Z$ : M=S1 : S1=S2 : S2=M
12030   J=1: M=0 : N=0 : L=INT(S2/2) : Z$=Word_2$
12040   FOR I=1 TO S1
12050     IF Word_1$(I,I)=Word_2$(J,J) THEN M=M+1: Word_2$(J,J)=" ": GO# JMP_JWD
12060     Max=1 : IF Max<(I-L) THEN Max=I-L
12070     Min=S2 : IF Min>(I+L-1) THEN Min=I+L-1
12080     FOR K=Max TO Min
12090       IF Word_1$(I,I)=Word_2$(K,K) THEN N=N+1: M=M+1: Word_2$(K,K)=" ": IF K>J THEN J=K
12100     NEXT K
12110     #JMP_JWD : IF J
12120   NEXT I
12130   IF M=0
12140     Result=0 : REM jaro distance
12150   ELSE 
12160     N=INT(N/2)
12170     Result=(M/S1+M/S2+((M-N)/M))/3. : REM jaro distance
12180   ENDIF
12190   ? "Jaro Distance=";Result
12200   Min=S1 : IF Min>S2 THEN Min=S2
12210   M=Min : IF M>3 THEN M=3
12220   M=M+1 : L=0 : Word_2$=Z$ : IF M>Min THEN M=Min
12230   FOR I=1 TO M
12240     IF Word_1$(I,I)=Word_2$(I,I)
12250       L=L+1
12260     ELSE
12270       EXIT
12280     ENDIF
12290   NEXT I
12300   Result=Result + (L*0.1*(1.0 - Result)) : REM Winkler
12310   ? "Jaro Winkler Distance=";Result
12320 ENDPROC

  

You may also check:How to resolve the algorithm Remove duplicate elements step by step in the MATLAB programming language
You may also check:How to resolve the algorithm Averages/Arithmetic mean step by step in the OCaml programming language
You may also check:How to resolve the algorithm Anagrams/Deranged anagrams step by step in the EchoLisp programming language
You may also check:How to resolve the algorithm Damm algorithm step by step in the Nim programming language
You may also check:How to resolve the algorithm Fivenum step by step in the Perl programming language