How to resolve the algorithm Jaro similarity step by step in the VBA programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Jaro similarity step by step in the VBA programming language

Table of Contents

Problem Statement

The Jaro distance is a measure of edit distance between two strings; its inverse, called the Jaro similarity, is a measure of two strings' similarity: the higher the value, the more similar the strings are. The score is normalized such that   0   equates to no similarities and   1   is an exact match.

The Jaro similarity

d

j

{\displaystyle d_{j}}

of two given strings

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

is Where:

Two characters from

s

1

{\displaystyle s_{1}}

and

s

2

{\displaystyle s_{2}}

respectively, are considered matching only if they are the same and not farther apart than

max (

|

s

1

|

,

|

s

2

|

)

2

− 1

{\displaystyle \left\lfloor {\frac {\max(|s_{1}|,|s_{2}|)}{2}}\right\rfloor -1}

characters. Each character of

s

1

{\displaystyle s_{1}}

is compared with all its matching characters in

s

2

{\displaystyle s_{2}}

. Each difference in position is half a transposition; that is, the number of transpositions is half the number of characters which are common to the two strings but occupy different positions in each one.

Given the strings

s

1

{\displaystyle s_{1}}

DWAYNE   and

s

2

{\displaystyle s_{2}}

DUANE   we find:

We find a Jaro score of:

Implement the Jaro algorithm and show the similarity scores for each of the following pairs:

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Jaro similarity step by step in the VBA programming language

Source code in the vba programming language

Option Explicit

Function JaroWinkler(text1 As String, text2 As String, Optional p As Double = 0.1) As Double
Dim dummyChar, match1, match2 As String
Dim i, f, t, j, m, l, s1, s2, limit As Integer

i = 1
Do
    dummyChar = Chr(i)
    i = i + 1
Loop Until InStr(1, text1 & text2, dummyChar, vbTextCompare) = 0

s1 = Len(text1)
s2 = Len(text2)
limit = WorksheetFunction.Max(0, Int(WorksheetFunction.Max(s1, s2) / 2) - 1)
match1 = String(s1, dummyChar)
match2 = String(s2, dummyChar)

For l = 1 To WorksheetFunction.Min(4, s1, s2)
    If Mid(text1, l, 1) <> Mid(text2, l, 1) Then Exit For
Next l
l = l - 1

For i = 1 To s1
    f = WorksheetFunction.Min(WorksheetFunction.Max(i - limit, 1), s2)
    t = WorksheetFunction.Min(WorksheetFunction.Max(i + limit, 1), s2)
    j = InStr(1, Mid(text2, f, t - f + 1), Mid(text1, i, 1), vbTextCompare)
    If j > 0 Then
        m = m + 1
        text2 = Mid(text2, 1, f + j - 2) & dummyChar & Mid(text2, f + j)
        match1 = Mid(match1, 1, i - 1) & Mid(text1, i, 1) & Mid(match1, i + 1)
        match2 = Mid(match2, 1, f + j - 2) & Mid(text1, i, 1) & Mid(match2, f + j)
    End If
Next i
match1 = Replace(match1, dummyChar, "", 1, -1, vbTextCompare)
match2 = Replace(match2, dummyChar, "", 1, -1, vbTextCompare)
t = 0
For i = 1 To m
    If Mid(match1, i, 1) <> Mid(match2, i, 1) Then t = t + 1
Next i

JaroWinkler = (m / s1 + m / s2 + (m - t / 2) / m) / 3
JaroWinkler = JaroWinkler + (1 - JaroWinkler) * l * WorksheetFunction.Min(0.25, p)
End Function

  

You may also check:How to resolve the algorithm Evaluate binomial coefficients step by step in the Picat programming language
You may also check:How to resolve the algorithm Averages/Arithmetic mean step by step in the Yorick programming language
You may also check:How to resolve the algorithm Random number generator (device) step by step in the XPL0 programming language
You may also check:How to resolve the algorithm Wieferich primes step by step in the APL programming language
You may also check:How to resolve the algorithm String prepend step by step in the EasyLang programming language