How to resolve the algorithm De Bruijn sequences step by step in the Python programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm De Bruijn sequences step by step in the Python programming language

Table of Contents

Problem Statement

The sequences are named after the Dutch mathematician   Nicolaas Govert de Bruijn.

A note on Dutch capitalization:   Nicolaas' last name is   de Bruijn,   the   de   isn't normally capitalized unless it's the first word in a sentence.   Rosetta Code (more or less by default or by fiat) requires the first word in the task name to be capitalized.

In combinatorial mathematics,   a   de Bruijn sequence   of order   n   on a   size-k   alphabet (computer science)   A   is a cyclic sequence in which every possible   length-n   string (computer science, formal theory)   on   A   occurs exactly once as a contiguous substring. Such a sequence is denoted by   B(k, n)   and has length   kn,   which is also the number of distinct substrings of length   n   on   A;     de Bruijn sequences are therefore optimally short. There are: distinct de Bruijn sequences   B(k, n).

For this Rosetta Code task,   a   de Bruijn   sequence is to be generated that can be used to shorten a brute-force attack on a   PIN-like   code lock that does not have an "enter" key and accepts the last   n   digits entered.

Note:   automated teller machines (ATMs)   used to work like this,   but their software has been updated to not allow a brute-force attack.

A   digital door lock   with a 4-digit code would have B (10, 4) solutions,   with a length of   10,000   (digits). Therefore, only at most     10,000 + 3     (as the solutions are cyclic or wrap-around)   presses are needed to open the lock. Trying all 4-digit codes separately would require   4 × 10,000   or   40,000   presses.

(The last requirement is to ensure that the verification tests performs correctly.   The verification processes should list any and all missing PIN codes.) Show all output here, on this page.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm De Bruijn sequences step by step in the Python programming language

The provided Python code implements a De Bruijn sequence algorithm, which generates a sequence of symbols from a given alphabet in such a way that every possible subsequence of length n appears exactly once. De Bruijn is used heavily in Computer Science and especially in coding theory and cryptography.

  1. De Bruijn Sequence Function:

    • This function, de_bruijn(k, n), generates a De Bruijn sequence for an alphabet of size k and subsequences of length n.
    • It first checks whether k can be cast to an integer and, if so, creates an alphabet list mapping 0 to k-1. Otherwise, it assumes k is already an alphabet.
    • It initializes an array a with k * n zeros and an empty list sequence.
    • The recursive function db is defined to create the sequence. It takes parameters t (current position) and p (previous position) and follows a recursive algorithm to build the sequence.
    • If t is greater than n, it checks if n is divisible by p, indicating that a valid subsequence has been created and appends it to the sequence.
    • Otherwise, it continues building the sequence by setting a[t] to the value at a[t-p] and recursively calling db with updated parameters.
    • The db function systematically explores all possible subsequences and stores them in the sequence.
    • Finally, the function returns the De Bruijn sequence by concatenating the characters from the alphabet list using the indices stored in the sequence.
  2. Validation Function:

    • The validate(db) function is a utility function used to check if the generated De Bruijn sequence db contains all possible combinations of digits (0-9) of length n.
    • It creates a wrapped version of db by concatenating the first three characters of db at the end (dbwithwrap) to ensure it contains all possible combinations.
    • It iterates over all possible 4-digit combinations of 0-9, creating teststrings. If any teststring is not found in dbwithwrap, it is considered an error, and its error message is appended to the errorstrings list.
    • Finally, it prints the number of errors found or a message indicating that there are no errors.
  3. Usage of the Code:

    • The code generates a De Bruijn sequence for an alphabet of digits (0-9) and a subsequence length of 4.
    • It prints the length of the generated sequence, the first and last 130 digits of the sequence, and validates the sequence.
    • Additionally, it validates the reversed and overlaid versions of the De Bruijn sequence.

Source code in the python programming language

# from https://en.wikipedia.org/wiki/De_Bruijn_sequence

def de_bruijn(k, n):
    """
    de Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    try:
        # let's see if k can be cast to an integer;
        # if so, make our alphabet a list
        _ = int(k)
        alphabet = list(map(str, range(k)))

    except (ValueError, TypeError):
        alphabet = k
        k = len(k)

    a = [0] * k * n
    sequence = []

    def db(t, p):
        if t > n:
            if n % p == 0:
                sequence.extend(a[1:p + 1])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return "".join(alphabet[i] for i in sequence)
    
def validate(db):
    """
    
    Check that all 10,000 combinations of 0-9 are present in 
    De Bruijn string db.
    
    Validating the reversed deBruijn sequence:
      No errors found
    
    Validating the overlaid deBruijn sequence:
      4 errors found:
        PIN number 1459 missing
        PIN number 4591 missing
        PIN number 5814 missing
        PIN number 8145 missing
    
    """
    
    dbwithwrap = db+db[0:3]
    
    digits = '0123456789'
    
    errorstrings = []
    
    for d1 in digits:
        for d2 in digits:
            for d3 in digits:
                for d4 in digits:
                    teststring = d1+d2+d3+d4
                    if teststring not in dbwithwrap:
                        errorstrings.append(teststring)
                        
    if len(errorstrings) > 0:
        print("  "+str(len(errorstrings))+" errors found:")
        for e in errorstrings:
            print("  PIN number "+e+"  missing")
    else:
        print("  No errors found")

db = de_bruijn(10, 4)

print(" ")
print("The length of the de Bruijn sequence is ", str(len(db)))
print(" ")
print("The first 130 digits of the de Bruijn sequence are: "+db[0:130])
print(" ")
print("The last 130 digits of the de Bruijn sequence are: "+db[-130:])
print(" ")
print("Validating the deBruijn sequence:")
validate(db)
dbreversed = db[::-1]
print(" ")
print("Validating the reversed deBruijn sequence:")
validate(dbreversed)
dboverlaid = db[0:4443]+'.'+db[4444:]
print(" ")
print("Validating the overlaid deBruijn sequence:")
validate(dboverlaid)


  

You may also check:How to resolve the algorithm Repeat a string step by step in the Pure programming language
You may also check:How to resolve the algorithm Boyer-Moore string search step by step in the FreeBASIC programming language
You may also check:How to resolve the algorithm Align columns step by step in the Ursala programming language
You may also check:How to resolve the algorithm Longest increasing subsequence step by step in the Haskell programming language
You may also check:How to resolve the algorithm Program name step by step in the Python programming language