How to resolve the algorithm Burrows–Wheeler transform step by step in the jq programming language

Published on 12 May 2024 09:40 PM
#Jq

How to resolve the algorithm Burrows–Wheeler transform step by step in the jq programming language

Table of Contents

Problem Statement

The Burrows–Wheeler transform (BWT, also called block-sorting compression) rearranges a character string into runs of similar characters. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding. More importantly, the transformation is reversible, without needing to store any additional data. The BWT is thus a "free" method of improving the efficiency of text compression algorithms, costing only some extra computation.

Source: Burrows–Wheeler transform

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Burrows–Wheeler transform step by step in the jq programming language

Source code in the jq programming language

# substitute ^ for STX and | for ETX
def makePrintable:
  if . == null then null
  else sub("\u0002"; "^") | sub("\u0003"; "|")
  end;
 
def bwt:
  {stx: "\u0002", etx: "\u0003"} as $x
  | if index($x.stx) >= 0 or index($x.etx) >= 0 then null
    else $x.stx + . + $x.etx
    | . as $s
    | (reduce range(0; length) as $i ([];
         .[$i] = $s[$i:] + $s[:$i]) | sort) as $table
    | reduce range(0; length) as $i ("";
         . + $table[$i][-1:])
    end;

def ibwt:
  . as $r
  | length as $len
  | reduce range(0;$len) as $i ([]; 
        reduce range(0; $len) as $j (.;
	.[$j] = $r[$j:$j+1] + .[$j]) | sort)
  | first( .[] | select(endswith("\u0003")))
  | .[1:-1] ;

def tests:
  (
    "banana",
    "appellee",
    "dogwood",
    "TO BE OR NOT TO BE OR WANT TO BE OR NOT?",
    "SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES",
    "\u0002ABC\u0003"
  )
  | . as $test
  | bwt as $t
  | "\(makePrintable)\n --> \($t | makePrintable
      // "ERROR: String can't contain STX or ETX" )",
    (if $t then " --> \($t | ibwt)\n" else "" end) ;

tests

  

You may also check:How to resolve the algorithm Guess the number step by step in the ABAP programming language
You may also check:How to resolve the algorithm Empty program step by step in the Toka programming language
You may also check:How to resolve the algorithm Random number generator (included) step by step in the PL/SQL programming language
You may also check:How to resolve the algorithm Abbreviations, automatic step by step in the Perl programming language
You may also check:How to resolve the algorithm FizzBuzz step by step in the GFA Basic programming language