How to resolve the algorithm Huffman coding step by step in the Mathematica / Wolfram Language programming language
How to resolve the algorithm Huffman coding step by step in the Mathematica / Wolfram Language programming language
Table of Contents
Problem Statement
Huffman encoding is a way to assign binary codes to symbols that reduces the overall number of bits used to encode a typical string of those symbols. For example, if you use letters as symbols and have details of the frequency of occurrence of those letters in typical strings, then you could just encode each letter with a fixed number of bits, such as in ASCII codes. You can do better than this by encoding more frequently occurring letters such as e and a, with smaller bit strings; and less frequently occurring letters such as q and x with longer bit strings. Any string of letters will be encoded as a string of bits that are no-longer of the same length per letter. To successfully decode such as string, the smaller codes assigned to letters such as 'e' cannot occur as a prefix in the larger codes such as that for 'x'. The Huffman coding scheme takes each symbol and its weight (or frequency of occurrence), and generates proper encodings for each symbol taking account of the weights of each symbol, so that higher weighted symbols have fewer bits in their encoding. (See the WP article for more information). A Huffman encoding can be computed by first creating a tree of nodes:
Traverse the constructed binary tree from root to leaves assigning and accumulating a '0' for one branch and a '1' for the other at each node. The accumulated zeros and ones at each leaf constitute a Huffman encoding for those symbols and weights:
Using the characters and their frequency from the string: create a program to generate a Huffman encoding for each character as a table.
Let's start with the solution:
Step by Step solution about How to resolve the algorithm Huffman coding step by step in the Mathematica / Wolfram Language programming language
This Wolfram code implements Huffman encoding, a lossless data compression algorithm based on the idea of assigning shorter bit codes to more frequent characters or symbols. Here's how the code works:
1. Input:
- The code takes a string
s
or a listl
as input. The strings
is converted to a list of characters usingCharacters[s]
.
2. Huffman Tree Generation:
-
The code initializes some variables:
merge
: A function to merge the two branches of the Huffman tree based on their frequencies.structure
: Represents the Huffman tree structure.rules
: Maps characters to their corresponding bit codes.
-
It uses
FixedPoint
to iteratively merge the branches of the Huffman tree until a single branch remains. The branches are merged in the order of increasing frequency. -
structure
now contains the final Huffman tree as a nested list.
3. Bit Code Generation:
-
The
rules
are generated by extracting unique characters from the input, and then usingPosition
to find their positions in thestructure
. -
The positions are converted to binary bit codes by subtracting 1 from each position.
4. Output:
- The output is a tuple:
- The first element is the compressed bitstream, which is the list of characters in the input with their corresponding Huffman codes.
- The second element is the
rules
which maps characters to their bit codes.
Source code in the wolfram programming language
huffman[s_String] := huffman[Characters[s]];
huffman[l_List] := Module[{merge, structure, rules},
(*merge front two branches. list is assumed to be sorted*)
merge[k_] := Replace[k, {{a_, aC_}, {b_, bC_}, rest___} :> {{{a, b}, aC + bC}, rest}];
structure = FixedPoint[
Composition[merge, SortBy[#, Last] &],
Tally[l]][[1, 1]];
rules = (# -> Flatten[Position[structure, #] - 1]) & /@ DeleteDuplicates[l];
{Flatten[l /. rules], rules}];
You may also check:How to resolve the algorithm FizzBuzz step by step in the Lobster programming language
You may also check:How to resolve the algorithm Factorions step by step in the AWK programming language
You may also check:How to resolve the algorithm Break OO privacy step by step in the Forth programming language
You may also check:How to resolve the algorithm Parametric polymorphism step by step in the Fortran programming language
You may also check:How to resolve the algorithm Musical scale step by step in the C++ programming language