How to resolve the algorithm Word frequency step by step in the Seed7 programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Word frequency step by step in the Seed7 programming language

Table of Contents

Problem Statement

Given a text file and an integer   n,   print/display the   n   most common words in the file   (and the number of their occurrences)   in decreasing frequency.

For the purposes of this task:

Show example output using Les Misérables from Project Gutenberg as the text file input and display the top   10   most used words.

This task was originally taken from programming pearls from Communications of the ACM June 1986 Volume 29 Number 6 where this problem is solved by Donald Knuth using literate programming and then critiqued by Doug McIlroy, demonstrating solving the problem in a 6 line Unix shell script (provided as an example below).

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Word frequency step by step in the Seed7 programming language

Source code in the seed7 programming language

$ include "seed7_05.s7i";
  include "gethttp.s7i";
  include "strifile.s7i";
  include "scanfile.s7i";
  include "chartype.s7i";
  include "console.s7i";

const type: wordHash is hash [string] integer;
const type: countHash is hash [integer] array string;

const proc: main is func
  local
    var file: inFile is STD_NULL;
    var string: aWord is "";
    var wordHash: numberOfWords is wordHash.EMPTY_HASH;
    var countHash: countWords is countHash.EMPTY_HASH;
    var array integer: countKeys is 0 times 0;
    var integer: index is 0;
    var integer: number is 0;
  begin
    OUT := STD_CONSOLE;
    inFile := openStrifile(getHttp("www.gutenberg.org/files/135/135-0.txt"));
    while hasNext(inFile) do
      aWord := lower(getSimpleSymbol(inFile));
      if aWord <> "" and aWord[1] in letter_char then
        if aWord in numberOfWords then
          incr(numberOfWords[aWord]);
        else
          numberOfWords @:= [aWord] 1;
        end if;
      end if;
    end while;
    countWords := flip(numberOfWords);
    countKeys := sort(keys(countWords));
    writeln("Word    Frequency");
    for index range length(countKeys) downto length(countKeys) - 9 do
      number := countKeys[index];
      for aWord range sort(countWords[number]) do
        writeln(aWord rpad 8 <& number);
      end for;
    end for;
  end func;

  

You may also check:How to resolve the algorithm Deconvolution/1D step by step in the Phix programming language
You may also check:How to resolve the algorithm Find the missing permutation step by step in the OCaml programming language
You may also check:How to resolve the algorithm Constrained genericity step by step in the Morfa programming language
You may also check:How to resolve the algorithm Zeckendorf arithmetic step by step in the Raku programming language
You may also check:How to resolve the algorithm Koch curve step by step in the Lua programming language