How to resolve the algorithm Word frequency step by step in the UNIX Shell programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Word frequency step by step in the UNIX Shell programming language

Table of Contents

Problem Statement

Given a text file and an integer   n,   print/display the   n   most common words in the file   (and the number of their occurrences)   in decreasing frequency.

For the purposes of this task:

Show example output using Les Misérables from Project Gutenberg as the text file input and display the top   10   most used words.

This task was originally taken from programming pearls from Communications of the ACM June 1986 Volume 29 Number 6 where this problem is solved by Donald Knuth using literate programming and then critiqued by Doug McIlroy, demonstrating solving the problem in a 6 line Unix shell script (provided as an example below).

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Word frequency step by step in the UNIX Shell programming language

Source code in the unix programming language

#!/bin/sh
<"$1" tr -cs A-Za-z '\n' | tr A-Z a-z | LC_ALL=C sort | uniq -c | sort -rn | head -n "$2"


curl "https://www.gutenberg.org/files/135/135-0.txt" | tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed 10q


  

You may also check:How to resolve the algorithm Zero to the zero power step by step in the Kotlin programming language
You may also check:How to resolve the algorithm Hello world/Newbie step by step in the Robotic programming language
You may also check:How to resolve the algorithm Loops/Downward for step by step in the AArch64 Assembly programming language
You may also check:How to resolve the algorithm Dinesman's multiple-dwelling problem step by step in the Icon and Unicon programming language
You may also check:How to resolve the algorithm 100 doors step by step in the smart BASIC programming language