How to resolve the algorithm File size distribution step by step in the REXX programming language

Problem Statement
Step by Step Solution
Sourcecode

Problem Statement

Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy.

My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker.

Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files?

Let's start with the solution:

Step by Step solution about How to resolve the algorithm File size distribution step by step in the REXX programming language

Source code in the rexx programming language

/*REXX program displays a histogram of filesize distribution of a directory structure(s)*/
numeric digits 30                                /*ensure enough decimal digits for a #.*/
parse arg ds .                                   /*obtain optional argument from the CL.*/
parse source . . path .                          /*   "   the path of this REXX program.*/
fID= substr(path, 1 + lastpos('\', path) )       /*   "   the filename and the filetype.*/
parse var  fID   fn  '.'                         /*   "   just the pure filename of pgm.*/
sw=max(79, linesize() - 1)                       /*   "   terminal width (linesize) - 1.*/
                                work= fn".OUT"   /*filename for workfile output of  DIR.*/
'DIR'   ds   '/s /-c /a-d  >'   work             /*do (DOS) DIR cmd for a data structure*/
call linein 0, 1                                 /*open output file, point to 1st record*/
maxL= 0;    @.= 00;      g= 0                    /*max len size; log array; # good recs.*/
$=0                                              /*$:  total bytes used by files found. */
     do while lines(work)\==0;  _= linein(work)  /*process the data in the DIR work file*/
     if left(_, 1)==' '    then iterate          /*Is the record not legitimate?  Skip. */
     parse upper  var   _    .  .  sz  .         /*uppercase the suffix  (if any).      */
     sz= space( translate(sz, , ','),  0)        /*remove any commas if present in the #*/

     if \datatype(sz,'W')  then do; #= left(sz, length(sz) - 1)       /*SZ has a suffix?*/
                                    if \datatype(#,'N')  then iterate /*Meat ¬ numeric? */
                                    sz= # * 1024 ** pos( right(sz, 1), 'KMGTPEZYXWVU') / 1
                                end                                   /* [↑]  use suffix*/
     $= $ + sz                                   /*keep a running total for the filesize*/
     if sz==0  then L= 0                         /*handle special case for an empty file*/
               else L= length(sz)                /*obtain the length of filesize number.*/
     g= g + 1                                    /*bump the counter of # of good records*/
     maxL= max(L, maxL)                          /*get max length filesize for alignment*/
     @.L= @.L + 1                                /*bump counter of record size category.*/
     end   /*j*/                                 /* [↑]   categories:  split by log ten.*/

if g==0  then do;  say 'file not found: '  ds;  exit 13;    end        /*no good records*/
say  ' record size range    count   '
hdr= '══════════════════ ══════════ ';     say hdr;         Lhdr=length(hdr)
mC=0                                             /*mC:  the maximum count for any range.*/
     do   t=1  to 2                              /*T==1   is used to find the max count.*/
       do k=0  to maxL;  mC= max(mC, @.k);  if t==1  then iterate           /*1st pass? */
                             if k==0  then y= center('zero',  length( word(hdr, 1)  ) )
                                      else y= '10^'left(k-1,2)  "──► 10^"left(k,2)  '-1'
       say y || right( commas(@.k), 11)   copies('─', max(1, (@.k / mC * sw % 1) - LHdr) )
       end   /*k*/
     end     /*y*/
say
trace off;   'ERASE'  work                       /*perform clean─up (erase a work file).*/
say commas(g)      ' files detected, '       commas($)        " total bytes."
exit                                             /*stick a fork in it,  we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
commas: parse arg _;  do j#=length(_)-3  to 1  by -3; _=insert(',', _, j#); end;  return _

You may also check:How to resolve the algorithm Strip whitespace from a string/Top and tail step by step in the Logtalk programming language
You may also check:How to resolve the algorithm Sorting algorithms/Counting sort step by step in the Ruby programming language
You may also check:How to resolve the algorithm Empty program step by step in the Wren programming language
You may also check:How to resolve the algorithm Function definition step by step in the NewLISP programming language
You may also check:How to resolve the algorithm Quickselect algorithm step by step in the Haskell programming language

How to resolve the algorithm File size distribution step by step in the REXX programming language

Table of Contents

Problem Statement

Step by Step solution about How to resolve the algorithm File size distribution step by step in the REXX programming language

Source code in the rexx programming language