How to resolve the algorithm File size distribution step by step in the Go programming language

Published on 12 May 2024 09:40 PM
#Go

How to resolve the algorithm File size distribution step by step in the Go programming language

Table of Contents

Problem Statement

Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy.

My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker.

Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files?

Let's start with the solution:

Step by Step solution about How to resolve the algorithm File size distribution step by step in the Go programming language

The provided Go code snippet is a program that calculates and prints the distribution of file sizes in a given directory and its subdirectories. It uses the filepath.Walk function to traverse the file system and gather information about each file.

Here's a step-by-step explanation of the code:

  1. Importing Necessary Packages:

    import (
       "fmt"
       "log"
       "math"
       "os"
       "path/filepath"
    )

    The program imports several necessary packages, including fmt for formatting, log for error handling, math for calculating logarithms, os for file operations, and path/filepath for working with file paths.

  2. Commatizing Function (commatize):

    func commatize(n int64) string {
       s := fmt.Sprintf("%d", n)
       if n < 0 {
           s = s[1:]
       }
       le := len(s)
       for i := le - 3; i >= 1; i -= 3 {
           s = s[0:i] + "," + s[i:]
       }
       if n >= 0 {
           return s
       }
       return "-" + s
    }

    The commatize function is a helper function that takes an integer n and returns a string representation of it with commas inserted for easier readability. It handles both positive and negative numbers.

  3. File Size Distribution Function (fileSizeDistribution):

    func fileSizeDistribution(root string) {
       var sizes [12]int
       files := 0
       directories := 0
       totalSize := int64(0)
       walkFunc := func(path string, info os.FileInfo, err error) error {
           if err != nil {
               return err
           }
           files++
           if info.IsDir() {
               directories++
           }
           size := info.Size()
           if size == 0 {
               sizes[0]++
               return nil
           }
           totalSize += size
           logSize := math.Log10(float64(size))
           index := int(math.Floor(logSize))
           sizes[index+1]++
           return nil
       }
       err := filepath.Walk(root, walkFunc)
       if err != nil {
           log.Fatal(err)
       }
       fmt.Printf("File size distribution for '%s' :-\n\n", root)
       for i := 0; i < len(sizes); i++ {
           if i == 0 {
               fmt.Print("  ")
           } else {
               fmt.Print("+ ")
           }
           fmt.Printf("Files less than 10 ^ %-2d bytes : %5d\n", i, sizes[i])
       }
       fmt.Println("                                  -----")
       fmt.Printf("= Total number of files         : %5d\n", files)
       fmt.Printf("  including directories         : %5d\n", directories)
       c := commatize(totalSize)
       fmt.Println("\n  Total size of files           :", c, "bytes")
    }

    The fileSizeDistribution function is the main function for calculating and printing the file size distribution. It takes a root directory as an argument.

    • It initializes several variables:

      • sizes is an array to store the distribution of file sizes (in log10 steps).
      • files and directories keep track of the total number of files and directories encountered.
      • totalSize accumulates the total size of all files.
    • It defines an anonymous function walkFunc that serves as the callback for the filepath.Walk function. The walkFunc:

      • Counts files and directories.
      • Calculates the log10 of a file's size and increments the corresponding index in the sizes array.
    • The program uses filepath.Walk to traverse the root directory and its subdirectories, calling walkFunc for each file/directory.

    • After traversing the directory structure, the function prints out the distribution of file sizes in log10 intervals, the total number of files and directories, and the total size of all files.

  4. Main Function:

    func main() {
       fileSizeDistribution("./")
    }

    The main function calls the fileSizeDistribution function with the current working directory ("./") as the root directory. This means it will calculate and print the file size distribution for the current directory and its subdirectories.

In summary, this Go program calculates how many files in a given directory and its subdirectories fall into different size categories (in log10 steps). It then prints out the distribution of file sizes, the total number of files and directories, and the total size of all files.

Source code in the go programming language

package main

import (
    "fmt"
    "log"
    "math"
    "os"
    "path/filepath"
)

func commatize(n int64) string {
    s := fmt.Sprintf("%d", n)
    if n < 0 {
        s = s[1:]
    }
    le := len(s)
    for i := le - 3; i >= 1; i -= 3 {
        s = s[0:i] + "," + s[i:]
    }
    if n >= 0 {
        return s
    }
    return "-" + s
}

func fileSizeDistribution(root string) {
    var sizes [12]int
    files := 0
    directories := 0
    totalSize := int64(0)
    walkFunc := func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }
        files++
        if info.IsDir() {
            directories++
        }
        size := info.Size()
        if size == 0 {
            sizes[0]++
            return nil
        }
        totalSize += size
        logSize := math.Log10(float64(size))
        index := int(math.Floor(logSize))
        sizes[index+1]++
        return nil
    }
    err := filepath.Walk(root, walkFunc)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("File size distribution for '%s' :-\n\n", root)
    for i := 0; i < len(sizes); i++ {
        if i == 0 {
            fmt.Print("  ")
        } else {
            fmt.Print("+ ")
        }
        fmt.Printf("Files less than 10 ^ %-2d bytes : %5d\n", i, sizes[i])
    }
    fmt.Println("                                  -----")
    fmt.Printf("= Total number of files         : %5d\n", files)
    fmt.Printf("  including directories         : %5d\n", directories)
    c := commatize(totalSize)
    fmt.Println("\n  Total size of files           :", c, "bytes")
}

func main() {
    fileSizeDistribution("./")
}


  

You may also check:How to resolve the algorithm MD4 step by step in the Lasso programming language
You may also check:How to resolve the algorithm Ludic numbers step by step in the Prolog programming language
You may also check:How to resolve the algorithm Execute HQ9+ step by step in the 11l programming language
You may also check:How to resolve the algorithm Time a function step by step in the Raku programming language
You may also check:How to resolve the algorithm FizzBuzz step by step in the FALSE programming language