How to resolve the algorithm Text processing/2 step by step in the PureBasic programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Text processing/2 step by step in the PureBasic programming language

Table of Contents

Problem Statement

The following task concerns data that came from a pollution monitoring station with twenty-four instruments monitoring twenty-four aspects of pollution in the air. Periodically a record is added to the file, each record being a line of 49 fields separated by white-space, which can be one or more space or tab characters. The fields (from the left) are: i.e. a datestamp followed by twenty-four repetitions of a floating-point instrument value and that instrument's associated integer flag. Flag values are >= 1 if the instrument is working and < 1 if there is some problem with it, in which case that instrument's value should be ignored. A sample from the full data file readings.txt, which is also used in the Text processing/1 task, follows: Data is no longer available at that link. Zipped mirror available here

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Text processing/2 step by step in the PureBasic programming language

Source code in the purebasic programming language

Define filename.s = "readings.txt"
#instrumentCount = 24

Enumeration
  #exp_date
  #exp_instruments
  #exp_instrumentStatus
EndEnumeration

Structure duplicate
  date.s
  firstLine.i
  line.i
EndStructure

NewMap dates() ;records line date occurs first
NewList duplicated.duplicate()
NewList syntaxError()
Define goodRecordCount, totalLines, line.s, i
Dim inputDate.s(0)
Dim instruments.s(0)
  
If ReadFile(0, filename)
  CreateRegularExpression(#exp_date, "\d+-\d+-\d+")
  CreateRegularExpression(#exp_instruments, "(\t|\x20)+(\d+\.\d+)(\t|\x20)+\-?\d")
  CreateRegularExpression(#exp_instrumentStatus, "(\t|\x20)+(\d+\.\d+)(\t|\x20)+")
  Repeat
    line = ReadString(0, #PB_Ascii)
    If line = "": Break: EndIf
    totalLines + 1
  
    ExtractRegularExpression(#exp_date, line, inputDate())
    If FindMapElement(dates(), inputDate(0))
      AddElement(duplicated())
      duplicated()\date = inputDate(0)
      duplicated()\firstLine = dates()
      duplicated()\line = totalLines
    Else
      dates(inputDate(0)) = totalLines
    EndIf
    
    ExtractRegularExpression(#exp_instruments, Mid(line, Len(inputDate(0)) + 1), instruments())
    Define pairsCount = ArraySize(instruments()), containsBadValues = #False
    For i =  0 To pairsCount
      If Val(ReplaceRegularExpression(#exp_instrumentStatus, instruments(i), "")) < 1
        containsBadValues = #True
        Break
      EndIf
    Next
    
    If pairsCount <> #instrumentCount - 1
      AddElement(syntaxError()): syntaxError() = totalLines
    EndIf
    If Not containsBadValues
      goodRecordCount + 1
    EndIf
  ForEver
  CloseFile(0)
  
  If OpenConsole()
    ForEach duplicated()
      PrintN("Duplicate date: " + duplicated()\date + " occurs on lines " + Str(duplicated()\line) + " and " + Str(duplicated()\firstLine) + ".")
    Next
    ForEach syntaxError()
      PrintN( "Syntax error in line " + Str(syntaxError()))
    Next
    PrintN(#CRLF$ + Str(goodRecordCount) + " of " + Str(totalLines) + " lines read were valid records.")
    
    Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input()
    CloseConsole()
  EndIf
EndIf

  

You may also check:How to resolve the algorithm Median filter step by step in the Phix programming language
You may also check:How to resolve the algorithm Pythagorean triples step by step in the Lasso programming language
You may also check:How to resolve the algorithm Integer comparison step by step in the Fōrmulæ programming language
You may also check:How to resolve the algorithm Read a specific line from a file step by step in the Mathematica/Wolfram Language programming language
You may also check:How to resolve the algorithm Balanced ternary step by step in the C programming language