How to resolve the algorithm Merge and aggregate datasets step by step in the AutoHotkey programming language
How to resolve the algorithm Merge and aggregate datasets step by step in the AutoHotkey programming language
Table of Contents
Problem Statement
Merge and aggregate datasets
Merge and aggregate two datasets as provided in .csv files into a new resulting dataset. Use the appropriate methods and data structures depending on the programming language. Use the most common libraries only when built-in functionality is not sufficient.
Either load the data from the .csv files or create the required data structures hard-coded.
patients.csv file contents:
visits.csv file contents:
Create a resulting dataset in-memory or output it to screen or file, whichever is appropriate for the programming language at hand. Merge and group per patient id and last name, get the maximum visit date, and get the sum and average of the scores per patient to get the resulting dataset.
Note that the visit date is purposefully provided as ISO format, so that it could also be processed as text and sorted alphabetically to determine the maximum date.
This task is aimed in particular at programming languages that are used in data science and data processing, such as F#, Python, R, SPSS, MATLAB etc.
Let's start with the solution:
Step by Step solution about How to resolve the algorithm Merge and aggregate datasets step by step in the AutoHotkey programming language
Source code in the autohotkey programming language
Merge_and_aggregate(patients, visits){
ID := [], LAST_VISIT := [], SCORE_SUM := [], VISIT := []
for i, line in StrSplit(patients, "`n", "`r"){
if (i=1)
continue
x := StrSplit(line, ",")
ID[x.1] := x.2
}
for i, line in StrSplit(visits, "`n", "`r"){
if (i=1)
continue
x := StrSplit(line, ",")
LAST_VISIT[x.1] := x.2 > LAST_VISIT[x.1] ? x.2 : LAST_VISIT[x.1]
SCORE_SUM[x.1] := (SCORE_SUM[x.1] ? SCORE_SUM[x.1] : 0) + (x.3 ? x.3 : 0)
if x.3
VISIT[x.1] := (VISIT[x.1] ? VISIT[x.1] : 0) + 1
}
output := "PATIENT_ID`tLASTNAME`tLAST_VISIT`tSCORE_SUM`tSCORE_AVG`n"
for id, name in ID
output .= ID "`t" name "`t" LAST_VISIT[id] "`t" SCORE_SUM[id] "`t" SCORE_SUM[id]/VISIT[id] "`n"
return output
}
patients =
(
PATIENT_ID,LASTNAME
1001,Hopper
4004,Wirth
3003,Kemeny
2002,Gosling
5005,Kurtz
)
visits =
(
PATIENT_ID,VISIT_DATE,SCORE
2002,2020-09-10,6.8
1001,2020-09-17,5.5
4004,2020-09-24,8.4
2002,2020-10-08,
1001,,6.6
3003,2020-11-12,
4004,2020-11-05,7.0
1001,2020-11-19,5.3
)
MsgBox % Merge_and_aggregate(patients, visits)
return
You may also check:How to resolve the algorithm Generator/Exponential step by step in the PL/I programming language
You may also check:How to resolve the algorithm Minimum multiple of m where digital sum equals m step by step in the Go programming language
You may also check:How to resolve the algorithm Diversity prediction theorem step by step in the Ada programming language
You may also check:How to resolve the algorithm ABC problem step by step in the REXX programming language
You may also check:How to resolve the algorithm Reverse words in a string step by step in the V (Vlang) programming language