Text processing/1

From Rosetta Code
(Redirected from Data Munging)
Jump to: navigation, search
Task
Text processing/1
You are encouraged to solve this task according to the task description, using any language you may know.
This task has been flagged for clarification. Code on this page in its current state may be flagged incorrect once this task has been clarified. See this page's Talk page for discussion.

Often data is produced by one program, in the wrong format for later use by another program or person. In these situations another program can be written to parse and transform the original data into a format useful to the other. The term "Data Munging" is often used in programming circles for this task.

A request on the comp.lang.awk newsgroup lead to a typical data munging task:

I have to analyse data files that have the following format:
Each row corresponds to 1 day and the field logic is: $1 is the date,
followed by 24 value/flag pairs, representing measurements at 01:00,
02:00 ... 24:00 of the respective day. In short:

<date> <val1> <flag1> <val2> <flag2> ...  <val24> <flag24>

Some test data is available at: 
... (nolonger available at original location)

I have to sum up the values (per day and only valid data, i.e. with
flag>0) in order to calculate the mean. That's not too difficult.
However, I also need to know what the "maximum data gap" is, i.e. the
longest period with successive invalid measurements (i.e values with
flag<=0)

The data is free to download and use and is of this format:

1991-03-30	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1
1991-03-31	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	10.000	1	20.000	1	20.000	1	20.000	1	35.000	1	50.000	1	60.000	1	40.000	1	30.000	1	30.000	1	30.000	1	25.000	1	20.000	1	20.000	1	20.000	1	20.000	1	20.000	1	35.000	1
1991-03-31	40.000	1	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2	0.000	-2
1991-04-01	0.000	-2	13.000	1	16.000	1	21.000	1	24.000	1	22.000	1	20.000	1	18.000	1	29.000	1	44.000	1	50.000	1	43.000	1	38.000	1	27.000	1	27.000	1	24.000	1	23.000	1	18.000	1	12.000	1	13.000	1	14.000	1	15.000	1	13.000	1	10.000	1
1991-04-02	8.000	1	9.000	1	11.000	1	12.000	1	12.000	1	12.000	1	27.000	1	26.000	1	27.000	1	33.000	1	32.000	1	31.000	1	29.000	1	31.000	1	25.000	1	25.000	1	24.000	1	21.000	1	17.000	1	14.000	1	15.000	1	12.000	1	12.000	1	10.000	1
1991-04-03	10.000	1	9.000	1	10.000	1	10.000	1	9.000	1	10.000	1	15.000	1	24.000	1	28.000	1	24.000	1	18.000	1	14.000	1	12.000	1	13.000	1	14.000	1	15.000	1	14.000	1	15.000	1	13.000	1	13.000	1	13.000	1	12.000	1	10.000	1	10.000	1

Only a sample of the data showing its format is given above. The full example file may be downloaded here.

Structure your program to show statistics for each line of the file, (similar to the original Python, Perl, and AWK examples below), followed by summary statistics for the file. When showing example output just show a few line statistics and the full end summary.

Contents

[edit] Ada

with Ada.Text_IO;            use Ada.Text_IO;
with Strings_Edit; use Strings_Edit;
with Strings_Edit.Floats; use Strings_Edit.Floats;
with Strings_Edit.Integers; use Strings_Edit.Integers;
 
procedure Data_Munging is
Syntax_Error : exception;
type Gap_Data is record
Count  : Natural := 0;
Line  : Natural := 0;
Pointer : Integer;
Year  : Integer;
Month  : Integer;
Day  : Integer;
end record;
File  : File_Type;
Max  : Gap_Data;
This  : Gap_Data;
Current : Gap_Data;
Count  : Natural := 0;
Sum  : Float  := 0.0;
begin
Open (File, In_File, "readings.txt");
loop
declare
Line  : constant String := Get_Line (File);
Pointer : Integer := Line'First;
Flag  : Integer;
Data  : Float;
begin
Current.Line := Current.Line + 1;
Get (Line, Pointer, SpaceAndTab);
Get (Line, Pointer, Current.Year);
Get (Line, Pointer, Current.Month);
Get (Line, Pointer, Current.Day);
while Pointer <= Line'Last loop
Get (Line, Pointer, SpaceAndTab);
Current.Pointer := Pointer;
Get (Line, Pointer, Data);
Get (Line, Pointer, SpaceAndTab);
Get (Line, Pointer, Flag);
if Flag < 0 then
if This.Count = 0 then
This := Current;
end if;
This.Count := This.Count + 1;
else
if This.Count > 0 and then Max.Count < This.Count then
Max := This;
end if;
This.Count := 0;
Count := Count + 1;
Sum  := Sum + Data;
end if;
end loop;
exception
when End_Error =>
raise Syntax_Error;
end;
end loop;
exception
when End_Error =>
Close (File);
if This.Count > 0 and then Max.Count < This.Count then
Max := This;
end if;
Put_Line ("Average " & Image (Sum / Float (Count)) & " over " & Image (Count));
if Max.Count > 0 then
Put ("Max. " & Image (Max.Count) & " false readings start at ");
Put (Image (Max.Line) & ':' & Image (Max.Pointer) & " stamped ");
Put_Line (Image (Max.Year) & Image (Max.Month) & Image (Max.Day));
end if;
when others =>
Close (File);
Put_Line ("Syntax error at " & Image (Current.Line) & ':' & Image (Max.Pointer));
end Data_Munging;

The implementation performs minimal checks. The average is calculated over all valid data. For the maximal chain of consequent invalid data, the source line number, the column number, and the time stamp of the first invalid data is printed.

Sample output:
Average 10.47915 over 129628
Max. 589 false readings start at 1136:20 stamped 1993-2-9

[edit] ALGOL 68

Translation of: python
Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386
INT no data := 0;               # Current run of consecutive flags<0 in lines of file #
INT no data max := -1; # Max consecutive flags<0 in lines of file #
FLEX[0]STRING no data max line; # ... and line number(s) where it occurs #
 
REAL tot file := 0; # Sum of file data #
INT num file := 0; # Number of file data items with flag>0 #
 
# CHAR fs = " "; #
INT nf = 24;
 
INT upb list := nf;
FORMAT list repr = $n(upb list-1)(g", ")g$;
 
PROC exception = ([]STRING args)VOID:(
putf(stand error, ($"Exception"$, $", "g$, args, $l$));
stop
);
 
PROC raise io error = (STRING message)VOID:exception(("io error", message));
 
OP +:= = (REF FLEX []STRING rhs, STRING append)REF FLEX[]STRING: (
HEAP [UPB rhs+1]STRING out rhs;
out rhs[:UPB rhs] := rhs;
out rhs[UPB rhs+1] := append;
rhs := out rhs;
out rhs
);
 
INT upb opts = 3; # these are "a68g" "./Data_Munging.a68" & "-" #
[argc - upb opts]STRING in files;
FOR arg TO UPB in files DO in files[arg] := argv(upb opts + arg) OD;
 
MODE FIELD = STRUCT(REAL data, INT flag);
FORMAT field repr = $2(g)$;
 
FOR index file TO UPB in files DO
STRING file name = in files[index file], FILE file;
IF open(file, file name, stand in channel) NE 0 THEN
raise io error("Cannot open """+file name+"""") FI;
on logical file end(file, (REF FILE f)BOOL: logical file end done);
REAL tot line, INT num line;
# make term(file, ", ") for CSV data #
STRING date;
DO
tot line := 0; # sum of line data #
num line := 0; # number of line data items with flag>0 #
# extract field info #
[nf]FIELD data;
getf(file, ($10a$, date, field repr, data, $l$));
 
FOR key TO UPB data DO
FIELD field = data[key];
IF flag OF field<1 THEN
no data +:= 1
ELSE
# check run of data-absent data #
IF no data max = no data AND no data>0 THEN
no data max line +:= date FI;
IF no data max<no data AND no data>0 THEN
no data max := no data;
no data max line := date FI;
# re-initialise run of no data counter #
no data := 0;
# gather values for averaging #
tot line +:= data OF field;
num line +:= 1
FI
OD;
 
# totals for the file so far #
tot file +:= tot line;
num file +:= num line;
 
printf(($"Line: "g" Reject: "g(-2)" Accept: "g(-2)" Line tot: "g(-14, 3)" Line avg: "g(-14, 3)l$,
date,
UPB(data) -num line,
num line, tot line,
IF num line>0 THEN tot line/num line ELSE 0 FI))
OD;
logical file end done:
close(file)
OD;
 
FORMAT plural = $b(" ", "s")$,
p = $b("", "s")$;
 
upb list := UPB in files;
printf(($l"File"f(plural)" = "$, upb list = 1, list repr, in files, $l$,
$"Total = "g(-0, 3)l$, tot file,
$"Readings = "g(-0)l$, num file,
$"Average = "g(-0, 3)l$, tot file / num file));
 
upb list := UPB no data max line;
printf(($l"Maximum run"f(p)" of "g(-0)" consecutive false reading"f(p)" ends at line starting with date"f(p)": "$,
upb list = 1, no data max, no data max = 0, upb list = 1, list repr, no data max line, $l$))

Command:

$ a68g ./Data_Munging.a68 - data
Output:
Line: 1991-03-30  Reject:  0  Accept: 24  Line tot:        240.000  Line avg:         10.000
Line: 1991-03-31  Reject:  0  Accept: 24  Line tot:        565.000  Line avg:         23.542
Line: 1991-03-31  Reject: 23  Accept:  1  Line tot:         40.000  Line avg:         40.000
Line: 1991-04-01  Reject:  1  Accept: 23  Line tot:        534.000  Line avg:         23.217
Line: 1991-04-02  Reject:  0  Accept: 24  Line tot:        475.000  Line avg:         19.792
Line: 1991-04-03  Reject:  0  Accept: 24  Line tot:        335.000  Line avg:         13.958

File     = data
Total    = 2189.000
Readings = 120
Average  = 18.242

Maximum run of 24 consecutive false readings ends at line starting with date: 1991-04-01

[edit] Aime

integer bads, count, max_bads;
file f;
list l;
real s;
text bad_day, worst_day;
 
f_affix(f, "/dev/stdin");
 
max_bads = 0;
count = 0;
bads = 0;
s = 0;
 
while (f_list(f, l, 0) ^ -1) {
integer e, i;
 
i = 2;
while (i < 49) {
e = atoi(l_q_text(l, i));
if (0 < e) {
count += 1;
s += atof(l_q_text(l, i - 1));
if (max_bads < bads) {
max_bads = bads;
worst_day = bad_day;
}
bads = 0;
} else {
if (!bads) {
bad_day = l_q_text(l, 0);
}
bads += 1;
}
i += 2;
}
}
 
o_text("Averaged ");
o_real(3, s / count);
o_text(" over ");
o_integer(count);
o_text(" readings.\n");
 
o_text("Longest bad run ");
o_integer(max_bads);
o_text(", started ");
o_text(worst_day);
o_text(".\n");

Run as:

cat readings.txt | tr -d \\r | aime SOURCE_FILE
Output:
Averaged 10.497 over 129403 readings.
Longest bad run 589, started 1993-02-09.

[edit] AutoHotkey

# Author AlephX Aug 17 2011
 
SetFormat, float, 4.2
SetFormat, FloatFast, 4.2
 
data = %A_scriptdir%\readings.txt
result = %A_scriptdir%\results.txt
totvalid := 0
totsum := 0
totavg:= 0
 
Loop, Read, %data%, %result%
{
sum := 0
Valid := 0
Couples := 0
Lines := A_Index
Loop, parse, A_LoopReadLine, %A_Tab%
{
;MsgBox, Field number %A_Index% is %A_LoopField%
if A_index = 1
{
Date := A_LoopField
Counter := 0
}
else
{
Counter++
couples := Couples + 0.5
if Counter = 1
{
value := A_LoopField / 1
}
else
{
if A_loopfield > 0
{
Sum := Sum + value
Valid++
 
if (wrong > maxwrong)
{
maxwrong := wrong
lastwrongdate := currwrongdate
startwrongdate := firstwrongdate
startoccurrence := firstoccurrence
lastoccurrence := curroccurrence
}
wrong := 0
}
else
{
wrong++
currwrongdate := date
curroccurrence := (A_index-1) / 2
if (wrong = 1)
{
firstwrongdate := date
firstoccurrence := curroccurrence
}
}
Counter := 0
}
}
}
avg := sum / valid
TotValid := Totvalid+valid
TotSum := Totsum+sum
FileAppend, Day: %date% sum: %sum% avg: %avg% Readings: %valid%/%couples%`n
}
 
Totavg := TotSum / TotValid
FileAppend, `n`nDays %Lines%`nMaximal wrong readings: %maxwrong% from %startwrongdate% at %startoccurrence% to %lastwrongdate% at %lastoccurrence%`n`n, %result%
FileAppend, Valid readings: %TotValid%`nTotal Value: %TotSUm%`nAverage: %TotAvg%, %result%
Sample output:
Day: 1990-01-01 sum: 590.00 avg: 26.82 Readings: 22/24.00
Day: 1990-01-02 sum: 410.00 avg: 17.08 Readings: 24/24.00
Day: 1990-01-03 sum: 1415.00 avg: 58.96 Readings: 24/24.00
Day: 1990-01-04 sum: 1800.00 avg: 75.00 Readings: 24/24.00
Day: 1990-01-05 sum: 1130.00 avg: 47.08 Readings: 24/24.00
...
Day: 2004-12-31 sum: 47.30 avg: 2.06 Readings: 23/24.00


Days 5471
Maximal wrong readings: 589 from 1993-02-09 at 2.00 to 1993-03-05 at 14.00

Valid readings: 129403
Total Value: 1358393.40
Average: 10.50

[edit] AWK

BEGIN{
nodata = 0; # Current run of consecutive flags<0 in lines of file
nodata_max=-1; # Max consecutive flags<0 in lines of file
nodata_maxline="!"; # ... and line number(s) where it occurs
}
FNR==1 {
# Accumulate input file names
if(infiles){
infiles = infiles "," infiles
} else {
infiles = FILENAME
}
}
{
tot_line=0; # sum of line data
num_line=0; # number of line data items with flag>0
 
# extract field info, skipping initial date field
for(field=2; field<=NF; field+=2){
datum=$field;
flag=$(field+1);
if(flag<1){
nodata++
}else{
# check run of data-absent fields
if(nodata_max==nodata && (nodata>0)){
nodata_maxline=nodata_maxline ", " $1
}
if(nodata_max<nodata && (nodata>0)){
nodata_max=nodata
nodata_maxline=$1
}
# re-initialise run of nodata counter
nodata=0;
# gather values for averaging
tot_line+=datum
num_line++;
}
}
 
# totals for the file so far
tot_file += tot_line
num_file += num_line
 
printf "Line: %11s Reject: %2i Accept: %2i Line_tot: %10.3f Line_avg: %10.3f\n", \
$1, ((NF -1)/2) -num_line, num_line, tot_line, (num_line>0)? tot_line/num_line: 0
 
# debug prints of original data plus some of the computed values
#printf "%s  %15.3g  %4i\n", $0, tot_line, num_line
#printf "%s\n  %15.3f  %4i  %4i  %4i  %s\n", $0, tot_line, num_line, nodata, nodata_max, nodata_maxline
 
 
}
 
END{
printf "\n"
printf "File(s) = %s\n", infiles
printf "Total = %10.3f\n", tot_file
printf "Readings = %6i\n", num_file
printf "Average = %10.3f\n", tot_file / num_file
 
printf "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s\n", nodata_max, nodata_maxline
}
Sample output:
bash$ awk -f readings.awk readings.txt | tail
Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings.txt
Total    = 1358393.400
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05
bash$ 

[edit] Batch File

@echo off
setlocal ENABLEDELAYEDEXPANSION
set maxrun= 0
set maxstart=
set maxend=
set notok=0
set inputfile=%1
for /F "tokens=1,*" %%i in (%inputfile%) do (
set date=%%i
call :processline %%j
)
 
echo\
echo max false: %maxrun% from %maxstart% until %maxend%
 
goto :EOF
 
:processline
set sum=0000
set count=0
set hour=1
:loop
if "%1"=="" goto :result
set num=%1
if "%2"=="1" (
if "%notok%" NEQ "0" (
set notok= !notok!
if /I "!notok:~-5!" GTR "%maxrun%" (
set maxrun=!notok:~-5!
set maxstart=%nok0date% %nok0hour%
set maxend=%nok1date% %nok1hour%
)
set notok=0
)
set /a sum+=%num:.=%
set /a count+=1
) else (
if "%notok%" EQU "0" (
set nok0date=%date%
set nok0hour=%hour%
) else (
set nok1date=%date%
set nok1hour=%hour%
)
set /a notok+=1
)
shift
shift
set /a hour+=1
goto :loop
 
:result
if "%count%"=="0" (
set mean=0
) else (
set /a mean=%sum%/%count%
)
if "%mean%"=="0" set mean=0000
if "%sum%"=="0" set sum=0000
set mean=%mean:~0,-3%.%mean:~-3%
set sum=%sum:~0,-3%.%sum:~-3%
set count= %count%
set sum= %sum%
set mean= %mean%
echo Line: %date% Accept: %count:~-3% tot: %sum:~-8% avg: %mean:~-8%
 
goto :EOF
Output:
C:\ >batch-fileparsing.bat readings-2.txt
Line: 1990-01-01 Accept:  22  tot:  590.000  avg:   26.818
Line: 1990-01-02 Accept:  24  tot:  410.000  avg:   17.083
Line: 1990-01-03 Accept:  24  tot: 1415.000  avg:   58.958
Line: 1990-01-04 Accept:  24  tot: 1800.000  avg:   75.000
Line: 1990-01-05 Accept:  24  tot: 1130.000  avg:   47.083
Line: 1990-01-06 Accept:  24  tot: 1820.000  avg:   75.833
...
Line: 1993-12-26 Accept:  24  tot:  195.000  avg:    8.125
Line: 1993-12-27 Accept:  24  tot:  112.000  avg:    4.666
Line: 1993-12-28 Accept:  24  tot:  303.000  avg:   12.625
Line: 1993-12-29 Accept:  24  tot:  339.000  avg:   14.125
Line: 1993-12-30 Accept:  24  tot:  593.000  avg:   24.708
Line: 1993-12-31 Accept:  24  tot:  865.000  avg:   36.041
...
max false:   589  from 1993-02-09 2 until 1993-03-05 14

[edit] BBC BASIC

      file% = OPENIN("readings.txt")
IF file% = 0 THEN PRINT "Could not open test data file" : END
 
Total = 0
Count% = 0
BadMax% = 0
bad% = 0
WHILE NOT EOF#file%
text$ = GET$#file%
IF text$<>"" THEN
tab% = INSTR(text$, CHR$(9))
date$ = LEFT$(text$, tab% - 1)
acc = 0
cnt% = 0
FOR field% = 1 TO 24
dval = VALMID$(text$, tab%+1)
tab% = INSTR(text$, CHR$(9), tab%+1)
flag% = VALMID$(text$, tab%+1)
tab% = INSTR(text$, CHR$(9), tab%+1)
IF flag% > 0 THEN
acc += dval
cnt% += 1
bad% = 0
ELSE
bad% += 1
IF bad% > BadMax% BadMax% = bad% : BadDate$ = date$
ENDIF
NEXT field%
@% = &90A
PRINT "Date: " date$ " Good = "; cnt%, " Bad = "; 24-cnt%, ;
@% = &20308
IF cnt% THEN PRINT " Total = " acc " Mean = " acc / cnt% ;
PRINT
Total += acc
Count% += cnt%
ENDIF
ENDWHILE
CLOSE #file%
PRINT ' "Grand total = " ; Total
PRINT "Number of valid readings = " ; STR$(Count%)
PRINT "Overall mean = " ; Total / Count%
@% = &90A
PRINT '"Longest run of bad readings = " ; BadMax% " ending " BadDate$
Output:
Date: 1990-01-01  Good = 22     Bad = 2   Total =  590.000  Mean =   26.818
Date: 1990-01-02  Good = 24     Bad = 0   Total =  410.000  Mean =   17.083
Date: 1990-01-03  Good = 24     Bad = 0   Total = 1415.000  Mean =   58.958
Date: 1990-01-04  Good = 24     Bad = 0   Total = 1800.000  Mean =   75.000
Date: 1990-01-05  Good = 24     Bad = 0   Total = 1130.000  Mean =   47.083
Date: 1990-01-06  Good = 24     Bad = 0   Total = 1820.000  Mean =   75.833
Date: 1990-01-07  Good = 24     Bad = 0   Total = 1385.000  Mean =   57.708
....
Date: 2004-12-29  Good = 23     Bad = 1   Total =   56.300  Mean =    2.448
Date: 2004-12-30  Good = 23     Bad = 1   Total =   65.300  Mean =    2.839
Date: 2004-12-31  Good = 23     Bad = 1   Total =   47.300  Mean =    2.057

Grand total = 1358393.402
Number of valid readings = 129403
Overall mean = 10.497

Longest run of bad readings = 589 ending 1993-03-05

[edit] C

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
static int badHrs, maxBadHrs;
 
static double hrsTot = 0.0;
static int rdgsTot = 0;
char bhEndDate[40];
 
int mungeLine( char *line, int lno, FILE *fout )
{
char date[40], *tkn;
int dHrs, flag, hrs2, hrs;
double hrsSum;
int hrsCnt = 0;
double avg;
 
tkn = strtok(line, ".");
if (tkn) {
int n = sscanf(tkn, "%s %d", &date, &hrs2);
if (n<2) {
printf("badly formated line - %d %s\n", lno, tkn);
return 0;
}
hrsSum = 0.0;
while( tkn= strtok(NULL, ".")) {
n = sscanf(tkn,"%d %d %d", &dHrs, &flag, &hrs);
if (n>=2) {
if (flag > 0) {
hrsSum += 1.0*hrs2 + .001*dHrs;
hrsCnt += 1;
if (maxBadHrs < badHrs) {
maxBadHrs = badHrs;
strcpy(bhEndDate, date);
}
badHrs = 0;
}
else {
badHrs += 1;
}
hrs2 = hrs;
}
else {
printf("bad file syntax line %d: %s\n",lno, tkn);
}
}
avg = (hrsCnt > 0)? hrsSum/hrsCnt : 0.0;
fprintf(fout, "%s Reject: %2d Accept: %2d Average: %7.3f\n",
date, 24-hrsCnt, hrsCnt, hrsSum/hrsCnt);
hrsTot += hrsSum;
rdgsTot += hrsCnt;
}
return 1;
}
 
int main()
{
FILE *infile, *outfile;
int lineNo = 0;
char line[512];
const char *ifilename = "readings.txt";
outfile = fopen("V0.txt", "w");
 
infile = fopen(ifilename, "rb");
if (!infile) {
printf("Can't open %s\n", ifilename);
exit(1);
}
while (NULL != fgets(line, 512, infile)) {
lineNo += 1;
if (0 == mungeLine(line, lineNo, outfile))
printf("Bad line at %d",lineNo);
}
fclose(infile);
 
fprintf(outfile, "File:  %s\n", ifilename);
fprintf(outfile, "Total:  %.3f\n", hrsTot);
fprintf(outfile, "Readings: %d\n", rdgsTot);
fprintf(outfile, "Average:  %.3f\n", hrsTot/rdgsTot);
fprintf(outfile, "\nMaximum number of consecutive bad readings is %d\n", maxBadHrs);
fprintf(outfile, "Ends on date %s\n", bhEndDate);
fclose(outfile);
return 0;
}
Sample output:
1990-01-01  Reject:  2  Accept: 22  Average:  26.818
1990-01-02  Reject:  0  Accept: 24  Average:  17.083
1990-01-03  Reject:  0  Accept: 24  Average:  58.958
1990-01-04  Reject:  0  Accept: 24  Average:  75.000
1990-01-05  Reject:  0  Accept: 24  Average:  47.083
...
File:     readings.txt
Total:    1358393.400
Readings: 129403
Average:  10.497

Maximum number of consecutive bad readings is 589
Ends on date 1993-03-05

[edit] C++

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iomanip>
#include <boost/lexical_cast.hpp>
#include <boost/algorithm/string.hpp>
 
using std::cout;
using std::endl;
const int NumFlags = 24;
 
int main()
{
std::fstream file("readings.txt");
 
int badCount = 0;
std::string badDate;
int badCountMax = 0;
while(true)
{
std::string line;
getline(file, line);
if(!file.good())
break;
 
std::vector<std::string> tokens;
boost::algorithm::split(tokens, line, boost::is_space());
 
if(tokens.size() != NumFlags * 2 + 1)
{
cout << "Bad input file." << endl;
return 0;
}
 
double total = 0.0;
int accepted = 0;
for(size_t i = 1; i < tokens.size(); i += 2)
{
double val = boost::lexical_cast<double>(tokens[i]);
int flag = boost::lexical_cast<int>(tokens[i+1]);
if(flag > 0)
{
total += val;
++accepted;
badCount = 0;
}
else
{
++badCount;
if(badCount > badCountMax)
{
badCountMax = badCount;
badDate = tokens[0];
}
}
}
 
cout << tokens[0];
cout << " Reject: " << std::setw(2) << (NumFlags - accepted);
cout << " Accept: " << std::setw(2) << accepted;
cout << " Average: " << std::setprecision(5) << total / accepted << endl;
}
cout << endl;
cout << "Maximum number of consecutive bad readings is " << badCountMax << endl;
cout << "Ends on date " << badDate << endl;
}
Output:
1990-01-01  Reject:  2  Accept: 22  Average: 26.818
1990-01-02  Reject:  0  Accept: 24  Average: 17.083
1990-01-03  Reject:  0  Accept: 24  Average: 58.958
1990-01-04  Reject:  0  Accept: 24  Average: 75
1990-01-05  Reject:  0  Accept: 24  Average: 47.083
...
Maximum number of consecutive bad readings is 589
Ends on date 1993-03-05

[edit] COBOL

       IDENTIFICATION DIVISION.
PROGRAM-ID. data-munging.
 
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT input-file ASSIGN TO INPUT-FILE-PATH
ORGANIZATION LINE SEQUENTIAL
FILE STATUS file-status.
 
DATA DIVISION.
FILE SECTION.
FD input-file.
01 input-record.
03 date-stamp PIC X(10).
03 FILLER PIC X.
*> Curse whoever decided to use tabs and variable length
*> data in the file!
03 input-data-pairs PIC X(300).
 
WORKING-STORAGE SECTION.
78 INPUT-FILE-PATH VALUE "readings.txt".
 
01 file-status PIC 99.
88 file-is-ok VALUE 0.
88 end-of-file VALUE 10.
 
01 data-pair.
03 val PIC 9(3)V9(3).
03 flag PIC S9.
88 invalid-flag VALUE -9 THRU 0.
 
01 val-length PIC 9.
01 flag-length PIC 9.
01 offset PIC 99.
 
01 day-total PIC 9(5)V9(3).
01 grand-total PIC 9(8)V9(3).
01 mean-val PIC 9(8)V9(3).
 
01 day-rejected PIC 9(5).
01 day-accepted PIC 9(5).
 
01 total-rejected PIC 9(8).
01 total-accepted PIC 9(8).
 
01 current-data-gap PIC 9(8).
01 max-data-gap PIC 9(8).
01 max-data-gap-end PIC X(10).
 
PROCEDURE DIVISION.
DECLARATIVES.
*> Terminate the program if an error occurs on input-file.
input-file-error SECTION.
USE AFTER STANDARD ERROR ON input-file.
 
DISPLAY
"An error occurred while reading input.txt. "
"File error: " file-status
". The program will terminate."
END-DISPLAY
 
GOBACK
.
 
END DECLARATIVES.
 
main-line.
*> Terminate the program if the file cannot be opened.
OPEN INPUT input-file
IF NOT file-is-ok
DISPLAY "File could not be opened. The program will "
"terminate."
GOBACK
END-IF
 
*> Process the data in the file.
PERFORM FOREVER
*> Stop processing if at the end of the file.
READ input-file
AT END
EXIT PERFORM
END-READ
 
*> Split the data up and process the value-flag pairs.
PERFORM UNTIL input-data-pairs = SPACES
*> Split off the value-flag pair at the front of the
*> record.
UNSTRING input-data-pairs DELIMITED BY X"09"
INTO val COUNT val-length, flag COUNT flag-length
 
COMPUTE offset = val-length + flag-length + 3
MOVE input-data-pairs (offset:) TO input-data-pairs
 
*> Process according to flag.
IF NOT invalid-flag
ADD val TO day-total, grand-total
 
ADD 1 TO day-accepted, total-accepted
 
IF max-data-gap < current-data-gap
MOVE current-data-gap TO max-data-gap
MOVE date-stamp TO max-data-gap-end
END-IF
 
MOVE ZERO TO current-data-gap
ELSE
ADD 1 TO current-data-gap, day-rejected,
total-rejected
END-IF
END-PERFORM
 
*> Display day stats.
DIVIDE day-total BY day-accepted GIVING mean-val
DISPLAY
date-stamp
" Reject: " day-rejected
" Accept: " day-accepted
" Average: " mean-val
END-DISPLAY
 
INITIALIZE day-rejected, day-accepted, mean-val,
day-total
END-PERFORM
 
CLOSE input-file
 
*> Display overall stats.
DISPLAY SPACE
DISPLAY "File: " INPUT-FILE-PATH
DISPLAY "Total: " grand-total
DISPLAY "Readings: " total-accepted
 
DIVIDE grand-total BY total-accepted GIVING mean-val
DISPLAY "Average: " mean-val
 
DISPLAY SPACE
DISPLAY "Bad readings: " total-rejected
DISPLAY "Maximum number of consecutive bad readings is "
max-data-gap
DISPLAY "Ends on date " max-data-gap-end
 
GOBACK
.
Example output:
1990-01-01 Reject: 00002 Accept: 00022 Average: 00000026.818
1990-01-02 Reject: 00000 Accept: 00024 Average: 00000017.083
1990-01-03 Reject: 00000 Accept: 00024 Average: 00000058.958
...
2004-12-29 Reject: 00001 Accept: 00023 Average: 00000002.447
2004-12-30 Reject: 00001 Accept: 00023 Average: 00000002.839
2004-12-31 Reject: 00001 Accept: 00023 Average: 00000002.056
 
File:         readings.txt
Total:        01358393.400
Readings:     00129403
Average:      00000010.497
 
Bad readings: 00001901
Maximum number of consecutive bad readings is 00000589
Ends on date 1993-03-05

[edit] Common Lisp

(defvar *invalid-count*)
(defvar *max-invalid*)
(defvar *max-invalid-date*)
(defvar *total-sum*)
(defvar *total-valid*)
 
(defun read-flag (stream date)
(let ((flag (read stream)))
(if (plusp flag)
(setf *invalid-count* 0)
(when (< *max-invalid* (incf *invalid-count*))
(setf *max-invalid* *invalid-count*)
(setf *max-invalid-date* date)))
flag))
 
(defun parse-line (line)
(with-input-from-string (s line)
(let ((date (make-string 10)))
(read-sequence date s)
(cons date (loop repeat 24 collect (list (read s)
(read-flag s date)))))))
 
(defun analyze-line (line)
(destructuring-bind (date &rest rest) line
(let* ((valid (remove-if-not #'plusp rest :key #'second))
(n (length valid))
(sum (apply #'+ (mapcar #'rationalize (mapcar #'first valid))))
(avg (if valid (/ sum n) 0)))
(incf *total-valid* n)
(incf *total-sum* sum)
(format t "Line: ~a Reject: ~2d Accept: ~2d ~
Line_tot: ~8,3f Line_avg: ~7,3f~%"

date (- 24 n) n sum avg))))
 
(defun process (pathname)
(let ((*invalid-count* 0) (*max-invalid* 0) *max-invalid-date*
(*total-sum* 0) (*total-valid* 0))
(with-open-file (f pathname)
(loop for line = (read-line f nil nil)
while line
do (analyze-line (parse-line line))))
(format t "~%File = ~a" pathname)
(format t "~&Total = ~f" *total-sum*)
(format t "~&Readings = ~a" *total-valid*)
(format t "~&Average = ~10,3f~%" (/ *total-sum* *total-valid*))
(format t "~%Maximum run(s) of ~a consecutive false readings ends at ~
line starting with date(s): ~a~%"

*max-invalid* *max-invalid-date*)))
Example output:
...
Line: 2004-12-29  Reject:  1  Accept: 23  Line_tot:   56.300  Line_avg:   2.448
Line: 2004-12-30  Reject:  1  Accept: 23  Line_tot:   65.300  Line_avg:   2.839
Line: 2004-12-31  Reject:  1  Accept: 23  Line_tot:   47.300  Line_avg:   2.057

File     = readings.txt
Total    = 1358393.4
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05

[edit] D

Translation of: Python
void main(in string[] args) {
import std.stdio, std.conv, std.string;
 
const fileNames = (args.length == 1) ? ["readings.txt"] :
args[1 .. $];
 
int noData, noDataMax = -1;
string[] noDataMaxLine;
 
double fileTotal = 0.0;
int fileValues;
 
foreach (const fileName; fileNames) {
foreach (char[] line; fileName.File.byLine) {
double lineTotal = 0.0;
int lineValues;
 
// Extract field info.
const parts = line.split;
const date = parts[0];
const fields = parts[1 .. $];
assert(fields.length % 2 == 0,
format("Expected even number of fields, not %d.",
fields.length));
 
for (int i; i < fields.length; i += 2) {
immutable value = fields[i].to!double;
immutable flag = fields[i + 1].to!int;
 
if (flag < 1) {
noData++;
continue;
}
 
// Check run of data-absent fields.
if (noDataMax == noData && noData > 0)
noDataMaxLine ~= date.idup;
 
if (noDataMax < noData && noData > 0) {
noDataMax = noData;
noDataMaxLine.length = 1;
noDataMaxLine[0] = date.idup;
}
 
// Re-initialise run of noData counter.
noData = 0;
 
// Gather values for averaging.
lineTotal += value;
lineValues++;
}
 
// Totals for the file so far.
fileTotal += lineTotal;
fileValues += lineValues;
 
writefln("Line: %11s Reject: %2d Accept: %2d" ~
" Line_tot: %10.3f Line_avg: %10.3f",
date,
fields.length / 2 - lineValues,
lineValues,
lineTotal,
(lineValues > 0) ? lineTotal / lineValues : 0.0);
}
}
 
writefln("\nFile(s) = %-(%s, %)", fileNames);
writefln("Total = %10.3f", fileTotal);
writefln("Readings = %6d", fileValues);
writefln("Average = %10.3f", fileTotal / fileValues);
 
writefln("\nMaximum run(s) of %d consecutive false " ~
"readings ends at line starting with date(s): %-(%s, %)",
noDataMax, noDataMaxLine);
}

The output matches that of the Python version.

[edit] Erlang

The function file_contents/1 is used by Text_processing/2. Please update the user if you make any interface changes.

 
-module( text_processing ).
 
-export( [file_contents/1, main/1] ).
 
-record( acc, {failed={"", 0, 0}, files=[], ok=0, total=0} ).
 
file_contents( Name ) ->
{ok, Binary} = file:read_file( Name ),
[line_contents(X) || X <- binary:split(Binary, <<"\r\n">>, [global]), X =/= <<>>].
 
main( Files ) ->
Acc = lists:foldl( fun file/2, #acc{}, Files ),
{Failed_date, Failed, _Continuation} = Acc#acc.failed,
io:fwrite( "~nFile(s)=~p~nTotal=~.2f~nReadings=~p~nAverage=~.2f~n~nMaximum run(s) of ~p consecutive false readings ends at line starting with date(s): ~p~n",
[lists:reverse(Acc#acc.files), Acc#acc.total, Acc#acc.ok, Acc#acc.total / Acc#acc.ok, Failed, Failed_date] ).
 
 
 
file( Name, #acc{files=Files}=Acc ) ->
try
Line_contents = file_contents( Name ),
lists:foldl( fun file_content_line/2, Acc#acc{files=[Name | Files]}, Line_contents )
 
catch
_:Error ->
io:fwrite( "Error: Failed to read ~s: ~p~n", [Name, Error] ),
Acc
end.
 
file_content_line( {Date, Value_flags}, #acc{failed=Failed, ok=Ok, total=Total}=Acc ) ->
New_failed = file_content_line_failed( Value_flags, Date, Failed ),
{Sum, Oks, Average} = file_content_line_oks_0( [X || {X, ok} <- Value_flags] ),
io:fwrite( "Line=~p\tRejected=~p\tAccepted=~p\tLine total=~.2f\tLine average=~.2f~n", [Date, erlang:length(Value_flags) - Oks, Oks, Sum, Average] ),
Acc#acc{failed=New_failed, ok=Ok + Oks, total=Total + Sum}.
 
file_content_line_failed( [], Date, {_Failed_date, Failed, Acc} ) when Acc > Failed ->
{Date, Acc, Acc};
file_content_line_failed( [], _Date, Failed ) ->
Failed;
file_content_line_failed( [{_V, error} | T], Date, {Failed_date, Failed, Acc} ) ->
file_content_line_failed( T, Date, {Failed_date, Failed, Acc + 1} );
file_content_line_failed( [_H | T], Date, {_Failed_date, Failed, Acc} ) when Acc > Failed ->
file_content_line_failed( T, Date, {Date, Acc, 0} );
file_content_line_failed( [_H | T], Date, {Failed_date, Failed, _Acc} ) ->
file_content_line_failed( T, Date, {Failed_date, Failed, 0} ).
 
file_content_line_flag( N ) when N > 0 -> ok;
file_content_line_flag( _N ) -> error.
 
file_content_line_oks_0( [] ) -> {0.0, 0, 0.0};
file_content_line_oks_0( Ok_value_flags ) ->
Sum = lists:sum( Ok_value_flags ),
Oks = erlang:length( Ok_value_flags ),
{Sum, Oks, Sum / Oks}.
 
file_content_line_value_flag( Binary, {[], Acc} ) ->
Flag = file_content_line_flag( erlang:list_to_integer(binary:bin_to_list(Binary)) ),
{[Flag], Acc};
file_content_line_value_flag( Binary, {[Flag], Acc} ) ->
Value = erlang:list_to_float( binary:bin_to_list(Binary) ),
{[], [{Value, Flag} | Acc]}.
 
line_contents( Line ) ->
[Date_binary | Rest] = binary:split( Line, <<"\t">>, [global] ),
{_Previous, Value_flags} = lists:foldr( fun file_content_line_value_flag/2, {[], []}, Rest ), % Preserve order
{binary:bin_to_list( Date_binary ), Value_flags}.
 
Output:
macbook-pro:rosettacode bengt$ escript text_processing.erl readings.txt
Line="1990-01-01"       Rejected=2      Accepted=22     Line total=590.00       Line average=26.82
Line="1990-01-02"       Rejected=0      Accepted=24     Line total=410.00       Line average=17.08
...
Line="2004-12-30"       Rejected=1      Accepted=23     Line total=65.30        Line average=2.84
Line="2004-12-31"       Rejected=1      Accepted=23     Line total=47.30        Line average=2.06

File(s)=["readings.txt"]
Total=1358393.40
Readings=129403
Average=10.50

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): "1993-03-05"

[edit] Forth

Works with: GNU Forth
\ data munging
 
\ 1991-03-30[\t10.000\t[-]1]*24
 
\ 1. mean of valid (flag > 0) values per day and overall
\ 2. length of longest run of invalid values, and when it happened
 
fvariable day-sum
variable day-n
 
fvariable total-sum
variable total-n
 
10 constant date-size \ yyyy-mm-dd
create cur-date date-size allot
 
create bad-date date-size allot
variable bad-n
 
create worst-date date-size allot
variable worst-n
 
: split ( buf len char -- buf' l2 buf l1 ) \ where buf'[0] = char, l1 = len-l2
>r 2dup r> scan
2swap 2 pick - ;
 
: next-sample ( buf len -- buf' len' fvalue flag )
#tab split >float drop 1 /string
#tab split snumber? drop >r 1 /string r> ;
 
: ok? 0> ;
 
: add-sample ( value -- )
day-sum f@ f+ day-sum f!
1 day-n +! ;
 
: add-day
day-sum f@ total-sum f@ f+ total-sum f!
day-n @ total-n +! ;
 
: add-bad-run
bad-n @ 0= if
cur-date bad-date date-size move
then
1 bad-n +! ;
 
: check-worst-run
bad-n @ worst-n @ > if
bad-n @ worst-n !
bad-date worst-date date-size move
then
0 bad-n ! ;
 
: hour ( buf len -- buf' len' )
next-sample ok? if
add-sample
check-worst-run
else
fdrop
add-bad-run
then ;
 
: .mean ( sum count -- ) 0 d>f f/ f. ;
 
: day ( line len -- )
2dup + #tab swap c! 1+ \ append tab for parsing
#tab split cur-date swap move 1 /string \ skip date
0e day-sum f!
0 day-n !
24 0 do hour loop 2drop
cur-date date-size type ." mean = "
day-sum f@ day-n @ .mean cr
add-day ;
 
stdin value input
 
: main
s" input.txt" r/o open-file throw to input
0e total-sum f!
0 total-n !
0 worst-n !
begin pad 512 input read-line throw
while pad swap day
repeat
input close-file throw
worst-n @ if
." Longest interruption: " worst-n @ .
." hours starting " worst-date date-size type cr
then
." Total mean = "
total-sum f@ total-n @ .mean cr ;
 
main bye

[edit] Go

package main
 
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
"strings"
)
 
var fn = "readings.txt"
 
func main() {
f, err := os.Open(fn)
if err != nil {
fmt.Println(err)
return
}
defer f.Close()
var (
badRun, maxRun int
badDate, maxDate string
fileSum float64
fileAccept int
)
for lr := bufio.NewReader(f); ; {
line, pref, err := lr.ReadLine()
if err == io.EOF {
break
}
if err != nil {
fmt.Println(err)
return
}
if pref {
fmt.Println("Unexpected long line.")
return
}
f := strings.Fields(string(line))
if len(f) != 49 {
fmt.Println("unexpected format,", len(f), "fields.")
return
}
var accept int
var sum float64
for i := 1; i < 49; i += 2 {
flag, err := strconv.Atoi(f[i+1])
if err != nil {
fmt.Println(err)
return
}
if flag > 0 { // value is good
if badRun > 0 { // terminate bad run
if badRun > maxRun {
maxRun = badRun
maxDate = badDate
}
badRun = 0
}
value, err := strconv.ParseFloat(f[i], 64)
if err != nil {
fmt.Println(err)
return
}
sum += value
accept++
} else { // value is bad
if badRun == 0 {
badDate = f[0]
}
badRun++
}
}
fmt.Printf("Line: %s Reject %2d Accept: %2d Line_tot:%9.3f",
f[0], 24-accept, accept, sum)
if accept > 0 {
fmt.Printf(" Line_avg:%8.3f\n", sum/float64(accept))
} else {
fmt.Println("")
}
fileSum += sum
fileAccept += accept
}
fmt.Println("\nFile =", fn)
fmt.Printf("Total = %.3f\n", fileSum)
fmt.Println("Readings = ", fileAccept)
if fileAccept > 0 {
fmt.Printf("Average =  %.3f\n", fileSum/float64(fileAccept))
}
if badRun > 0 && badRun > maxRun {
maxRun = badRun
maxDate = badDate
}
if maxRun == 0 {
fmt.Println("\nAll data valid.")
} else {
fmt.Printf("\nMax data gap = %d, beginning on line %s.\n",
maxRun, maxDate)
}
}
Output:
...
Line: 2004-12-28  Reject  1  Accept: 23  Line_tot:   77.800  Line_avg:   3.383
Line: 2004-12-29  Reject  1  Accept: 23  Line_tot:   56.300  Line_avg:   2.448
Line: 2004-12-30  Reject  1  Accept: 23  Line_tot:   65.300  Line_avg:   2.839
Line: 2004-12-31  Reject  1  Accept: 23  Line_tot:   47.300  Line_avg:   2.057

File     = readings.txt
Total    = 1358393.400
Readings =  129403
Average  = 10.497

Max data gap = 589, beginning on line 1993-02-09.

[edit] Haskell

import Data.List
import Numeric
import Control.Arrow
import Control.Monad
import Text.Printf
import System.Environment
import Data.Function
 
type Date = String
type Value = Double
type Flag = Bool
 
readFlg :: String -> Flag
readFlg = (> 0).read
 
readNum :: String -> Value
readNum = fst.head.readFloat
 
take2 = takeWhile(not.null).unfoldr (Just.splitAt 2)
 
parseData :: [String] -> (Date,[(Value,Flag)])
parseData = head &&& map(readNum.head &&& readFlg.last).take2.tail
 
sumAccs :: (Date,[(Value,Flag)]) -> (Date, ((Value,Int),[Flag]))
sumAccs = second (((sum &&& length).concat.uncurry(zipWith(\v f -> [v|f])) &&& snd).unzip)
 
maxNAseq :: [Flag] -> [(Int,Int)]
maxNAseq = head.groupBy((==) `on` fst).sortBy(flip compare)
. concat.uncurry(zipWith(\i (r,b)->[(r,i)|not b]))
. first(init.scanl(+)0). unzip
. map ((fst &&& id).(length &&& head)). group
 
main = do
file:_ <- getArgs
f <- readFile file
let dat :: [(Date,((Value,Int),[Flag]))]
dat = map (sumAccs. parseData. words).lines $ f
summ = ((sum *** sum). unzip *** maxNAseq.concat). unzip $ map snd dat
totalFmt = "\nSummary\t\t accept: %d\t total: %.3f \taverage: %6.3f\n\n"
lineFmt = "%8s\t accept: %2d\t total: %11.3f \taverage: %6.3f\n"
maxFmt = "Maximum of %d consecutive false readings, starting on line /%s/ and ending on line /%s/\n"
-- output statistics
putStrLn "\nSome lines:\n"
mapM_ (\(d,((v,n),_)) -> printf lineFmt d n v (v/fromIntegral n)) $ take 4 $ drop 2200 dat
(\(t,n) -> printf totalFmt n t (t/fromIntegral n)) $ fst summ
mapM_ ((\(l, d1,d2) -> printf maxFmt l d1 d2)
. (\(a,b)-> (a,(fst.(dat!!).(`div`24))b,(fst.(dat!!).(`div`24))(a+b)))) $ snd summ
Output:
*Main> :main ["./RC/readings.txt"]
Some lines:

1996-01-11       accept: 24      total:     437.000     average: 18.208
1996-01-12       accept: 24      total:     536.000     average: 22.333
1996-01-13       accept: 24      total:    1062.000     average: 44.250
1996-01-14       accept: 24      total:     787.000     average: 32.792

Summary          accept: 129403  total: 1358393.400     average: 10.497

Maximum of 589 consecutive false readings, starting on line /1993-02-09/ and ending on line /1993-03-05/

[edit] Icon and Unicon

record badrun(count,fromdate,todate)  # record to track bad runs
 
procedure main()
return mungetask1("readings1-input.txt","readings1-output.txt")
end
 
procedure mungetask1(fin,fout)
 
fin := open(fin) | stop("Unable to open input file ",fin)
fout := open(fout,"w") | stop("Unable to open output file ",fout)
 
F_tot := F_acc := F_rej := 0 # data set totals
rejmax := badrun(-1) # longest reject runs
rejcur := badrun(0) # current reject runs
 
while line := read(fin) do {
line ? {
ldate := tab(many(&digits ++ '-')) # date (poorly checked)
fields := tot := rej := 0 # record counters & totals
 
while tab(many(' \t')) do { # whitespace before every pair
value := real(tab(many(&digits++'-.'))) | stop("Bad value in ",ldate)
tab(many(' \t'))
flag := integer(tab(many(&digits++'-'))) | stop("Bad flag in ",ldate)
fields +:= 1
 
if flag > 0 then { # good data, ends a bad run
if rejcur.count > rejmax.count then rejmax := rejcur
rejcur := badrun(0)
tot +:= value
}
else { # bad (flagged) data
if rejcur.count = 0 then rejcur.fromdate := ldate
rejcur.todate := ldate
rejcur.count +:= 1
rej +:= 1
}
}
}
F_tot +:= tot
F_acc +:= acc := fields - rej
F_rej +:= rej
write(fout,"Line: ",ldate," Reject: ", rej," Accept: ", acc," Line_tot: ",tot," Line_avg: ", if acc > 0 then tot / acc else 0)
}
 
write(fout,"\nTotal = ",F_tot,"\nReadings = ",F_acc,"\nRejects = ",F_rej,"\nAverage = ",F_tot / F_acc)
if rejmax.count > 0 then
write(fout,"Maximum run of bad data was ",rejmax.count," readings from ",rejmax.fromdate," to ",rejmax.todate)
else
write(fout,"No bad runs of data")
end
Sample output:
...
Line: 2004-12-28 Reject: 1 Accept: 23 Line_tot: 77.80000000000001 Line_avg: 3.382608695652174
Line: 2004-12-29 Reject: 1 Accept: 23 Line_tot: 56.3 Line_avg: 2.447826086956522
Line: 2004-12-30 Reject: 1 Accept: 23 Line_tot: 65.3 Line_avg: 2.839130434782609
Line: 2004-12-31 Reject: 1 Accept: 23 Line_tot: 47.3 Line_avg: 2.056521739130435

Total    = 1358393.399999999
Readings = 129403
Rejects  = 1901
Average  = 10.49738723213526
Maximum run of bad data was 589 readings from 1993-02-09 to 1993-03-05

[edit] J

Solution:

  load 'files'
parseLine=: 10&({. ,&< (_99&".;._1)@:}.) NB. custom parser
summarize=: # , +/ , +/ % # NB. count,sum,mean
filter=: #~ 0&< NB. keep valid measurements
 
'Dates dat'=: |: parseLine;._2 CR -.~ fread jpath '~temp/readings.txt'
Vals=: (+: i.24){"1 dat
Flags=: (>: +: i.24){"1 dat
DailySummary=: Vals summarize@filter"1 Flags
RunLengths=: ([: #(;.1) 0 , }. *. }:) , 0 >: Flags
]MaxRun=: >./ RunLengths
589
]StartDates=: Dates {~ (>:@I.@e.&MaxRun (24 <.@%~ +/)@{. ]) RunLengths
1993-03-05

Formatting Output
Define report formatting verbs:

formatDailySumry=: dyad define
labels=. , ];.2 'Line: Accept: Line_tot: Line_avg: '
labels , x ,. 7j0 10j3 10j3 ": y
)
formatFileSumry=: dyad define
labels=. ];.2 'Total: Readings: Average: '
sumryvals=. (, %/) 1 0{ +/y
out=. labels ,. 12j3 12j0 12j3 ":&> sumryvals
'maxrun dates'=. x
out=. out,LF,'Maximum run(s) of ',(": maxrun),' consecutive false readings ends at line(s) starting with date(s): ',dates
)
Show output:
   (_4{.Dates) formatDailySumry _4{. DailySummary
Line: Accept: Line_tot: Line_avg:
2004-12-28 23 77.800 3.383
2004-12-29 23 56.300 2.448
2004-12-30 23 65.300 2.839
2004-12-31 23 47.300 2.057
 
(MaxRun;StartDates) formatFileSumry DailySummary
Total: 1358393.400
Readings: 129403
Average: 10.497
 
Maximum run(s) of 589 consecutive false readings ends at line(s) starting with date(s): 1993-03-05

[edit] JavaScript

Works with: JScript
var filename = 'readings.txt';
var show_lines = 5;
var file_stats = {
'num_readings': 0,
'total': 0,
'reject_run': 0,
'reject_run_max': 0,
'reject_run_date': ''
};
 
var fh = new ActiveXObject("Scripting.FileSystemObject").openTextFile(filename, 1); // 1 = for reading
while ( ! fh.atEndOfStream) {
var line = fh.ReadLine();
line_stats(line, (show_lines-- > 0));
}
fh.close();
 
WScript.echo(
"\nFile(s) = " + filename + "\n" +
"Total = " + dec3(file_stats.total) + "\n" +
"Readings = " + file_stats.num_readings + "\n" +
"Average = " + dec3(file_stats.total / file_stats.num_readings) + "\n\n" +
"Maximum run of " + file_stats.reject_run_max +
" consecutive false readings ends at " + file_stats.reject_run_date
);
 
function line_stats(line, print_line) {
var readings = 0;
var rejects = 0;
var total = 0;
var fields = line.split('\t');
var date = fields.shift();
 
while (fields.length > 0) {
var value = parseFloat(fields.shift());
var flag = parseInt(fields.shift(), 10);
readings++;
if (flag <= 0) {
rejects++;
file_stats.reject_run++;
}
else {
total += value;
if (file_stats.reject_run > file_stats.reject_run_max) {
file_stats.reject_run_max = file_stats.reject_run;
file_stats.reject_run_date = date;
}
file_stats.reject_run = 0;
}
}
 
file_stats.num_readings += readings - rejects;
file_stats.total += total;
 
if (print_line) {
WScript.echo(
"Line: " + date + "\t" +
"Reject: " + rejects + "\t" +
"Accept: " + (readings - rejects) + "\t" +
"Line_tot: " + dec3(total) + "\t" +
"Line_avg: " + ((readings == rejects) ? "0.0" : dec3(total / (readings - rejects)))
);
}
}
 
// round a number to 3 decimal places
function dec3(value) {
return Math.round(value * 1e3) / 1e3;
}
Output:
Line: 1990-01-01        Reject: 2       Accept: 22      Line_tot: 590   Line_avg: 26.818
Line: 1990-01-02        Reject: 0       Accept: 24      Line_tot: 410   Line_avg: 17.083
Line: 1990-01-03        Reject: 0       Accept: 24      Line_tot: 1415  Line_avg: 58.958
Line: 1990-01-04        Reject: 0       Accept: 24      Line_tot: 1800  Line_avg: 75
Line: 1990-01-05        Reject: 0       Accept: 24      Line_tot: 1130  Line_avg: 47.083

File(s)  = readings.txt
Total    = 1358393.4
Readings = 129403
Average  = 10.497

Maximum run of 589 consecutive false readings ends at 1993-03-05

[edit] Lua

filename = "readings.txt"
io.input( filename )
 
file_sum, file_cnt_data, file_lines = 0, 0, 0
max_rejected, n_rejected = 0, 0
max_rejected_date, rejected_date = "", ""
 
while true do
data = io.read("*line")
if data == nil then break end
 
date = string.match( data, "%d+%-%d+%-%d+" )
if date == nil then break end
 
val = {}
for w in string.gmatch( data, "%s%-*%d+[%.%d]*" ) do
val[#val+1] = tonumber(w)
end
 
sum, cnt = 0, 0
for i = 1, #val, 2 do
if val[i+1] > 0 then
sum = sum + val[i]
cnt = cnt + 1
n_rejected = 0
else
if n_rejected == 0 then
rejected_date = date
end
n_rejected = n_rejected + 1
if n_rejected > max_rejected then
max_rejected = n_rejected
max_rejected_date = rejected_date
end
end
end
 
file_sum = file_sum + sum
file_cnt_data = file_cnt_data + cnt
file_lines = file_lines + 1
 
print( string.format( "%s:\tRejected: %d\tAccepted: %d\tLine_total: %f\tLine_average: %f", date, #val/2-cnt, cnt, sum, sum/cnt ) )
end
 
print( string.format( "\nFile:\t  %s", filename ) )
print( string.format( "Total:\t  %f", file_sum ) )
print( string.format( "Readings: %d", file_lines ) )
print( string.format( "Average:  %f", file_sum/file_cnt_data ) )
print( string.format( "Maximum %d consecutive false readings starting at %s.", max_rejected, max_rejected_date ) )
Output:
File:	  readings.txt
Total:	  1358393.400000
Readings: 5471
Average:  10.497387
Maximum 589 consecutive false readings starting at 1993-02-09.

[edit] Mathematica

FileName = "Readings.txt"; data = Import[FileName,"TSV"];
 
Scan[(a=Position[#[[3;;All;;2]],1];
Print["Line:",#[[1]] ,"\tReject:", 24 - Length[a], "\t Accept:", Length[a], "\tLine_tot:",
Total@Part[#, Flatten[2*a]] , "\tLine_avg:", Total@Part[#, Flatten[2*a]]/Length[a]])&, data]
 
GlobalSum = Nb = Running = MaxRunRecorded = 0; MaxRunTime = {};
Scan[ For[i = 3, i < 50, i = i + 2,
If[#[[i]] == 1,
Running=0; GlobalSum += #[[i-1]]; Nb++;,
Running ++; If[MaxRunRecorded < Running, MaxRunRecorded = Running;MaxRunTime={ #[[1]]}; ];
]] &, data ]
 
Print["\nFile(s) : ",FileName,"\nTotal : ",AccountingForm@GlobalSum,"\nReadings : ",Nb,
"\nAverage : ",GlobalSum/Nb,"\n\nMaximum run(s) of ",MaxRunRecorded,
" consecutive false readings ends at line starting with date(s):",MaxRunTime]
Line:1990-01-01	Reject:2	 Accept:22	Line_tot:590.	Line_avg:26.8182
Line:1990-01-02	Reject:0	 Accept:24	Line_tot:410.	Line_avg:17.0833
Line:1990-01-03	Reject:0	 Accept:24	Line_tot:1415.	Line_avg:58.9583
Line:1990-01-04	Reject:0	 Accept:24	Line_tot:1800.	Line_avg:75.
Line:1990-01-05	Reject:0	 Accept:24	Line_tot:1130.	Line_avg:47.0833
....

File(s) : Readings.txt
Total : 1358393.
Readings : 129403
Average : 10.4974

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s):{1993-03-05}

[edit] OCaml

let input_line ic =
try Some(input_line ic)
with End_of_file -> None
 
let fold_input f ini ic =
let rec fold ac =
match input_line ic with
| Some line -> fold (f ac line)
| None -> ac
in
fold ini
 
let ic = open_in "readings.txt"
 
let scan line =
Scanf.sscanf line "%s\
\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\
\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\
\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\
\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d\t%f\t%d"

(fun date
v1 f1 v2 f2 v3 f3 v4 f4 v5 f5 v6 f6
v7 f7 v8 f8 v9 f9 v10 f10 v11 f11 v12 f12
v13 f13 v14 f14 v15 f15 v16 f16 v17 f17 v18 f18
v19 f19 v20 f20 v21 f21 v22 f22 v23 f23 v24 f24 ->
(date),
[ (v1, f1 ); (v2, f2 ); (v3, f3 ); (v4, f4 ); (v5, f5 ); (v6, f6 );
(v7, f7 ); (v8, f8 ); (v9, f9 ); (v10, f10); (v11, f11); (v12, f12);
(v13, f13); (v14, f14); (v15, f15); (v16, f16); (v17, f17); (v18, f18);
(v19, f19); (v20, f20); (v21, f21); (v22, f22); (v23, f23); (v24, f24); ])
 
let tot_file, num_file, _, nodata_max, nodata_maxline =
fold_input
(fun (tot_file, num_file, nodata, nodata_max, nodata_maxline) line ->
let date, datas = scan line in
let _datas = List.filter (fun (_, flag) -> flag > 0) datas in
let ok = List.length _datas in
let tot = List.fold_left (fun ac (value, _) -> ac +. value) 0.0 _datas in
let nodata, nodata_max, nodata_maxline =
List.fold_left
(fun (nodata, nodata_max, nodata_maxline) (_, flag) ->
if flag <= 0
then (succ nodata, nodata_max, nodata_maxline)
else
if nodata_max = nodata && nodata > 0
then (0, nodata_max, date::nodata_maxline)
else if nodata_max < nodata && nodata > 0
then (0, nodata, [date])
else (0, nodata_max, nodata_maxline)
)
(nodata, nodata_max, nodata_maxline) datas in
Printf.printf "Line: %s" date;
Printf.printf " Reject: %2d Accept: %2d" (24 - ok) ok;
Printf.printf "\tLine_tot: %8.3f" tot;
Printf.printf "\tLine_avg: %8.3f\n" (tot /. float ok);
(tot_file +. tot, num_file + ok, nodata, nodata_max, nodata_maxline))
(0.0, 0, 0, 0, [])
ic ;;
 
close_in ic ;;
 
Printf.printf "Total = %f\n" tot_file;
Printf.printf "Readings = %d\n" num_file;
Printf.printf "Average = %f\n" (tot_file /. float num_file);
Printf.printf "Maximum run(s) of %d consecutive false readings \
ends at line starting with date(s): %s\n"

nodata_max (String.concat ", " nodata_maxline);

[edit] Perl

[edit] An AWK-like solution

use strict;
use warnings;
 
my $nodata = 0; # Current run of consecutive flags<0 in lines of file
my $nodata_max = -1; # Max consecutive flags<0 in lines of file
my $nodata_maxline = "!"; # ... and line number(s) where it occurs
 
my $infiles = join ", ", @ARGV;
 
my $tot_file = 0;
my $num_file = 0;
 
while (<>) {
chomp;
my $tot_line = 0; # sum of line data
my $num_line = 0; # number of line data items with flag>0
my $rejects = 0;
 
# extract field info, skipping initial date field
my ($date, @fields) = split;
while (@fields and my ($datum, $flag) = splice @fields, 0, 2) {
if ($flag+1 < 2) {
$nodata++;
$rejects++;
next;
}
 
# check run of data-absent fields
if($nodata_max == $nodata and $nodata > 0){
$nodata_maxline = "$nodata_maxline, $date";
}
if($nodata_max < $nodata and $nodata > 0){
$nodata_max = $nodata;
$nodata_maxline = $date;
}
# re-initialise run of nodata counter
$nodata = 0;
# gather values for averaging
$tot_line += $datum;
$num_line++;
}
 
# totals for the file so far
$tot_file += $tot_line;
$num_file += $num_line;
 
printf "Line: %11s Reject: %2i Accept: %2i Line_tot: %10.3f Line_avg: %10.3f\n",
$date, $rejects, $num_line, $tot_line, ($num_line>0)? $tot_line/$num_line: 0;
 
}
 
printf "\n";
printf "File(s) = %s\n", $infiles;
printf "Total = %10.3f\n", $tot_file;
printf "Readings = %6i\n", $num_file;
printf "Average = %10.3f\n", $tot_file / $num_file;
 
printf "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s\n",
$nodata_max, $nodata_maxline;
Sample output:
bash$ perl -f readings.pl readings.txt | tail
Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings.txt
Total    = 1358393.400
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05
bash$

[edit] An object-oriented solution

use strict;
use warnings;
 
use constant RESULT_TEMPLATE => "%-19s = %12.3f / %-6u = %.3f\n";
 
my $parser = Parser->new;
 
# parse lines and print results
printf RESULT_TEMPLATE, $parser->parse(split)
while <>;
 
$parser->finish;
 
# print total and summary
printf "\n".RESULT_TEMPLATE."\n", $parser->result;
printf "the maximum of %u consecutive bad values was reached %u time(s)\n",
$parser->bad_max, scalar $parser->bad_ranges;
 
# print bad ranges
print for map { ' '.join(' - ', @$_)."\n" } $parser->bad_ranges;
 
BEGIN {
package main::Parser;
 
sub new {
my $obj = {
SUM => 0,
COUNT => 0,
CURRENT_DATE => undef,
BAD_DATE => undef,
BAD_RANGES => [],
BAD_MAX => 0,
BAD_COUNT => 0
};
 
return bless $obj;
}
 
sub _average {
my ($sum, $count) = @_;
return ($sum, $count, $count && $sum / $count);
}
 
sub _push_bad_range_if_necessary {
my ($parser) = @_;
my ($count, $max) = @$parser{qw(BAD_COUNT BAD_MAX)};
 
return if $count < $max;
 
if ($count > $max) {
$parser->{BAD_RANGES} = [];
$parser->{BAD_MAX} = $count;
}
 
push @{$parser->{BAD_RANGES}}, [ @$parser{qw(BAD_DATE CURRENT_DATE)} ];
}
 
sub _check {
my ($parser, $flag) = @_;
if ($flag <= 0) {
++$parser->{BAD_COUNT};
$parser->{BAD_DATE} = $parser->{CURRENT_DATE}
unless defined $parser->{BAD_DATE};
 
return 0;
}
else {
$parser->_push_bad_range_if_necessary;
$parser->{BAD_COUNT} = 0;
$parser->{BAD_DATE} = undef;
return 1;
}
}
 
sub bad_max {
my ($parser) = @_;
return $parser->{BAD_MAX}
}
 
sub bad_ranges {
my ($parser) = @_;
return @{$parser->{BAD_RANGES}}
}
 
sub parse {
my $parser = shift;
my $date = shift;
 
$parser->{CURRENT_DATE} = $date;
 
my $sum = 0;
my $count = 0;
 
while (my ($value, $flag) = splice @_, 0, 2) {
next unless $parser->_check($flag);
$sum += $value;
++$count;
}
 
$parser->{SUM} += $sum;
$parser->{COUNT} += $count;
 
return ("average($date)", _average($sum, $count));
}
 
sub result {
my ($parser) = @_;
return ('total-average', _average(@$parser{qw(SUM COUNT)}));
}
 
sub finish {
my ($parser) = @_;
$parser->_push_bad_range_if_necessary
}
}
Sample output:
$ perl readings.pl < readings.txt | tail
average(2004-12-27) =       57.100 / 23     = 2.483
average(2004-12-28) =       77.800 / 23     = 3.383
average(2004-12-29) =       56.300 / 23     = 2.448
average(2004-12-30) =       65.300 / 23     = 2.839
average(2004-12-31) =       47.300 / 23     = 2.057

total-average       =  1358393.400 / 129403 = 10.497

the maximum of 589 consecutive bad values was reached 1 time(s)
  1993-02-09 - 1993-03-05

$

[edit] Perl 6

my @gaps;
my $previous = 'valid';
 
for $*IN.lines -> $line {
my ($date, @readings) = split /\s+/, $line;
my @valid;
my $hour = 0;
for @readings -> $reading, $flag {
if $flag > 0 {
@valid.push($reading);
if $previous eq 'invalid' {
@gaps[*-1]{'end'} = "$date $hour:00";
$previous = 'valid';
}
}
else
{
if $previous eq 'valid' {
@gaps.push( {start => "$date $hour:00"} );
}
@gaps[*-1]{'count'}++;
$previous = 'invalid';
}
$hour++;
}
say "$date: { ( +@valid ?? ( ( [+] @valid ) / +@valid ).fmt("%.3f") !! 0 ).fmt("%8s") }",
" mean from { (+@valid).fmt("%2s") } valid.";
};
 
my $longest = @gaps.sort({-$^a<count>})[0];
 
say "Longest period of invalid readings was {$longest<count>} hours,\n",
"from {$longest<start>} till {$longest<end>}."
Output:
1990-01-01:   26.818 mean from 22 valid.
1990-01-02:   17.083 mean from 24 valid.
1990-01-03:   58.958 mean from 24 valid.
1990-01-04:   75.000 mean from 24 valid.
1990-01-05:   47.083 mean from 24 valid.
...
(many lines omitted)
...
2004-12-27:    2.483 mean from 23 valid.
2004-12-28:    3.383 mean from 23 valid.
2004-12-29:    2.448 mean from 23 valid.
2004-12-30:    2.839 mean from 23 valid.
2004-12-31:    2.057 mean from 23 valid.
Longest period of invalid readings was 589 hours,
from 1993-02-09 1:00 till 1993-03-05 14:00.

[edit] PicoLisp

Translation of: AWK

Put the following into an executable file "readings":

#!/usr/bin/picolisp /usr/lib/picolisp/lib.l
 
(let (NoData 0 NoDataMax -1 NoDataMaxline "!" TotFile 0 NumFile 0)
(let InFiles
(glue ","
(mapcar
'((File)
(in File
(while (split (line) "^I")
(let (Len (length @) Date (car @) TotLine 0 NumLine 0)
(for (L (cdr @) L (cddr L))
(if (> 1 (format (cadr L)))
(inc 'NoData)
(when (gt0 NoData)
(when (= NoDataMax NoData)
(setq NoDataMaxline (pack NoDataMaxline ", " Date)) )
(when (> NoData NoDataMax)
(setq NoDataMax NoData NoDataMaxline Date) ) )
(zero NoData)
(inc 'TotLine (format (car L) 3))
(inc 'NumLine) ) )
(inc 'TotFile TotLine)
(inc 'NumFile NumLine)
(tab (-7 -12 -7 3 -9 3 -11 11 -11 11)
"Line:" Date
"Reject:" (- (/ (dec Len) 2) NumLine)
" Accept:" NumLine
" Line_tot:" (format TotLine 3)
" Line_avg:"
(and (gt0 NumLine) (format (*/ TotLine @) 3)) ) ) ) )
File )
(argv) ) )
(prinl)
(prinl "File(s) = " InFiles)
(prinl "Total = " (format TotFile 3))
(prinl "Readings = " NumFile)
(prinl "Average = " (format (*/ TotFile NumFile) 3))
(prinl)
(prinl
"Maximum run(s) of " NoDataMax
" consecutive false readings ends at line starting with date(s): " NoDataMaxline ) ) )
 
(bye)

Then it can be called as

$ ./readings readings.txt |tail
Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings.txt
Total    = 1358393.400
Readings = 129403
Average  = 10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05
$

[edit] PL/I

text1: procedure options (main); /* 13 May 2010 */
 
declare line character (2000) varying;
declare 1 pairs(24),
2 value fixed (10,4),
2 flag fixed;
declare date character (12) varying;
declare no_items fixed decimal (10);
declare (nv, sum, line_no, ndud_values, max_ndud_values) fixed;
declare (i, k) fixed binary;
declare in file input;
 
open file (in) title ('/TEXT1.DAT,TYPE(TEXT),RECSIZE(2000)' );
 
on endfile (in) go to finish_up;
 
line_no = 0;
loop:
do forever;
get file (in) edit (line) (L);
/* put skip list (line); */
line = translate(line, ' ', '09'x);
line_no = line_no + 1;
line = trim(line);
no_items = tally(line, ' ') - tally(line, ' ') + 1;
if no_items ^= 49 then
do; put skip list ('There are not 49 items on this line'); iterate loop; end;
k = index(line, ' '); /* Find the first blank in the line. */
date = substr(line, 1, k);
line = substr(line, k) || ' ';
on conversion go to loop;
get string (line) list (pairs);
sum, nv, ndud_values, max_ndud_values = 0;
do i = 1 to 24;
if flag(i) > 0 then
do; sum = sum + value(i); nv = nv + 1;
ndud_values = 0; /* reset the counter of dud values */
end;
else
do; /* we have a dud reading. */
ndud_values = ndud_values + 1;
if ndud_values > max_ndud_values then
max_ndud_values = ndud_values;
end;
end;
if nv = 0 then iterate;
put skip list ('Line ' || trim(line_no) || ' average=', divide(sum, nv, 10,4) );
if max_ndud_values > 0 then
put skip list ('Maximum run of dud readings =', max_ndud_values);
end;
 
finish_up:
 
end text1;

[edit] PureBasic

#TASK="Text processing/1"
Define File$, InLine$, Part$, i, Out$, ErrEnds$, Errcnt, ErrMax
Define lsum.d, tsum.d, rejects, val.d, readings
 
File$=OpenFileRequester(#TASK,"readings.txt","",0)
If OpenConsole() And ReadFile(0,File$)
While Not Eof(0)
InLine$=ReadString(0)
For i=1 To 1+2*24
Part$=StringField(InLine$,i,#TAB$)
If i=1 ; Date
Out$=Part$: lsum=0: rejects=0
ElseIf i%2=0 ; Recorded value
val=ValD(Part$)
Else ; Status part
If Val(Part$)>0
Errcnt=0 : readings+1
lsum+val : tsum+val
Else
rejects+1: Errcnt+1
If Errcnt>ErrMax
ErrMax=Errcnt
ErrEnds$=Out$
EndIf
EndIf
EndIf
Next i
Out$+" Rejects: " + Str(rejects)
Out$+" Accepts: " + Str(24-rejects)
Out$+" Line_tot: "+ StrD(lsum,3)
If rejects<24
Out$+" Line_avg: "+StrD(lsum/(24-rejects),3)
Else
Out$+" Line_avg: N/A"
EndIf
PrintN("Line: "+Out$)
Wend
PrintN(#CRLF$+"File = "+GetFilePart(File$))
PrintN("Total = "+ StrD(tsum,3))
PrintN("Readings = "+ Str(readings))
PrintN("Average = "+ StrD(tsum/readings,3))
Print(#CRLF$+"Maximum of "+Str(ErrMax))
PrintN(" consecutive false readings, ends at "+ErrEnds$)
CloseFile(0)
;
Print("Press ENTER to exit"): Input()
EndIf
Sample output:
...
Line: 2004-12-27 Rejects: 1 Accepts: 23 Line_tot: 57.100 Line_avg: 2.483
Line: 2004-12-28 Rejects: 1 Accepts: 23 Line_tot: 77.800 Line_avg: 3.383
Line: 2004-12-29 Rejects: 1 Accepts: 23 Line_tot: 56.300 Line_avg: 2.448
Line: 2004-12-30 Rejects: 1 Accepts: 23 Line_tot: 65.300 Line_avg: 2.839
Line: 2004-12-31 Rejects: 1 Accepts: 23 Line_tot: 47.300 Line_avg: 2.057

File     = readings.txt
Total    = 1358393.400
Readings = 129403
Average  = 10.497

Maximum of 589 consecutive false readings, ends at 1993-03-05

[edit] Python

import fileinput
import sys
 
nodata = 0; # Current run of consecutive flags<0 in lines of file
nodata_max=-1; # Max consecutive flags<0 in lines of file
nodata_maxline=[]; # ... and line number(s) where it occurs
 
tot_file = 0 # Sum of file data
num_file = 0 # Number of file data items with flag>0
 
infiles = sys.argv[1:]
 
for line in fileinput.input():
tot_line=0; # sum of line data
num_line=0; # number of line data items with flag>0
 
# extract field info
field = line.split()
date = field[0]
data = [float(f) for f in field[1::2]]
flags = [int(f) for f in field[2::2]]
 
for datum, flag in zip(data, flags):
if flag<1:
nodata += 1
else:
# check run of data-absent fields
if nodata_max==nodata and nodata>0:
nodata_maxline.append(date)
if nodata_max<nodata and nodata>0:
nodata_max=nodata
nodata_maxline=[date]
# re-initialise run of nodata counter
nodata=0;
# gather values for averaging
tot_line += datum
num_line += 1
 
# totals for the file so far
tot_file += tot_line
num_file += num_line
 
print "Line: %11s Reject: %2i Accept: %2i Line_tot: %10.3f Line_avg: %10.3f" % (
date,
len(data) -num_line,
num_line, tot_line,
tot_line/num_line if (num_line>0) else 0)
 
print ""
print "File(s) = %s" % (", ".join(infiles),)
print "Total = %10.3f" % (tot_file,)
print "Readings = %6i" % (num_file,)
print "Average = %10.3f" % (tot_file / num_file,)
 
print "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s" % (
nodata_max, ", ".join(nodata_maxline))
Sample output:
bash$ /cygdrive/c/Python26/python readings.py readings.txt|tail
Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings.txt
Total    = 1358393.400
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05
bash$

[edit] R

#Read in data from file
dfr <- read.delim("readings.txt")
#Calculate daily means
flags <- as.matrix(dfr[,seq(3,49,2)])>0
vals <- as.matrix(dfr[,seq(2,49,2)])
daily.means <- rowSums(ifelse(flags, vals, 0))/rowSums(flags)
#Calculate time between good measurements
times <- strptime(dfr[1,1], "%Y-%m-%d", tz="GMT") + 3600*seq(1,24*nrow(dfr),1)
hours.between.good.measurements <- diff(times[t(flags)])/3600

[edit] Racket

#lang racket
;; Use SRFI 48 to make %n.nf formats convenient.
(require (prefix-in srfi/48: srfi/48)) ; SRFI 48: Intermediate Format Strings
 
;; Parameter allows us to used exact decimal strings
(read-decimal-as-inexact #f)
 
;; files to read is a sequence, so it could be either a list or vector of files
(define (text-processing/1 files-to-read)
 
(define (print-line-info d r a t)
(srfi/48:format #t "Line: ~11F Reject: ~2F Accept: ~2F Line_tot: ~10,3F Line_avg: ~10,3F~%"
d r a t (if (zero? a) +nan.0 (/ t a))))
 
 ;; returns something that can be used as args to an apply
(define (handle-and-tag-max consecutive-false tag max-consecutive-false max-false-tags)
(let ((consecutive-false+1 (add1 consecutive-false)))
(list consecutive-false+1
(max max-consecutive-false consecutive-false+1)
(cond ((= consecutive-false+1 max-consecutive-false) (cons tag max-false-tags))
((= consecutive-false max-consecutive-false) (list tag))
(else max-false-tags)))))
 
(define (sub-t-p/1 N sum consecutive-false max-consecutive-false max-false-tags)
(for/fold ((N N) (sum sum) (consecutive-false consecutive-false) (max-consecutive-false max-consecutive-false) (max-false-tags max-false-tags))
((l (in-lines)))
(match l
[(app string-split `(,tag ,(app string->number vs.ss) ...))
(let get-line-pairs
((vs.ss vs.ss) (line-N 0) (reject 0) (line-sum 0) (consecutive-false consecutive-false)
(max-consecutive-false max-consecutive-false) (max-false-tags max-false-tags))
(match vs.ss
['()
(print-line-info tag reject line-N line-sum)
(values (+ N line-N) (+ sum line-sum) consecutive-false max-consecutive-false max-false-tags)]
[(list-rest v (? positive?) tl)
(get-line-pairs tl (add1 line-N) reject (+ line-sum v) 0 max-consecutive-false max-false-tags)]
[(list-rest _ _ tl)
(apply get-line-pairs tl line-N (add1 reject) line-sum
(handle-and-tag-max consecutive-false tag max-consecutive-false max-false-tags))]))]
(x (fprintf (current-error-port) "mismatch ~s~%" x)
(values N sum consecutive-false max-consecutive-false max-false-tags)))))
 
(for/fold ((N 0) (sum 0) (consecutive-false 0) (max-consecutive-false 0) (max-false-tags null))
((f files-to-read))
(with-input-from-file f
(lambda () (sub-t-p/1 N sum consecutive-false max-consecutive-false max-false-tags)))))
 
(let ((files (vector->list (current-command-line-arguments))))
(let-values (([N sum consecutive-false max-consecutive-false max-false-tags] (text-processing/1 files)))
(srfi/48:format #t "~%File(s) = ~a~%Total = ~10,3F~%Readings = ~6F~%" (string-join files) sum N)
(unless (zero? N) (srfi/48:format #t "Average = ~10,3F~%" (/ sum N)))
(srfi/48:format #t "~%Maximum run(s) of ~a consecutive false readings ends at line starting with date(s): ~a~%"
max-consecutive-false (string-join max-false-tags))))
Sample run:
$ racket 1.rkt readings/readings.txt | tail
Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings/readings.txt
Total    = 1358393.400
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05

[edit] REXX

/*REXX program to process  instrument data  from a  data file.          */
numeric digits 20 /*allow for bigger numbers. */
ifid='READINGS.TXT' /*the input file. */
ofid='READINGS.OUT' /*the outut file. */
grandSum=0 /*grand sum of whole file. */
grandflg=0 /*grand num of flagged data. */
grandOKs=0
longFlag=0 /*longest period of flagged data.*/
contFlag=0 /*longest continous flagged data.*/
w=16 /*width of fields when displayed.*/
 
do recs=1 while lines(ifid)\==0 /*read until finished. */
rec=linein(ifid) /*read the next record (line). */
parse var rec datestamp Idata /*pick off the dateStamp & data. */
sum=0
flg=0
OKs=0
 
do j=1 until Idata='' /*process the instrument data. */
parse var Idata data.j flag.j Idata
 
if flag.j>0 then do /*if good data, ... */
OKs=OKs+1
sum=sum+data.j
if contFlag>longFlag then do
longdate=datestamp
longFlag=contFlag
end
contFlag=0
end
else do /*flagged data ... */
flg=flg+1
contFlag=contFlag+1
end
end /*j*/
 
if OKs\==0 then avg=format(sum/OKs,,3)
else avg='[n/a]'
grandOKs=grandOKs+OKs
_=right(comma(avg),w)
grandSum=grandSum+sum
grandFlg=grandFlg+flg
if flg==0 then call sy datestamp ' average='_
else call sy datestamp ' average='_ ' flagged='right(flg,2)
end /*recs*/
 
recs=recs-1 /*adjust for reading end-of-file.*/
if grandOKs\==0 then Gavg=format(grandsum/grandOKs,,3)
else Gavg='[n/a]'
call sy
call sy copies('═',60)
call sy ' records read:' right(comma(recs),w)
call sy ' grand sum:' right(comma(grandSum),w+4)
call sy ' grand average:' right(comma(Gavg),w+4)
call sy ' grand OK data:' right(comma(grandOKs),w)
call sy ' grand flagged:' right(comma(grandFlg),w)
if longFlag\==0 then
call sy ' longest flagged:' right(comma(longFlag),w) " ending at " longdate
call sy copies('═',60)
exit /*stick a fork in it, we're done.*/
/*──────────────────────────────────SY subroutine───────────────────────*/
sy: procedure; parse arg stuff; say stuff
if 1==0 then call lineout ofid,stuff
return
/*──────────────────────────────────COMMA subroutine────────────────────*/
comma: procedure; parse arg _,c,p,t;arg ,cu;c=word(c ",",1)
if cu=='BLANK' then c=' ';o=word(p 3,1);p=abs(o);t=word(t 999999999,1)
if \datatype(p,'W')|\datatype(t,'W')|p==0|arg()>4 then return _;n=_'.9'
#=123456789;k=0;if o<0 then do;b=verify(_,' ');if b==0 then return _
e=length(_)-verify(reverse(_),' ')+1;end;else do;b=verify(n,#,"M")
e=verify(n,#'0',,verify(n,#"0.",'M'))-p-1;end
do j=e to b by -p while k<t;_=insert(c,_,j);k=k+1;end;return _
Output:
   ∙
   ∙
   ∙
1991-10-16  average=           4.167   flagged= 6
1991-10-17  average=          10.867   flagged= 9
1991-10-18  average=           3.083
   ∙
   ∙
   ∙
============================================================
      records read:            5,471
     grand     sum:        1,358,393.400
     grand average:               10.497
     grand OK data:          129,403
     grand flagged:            1,901
   longest flagged:              589  ending at  1993-03-05
============================================================

[edit] Ruby

filename = "readings.txt"
total = { "num_readings" => 0, "num_good_readings" => 0, "sum_readings" => 0.0 }
invalid_count = 0
max_invalid_count = 0
invalid_run_end = ""
 
File.new(filename).each do |line|
num_readings = 0
num_good_readings = 0
sum_readings = 0.0
 
fields = line.split
fields[1..-1].each_slice(2) do |reading, flag|
num_readings += 1
if Integer(flag) > 0
num_good_readings += 1
sum_readings += Float(reading)
invalid_count = 0
else
invalid_count += 1
if invalid_count > max_invalid_count
max_invalid_count = invalid_count
invalid_run_end = fields[0]
end
end
end
 
printf "Line: %11s Reject: %2d Accept: %2d Line_tot: %10.3f Line_avg: %10.3f\n",
fields[0], num_readings - num_good_readings, num_good_readings, sum_readings,
num_good_readings > 0 ? sum_readings/num_good_readings : 0.0
 
total["num_readings"] += num_readings
total["num_good_readings"] += num_good_readings
total["sum_readings"] += sum_readings
end
 
puts ""
puts "File(s) = #{filename}"
printf "Total = %.3f\n", total['sum_readings']
puts "Readings = #{total['num_good_readings']}"
printf "Average = %.3f\n", total['sum_readings']/total['num_good_readings']
puts ""
puts "Maximum run(s) of #{max_invalid_count} consecutive false readings ends at #{invalid_run_end}"

[edit] Scala

Works with: Scala version 2.8

A fully functional solution, minus the fact that it uses iterators:

object DataMunging {
import scala.io.Source
 
def spans[A](list: List[A]) = list.tail.foldLeft(List((list.head, 1))) {
case ((a, n) :: tail, b) if a == b => (a, n + 1) :: tail
case (l, b) => (b, 1) :: l
}
 
type Flag = ((Boolean, Int), String)
type Flags = List[Flag]
type LineIterator = Iterator[Option[(Double, Int, Flags)]]
 
val pattern = """^(\d+-\d+-\d+)""" + """\s+(\d+\.\d+)\s+(-?\d+)""" * 24 + "$" r;
 
def linesIterator(file: java.io.File) = Source.fromFile(file).getLines().map(
pattern findFirstMatchIn _ map (
_.subgroups match {
case List(date, rawData @ _*) =>
val dataset = (rawData map (_ toDouble) iterator) grouped 2 toList;
val valid = dataset filter (_.last > 0) map (_.head)
val validSize = valid length;
val validSum = valid sum;
val flags = spans(dataset map (_.last > 0)) map ((_, date))
println("Line: %11s Reject: %2d Accept: %2d Line_tot: %10.3f Line_avg: %10.3f" format
(date, 24 - validSize, validSize, validSum, validSum / validSize))
(validSum, validSize, flags)
}
)
)
 
def totalizeLines(fileIterator: LineIterator) =
fileIterator.foldLeft(0.0, 0, List[Flag]()) {
case ((totalSum, totalSize, ((flag, size), date) :: tail), Some((validSum, validSize, flags))) =>
val ((firstFlag, firstSize), _) = flags.last
if (firstFlag == flag) {
(totalSum + validSum, totalSize + validSize, flags.init ::: ((flag, size + firstSize), date) :: tail)
} else {
(totalSum + validSum, totalSize + validSize, flags ::: ((flag, size), date) :: tail)
}
case ((_, _, Nil), Some(partials)) => partials
case (totals, None) => totals
}
 
def main(args: Array[String]) {
val files = args map (new java.io.File(_)) filter (file => file.isFile && file.canRead)
val lines = files.iterator flatMap linesIterator
val (totalSum, totalSize, flags) = totalizeLines(lines)
val ((_, invalidCount), startDate) = flags.filter(!_._1._1).max
val report = """|
|File(s) = %s
|Total = %10.3f
|Readings = %6d
|Average = %10.3f
|
|Maximum run(s) of %d consecutive false readings began at %s"
"".stripMargin
println(report format (files mkString " ", totalSum, totalSize, totalSum / totalSize, invalidCount, startDate))
}
}

A quick&dirty solution:

object AltDataMunging {
def main(args: Array[String]) {
var totalSum = 0.0
var totalSize = 0
var maxInvalidDate = ""
var maxInvalidCount = 0
var invalidDate = ""
var invalidCount = 0
val files = args map (new java.io.File(_)) filter (file => file.isFile && file.canRead)
 
files.iterator flatMap (file => Source fromFile file getLines ()) map (_.trim split "\\s+") foreach {
case Array(date, rawData @ _*) =>
val dataset = (rawData map (_ toDouble) iterator) grouped 2 toList;
val valid = dataset filter (_.last > 0) map (_.head)
val flags = spans(dataset map (_.last > 0)) map ((_, date))
println("Line: %11s Reject: %2d Accept: %2d Line_tot: %10.3f Line_avg: %10.3f" format
(date, 24 - valid.size, valid.size, valid.sum, valid.sum / valid.size))
totalSum += valid.sum
totalSize += valid.size
dataset foreach {
case _ :: flag :: Nil if flag > 0 =>
if (invalidCount > maxInvalidCount) {
maxInvalidDate = invalidDate
maxInvalidCount = invalidCount
}
invalidCount = 0
case _ =>
if (invalidCount == 0) invalidDate = date
invalidCount += 1
}
}
 
val report = """|
|File(s) = %s
|Total = %10.3f
|Readings = %6d
|Average = %10.3f
|
|Maximum run(s) of %d consecutive false readings began at %s"
"".stripMargin
println(report format (files mkString " ", totalSum, totalSize, totalSum / totalSize, maxInvalidCount, maxInvalidDate))
}
}

Last few lines of the sample output (either version):

Line:  2004-12-29  Reject:  1  Accept: 23  Line_tot:     56.300  Line_avg:      2.448
Line:  2004-12-30  Reject:  1  Accept: 23  Line_tot:     65.300  Line_avg:      2.839
Line:  2004-12-31  Reject:  1  Accept: 23  Line_tot:     47.300  Line_avg:      2.057

File(s)  = readings.txt
Total    = 1358393.400
Readings = 129403
Average  =     10.497

Maximum run(s) of 589 consecutive false readings began at 1993-02-09

Though it is easier to show when the consecutive false readings ends, if longest run is the last thing in the file, it hasn't really "ended".

[edit] Tcl

set max_invalid_run 0
set max_invalid_run_end ""
set tot_file 0
set num_file 0
 
set linefmt "Line: %11s Reject: %2d Accept: %2d Line_tot: %10.3f Line_avg: %10.3f"
 
set filename readings.txt
set fh [open $filename]
while {[gets $fh line] != -1} {
set tot_line [set count [set num_line 0]]
set fields [regexp -all -inline {\S+} $line]
set date [lindex $fields 0]
foreach {val flag} [lrange $fields 1 end] {
incr count
if {$flag > 0} {
incr num_line
incr num_file
set tot_line [expr {$tot_line + $val}]
set invalid_run_count 0
} else {
incr invalid_run_count
if {$invalid_run_count > $max_invalid_run} {
set max_invalid_run $invalid_run_count
set max_invalid_run_end $date
}
}
}
set tot_file [expr {$tot_file + $tot_line}]
puts [format $linefmt $date [expr {$count - $num_line}] $num_line $tot_line \
[expr {$num_line > 0 ? $tot_line / $num_line : 0}]]
}
close $fh
 
puts ""
puts "File(s) = $filename"
puts "Total = [format %.3f $tot_file]"
puts "Readings = $num_file"
puts "Average = [format %.3f [expr {$tot_file / $num_file}]]"
puts ""
puts "Maximum run(s) of $max_invalid_run consecutive false readings ends at $max_invalid_run_end"

[edit] Ursala

The input file is transformed to a list of assignments of character strings to lists of pairs of floats and booleans (type %ebXLm) in the parsed data. The same function is used to compute the daily and the cumulative statistics.

#import std
#import nat
#import flo
 
parsed_data = ^|A(~&,* ^|/%ep@iNC ~&h==`1)*htK27K28pPCS (sep 9%cOi&)*FyS readings_dot_txt
 
daily_stats =
 
* ^|A(~&,@rFlS ^/length ^/plus:-0. ||0.! ~&i&& mean); mat` + <.
~&n,
'accept: '--+ @ml printf/'%7.0f'+ float,
'total: '--+ @mrl printf/'%10.1f',
'average: '--+ @mrr printf/'%7.3f'>
 
long_run =
 
-+
~&i&& ^|TNC('maximum of '--@h+ %nP,' consecutive false readings ending on line '--),
@nmrSPDSL -&~&,leql$^; ^/length ~&zn&-@hrZPF+ rlc both ~&rZ+-
 
main = ^T(daily_stats^lrNCT/~& @mSL 'summary ':,long_run) parsed_data

last few lines of output:

2004-12-29 accept:      23 total:       56.3 average:   2.448
2004-12-30 accept:      23 total:       65.3 average:   2.839
2004-12-31 accept:      23 total:       47.3 average:   2.057
summary    accept:  129403 total:  1358393.4 average:  10.497
maximum of 589 consecutive false readings ending on line 1993-03-05

[edit] Vedit macro language

Translation of: AWK

Vedit does not have floating point data type, so fixed point calculations are used here.

#50 = Buf_Num		// Current edit buffer (source data)
File_Open("output.txt")
#51 = Buf_Num // Edit buffer for output file
Buf_Switch(#50)
#10 = 0 // total sum of file data
#11 = 0 // number of valid data items in file
#12 = 0 // Current run of consecutive flags<0 in lines of file
#13 = -1 // Max consecutive flags<0 in lines of file
Reg_Empty(15) // ... and date tag(s) at line(s) where it occurs
 
While(!At_EOF) {
#20 = 0 // sum of line data
#21 = 0 // number of line data items with flag>0
#22 = 0 // number of line data items with flag<0
Reg_Copy_Block(14, Cur_Pos, Cur_Pos+10) // date field
 
// extract field info, skipping initial date field
Repeat(ALL) {
Search("|{|T,|N}", ADVANCE+ERRBREAK) // next Tab or Newline
if (Match_Item==2) { Break } // end of line
#30 = Num_Eval(ADVANCE) * 1000 // #30 = value
Char // fixed point, 3 decimal digits
#30 += Num_Eval(ADVANCE+SUPPRESS)
#31 = Num_Eval(ADVANCE) // #31 = flag
if (#31 < 1) { // not valid field?
#12++
#22++
} else { // valid field
// check run of data-absent fields
if(#13 == #12 && #12 > 0) {
Reg_Set(15, ", ", APPEND)
Reg_Set(15, @14, APPEND)
}
if(#13 < #12 && #12 > 0) {
#13 = #12
Reg_Set(15, @14)
}
 
// re-initialise run of nodata counter
#12 = 0
// gather values for averaging
#20 += #30
#21++
}
}
 
// totals for the file so far
#10 += #20
#11 += #21
 
Buf_Switch(#51) // buffer for output data
IT("Line: ") Reg_Ins(14)
IT(" Reject:") Num_Ins(#22, COUNT, 3)
IT(" Accept:") Num_Ins(#21, COUNT, 3)
IT(" Line tot:") Num_Ins(#20, COUNT, 8) Char(-3) IC('.') EOL
IT(" Line avg:") Num_Ins((#20+#21/2)/#21, COUNT, 7) Char(-3) IC('.') EOL IN
Buf_Switch(#50) // buffer for input data
}
 
Buf_Switch(#51) // buffer for output data
IN
IT("Total: ") Num_Ins(#10, FORCE+NOCR) Char(-3) IC('.') EOL IN
IT("Readings: ") Num_Ins(#11, FORCE)
IT("Average: ") Num_Ins((#10+#11/2)/#11, FORCE+NOCR) Char(-3) IC('.') EOL IN
IN
IT("Maximum run(s) of ") Num_Ins(#13, LEFT+NOCR)
IT(" consecutive false readings ends at line starting with date(s): ") Reg_Ins(15)
IN
Sample output:
Line: 2004-12-28  Reject:  1  Accept: 23  Line tot:   77.800  Line avg:   3.383
Line: 2004-12-29  Reject:  1  Accept: 23  Line tot:   56.300  Line avg:   2.448
Line: 2004-12-30  Reject:  1  Accept: 23  Line tot:   65.300  Line avg:   2.839
Line: 2004-12-31  Reject:  1  Accept: 23  Line tot:   47.300  Line avg:   2.057

Total:   1358393.400
Readings:     129403
Average:      10.497

Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox