Text processing/1: Difference between revisions

← Older edit

Text processing/1 (view source)

Revision as of 11:39, 14 February 2024

62,786 bytes added , 3 months ago

m

→‎{{header|Wren}}: Minor tidy

PureFox

9,477

edits

Revision as of 23:52, 3 January 2015 (view source) rosettacode>Def (Nimrod -> Nim) ← Older edit		Latest revision as of 11:39, 14 February 2024 (view source) PureFox (talk \| contribs) m (→‎{{header\|Wren}}: Minor tidy)
(47 intermediate revisions by 21 users not shown)
Line 2: Often data is produced by one program, in the wrong format for later use by another program or person. In these situations another program can be written to parse and transform the original data into a format useful to the other. The term "Data Munging" is [http://www.google.co.uk/search?q=%22data+munging%22 often] used in programming circles for this task. A [http://groups.google.co.uk/group/comp.lang.awk/msg/0ecba3a3fbf247d8?hl=en request] on the comp.lang.awk newsgroup ~~lead~~led to a typical data munging task: <pre>I have to analyse data files that have the following format: Each row corresponds to 1 day and the field logic is: $1 is the date, Line 20: The data is [http://www.eea.europa.eu/help/eea-help-centre/faqs/how-do-i-obtain-eea-reports free to download and use] and is of this format: Data is no longer available at that link. Zipped mirror available [https://github.com/thundergnat/rc/blob/master/resouces/readings.zip here] (offsite mirror). <pre style="overflow:scroll"> 1991-03-30 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 10.000 1 Line 32 ⟶ 34: Structure your program to show statistics for each line of the file, (similar to the original Python, Perl, and AWK examples below), followed by summary statistics for the file. When showing example output just show a few line statistics and the full end summary. =={{header\|11l}}== {{trans\|Python}} <syntaxhighlight lang="11l">V nodata = 0 V nodata_max = -1 [String] nodata_maxline V tot_file = 0.0 V num_file = 0 :start: L(line) File(:argv[1]).read().rtrim("\n").split("\n") V tot_line = 0.0 V num_line = 0 V field = line.split("\t") V date = field[0] V data = field[(1..).step(2)].map(f -> Float(f)) V flags = field[(2..).step(2)].map(f -> Int(f)) L(datum, flag) zip(data, flags) I flag < 1 nodata++ E I nodata_max == nodata & nodata > 0 nodata_maxline.append(date) I nodata_max < nodata & nodata > 0 nodata_max = nodata nodata_maxline = [date] nodata = 0 tot_line += datum num_line++ tot_file += tot_line num_file += num_line print(‘Line: #11 Reject: #2 Accept: #2 Line_tot: #6.3 Line_avg: #6.3’.format( date, data.len - num_line, num_line, tot_line, I (num_line > 0) {tot_line / num_line} E 0)) print() print(‘File(s) = #.’.format(:argv[1])) print(‘Total = #6.3’.format(tot_file)) print(‘Readings = #6’.format(num_file)) print(‘Average = #6.3’.format(tot_file / num_file)) print("\nMaximum run(s) of #. consecutive false readings ends at line starting with date(s): #.".format(nodata_max, nodata_maxline.join(‘, ’)))</syntaxhighlight> {{out}} <pre> ... Line: 2004-12-29 Reject: 1 Accept: 23 Line_tot: 56.300 Line_avg: 2.448 Line: 2004-12-30 Reject: 1 Accept: 23 Line_tot: 65.300 Line_avg: 2.839 Line: 2004-12-31 Reject: 1 Accept: 23 Line_tot: 47.300 Line_avg: 2.057 File(s) = readings.txt Total = 1358393.400 Readings = 129403 Average = 10.497 Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05 </pre> =={{header\|Ada}}== {{libheader\|Simple components for Ada}} <~~lang~~syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; with Strings_Edit; use Strings_Edit; with Strings_Edit.Floats; use Strings_Edit.Floats; Line 110 ⟶ 172: Close (File); Put_Line ("Syntax error at " & Image (Current.Line) & ':' & Image (Max.Pointer)); end Data_Munging;</~~lang~~syntaxhighlight> The implementation performs minimal checks. The average is calculated over all valid data. For the maximal chain of consequent invalid data, the source line number, the column number, and the time stamp of the first invalid data is printed. {{out\|Sample output}} Line 117 ⟶ 179: Max. 589 false readings start at 1136:20 stamped 1993-2-9 </pre> =={{header\|Aime}}== <syntaxhighlight lang="aime">integer bads, count, max_bads; file f; list l; real s; text bad_day, worst_day; f.stdin; max_bads = count = bads = s = 0; while (f.list(l, 0) ^ -1) { integer i; i = 2; while (i < 49) { if (0 < atoi(l[i])) { count += 1; s += atof(l[i - 1]); if (max_bads < bads) { max_bads = bads; worst_day = bad_day; } bads = 0; } else { if (!bads) { bad_day = l[0]; } bads += 1; } i += 2; } } o_form("Averaged /d3/ over ~ readings.\n", s / count, count); o_("Longest bad run ", max_bads, ", started ", worst_day, ".\n");</syntaxhighlight> Run as: <pre>cat readings.txt \| tr -d \\r \| aime SOURCE_FILE</pre> {{out}} <pre>Averaged 10.497 over 129403 readings. Longest bad run 589, started 1993-02-09.</pre> =={{header\|ALGOL 68}}== Line 123 ⟶ 227: {{works with\|ALGOL 68G\|Any - tested with release mk15-0.8b.fc9.i386}} <!--{{does not work with\|ELLA ALGOL 68\|Any (with appropriate job cards) - argc and argv are extensions}} --> <~~lang~~syntaxhighlight lang="algol68">INT no data := 0; # Current run of consecutive flags<0 in lines of file # INT no data max := -1; # Max consecutive flags<0 in lines of file # FLEX[0]STRING no data max line; # ... and line number(s) where it occurs # Line 217 ⟶ 321: upb list := UPB no data max line; printf(($l"Maximum run"f(p)" of "g(-0)" consecutive false reading"f(p)" ends at line starting with date"f(p)": "$, upb list = 1, no data max, no data max = 0, upb list = 1, list repr, no data max line, $l$))</~~lang~~syntaxhighlight> Command: $ a68g ./Data_Munging.a68 - data Line 236 ⟶ 340: Maximum run of 24 consecutive false readings ends at line starting with date: 1991-04-01 </pre> ~~=={{header\|Aime}}==~~ ~~<lang aime>integer bads, count, max_bads;~~ ~~file f;~~ ~~list l;~~ ~~real s;~~ ~~text bad_day, worst_day;~~ ~~f_affix(f, "/dev/stdin");~~ ~~max_bads = 0;~~ ~~count = 0;~~ ~~bads = 0;~~ ~~s = 0;~~ ~~while (f_list(f, l, 0) ^ -1) {~~ ~~integer e, i;~~ ~~i = 2;~~ ~~while (i < 49) {~~ ~~e = atoi(l_q_text(l, i));~~ ~~if (0 < e) {~~ ~~count += 1;~~ ~~s += atof(l_q_text(l, i - 1));~~ ~~if (max_bads < bads) {~~ ~~max_bads = bads;~~ ~~worst_day = bad_day;~~ } ~~bads = 0;~~ ~~} else {~~ ~~if (!bads) {~~ ~~bad_day = l_q_text(l, 0);~~ } ~~bads += 1;~~ } ~~i += 2;~~ } } ~~o_text("Averaged ");~~ ~~o_real(3, s / count);~~ ~~o_text(" over ");~~ ~~o_integer(count);~~ ~~o_text(" readings.\n");~~ ~~o_text("Longest bad run ");~~ ~~o_integer(max_bads);~~ ~~o_text(", started ");~~ ~~o_text(worst_day);~~ ~~o_text(".\n");</lang>~~ ~~Run as:~~ ~~<pre>cat readings.txt \| tr -d \\r \| aime SOURCE_FILE</pre>~~ ~~{{out}}~~ ~~<pre>Averaged 10.497 over 129403 readings.~~ ~~Longest bad run 589, started 1993-02-09.</pre>~~ =={{header\|AutoHotkey}}== <~~lang~~syntaxhighlight ~~AutoHotkey~~lang="autohotkey"># Author AlephX Aug 17 2011 SetFormat, float, 4.2 Line 366 ⟶ 415: Totavg := TotSum / TotValid FileAppend, `n`nDays %Lines%`nMaximal wrong readings: %maxwrong% from %startwrongdate% at %startoccurrence% to %lastwrongdate% at %lastoccurrence%`n`n, %result% FileAppend, Valid readings: %TotValid%`nTotal Value: %TotSUm%`nAverage: %TotAvg%, %result%</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>Day: 1990-01-01 sum: 590.00 avg: 26.82 Readings: 22/24.00 Line 385 ⟶ 434: =={{header\|AWK}}== <~~lang~~syntaxhighlight lang="awk">BEGIN{ nodata = 0; # Current run of consecutive flags<0 in lines of file nodata_max=-1; # Max consecutive flags<0 in lines of file Line 447 ⟶ 496: printf "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s\n", nodata_max, nodata_maxline }</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>bash$ awk -f readings.awk readings.txt \| tail Line 463 ⟶ 512: =={{header\|Batch File}}== <~~lang~~syntaxhighlight lang="dos">@echo off setlocal ENABLEDELAYEDEXPANSION set maxrun= 0 Line 529 ⟶ 578: echo Line: %date% Accept: %count:~-3% tot: %sum:~-8% avg: %mean:~-8% goto :EOF</~~lang~~syntaxhighlight> {{out}} <pre> Line 551 ⟶ 600: =={{header\|BBC BASIC}}== <~~lang~~syntaxhighlight lang="bbcbasic"> file% = OPENIN("readings.txt") IF file% = 0 THEN PRINT "Could not open test data file" : END Line 593 ⟶ 642: PRINT "Overall mean = " ; Total / Count% @% = &90A PRINT '"Longest run of bad readings = " ; BadMax% " ending " BadDate$</~~lang~~syntaxhighlight> {{out}} <pre> Line 616 ⟶ 665: =={{header\|C}}== <~~lang~~syntaxhighlight lang="c">#include <stdio.h> #include <stdlib.h> #include <string.h> Line 700 ⟶ 749: fclose(outfile); return 0; }</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>1990-01-01 Reject: 2 Accept: 22 Average: 26.818 Line 717 ⟶ 766: =={{header\|C++}}== <~~lang~~syntaxhighlight ~~Cpp~~lang="cpp">#include <iostream> #include <fstream> #include <string> Line 783 ⟶ 832: cout << "Maximum number of consecutive bad readings is " << badCountMax << endl; cout << "Ends on date " << badDate << endl; }</~~lang~~syntaxhighlight> {{out}} <pre>1990-01-01 Reject: 2 Accept: 22 Average: 26.818 Line 793 ⟶ 842: Maximum number of consecutive bad readings is 589 Ends on date 1993-03-05</pre> =={{header\|Clojure}}== <syntaxhighlight lang="clojure"> (ns rosettacode.textprocessing1 (:require [clojure.string :as str])) (defn parse-line [s] (let [[date & data-toks] (str/split s #"\s+")] {:date date :hour-vals (for [[v flag] (partition 2 data-toks)] {:val (Double. v) :flag (Long. flag)})})) (defn analyze-line [m] (let [valid? (fn [rec] (pos? (:flag rec))) data (->> (filter valid? (:hour-vals m)) (map :val)) n-vals (count data) sum (reduce + data)] {:date (:date m) :n-vals n-vals :sum (double sum) :avg (if (zero? n-vals) 0.0 (/ sum n-vals)) :gaps (for [hr (:hour-vals m)] {:gap? (not (valid? hr)) :date (:date m)})})) (defn print-line [m] (println (format "%s: %d valid, sum: %7.3f, mean: %6.3f" (:date m) (:n-vals m) (:sum m) (:avg m)))) (defn process-line [s] (let [m (parse-line s) line-info (analyze-line m)] (print-line line-info) line-info)) (defn update-file-stats [file-m line-m] (let [append (fn [a b] (reduce conj a b))] (-> file-m (update-in [:sum] + (:sum line-m)) (update-in [:n-vals] + (:n-vals line-m)) (update-in [:gap-recs] append (:gaps line-m))))) (defn process-file [path] (let [file-lines (->> (slurp path) str/split-lines) summary (reduce (fn [res line] (update-file-stats res (process-line line))) {:sum 0 :n-vals 0 :gap-recs []} file-lines) max-gap (->> (partition-by :gap? (:gap-recs summary)) (filter #(:gap? (first %))) (sort-by count >) first)] (println (format "Sum: %f\n# Values: %d\nAvg: %f" (:sum summary) (:n-vals summary) (/ (:sum summary) (:n-vals summary)))) (println (format "Max gap of %d recs started on %s" (count max-gap) (:date (first max-gap)))))) </syntaxhighlight> {{out}} <pre> ... Many lines elided ... 2004-12-24: 23 valid, sum: 67.500, mean: 2.935 2004-12-25: 23 valid, sum: 137.500, mean: 5.978 2004-12-26: 23 valid, sum: 154.600, mean: 6.722 2004-12-27: 23 valid, sum: 57.100, mean: 2.483 2004-12-28: 23 valid, sum: 77.800, mean: 3.383 2004-12-29: 23 valid, sum: 56.300, mean: 2.448 2004-12-30: 23 valid, sum: 65.300, mean: 2.839 2004-12-31: 23 valid, sum: 47.300, mean: 2.057 Sum: 1358393.400000 # Values: 129403 Avg: 10.497387 Max gap of 589 recs started on 1993-02-09 </pre> =={{header\|COBOL}}== <~~lang~~syntaxhighlight lang="cobol"> IDENTIFICATION DIVISION. PROGRAM-ID. data-munging. Line 938 ⟶ 1,074: GOBACK .</~~lang~~syntaxhighlight> {{Out\|Example output}} Line 961 ⟶ 1,097: =={{header\|Common Lisp}}== <~~lang~~syntaxhighlight lang="lisp">(defvar invalid-count) (defvar max-invalid) (defvar max-invalid-date) Line 1,008 ⟶ 1,144: (format t "~%Maximum run(s) of ~a consecutive false readings ends at ~ line starting with date(s): ~a~%" max-invalid max-invalid-date)))</~~lang~~syntaxhighlight> {{out\|Example output}} <pre>... Line 1,024 ⟶ 1,160: =={{header\|D}}== {{trans\|Python}} <~~lang~~syntaxhighlight lang="d">void main(in string[] args) { import std.stdio, std.conv, std.string; Line 1,098 ⟶ 1,234: "readings ends at line starting with date(s): %-(%s, %)", noDataMax, noDataMaxLine); }</~~lang~~syntaxhighlight> The output matches that of the [[#Python\|Python]] version. =={{header\|Eiffel}}== <syntaxhighlight lang="eiffel"> class APPLICATION create make feature make -- Summary statistics for 'hash'. local reject, accept, reading_total: INTEGER total, average, file_total: REAL do read_wordlist across hash as h loop io.put_string (h.key + "%T") reject := 0 accept := 0 total := 0 across h.item as data loop if data.item.flag > 0 then accept := accept + 1 total := total + data.item.val else reject := reject + 1 end end file_total := file_total + total reading_total := reading_total + accept io.put_string ("accept: " + accept.out + "%Treject: " + reject.out + "%Ttotal: " + total.out + "%T") average := total / accept.to_real io.put_string ("average: " + average.out + "%N") end io.put_string ("File total: " + file_total.out + "%N") io.put_string ("Readings total: " + reading_total.out + "%N") find_longest_gap end find_longest_gap -- Longest gap (flag values <= 0). local count: INTEGER longest_gap: INTEGER end_date: STRING do create end_date.make_empty across hash as h loop across h.item as data loop if data.item.flag <= 0 then count := count + 1 else if count > longest_gap then longest_gap := count end_date := h.key end count := 0 end end end io.put_string ("%NThe longest gap is " + longest_gap.out + ". It ends at the date stamp " + end_date + ". %N") end original_list: STRING = "readings.txt" read_wordlist -- Preprocessed wordlist in 'hash'. local l_file: PLAIN_TEXT_FILE data: LIST [STRING] by_dates: LIST [STRING] date: STRING data_tup: TUPLE [val: REAL; flag: INTEGER] data_arr: ARRAY [TUPLE [val: REAL; flag: INTEGER]] i: INTEGER do create l_file.make_open_read_write (original_list) l_file.read_stream (l_file.count) data := l_file.last_string.split ('%N') l_file.close create hash.make (data.count) across data as d loop if not d.item.is_empty then by_dates := d.item.split ('%T') date := by_dates [1] by_dates.prune (date) create data_tup create data_arr.make_empty from i := 1 until i > by_dates.count - 1 loop data_tup := [by_dates [i].to_real, by_dates [i + 1].to_integer] data_arr.force (data_tup, data_arr.count + 1) i := i + 2 end hash.put (data_arr, date) if not hash.inserted then date.append ("_double_date_stamp") hash.put (data_arr, date) end end end end hash: HASH_TABLE [ARRAY [TUPLE [val: REAL; flag: INTEGER]], STRING] end </syntaxhighlight> {{out}} Only the last three lines of the per line summary statistics are shown. <pre> . . . 2004-12-29 accept: 23 reject: 1 total: 56.3 average: 2.44 2004-12-30 accept: 23 reject: 1 total: 65.3 average: 2.83 2004-12-31 accept: 23 reject: 1 total: 47.3 average: 2.05 File total: 1.35839e+006 Readings total: 129403 The longest gap is 589. It ends at the date stamp 1993-03-05. </pre> </pre> =={{header\|Erlang}}== The function file_contents/1 is used by [[Text_processing/2]]. Please update the user if you make any interface changes. <syntaxhighlight lang="erlang"> ~~<lang Erlang>~~ -module( text_processing ). Line 1,171 ⟶ 1,446: {_Previous, Value_flags} = lists:foldr( fun file_content_line_value_flag/2, {[], []}, Rest ), % Preserve order {binary:bin_to_list( Date_binary ), Value_flags}. </syntaxhighlight> ~~</lang>~~ {{out}} <pre> Line 1,191 ⟶ 1,466: =={{header\|Forth}}== {{works with\|GNU Forth}} <~~lang~~syntaxhighlight lang="forth">\ data munging \ 1991-03-30[\t10.000\t[-]1]24 Line 1,283 ⟶ 1,558: total-sum f@ total-n @ .mean cr ; main bye</~~lang~~syntaxhighlight> =={{header\|Fortran}}== Aside from formatted I/O, Fotran also offers free-format or "list-directed" I/O that accepts numerical data without many constraints - though complex numbers must be presented as (x,y) style. There are complications when character data are to be read in but this example does not involve that. Unfortunately, although the date part could be considered as three integers (with the hyphens separating the tokens), fortran's free-format scheme requires an actual delimiter, not an implied delimiter. If slashes were to be used, its behaviour is even less helpful, as the / is recognised as terminating the scan of the line! This may well allow comments to be added to data lines, but it makes reading such dates difficult. The free-format rule is that to read N data, input will be read from as many records as necessary until N data have been obtained. Should the last-read record have further data, they will not be seen by the next READ because it will start on a new record. So, to handle this, the plan becomes to read the record into a CHARACTER variable, read the date part with a FORMAT statement working on the first ten characters, and then read the rest of the input via free-format. If the data field is corrupt (say, a letter in place of a digit) the ERR label will be selected. Similarly, when reading from a CHARACTER variable there is no read-next-record available, so if there is a shortage of data (a number is missing) the END label will be selected. A complaint is printed, a HIC counted, and if HIC is not too large, go back and try for the next record. This means that record count no longer relates to the number of values read, so a count of bad values is maintained as well. In the event, no such troublesome records were encountered in this example. An advantage of reading the record into a scratchpad is that any error messages can quote the text of the record. Similarly, tabs could be converted to spaces, etc. but this isn't needed as the free-format reading allows either, and commas. More generally, a much more comprehensive analysis of the record's text could be made should the need arise, most obviously that the date is a valid one, its hyphens are placed as expected, and the dates are in sequence without gaps. Another useful check is for the appearance of additional text beyond the twenty-four pairs specified - for example, a CRLF has been "lost" and the next record's content follows on the same line. Somehow. I suspect bungles with "cut&paste" hand editing, but the data suppliers never even apologised. In decades of working with (half-)hourly data on the generation and consumption of electricity, it has been demonstrated that modern data suppliers endlessly manifest an inability to stick with a particular format, even one of their own design, and that the daylight savings changeover days (where there are ''not'' twenty-four hours in a day) surprass their competence annually. Persons from a mainframe background do a much more reliable job than those who have only tinkered with spreadsheets. Incidentally, a daily average of a set of measurements may be unsuitable when data are missing, as when there is a regular pattern over a day. The N.Z. electricity supply association ruled that in calculating the ratio of daytime to nighttime usage, should there be four or more missing data in a day, then the entire day's data were to be rejected when computing the monthly or quarterly ratio. <syntaxhighlight lang="fortran"> Crunches a set of hourly data. Starts with a date, then 24 pairs of value,indicator for that day, on one line. INTEGER Y,M,D !Year, month, and day. INTEGER GOOD(24) !The indicators. REAL8 V(24),VTOT,T !The grist. INTEGER NV,N,NB !Number of good values overall, and in a day. INTEGER I,NREC,HIC !Some counters. INTEGER BI,BN,BBI,BBN !Stuff to locate the longest run of bad data, CHARACTER10 BDATE,BBDATE !Along with the starting date. LOGICAL INGOOD !State flipper for the runs of data. INTEGER IN,MSG !I/O mnemonics. CHARACTER666 ACARD !Scratchpad, of sufficient length for all expectation. IN = 10 !Unit number for the input file. MSG = 6 !Output. OPEN (IN,FILE="Readings1.txt", FORM="FORMATTED", !This should be a function. 1 STATUS ="OLD",ACTION="READ") !Returning success, or failure. NB = 0 !No bad values read. NV = 0 !Nor good values read. VTOT = 0 !Their average is to come. NREC = 0 !No records read. HIC = 0 !Provoking no complaints. INGOOD = .TRUE. !I start in hope. BBN = 0 !And the longest previous bad run is short. Chew into the file. 10 READ (IN,11,END=100,ERR=666) L,ACARD(1:MIN(L,LEN(ACARD))) !With some protection. NREC = NREC + 1 !So, a record has been read. 11 FORMAT (Q,A) !Obviously, Q ascertains the length of the record being read. READ (ACARD,12,END=600,ERR=601) Y,M,D !The date part is trouble, as always. 12 FORMAT (I4,2(1X,I2)) !Because there are no delimiters between the parts. READ (ACARD(11:L),,END=600,ERR=601) (V(I),GOOD(I),I = 1,24) !But after the date, delimiters abound. Calculations. Could use COUNT(array) and SUM(array), but each requires its own pass through the array. 20 T = 0 !Start on the day's statistics. N = 0 !No values yet. DO I = 1,24 !So, scan the cargo and do all the twiddling in one pass.. IF (GOOD(I).GT.0) THEN !A good value? N = N + 1 !Yes. Count it in. T = T + V(I) !And augment for the average. IF (.NOT.INGOOD) THEN !Had we been ungood? INGOOD = .TRUE. !Yes. But now it changes. IF (BN.GT.BBN) THEN !The run just ending: is it longer? BBN = BN !Yes. Make it the new baddest. BBI = BI !Recalling its start index, BBDATE = BDATE !And its start date. END IF !So much for bigger badness. END IF !Now we're in good data. ELSE !Otherwise, a bad value is upon us. IF (INGOOD) THEN !Were we good? INGOOD = .FALSE. !No longer. A new bad run is starting. BDATE = ACARD(1:10) !Recall the date for this starter. BI = I !And its index. BN = 0 !Start the run-length counter. END IF !So much for a fall. BN = BN + 1 !Count another bad value. END IF !Good or bad, so much for that value. END DO !On to the next. Commentary for the day's data.. IF (N.LE.0) THEN !I prefer to avoid dividing by zero. WRITE (MSG,21) NREC,ACARD(1:10) !So, no average to report. 21 FORMAT ("Record",I8," (",A,") has no good data!") !Just a remark. ELSE !But otherwise, WRITE(MSG,22) NREC,ACARD(1:10),N,T/N !An average is possible. 22 FORMAT("Record",I8," (",A,")",I3," good, average",F9.3) !So here it is. NB = NB + 24 - N !Count the bad by implication. NV = NV + N !Count the good directly. VTOT = VTOT + T !Should really sum deviations from a working average. END IF !So much for that line. GO TO 10 !More! More! I want more!! Complaints. Should really distinguish between trouble in the date part and in the data part. 600 WRITE (MSG,) '"END" declared - insufficient data?' !Not enough numbers, presumably. GO TO 602 !Reveal the record. 601 WRITE (MSG,) '"ERR" declared - improper number format?' !Ah, but which number? 602 WRITE (MSG,603) NREC,L,ACARD(1:L) !Anyway, reveal the uninterpreted record. 603 FORMAT(" Record ",I0,", length ",I0," reads ",A) !Just so. HIC = HIC + 1 !This may grow into a habit. IF (HIC.LE.12) GO TO 10 !But if not yet, try the next record. STOP "Enough distaste." !Or, give up. 666 WRITE (MSG,101) NREC,"format error!" !For A-style data? Should never happen! GO TO 900 !But if it does, give up! Closedown. 100 WRITE (MSG,101) NREC,"then end-of-file" !Discovered on the next attempt. 101 FORMAT (" Record ",I0,": ",A) !A record number plus a remark. WRITE (MSG,102) NV,NB,VTOT/NV !The overall results. 102 FORMAT (I8," values, ",I0," bad. Average",F9.4) !This should do. IF (BBN.LE.0) THEN !Now for a special report. WRITE (MSG,) "No bad value presented, so no longest run." !Unneeded! ELSE !But actually, the example data has some bad values. WRITE (MSG,103) BBN,BBI,BBDATE !And this is for the longest encountered. 103 FORMAT ("Longest bad run: ",I0,", starting hour ",I0," on ",A) !Just so. END IF !Enough remarks. 900 CLOSE(IN) !Done. END !Spaghetti rules. </syntaxhighlight> Output: Record 1 (1990-01-01) 22 good, average 26.818 Record 2 (1990-01-02) 24 good, average 17.083 Record 3 (1990-01-03) 24 good, average 58.958 ...etc. Record 945 (1992-08-01) has no good data! ...etc. Record 5471 (2004-12-31) 23 good, average 2.057 Record 5471: then end-of-file 129403 values, 1901 bad. Average 10.4974 Longest bad run: 589, starting hour 2 on 1993-02-09 =={{header\|Go}}== <~~lang~~syntaxhighlight lang="go">package main import ( Line 1,376 ⟶ 1,767: maxRun, maxDate) } }</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,394 ⟶ 1,785: =={{header\|Haskell}}== <~~lang~~syntaxhighlight ~~Haskell~~lang="haskell">import Data.List import Numeric import Control.Arrow Line 1,440 ⟶ 1,831: (\(t,n) -> printf totalFmt n t (t/fromIntegral n)) $ fst summ mapM_ ((\(l, d1,d2) -> printf maxFmt l d1 d2) . (\(a,b)-> (a,(fst.(dat!!).(`div`24))b,(fst.(dat!!).(`div`24))(a+b)))) $ snd summ</~~lang~~syntaxhighlight> {{out}} <~~lang~~syntaxhighlight ~~Haskell~~lang="haskell">Main> :main ["./RC/readings.txt"]</~~lang~~syntaxhighlight> <pre>Some lines: Line 1,455 ⟶ 1,846: =={{header\|Icon}} and {{header\|Unicon}}== <~~lang~~syntaxhighlight ~~Icon~~lang="icon">record badrun(count,fromdate,todate) # record to track bad runs procedure main() Line 1,505 ⟶ 1,896: else write(fout,"No bad runs of data") end</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>... Line 1,521 ⟶ 1,912: =={{header\|J}}== '''Solution:''' <~~lang~~syntaxhighlight lang="j"> load 'files' parseLine=: 10&({. ,&< (_99&".;._1)@:}.) NB. custom parser summarize=: # , +/ , +/ % # NB. count,sum,mean Line 1,534 ⟶ 1,925: 589 ]StartDates=: Dates {~ (>:@I.@e.&MaxRun (24 <.@%~ +/)@{. ]) RunLengths 1993-03-05</~~lang~~syntaxhighlight> '''Formatting Output'''<br> Define report formatting verbs: <~~lang~~syntaxhighlight lang="j">formatDailySumry=: dyad define labels=. , ];.2 'Line: Accept: Line_tot: Line_avg: ' labels , x ,. 7j0 10j3 10j3 ": y Line 1,547 ⟶ 1,938: 'maxrun dates'=. x out=. out,LF,'Maximum run(s) of ',(": maxrun),' consecutive false readings ends at line(s) starting with date(s): ',dates )</~~lang~~syntaxhighlight> {{out\|Show output}} <~~lang~~syntaxhighlight lang="j"> (_4{.Dates) formatDailySumry _4{. DailySummary Line: Accept: Line_tot: Line_avg: 2004-12-28 23 77.800 3.383 Line 1,561 ⟶ 1,952: Average: 10.497 Maximum run(s) of 589 consecutive false readings ends at line(s) starting with date(s): 1993-03-05</~~lang~~syntaxhighlight> =={{header\|Java}}== {{works with\|Java\|7}} <~~lang~~syntaxhighlight lang="java">import java.io.File; import java.util.; import static java.lang.System.out; Line 1,653 ⟶ 2,044: } } }</~~lang~~syntaxhighlight> <pre>1990-01-01 out: 2 in: 22 tot: 590.000 avg: 26.818 Line 1,668 ⟶ 2,059: =={{header\|JavaScript}}== {{works with\|JScript}} <~~lang~~syntaxhighlight lang="javascript">var filename = 'readings.txt'; var show_lines = 5; var file_stats = { Line 1,736 ⟶ 2,127: function dec3(value) { return Math.round(value * 1e3) / 1e3; }</~~lang~~syntaxhighlight> {{out}} <pre>Line: 1990-01-01 Reject: 2 Accept: 22 Line_tot: 590 Line_avg: 26.818 Line 1,750 ⟶ 2,141: Maximum run of 589 consecutive false readings ends at 1993-03-05</pre> =={{header\|jq}}== {{works with\|jq\|with foreach}} This article highlights jq's recently added "foreach" and "inputs" filters, as they allow the input file to be processed efficiently on a line-by-line basis, with minimal memory requirements. The "foreach" syntax is: <syntaxhighlight lang="jq">foreach STREAM as $row ( INITIAL; EXPRESSION; VALUE ).</syntaxhighlight> The basic idea is that for each $row in STREAM, the value specified by VALUE is emitted. If we wished only to produce per-line synopses of the "readings.txt" file, the following pattern could be used: <syntaxhighlight lang="jq">foreach (inputs \| split("\t")) as $line (INITIAL; EXPRESSION; VALUE)</syntaxhighlight> In order to distinguish the single-line synopsis from the whole-file synopsis, we will use the following pattern instead: <syntaxhighlight lang="jq">foreach ((inputs \| split("\t")), null) as $line (INITIAL; EXPRESSION; VALUE)</syntaxhighlight> The "null" is added so that the stream of per-line values can be distinguished from the last value in the stream. In this section, the whole-file synopsis is focused on the runs of lines having at least one flag<=0. The maximal length of such runs is computed, and the starting line(s) and date(s) of all such runs are recorded. One point of interest in the following program is the use of JSON objects to store values. This allows mnemonic names to be used instead of local variables. <syntaxhighlight lang="jq"># Input: { "max": max_run_length, # "starts": array_of_start_line_values, # of all the maximal runs # "start_dates": array_of_start_dates # of all the maximal runs # } def report: (.starts \| length) as $l \| if $l == 1 then "There is one maximal run of lines with flag<=0.", "The maximal run has length \(.max) and starts at line \(.starts[0]) and has start date \(.start_dates[0])." elif $l == 0 then "There is no lines with flag<=0." else "There are \($l) maximal runs of lines with flag<=0.", "These runs have length \(.max) and start at the following line numbers:", "\(.starts)", "The corresponding dates are:", "\(.start_dates)" end; # "process" processes "tab-separated string values" on stdin def process: # Given a line in the form of an array [date, datum1, flag2, ...], # "synopsis" returns [ number of data items on the line with flag>0, sum, number of data items on the line with flag<=0 ] def synopsis: # of a line . as $row \| reduce range(0; (length - 1) / 2) as $i ( [0,0,0]; ($row[1+ (2$i)] \| tonumber) as $datum \| ($row[2+(2$i)] \| tonumber) as $flag \| if ($flag>0) then .[0] += 1 \| .[1] += $datum else .[2] += 1 end ); # state: {"line": line_number # (first line is line 0) # "synopis": _, # value returned by "synopsis" # "start": line_number_of_start_of_current_run, # "start_date": date_of_start_of_current_run, # "length": length_of_current_run # so far # "max": max_run_length # so far # "starts": array_of_start_values # of all the maximal runs # "start_dates": array_of_start_dates # of all the maximal runs # } foreach ((inputs \| split("\t")), null) as $line # null signals END # Slots are effectively initialized by default to null ( { "line": -1, "length": 0, "max": 0, "starts": [], "start_dates": [] }; if $line == null then .line = null else .line += 1 # \| debug # synopsis returns [number with flag>0, sum, number with flag<=0 ] \| .synopsis = ($line \| synopsis) \| if .synopsis[2] > 0 then if .start then . else .start = .line \| .start_date = $line[0] end \| .length += 1 \| if .max < .length then (.max = .length) \| .starts = [ .start ] \| .start_dates = [ .start_date ] elif .max == .length then .starts += [ .start ] \| .start_dates += [ .start_date ] else . end else .start = null \| .length = 0 end end; .) \| if .line == null then {max, starts, start_dates} \| report else .synopsis end; process</syntaxhighlight> {{out}} <syntaxhighlight lang="sh">$ jq -c -n -R -r -f Text_processing_1.jq readings.txt [22,590,2] [24,410,0] ... [23,47.3,1] There is one maximal run of lines with flag<=0. The maximal run has length 93 and starts at line 5378 and has start date 2004-09-30.</syntaxhighlight> =={{header\|Julia}}== <syntaxhighlight lang="julia"> using DataFrames function mungdata(filename) lines = readlines(filename) numlines = length(lines) dates = Array{DateTime, 1}(numlines) means = zeros(Float64, numlines) numvalid = zeros(Int, numlines) invalidlength = zeros(Int, numlines) invalidpos = zeros(Int, numlines) datamatrix = Array{Float64,2}(numlines, 24) datamatrix .= NaN totalsum = 0.0 totalgood = 0 for (linenum,line) in enumerate(lines) data = split(line) validcount = badlength = 0 validsum = 0.0 for i in 2:2:length(data)-1 if parse(Int, data[i+1]) >= 0 validsum += (datamatrix[linenum, Int(i/2)] = parse(Float64, data[i])) validcount += 1 badlength = 0 else badlength += 1 if badlength > invalidlength[linenum] invalidlength[linenum] = badlength invalidpos[linenum] = Int(i/2) - invalidlength[linenum] + 1 end end end dates[linenum] = DateTime(data[1], "y-m-d") means[linenum] = validsum / validcount numvalid[linenum] = validcount totalsum += validsum totalgood += validcount end dt = DataFrame(Date = dates, Mean = means, ValidValues = numvalid, MaximumGap = invalidlength, GapPosition = invalidpos) for i in 1:size(datamatrix)[2] dt[Symbol("$(i-1):00")] = datamatrix[:,i] end dt, totalsum/totalgood end datafilename = "data.txt" # this is taken from the example listed on the task, since the actual text file is not available df, dmean = mungdata(datafilename) println(df) println("The overall mean is $dmean") maxbadline = indmax(df[:MaximumGap]) maxbadval = df[:MaximumGap][maxbadline] maxbadtime = df[:GapPosition][maxbadline] - 1 maxbaddate = replace("$(df[:Date][maxbadline])", r"T.+$", "") println("The largest run of bad values is $(maxbadval), on $(maxbaddate) beginning at $(maxbadtime):00 hours.") </syntaxhighlight> {{output}} <pre> 6×29 DataFrames.DataFrame │ Row │ Date │ Mean │ ValidValues │ MaximumGap │ GapPosition │ 0:00 │ 1:00 │ 2:00 │ 3:00 │ 4:00 │ ├─────┼─────────────────────┼─────────┼─────────────┼────────────┼─────────────┼──────┼──────┼──────┼──────┼──────┤ │ 1 │ 1991-03-30T00:00:00 │ 10.0 │ 24 │ 0 │ 0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ │ 2 │ 1991-03-31T00:00:00 │ 23.5417 │ 24 │ 0 │ 0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ │ 3 │ 1991-03-31T00:00:00 │ 40.0 │ 1 │ 23 │ 2 │ 40.0 │ NaN │ NaN │ NaN │ NaN │ │ 4 │ 1991-04-01T00:00:00 │ 23.2174 │ 23 │ 1 │ 1 │ NaN │ 13.0 │ 16.0 │ 21.0 │ 24.0 │ │ 5 │ 1991-04-02T00:00:00 │ 19.7917 │ 24 │ 0 │ 0 │ 8.0 │ 9.0 │ 11.0 │ 12.0 │ 12.0 │ │ 6 │ 1991-04-03T00:00:00 │ 13.9583 │ 24 │ 0 │ 0 │ 10.0 │ 9.0 │ 10.0 │ 10.0 │ 9.0 │ │ Row │ 5:00 │ 6:00 │ 7:00 │ 8:00 │ 9:00 │ 10:00 │ 11:00 │ 12:00 │ 13:00 │ 14:00 │ 15:00 │ 16:00 │ 17:00 │ 18:00 │ ├─────┼──────┼──────┼──────┼──────┼──────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┤ │ 1 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ │ 2 │ 10.0 │ 10.0 │ 20.0 │ 20.0 │ 20.0 │ 35.0 │ 50.0 │ 60.0 │ 40.0 │ 30.0 │ 30.0 │ 30.0 │ 25.0 │ 20.0 │ │ 3 │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ NaN │ │ 4 │ 22.0 │ 20.0 │ 18.0 │ 29.0 │ 44.0 │ 50.0 │ 43.0 │ 38.0 │ 27.0 │ 27.0 │ 24.0 │ 23.0 │ 18.0 │ 12.0 │ │ 5 │ 12.0 │ 27.0 │ 26.0 │ 27.0 │ 33.0 │ 32.0 │ 31.0 │ 29.0 │ 31.0 │ 25.0 │ 25.0 │ 24.0 │ 21.0 │ 17.0 │ │ 6 │ 10.0 │ 15.0 │ 24.0 │ 28.0 │ 24.0 │ 18.0 │ 14.0 │ 12.0 │ 13.0 │ 14.0 │ 15.0 │ 14.0 │ 15.0 │ 13.0 │ │ Row │ 19:00 │ 20:00 │ 21:00 │ 22:00 │ 23:00 │ ├─────┼───────┼───────┼───────┼───────┼───────┤ │ 1 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ 10.0 │ │ 2 │ 20.0 │ 20.0 │ 20.0 │ 20.0 │ 35.0 │ │ 3 │ NaN │ NaN │ NaN │ NaN │ NaN │ │ 4 │ 13.0 │ 14.0 │ 15.0 │ 13.0 │ 10.0 │ │ 5 │ 14.0 │ 15.0 │ 12.0 │ 12.0 │ 10.0 │ │ 6 │ 13.0 │ 13.0 │ 12.0 │ 10.0 │ 10.0 │ The overall mean is 18.241666666666667 The largest run of bad values is 23, on 1991-03-31 beginning at 1:00 hours. </pre> =={{header\|Kotlin}}== <syntaxhighlight lang="scala">// version 1.2.31 import java.io.File fun main(args: Array<String>) { val rx = Regex("""\s+""") val file = File("readings.txt") val fmt = "Line: %s Reject: %2d Accept: %2d Line_tot: %7.3f Line_avg: %7.3f" var grandTotal = 0.0 var readings = 0 var date = "" var run = 0 var maxRun = -1 var finishLine = "" file.forEachLine { line -> val fields = line.split(rx) date = fields[0] if (fields.size == 49) { var accept = 0 var total = 0.0 for (i in 1 until fields.size step 2) { if (fields[i + 1].toInt() >= 1) { accept++ total += fields[i].toDouble() if (run > maxRun) { maxRun = run finishLine = date } run = 0 } else run++ } grandTotal += total readings += accept println(fmt.format(date, 24 - accept, accept, total, total / accept)) } else println("Line: $date does not have 49 fields and has been ignored") } if (run > maxRun) { maxRun = run finishLine = date } val average = grandTotal / readings println("\nFile = ${file.name}") println("Total = ${"%7.3f".format(grandTotal)}") println("Readings = $readings") println("Average = ${"%-7.3f".format(average)}") println("\nMaximum run of $maxRun consecutive false readings") println("ends at line starting with date: $finishLine") }</syntaxhighlight> {{out}} Abbreviated output: <pre> Line: 1990-01-01 Reject: 2 Accept: 22 Line_tot: 590.000 Line_avg: 26.818 Line: 1990-01-02 Reject: 0 Accept: 24 Line_tot: 410.000 Line_avg: 17.083 Line: 1990-01-03 Reject: 0 Accept: 24 Line_tot: 1415.000 Line_avg: 58.958 Line: 1990-01-04 Reject: 0 Accept: 24 Line_tot: 1800.000 Line_avg: 75.000 Line: 1990-01-05 Reject: 0 Accept: 24 Line_tot: 1130.000 Line_avg: 47.083 .... Line: 2004-12-27 Reject: 1 Accept: 23 Line_tot: 57.100 Line_avg: 2.483 Line: 2004-12-28 Reject: 1 Accept: 23 Line_tot: 77.800 Line_avg: 3.383 Line: 2004-12-29 Reject: 1 Accept: 23 Line_tot: 56.300 Line_avg: 2.448 Line: 2004-12-30 Reject: 1 Accept: 23 Line_tot: 65.300 Line_avg: 2.839 Line: 2004-12-31 Reject: 1 Accept: 23 Line_tot: 47.300 Line_avg: 2.057 File = readings.txt Total = 1358393.400 Readings = 129403 Average = 10.497 Maximum run of 589 consecutive false readings ends at line starting with date: 1993-03-05 </pre> =={{header\|Lua}}== <~~lang~~syntaxhighlight ~~Lua~~lang="lua">filename = "readings.txt" io.input( filename ) Line 1,800 ⟶ 2,460: print( string.format( "Readings: %d", file_lines ) ) print( string.format( "Average: %f", file_sum/file_cnt_data ) ) print( string.format( "Maximum %d consecutive false readings starting at %s.", max_rejected, max_rejected_date ) )</~~lang~~syntaxhighlight> <pre>Output: File: readings.txt Line 1,808 ⟶ 2,468: Maximum 589 consecutive false readings starting at 1993-02-09.</pre> =={{header\|Mathematica}}/{{header\|Wolfram Language}}== <~~lang~~syntaxhighlight ~~Mathematica~~lang="mathematica">FileName = "Readings.txt"; data = Import[FileName,"TSV"]; Scan[(a=Position[#[[3;;All;;2]],1]; Print["Line:",#[[1]] ,"\tReject:", 24 - Length[a], "\t Accept:", Length[a], "\tLine_tot:", Line 1,824 ⟶ 2,483: Print["\nFile(s) : ",FileName,"\nTotal : ",AccountingForm@GlobalSum,"\nReadings : ",Nb, "\nAverage : ",GlobalSum/Nb,"\n\nMaximum run(s) of ",MaxRunRecorded, " consecutive false readings ends at line starting with date(s):",MaxRunTime]</~~lang~~syntaxhighlight> <pre>Line:1990-01-01 Reject:2 Accept:22 Line_tot:590. Line_avg:26.8182 Line 1,842 ⟶ 2,501: =={{header\|Nim}}== {{trans\|Python}} <~~lang~~syntaxhighlight lang="nim">import os, sequtils, strutils, ~~sequtils~~strformat var nodata = 0 nodataMax = -1 nodataMaxLine: seq[string~~] = @[~~] totFile = 0.0 Line 1,853 ⟶ 2,512: for filename in commandLineParams(): ~~var~~for fline =in ~~open(~~filename).lines: ~~for line in f.lines:~~ var totLine = 0.0 numLine = 0 ~~field~~data: ~~= line.split()~~seq[float] ~~date~~flags: ~~= field~~seq[0int] ~~data: seq[float] = @[]~~ ~~flags: seq[int] = @[]~~ let fields = line.split() ~~for i, f in field[1 .. -1]:~~ let date = fields[0] ~~if i mod 2 == 0: data.add parseFloat(f)~~ ~~else: flags.add parseInt(f)~~ for ~~datum~~i, ~~flag~~field in ~~items(zip(data, flags))~~fields[1..^1]: if i mod 2 == 0: data.add parseFloat(field) else: flags.add parseInt(field) for datum, flag in zip(data, flags).items: if flag < 1: inc nodata Line 1,883 ⟶ 2,542: numFile += numLine ~~echo~~let ~~"Line:~~average $#= if ~~Reject:~~numLine $#> ~~Accept~~0: $#totLine / ~~LineTot: $#~~float(numLine) ~~LineAvg~~else: ~~$#"~~0.0 echo &"Line: ~~.format(~~{date,} Reject: {data.len - numLine,:2} Accept: {numLine:2} ", &"LineTot: ~~formatFloat(~~{totLine~~, precision =~~:6.2f} ~~0),~~LineAvg: ~~formatFloat(~~{average:4.2f}" ~~(if numLine > 0: totLine / float(numLine) else: 0.0), precision = 0))~~ echo() echo &"""File(s) = {commandLineParams().join(" ")}""" echo &"Total = {totFile:.2f}" echo &"Readings = {numFile}" echo &"Average = {totFile / float(numFile):.2f}" echo "" echo &"Maximum run(s) of {nodataMax} consecutive false readings ", ~~echo "File(s) = ", commandLineParams().join(" ")~~ &"""ends at line starting with date(s): {nodataMaxLine.join(" ")}."""</syntaxhighlight> ~~echo "Total = ", formatFloat(totFile, precision = 0)~~ ~~echo "Readings = ", numFile~~ {{out}} ~~echo "Average = ", formatFloat(totFile / float(numFile), precision = 0)~~ <pre>$ ./textproc1 readings.txt \| tail ~~echo ""~~ Line: 2004-12-29 Reject: 1 Accept: 23 LineTot: 56.30 LineAvg: 2.45 ~~echo "Maximum run(s) of ", nodataMax, " consecutive false readings ends at line starting with date(s): ", nodataMaxLine.join(" ")</lang>~~ Line: 2004-12-30 Reject: 1 Accept: 23 LineTot: 65.30 LineAvg: 2.84 ~~Output:~~ Line: 2004-12-31 Reject: 1 Accept: 23 LineTot: 47.30 LineAvg: 2.06 ~~<pre>$ ./textproc1 readings.txt\|tail~~ ~~Line: 2004-12-29 Reject: 1 Accept: 23 LineTot: 56.3 LineAvg: 2.44783~~ ~~Line: 2004-12-30 Reject: 1 Accept: 23 LineTot: 65.3 LineAvg: 2.83913~~ ~~Line: 2004-12-31 Reject: 1 Accept: 23 LineTot: 47.3 LineAvg: 2.05652~~ File(s) = readings.txt Total = 11358393.~~35839e+06~~40 Readings = 129403 Average = 10.~~4974~~50 Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05.</pre> =={{header\|OCaml}}== <~~lang~~syntaxhighlight lang="ocaml">let input_line ic = try Some(input_line ic) with End_of_file -> None Line 1,975 ⟶ 2,635: Printf.printf "Maximum run(s) of %d consecutive false readings \ ends at line starting with date(s): %s\n" nodata_max (String.concat ", " nodata_maxline);</~~lang~~syntaxhighlight> =={{header\|Perl}}== ===An AWK-like solution=== <~~lang~~syntaxhighlight lang="perl">use strict; use warnings; Line 2,037 ⟶ 2,697: printf "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s\n", $nodata_max, $nodata_maxline;</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>bash$ perl -f readings.pl readings.txt \| tail Line 2,053 ⟶ 2,713: ===An object-oriented solution=== <~~lang~~syntaxhighlight lang="perl">use strict; use warnings; Line 2,167 ⟶ 2,827: $parser->_push_bad_range_if_necessary } }</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>$ perl readings.pl < readings.txt \| tail Line 2,183 ⟶ 2,843: $</pre> =={{header\|~~Perl 6~~Phix}}== <!--<syntaxhighlight lang="phix">(phixonline)--> ~~<lang perl6>my @gaps;~~ <span style="color: #000080;font-style:italic;">-- demo\rosetta\TextProcessing1.exw</span> ~~my $previous = 'valid';~~ <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> <span style="color: #000080;font-style:italic;">-- (include version/first of next three lines only)</span> <span style="color: #008080;">include</span> <span style="color: #000000;">readings</span><span style="color: #0000FF;">.</span><span style="color: #000000;">e</span> <span style="color: #000080;font-style:italic;">-- global constant lines, or: --assert(write_lines("readings.txt",lines)!=-1) -- first run, then: --constant lines = read_lines("readings.txt")</span> <span style="color: #008080;">include</span> <span style="color: #000000;">builtins</span><span style="color: #0000FF;">\</span><span style="color: #004080;">timedate</span><span style="color: #0000FF;">.</span><span style="color: #000000;">e</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">count</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">max_count</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">ntot</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span> <span style="color: #004080;">atom</span> <span style="color: #000000;">readtot</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span> <span style="color: #004080;">timedate</span> <span style="color: #000000;">run_start</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">max_start</span> <span style="color: #008080;">procedure</span> <span style="color: #000000;">end_bad_run</span><span style="color: #0000FF;">()</span> <span style="color: #008080;">if</span> <span style="color: #000000;">count</span> <span style="color: #008080;">then</span> <span style="color: #008080;">if</span> <span style="color: #000000;">count</span><span style="color: #0000FF;">></span><span style="color: #000000;">max_count</span> <span style="color: #008080;">then</span> <span style="color: #000000;">max_count</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">count</span> <span style="color: #000000;">max_start</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">run_start</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #000000;">count</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span> <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">lines</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">oneline</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">split</span><span style="color: #0000FF;">(</span><span style="color: #000000;">lines</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span><span style="color: #008000;">'\t'</span><span style="color: #0000FF;">),</span> <span style="color: #000000;">r</span> <span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">)!=</span><span style="color: #000000;">49</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?</span><span style="color: #008000;">"bad line (length!=49)"</span> <span style="color: #008080;">else</span> <span style="color: #000000;">r</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">parse_date_string</span><span style="color: #0000FF;">(</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],{</span><span style="color: #008000;">"YYYY-MM-DD"</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">if</span> <span style="color: #008080;">not</span> <span style="color: #004080;">timedate</span><span style="color: #0000FF;">(</span><span style="color: #000000;">r</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?{</span><span style="color: #008000;">"bad date"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]}</span> <span style="color: #008080;">else</span> <span style="color: #004080;">timedate</span> <span style="color: #000000;">td</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">r</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">rejects</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">accepts</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #004080;">atom</span> <span style="color: #000000;">readsum</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span> <span style="color: #008080;">for</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">=</span><span style="color: #000000;">2</span> <span style="color: #008080;">to</span> <span style="color: #000000;">48</span> <span style="color: #008080;">by</span> <span style="color: #000000;">2</span> <span style="color: #008080;">do</span> <span style="color: #000000;">r</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">scanf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">],</span><span style="color: #008000;">"%f"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">r</span><span style="color: #0000FF;">)!=</span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?{</span><span style="color: #008000;">"error scanning"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]}</span> <span style="color: #000000;">rejects</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">else</span> <span style="color: #004080;">atom</span> <span style="color: #000000;">reading</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">r</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #000000;">r</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">scanf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #008000;">"%d"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">r</span><span style="color: #0000FF;">)!=</span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?{</span><span style="color: #008000;">"error scanning"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">oneline</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]}</span> <span style="color: #000000;">rejects</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">else</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">flag</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">r</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">if</span> <span style="color: #000000;">flag</span><span style="color: #0000FF;"><=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> <span style="color: #008080;">if</span> <span style="color: #000000;">count</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> <span style="color: #000000;">run_start</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">td</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #000000;">count</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #000000;">rejects</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">else</span> <span style="color: #000000;">end_bad_run</span><span style="color: #0000FF;">()</span> <span style="color: #000000;">accepts</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #000000;">readsum</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">reading</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> <span style="color: #000000;">readtot</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">readsum</span> <span style="color: #000000;">ntot</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">accepts</span> <span style="color: #008080;">if</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">>=</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">lines</span><span style="color: #0000FF;">)-</span><span style="color: #000000;">2</span> <span style="color: #008080;">then</span> <span style="color: #004080;">string</span> <span style="color: #000000;">average</span> <span style="color: #0000FF;">=</span> <span style="color: #008080;">iff</span><span style="color: #0000FF;">(</span><span style="color: #000000;">accepts</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">?</span><span style="color: #008000;">"N/A"</span><span style="color: #0000FF;">:</span><span style="color: #7060A8;">sprintf</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"%6.3f"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">readsum</span><span style="color: #0000FF;">/</span><span style="color: #000000;">accepts</span><span style="color: #0000FF;">))</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Date: %s, Rejects: %2d, Accepts: %2d, Line total: %7.3f, Average %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #7060A8;">format_timedate</span><span style="color: #0000FF;">(</span><span style="color: #000000;">td</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"DD/MM/YYYY"</span><span style="color: #0000FF;">),</span><span style="color: #000000;">rejects</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">accepts</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">readsum</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">average</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Average: %.3f (of %d readings)\n"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">readtot</span><span style="color: #0000FF;">/</span><span style="color: #000000;">ntot</span><span style="color: #0000FF;">,</span><span style="color: #000000;">ntot</span><span style="color: #0000FF;">})</span> <span style="color: #000000;">end_bad_run</span><span style="color: #0000FF;">()</span> <span style="color: #008080;">if</span> <span style="color: #000000;">max_count</span> <span style="color: #008080;">then</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Maximum run of %d consecutive false readings starting: %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">max_count</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">format_timedate</span><span style="color: #0000FF;">(</span><span style="color: #000000;">max_start</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"DD/MM/YYYY"</span><span style="color: #0000FF;">)})</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #0000FF;">?</span><span style="color: #008000;">"done"</span> <span style="color: #0000FF;">{}</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">wait_key</span><span style="color: #0000FF;">()</span> <!--</syntaxhighlight>--> {{out}} <pre> Date: 29/12/2004, Rejects: 1, Accepts: 23, Line total: 56.300, Average 2.448 Date: 30/12/2004, Rejects: 1, Accepts: 23, Line total: 65.300, Average 2.839 Date: 31/12/2004, Rejects: 1, Accepts: 23, Line total: 47.300, Average 2.057 Average: 10.497 (of 129403 readings) Maximum run of 589 consecutive false readings starting: 09/02/1993 </pre> =={{header\|Picat}}== ~~for $IN.lines -> $line {~~ {{trans\|Ruby}} ~~my ($date, @readings) = split /\s+/, $line;~~ <syntaxhighlight lang="picat">go => ~~my @valid;~~ File = "readings.txt", ~~my $hour = 0;~~ Total = new_map([num_readings=0,num_good_readings=0,sum_readings=0.0]), ~~for @readings -> $reading, $flag {~~ InvalidCount = 0, ~~if $flag > 0 {~~ MaxInvalidCount = 0, ~~@valid.push($reading);~~ InvalidRunEnd = "", ~~if $previous eq 'invalid' {~~ ~~@gaps[-1]{'end'} = "$date $hour:00";~~ ~~$previous = 'valid';~~ } } ~~else~~ { ~~if $previous eq 'valid' {~~ ~~@gaps.push( {start => "$date $hour:00"} );~~ } ~~@gaps[-1]{'count'}++;~~ ~~$previous = 'invalid';~~ } ~~$hour++;~~ } ~~say "$date: { ( +@valid ?? ( ( [+] @valid ) / +@valid ).fmt("%.3f") !! 0 ).fmt("%8s") }",~~ ~~" mean from { (+@valid).fmt("%2s") } valid.";~~ }; Id = 0, ~~my $longest = @gaps.sort({-$^a<count>})[0];~~ foreach(Line in read_file_lines(File)) Id := Id + 1, NumReadings = 0, NumGoodReadings = 0, SumReadings = 0, Fields = Line.split, Rec = Fields.tail.map(to_float), foreach([Reading,Flag] in chunks_of(Rec,2)) NumReadings := NumReadings + 1, if Flag > 0 then NumGoodReadings := NumGoodReadings + 1, SumReadings := SumReadings + Reading, InvalidCount := 0 else InvalidCount := InvalidCount + 1, if InvalidCount > MaxInvalidCount then MaxInvalidCount := InvalidCount, InvalidRunEnd := Fields[1] end end end, Total.put(num_readings,Total.get(num_readings) + NumReadings), Total.put(num_good_readings,Total.get(num_good_readings) + NumGoodReadings), Total.put(sum_readings,Total.get(sum_readings) + SumReadings), if Id <= 3 then printf("date:%w accept:%w reject:%w sum:%w\n", Fields[1],NumGoodReadings, NumReadings-NumGoodReadings, SumReadings) end end, nl, printf("readings: %d good readings: %d sum: %0.3f avg: %0.3f\n",Total.get(num_readings), Total.get(num_good_readings), Total.get(sum_readings), Total.get(sum_readings) / Total.get(num_good_readings)), nl, println(maxInvalidCount=MaxInvalidCount), println(invalidRunEnd=InvalidRunEnd), nl.</syntaxhighlight> ~~say "Longest period of invalid readings was {$longest<count>} hours,\n",~~ ~~"from {$longest<start>} till {$longest<end>}."</lang>~~ {{out}} <pre> date:1990-01-01 accept:22 reject:2 ~~26.818 mean from 22 valid~~sum:590.0 date:1990-01-02 accept:24 reject:0 ~~17.083 mean from 24 valid~~sum:410.0 date:1990-01-03 accept:24 reject:0 ~~58.958 mean from 24 valid~~sum:1415.0 ~~1990-01-04: 75.000 mean from 24 valid.~~ readings: 131304 good readings: 129403 sum: 1358393.400 avg: 10.497 ~~1990-01-05: 47.083 mean from 24 valid.~~ ~~...~~ maxInvalidCount = 589 ~~(many lines omitted)~~ invalidRunEnd = 1993-03-05</pre> ~~...~~ ~~2004-12-27: 2.483 mean from 23 valid.~~ ~~2004-12-28: 3.383 mean from 23 valid.~~ ~~2004-12-29: 2.448 mean from 23 valid.~~ ~~2004-12-30: 2.839 mean from 23 valid.~~ ~~2004-12-31: 2.057 mean from 23 valid.~~ ~~Longest period of invalid readings was 589 hours,~~ ~~from 1993-02-09 1:00 till 1993-03-05 14:00.~~ ~~</pre>~~ =={{header\|PicoLisp}}== {{trans\|AWK}} Put the following into an executable file "readings": <~~lang~~syntaxhighlight ~~PicoLisp~~lang="picolisp">#!/usr/bin/picolisp /usr/lib/picolisp/lib.l (let (NoData 0 NoDataMax -1 NoDataMaxline "!" TotFile 0 NumFile 0) Line 2,281 ⟶ 3,048: " consecutive false readings ends at line starting with date(s): " NoDataMaxline ) ) ) (bye)</~~lang~~syntaxhighlight> Then it can be called as <pre>$ ./readings readings.txt \|tail Line 2,297 ⟶ 3,064: =={{header\|PL/I}}== <~~lang~~syntaxhighlight lang="pli">text1: procedure options (main); / 13 May 2010 / declare line character (2000) varying; Line 2,350 ⟶ 3,117: finish_up: end text1;</~~lang~~syntaxhighlight> =={{header\|PowerShell}}== <syntaxhighlight lang="powershell">$file = '.\readings.txt' $lines = Get-Content $file # $args[0] $valid = $true $startDate = $currStart = $endDate = '' $startHour = $endHour = $currHour = $max = $currMax = $total = $readings = 0 $task = @() foreach ($var in $lines) { $date, $rest = [regex]::Split($var,'\s+') $reject = $accept = $sum = $cnt = 0 while ($rest) { $cnt += 1 [Double]$val, [Double]$flag, $rest = $rest if (0 -lt $flag) { $currMax = 0 $sum += $val $accept += 1 } else { if (0 -eq $currMax) { $currStart = $date $currHour = $cnt } $currMax += 1 $reject += 1 if ($max -lt $currMax) { $startDate, $endDate = $currStart, $date $startHour, $endHour = $currHour, $cnt $max = $currMax } } } $readings += $accept $total += $sum $average = if (0 -lt $accept) {$sum/$accept} else {0} $task += [PSCustomObject]@{ 'Line' = $date 'Reject' = $reject 'Accept' = $accept 'Sum' = $sum.ToString("N") 'Average' = $average.ToString("N3") } $valid = 0 -eq $reject } $task \| Select -Last 3 $average = $total/$readings "File(s) = $file" "Total = {0}" -f $total.ToString("N") "Readings = $readings" "Average = {0}" -f $average.ToString("N3") "" "Maximum run(s) of $max consecutive false readings." if (0 -lt $max) { "Consecutive false readings starts at line starting with date $startDate at hour {0:0#}:00." -f $startHour "Consecutive false readings ends at line starting with date $endDate at hour {0:0#}:00." -f $endHour }</syntaxhighlight> <pre>Line : 2004-12-29 Reject : 1 Accept : 23 Sum : 56.30 Average : 2.448 Line : 2004-12-30 Reject : 1 Accept : 23 Sum : 65.30 Average : 2.839 Line : 2004-12-31 Reject : 1 Accept : 23 Sum : 47.30 Average : 2.057 File(s) = .\readings.txt Total = 1,358,393.40 Readings = 129403 Average = 10.497 Maximum run(s) of 589 consecutive false readings. Consecutive false readings starts at line starting with date 1993-02-09 at hour 02:00. Consecutive false readings ends at line starting with date 1993-03-05 at hour 14:00.</pre> =={{header\|PureBasic}}== <~~lang~~syntaxhighlight ~~PureBasic~~lang="purebasic">#TASK="Text processing/1" Define File$, InLine$, Part$, i, Out$, ErrEnds$, Errcnt, ErrMax Define lsum.d, tsum.d, rejects, val.d, readings Line 2,399 ⟶ 3,249: ; Print("Press ENTER to exit"): Input() EndIf</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>... Line 2,416 ⟶ 3,266: =={{header\|Python}}== <~~lang~~syntaxhighlight lang="python">import fileinput import sys Line 2,471 ⟶ 3,321: print "\nMaximum run(s) of %i consecutive false readings ends at line starting with date(s): %s" % ( nodata_max, ", ".join(nodata_maxline))</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre>bash$ /cygdrive/c/Python26/python readings.py readings.txt\|tail Line 2,487 ⟶ 3,337: =={{header\|R}}== <~~lang~~syntaxhighlight Rlang="r">#Read in data from file dfr <- read.delim("readings.txt") #Calculate daily means Line 2,495 ⟶ 3,345: #Calculate time between good measurements times <- strptime(dfr[1,1], "%Y-%m-%d", tz="GMT") + 3600seq(1,24nrow(dfr),1) hours.between.good.measurements <- diff(times[t(flags)])/3600</~~lang~~syntaxhighlight> =={{header\|Racket}}== <~~lang~~syntaxhighlight lang="racket">#lang racket ;; Use SRFI 48 to make %n.nf formats convenient. (require (prefix-in srfi/48: srfi/48)) ; SRFI 48: Intermediate Format Strings Line 2,551 ⟶ 3,401: (unless (zero? N) (srfi/48:format #t "Average = ~10,3F~%" (/ sum N))) (srfi/48:format #t "~%Maximum run(s) of ~a consecutive false readings ends at line starting with date(s): ~a~%" max-consecutive-false (string-join max-false-tags))))</~~lang~~syntaxhighlight> {{out\|Sample run}} <pre>$ racket 1.rkt readings/readings.txt \| tail Line 2,564 ⟶ 3,414: Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05</pre> =={{header\|Raku}}== (formerly Perl 6) <syntaxhighlight lang="raku" line>my @gaps; my $previous = 'valid'; for $IN.lines -> $line { my ($date, @readings) = split /\s+/, $line; my @valid; my $hour = 0; for @readings -> $reading, $flag { if $flag > 0 { @valid.push($reading); if $previous eq 'invalid' { @gaps[-1]{'end'} = "$date $hour:00"; $previous = 'valid'; } } else { if $previous eq 'valid' { @gaps.push( {start => "$date $hour:00"} ); } @gaps[-1]{'count'}++; $previous = 'invalid'; } $hour++; } say "$date: { ( +@valid ?? ( ( [+] @valid ) / +@valid ).fmt("%.3f") !! 0 ).fmt("%8s") }", " mean from { (+@valid).fmt("%2s") } valid."; }; my $longest = @gaps.sort({-$^a<count>})[0]; say "Longest period of invalid readings was {$longest<count>} hours,\n", "from {$longest<start>} till {$longest<end>}."</syntaxhighlight> {{out}} <pre> 1990-01-01: 26.818 mean from 22 valid. 1990-01-02: 17.083 mean from 24 valid. 1990-01-03: 58.958 mean from 24 valid. 1990-01-04: 75.000 mean from 24 valid. 1990-01-05: 47.083 mean from 24 valid. ... (many lines omitted) ... 2004-12-27: 2.483 mean from 23 valid. 2004-12-28: 3.383 mean from 23 valid. 2004-12-29: 2.448 mean from 23 valid. 2004-12-30: 2.839 mean from 23 valid. 2004-12-31: 2.057 mean from 23 valid. Longest period of invalid readings was 589 hours, from 1993-02-09 1:00 till 1993-03-05 14:00. </pre> =={{header\|REXX}}== <~~lang~~syntaxhighlight lang="rexx">/REXX program to process instrument data from a data file. / numeric digits 20 /allow for bigger (precision) numbers. / ifid='READINGS.TXT' /the name ~~input file.~~ of the input file. / ofid='READINGS.OUT' /~~the~~ " ~~outut~~ ~~file.~~ " " " output " / grandSum=0 /the grand sum of whole file. / ~~grandflg~~grandFlg=0 /the grand ~~num~~number of flagged data. / grandOKs=0 ~~longFlag~~Lflag=0 /the longest period of flagged data. / ~~contFlag~~Cflag=0 /the longest continous flagged data. / w=16 /the width of fields when displayed. / do recs=1 while lines(ifid)\==0 /~~read~~keep reading records until finished. / rec=linein(ifid) /read the next record (line). of file. / parse var rec datestamp Idata /pick off the dateStamp &and the data. / sum=0 flg=0 OKs=0 do j=1 until Idata='' /process the instrument data. / parse var Idata data.j flag.j Idata if flag.j>0 then do /ifprocess good data, ~~...~~··· / OKs=OKs+1 sum=sum+data.j if ~~contFlag~~Cflag>~~longFlag~~Lflag then do ~~longdate~~Ldate=datestamp ~~longFlag~~Lflag=~~contFlag~~Cflag end ~~contFlag~~Cflag=0 end else do /process flagged data ~~...~~ ··· / flg=flg+1 ~~contFlag~~Cflag=~~contFlag~~Cflag+1 end end /j/ if OKs\==0 then avg=format(sum/OKs,,3) else avg='[n/a]' grandOKs=grandOKs+OKs _=right(~~comma~~commas(avg),w) grandSum=grandSum+sum grandFlg=grandFlg+flg if flg==0 then call sy datestamp ' average='_ else call sy datestamp ' average='_ ' flagged='right(flg,2) end /recs/ recs=recs-1 /adjust for reading ~~end-of-file~~the end─of─file. / if grandOKs\==0 then Gavg=format(~~grandsum~~grandSum/grandOKs,,3) else Gavg='[n/a]' call sy call sy copies('═',60) call sy ' records read:' right(~~comma~~commas(recs), w) call sy ' grand sum:' right(~~comma~~commas(grandSum), w+4) call sy ' grand average:' right(~~comma~~commas(Gavg), w+4) call sy ' grand OK data:' right(~~comma~~commas(grandOKs), w) call sy ' grand flagged:' right(~~comma~~commas(grandFlg), w) if Lflag\==0 then call sy ' longest flagged:' right(commas(Lflag),w) " ending at " Ldate ~~if longFlag\==0 then~~ ~~call sy ' longest flagged:' right(comma(longFlag),w) " ending at " longdate~~ call sy copies('═',60) exit /stick a fork in it, we're all done. / /────────────────────────────────────────────────────────────────────────────/ ~~/──────────────────────────────────SY subroutine───────────────────────/~~ sycommas: procedure; parse arg ~~stuff~~_; n=_'.9'; #=123456789; ~~say~~ ~~stuff~~b=verify(n,#,"M") e=verify(n,#'0',,verify(n,#"0.",'M'))-4 ~~if 1==0 then call lineout ofid,stuff~~ do j=e to b by -3; _=insert(',',_,j); end /j/; return _ ~~return~~ /────────────────────────────────────────────────────────────────────────────/ ~~/──────────────────────────────────COMMA subroutine────────────────────/~~ sy: say arg(1); call lineout ofid,arg(1); return</syntaxhighlight> ~~comma: procedure; parse arg _,c,p,t;arg ,cu;c=word(c ",",1)~~ '''output'''   when using the default input file: ~~if cu=='BLANK' then c=' ';o=word(p 3,1);p=abs(o);t=word(t 999999999,1)~~ <pre style="height:40ex"> ~~if \datatype(p,'W')\|\datatype(t,'W')\|p==0\|arg()>4 then return _;n=_'.9'~~ ~~#=123456789;k=0;if o<0 then do;b=verify(_,' ');if b==0 then return _~~ ~~e=length(_)-verify(reverse(_),' ')+1;end;else do;b=verify(n,#,"M")~~ ~~e=verify(n,#'0',,verify(n,#"0.",'M'))-p-1;end~~ ~~do j=e to b by -p while k<t;_=insert(c,_,j);k=k+1;end;return _</lang>~~ ~~{{out}}~~ ~~<pre style="height:40ex;overflow:scroll">~~ ∙ ∙ Line 2,660 ⟶ 3,557: =={{header\|Ruby}}== <~~lang~~syntaxhighlight lang="ruby">filename = "readings.txt" total = { "num_readings" => 0, "num_good_readings" => 0, "sum_readings" => 0.0 } invalid_count = 0 Line 2,702 ⟶ 3,599: printf "Average = %.3f\n", total['sum_readings']/total['num_good_readings'] puts "" puts "Maximum run(s) of #{max_invalid_count} consecutive false readings ends at #{invalid_run_end}"</~~lang~~syntaxhighlight> Alternate implementation: <syntaxhighlight lang="ruby">Reading = Struct.new(:date, :value, :flag) DailyReading = Struct.new(:date, :readings) do def good() readings.select(&:flag) end def bad() readings.reject(&:flag) end def sum() good.map(&:value).inject(0.0) {\|sum, val\| sum + val } end def avg() good.size > 0 ? (sum / good.size) : 0 end def print_status puts "%11s: good: %2d bad: %2d total: %8.3f avg: %6.3f" % [date, good.count, bad.count, sum, avg] self end end daily_readings = IO.foreach(ARGV.first).map do \|line\| (date, parts) = line.chomp.split(/\s/) readings = parts.each_slice(2).map {\|pair\| Reading.new(date, pair.first.to_f, pair.last.to_i > 0)} DailyReading.new(date, readings).print_status end all_readings = daily_readings.flat_map(&:readings) good_readings = all_readings.select(&:flag) all_streaks = all_readings.slice_when {\|bef, aft\| bef.flag != aft.flag } worst_streak = all_streaks.reject {\|grp\| grp.any?(&:flag)}.sort_by(&:size).last total = good_readings.map(&:value).reduce(:+) num_readings = good_readings.count puts puts "Total: %.3f" % total puts "Readings: #{num_readings}" puts "Average %.3f" % total./(num_readings) puts puts "Max run of #{worst_streak.count} consecutive false readings from #{worst_streak.first.date} until #{worst_streak.last.date}" </syntaxhighlight> =={{header\|Scala}}== Line 2,708 ⟶ 3,640: A fully functional solution, minus the fact that it uses iterators: <~~lang~~syntaxhighlight lang="scala">object DataMunging { import scala.io.Source Line 2,765 ⟶ 3,697: println(report format (files mkString " ", totalSum, totalSize, totalSum / totalSize, invalidCount, startDate)) } }</~~lang~~syntaxhighlight> A quick&dirty solution: <~~lang~~syntaxhighlight lang="scala">object AltDataMunging { def main(args: Array[String]) { var totalSum = 0.0 Line 2,808 ⟶ 3,740: println(report format (files mkString " ", totalSum, totalSize, totalSum / totalSize, maxInvalidCount, maxInvalidDate)) } }</~~lang~~syntaxhighlight> Last few lines of the sample output (either version): <pre> Line 2,823 ⟶ 3,755: </pre> Though it is easier to show when the consecutive false readings ends, if longest run is the last thing in the file, it hasn't really "ended". =={{header\|Sidef}}== {{trans\|Raku}} <syntaxhighlight lang="ruby">var gaps = []; var previous = :valid; ARGF.each { \|line\| var (date, readings) = line.words...; var valid = []; var hour = 0; readings.map{.to_n}.each_slice(2, { \|reading, flag\| if (flag > 0) { valid << reading; if (previous == :invalid) { gaps[-1]{:end} = "#{date} #{hour}:00"; previous = :valid; } } else { if (previous == :valid) { gaps << Hash(start => "#{date} #{hour}:00"); } gaps[-1]{:count} := 0 ++; previous = :invalid; } ++hour; }) say ("#{date}: #{ '%8s' % (valid ? ('%.3f' % Math.avg(valid...)) : 0) }", " mean from #{ '%2s' % valid.len } valid."); } var longest = gaps.sort_by{\|a\| -a{:count} }.first; say ("Longest period of invalid readings was #{longest{:count}} hours,\n", "from #{longest{:start}} till #{longest{:end}}.");</syntaxhighlight> {{out}} <pre> 1991-03-30: 10.000 mean from 24 valid. 1991-03-31: 23.542 mean from 24 valid. 1991-03-31: 40.000 mean from 1 valid. 1991-04-01: 23.217 mean from 23 valid. 1991-04-02: 19.792 mean from 24 valid. 1991-04-03: 13.958 mean from 24 valid. Longest period of invalid readings was 24 hours, from 1991-03-31 1:00 till 1991-04-01 1:00. </pre> ''Output is from the sample of the task.'' =={{header\|Swift}}== {{trans\|Rust}} <syntaxhighlight lang="swift">import Foundation let fmtDbl = { String(format: "%10.3f", $0) } Task.detached { let formatter = DateFormatter() formatter.dateFormat = "yyyy-MM-dd" let (data, _) = try await URLSession.shared.bytes(from: URL(fileURLWithPath: CommandLine.arguments[1])) var rowStats = [(Date, Double, Int)]() var invalidPeriods = 0 var invalidStart: Date? var sumFile = 0.0 var readings = 0 var longestInvalid = 0 var longestInvalidStart: Date? var longestInvalidEnd: Date? for try await line in data.lines { let lineSplit = line.components(separatedBy: "\t") guard !lineSplit.isEmpty, let date = formatter.date(from: lineSplit[0]) else { fatalError("Invalid date \(lineSplit[0])") } let data = Array(lineSplit.dropFirst()) let parsed = stride(from: 0, to: data.endIndex, by: 2).map({idx -> (Double, Int) in let slice = data[idx..<idx+2] return (Double(slice[idx]) ?? 0, Int(slice[idx+1]) ?? 0) }) var sum = 0.0 var numValid = 0 for (val, flag) in parsed { if flag <= 0 { if invalidStart == nil { invalidStart = date } invalidPeriods += 1 } else { if invalidPeriods > longestInvalid { longestInvalid = invalidPeriods longestInvalidStart = invalidStart longestInvalidEnd = date } sumFile += val sum += val numValid += 1 readings += 1 invalidPeriods = 0 invalidStart = nil } } if numValid != 0 { rowStats.append((date, sum / Double(numValid), parsed.count - numValid)) } } for stat in rowStats.lazy.reversed().prefix(5) { print("\(stat.0): Average: \(fmtDbl(stat.1)); Valid Readings: \(24 - stat.2); Invalid Readings: \(stat.2)") } print(""" Sum File: \(fmtDbl(sumFile)) Average: \(fmtDbl(sumFile / Double(readings))) Readings: \(readings) Longest Invalid: \(longestInvalid) (\(longestInvalidStart!) - \(longestInvalidEnd!)) """) exit(0) } dispatchMain() </syntaxhighlight> {{out}} <pre>2004-12-31 05:00:00 +0000: Average: 2.057; Valid Readings: 23; Invalid Readings: 1 2004-12-30 05:00:00 +0000: Average: 2.839; Valid Readings: 23; Invalid Readings: 1 2004-12-29 05:00:00 +0000: Average: 2.448; Valid Readings: 23; Invalid Readings: 1 2004-12-28 05:00:00 +0000: Average: 3.383; Valid Readings: 23; Invalid Readings: 1 2004-12-27 05:00:00 +0000: Average: 2.483; Valid Readings: 23; Invalid Readings: 1 Sum File: 1358393.400 Average: 10.497 Readings: 129403 Longest Invalid: 589 (1993-02-09 05:00:00 +0000 - 1993-03-05 05:00:00 +0000)</pre> =={{header\|Tcl}}== <~~lang~~syntaxhighlight lang="tcl">set max_invalid_run 0 set max_invalid_run_end "" set tot_file 0 Line 2,865 ⟶ 3,943: puts "Average = [format %.3f [expr {$tot_file / $num_file}]]" puts "" puts "Maximum run(s) of $max_invalid_run consecutive false readings ends at $max_invalid_run_end"</~~lang~~syntaxhighlight> =={{header\|Ursala}}== Line 2,871 ⟶ 3,949: and booleans (type <code>%ebXLm</code>) in the parsed data. The same function is used to compute the daily and the cumulative statistics. <~~lang~~syntaxhighlight ~~Ursala~~lang="ursala">#import std #import nat #import flo Line 2,891 ⟶ 3,969: @nmrSPDSL -&~&,leql$^; ^/length ~&zn&-@hrZPF+ rlc both ~&rZ+- main = ^T(daily_stats^lrNCT/~& @mSL 'summary ':,long_run) parsed_data</~~lang~~syntaxhighlight> last few lines of output: <pre> Line 2,899 ⟶ 3,977: summary accept: 129403 total: 1358393.4 average: 10.497 maximum of 589 consecutive false readings ending on line 1993-03-05 </pre> =={{header\|VBScript}}== <syntaxhighlight lang="vb">Set objFSO = CreateObject("Scripting.FileSystemObject") Set objFile = objFSO.OpenTextFile(objFSO.GetParentFolderName(WScript.ScriptFullName) &_ "\data.txt",1) bad_readings_total = 0 good_readings_total = 0 data_gap = 0 start_date = "" end_date = "" tmp_datax_gap = 0 tmp_start_date = "" Do Until objFile.AtEndOfStream bad_readings = 0 good_readings = 0 line_total = 0 line = objFile.ReadLine token = Split(line,vbTab) n = 1 Do While n <= UBound(token) If n + 1 <= UBound(token) Then If CInt(token(n+1)) < 1 Then bad_readings = bad_readings + 1 bad_readings_total = bad_readings_total + 1 'Account for bad readings. If tmp_start_date = "" Then tmp_start_date = token(0) End If tmp_data_gap = tmp_data_gap + 1 Else good_readings = good_readings + 1 line_total = line_total + CInt(token(n)) good_readings_total = good_readings_total + 1 'Sum up the bad readings. If (tmp_start_date <> "") And (tmp_data_gap > data_gap) Then start_date = tmp_start_date end_date = token(0) data_gap = tmp_data_gap tmp_start_date = "" tmp_data_gap = 0 Else tmp_start_date = "" tmp_data_gap = 0 End If End If End If n = n + 2 Loop line_avg = line_total/good_readings WScript.StdOut.Write "Date: " & token(0) & vbTab &_ "Bad Reads: " & bad_readings & vbTab &_ "Good Reads: " & good_readings & vbTab &_ "Line Total: " & FormatNumber(line_total,3) & vbTab &_ "Line Avg: " & FormatNumber(line_avg,3) WScript.StdOut.WriteLine Loop WScript.StdOut.WriteLine WScript.StdOut.Write "Maximum run of " & data_gap &_ " consecutive bad readings from " & start_date & " to " &_ end_date & "." WScript.StdOut.WriteLine objFile.Close Set objFSO = Nothing</syntaxhighlight> {{Out}} <pre> Date: 1991-03-30 Bad Reads: 0 Good Reads: 24 Line Total: 240.000 Line Avg: 10.000 Date: 1991-03-31 Bad Reads: 0 Good Reads: 24 Line Total: 565.000 Line Avg: 23.542 Date: 1991-03-31 Bad Reads: 23 Good Reads: 1 Line Total: 40.000 Line Avg: 40.000 Date: 1991-04-01 Bad Reads: 1 Good Reads: 23 Line Total: 534.000 Line Avg: 23.217 Date: 1991-04-02 Bad Reads: 0 Good Reads: 24 Line Total: 475.000 Line Avg: 19.792 Date: 1991-04-03 Bad Reads: 0 Good Reads: 24 Line Total: 335.000 Line Avg: 13.958 Maximum run of 24 consecutive bad readings from 1991-03-31 to 1991-04-01. </pre> Line 2,904 ⟶ 4,058: {{trans\|AWK}} Vedit does not have floating point data type, so fixed point calculations are used here. <~~lang~~syntaxhighlight lang="vedit">#50 = Buf_Num // Current edit buffer (source data) File_Open("output.txt") #51 = Buf_Num // Edit buffer for output file Line 2,971 ⟶ 4,125: IT("Maximum run(s) of ") Num_Ins(#13, LEFT+NOCR) IT(" consecutive false readings ends at line starting with date(s): ") Reg_Ins(15) IN</~~lang~~syntaxhighlight> {{out\|Sample output}} <pre> Line 2,985 ⟶ 4,139: Maximum run(s) of 589 consecutive false readings ends at line starting with date(s): 1993-03-05 </pre> =={{header\|Wren}}== {{trans\|Kotlin}} {{libheader\|Wren-pattern}} {{libheader\|Wren-fmt}} <syntaxhighlight lang="wren">import "io" for File import "./pattern" for Pattern import "./fmt" for Fmt var p = Pattern.new("+1/s") var fileName = "readings.txt" var lines = File.read(fileName).trimEnd().split("\r\n") var f = "Line: $s Reject: $2d Accept: $2d Line_tot: $8.3f Line_avg: $7.3f" var grandTotal = 0 var readings = 0 var date = "" var run = 0 var maxRun = -1 var finishLine = "" for (line in lines) { var fields = p.splitAll(line) date = fields[0] if (fields.count == 49) { var accept = 0 var total = 0 var i = 1 while (i < fields.count) { if (Num.fromString(fields[i+1]) >= 1) { accept = accept + 1 total = total + Num.fromString(fields[i]) if (run > maxRun) { maxRun = run finishLine = date } run = 0 } else { run = run + 1 } i = i + 2 } grandTotal = grandTotal + total readings = readings + accept Fmt.print(f, date, 24-accept, accept, total, total/accept) } else { Fmt.print("Line: $s does not have 49 fields and has been ignored", date) } } if (run > maxRun) { maxRun = run finishLine = date } var average = grandTotal / readings Fmt.print("\nFile = $s", fileName) Fmt.print("Total = $0.3f", grandTotal) Fmt.print("Readings = $d", readings) Fmt.print("Average = $0.3f", average) Fmt.print("\nMaximum run of $d consecutive false readings", maxRun) Fmt.print("ends at line starting with date: $s", finishLine)</syntaxhighlight> {{out}} Abridged output. <pre> Line: 1990-01-01 Reject: 2 Accept: 22 Line_tot: 590.000 Line_avg: 26.818 Line: 1990-01-02 Reject: 0 Accept: 24 Line_tot: 410.000 Line_avg: 17.083 Line: 1990-01-03 Reject: 0 Accept: 24 Line_tot: 1415.000 Line_avg: 58.958 Line: 1990-01-04 Reject: 0 Accept: 24 Line_tot: 1800.000 Line_avg: 75.000 Line: 1990-01-05 Reject: 0 Accept: 24 Line_tot: 1130.000 Line_avg: 47.083 ... Line: 2004-12-27 Reject: 1 Accept: 23 Line_tot: 57.100 Line_avg: 2.483 Line: 2004-12-28 Reject: 1 Accept: 23 Line_tot: 77.800 Line_avg: 3.383 Line: 2004-12-29 Reject: 1 Accept: 23 Line_tot: 56.300 Line_avg: 2.448 Line: 2004-12-30 Reject: 1 Accept: 23 Line_tot: 65.300 Line_avg: 2.839 Line: 2004-12-31 Reject: 1 Accept: 23 Line_tot: 47.300 Line_avg: 2.057 File = readings.txt Total = 1358393.400 Readings = 129403 Average = 10.497 Maximum run of 589 consecutive false readings ends at line starting with date: 1993-03-05 </pre> {{omit from\|Openscad}}