Text processing/1: Difference between revisions

(→‎Process whole file at once: Add example output)
(→‎Process file in chunks: add commentary)
Line 648:
<lang j>
'Dates DailySumry Flags'=: mungeDataBlocks jpath '~temp/readings.txt'
Line: Accept: Line_tot: Line_avg:
NB. output as for example above
2004-12-28 23 77.800 3.383
$ each Dates;DailySumry;Flags
2004-12-29 23 56.300 2.448
NB. output as for example above
2004-12-30 23 65.300 2.839
2004-12-31 23 47.300 2.057
 
Total: 1358393.400
Readings: 129403
Average: 10.497
 
Maximum run(s) of 589 consecutive false readings ends at line(s) starting with date(s): 1993-03-05
 
$ each Dates;DailySumry;Flags NB. show array shapes (rows cols) of each of the nouns defined
+-------+------+-------+
|5471 10|5471 3|5471 24|
+-------+------+-------+
</lang>
 
==== Process lines at a time ====
The <tt>fapplyines</tt> adverb reads 1,000,000 chunks of the file, retaining trailing part lines in a buffer, it then processes the complete lines, line-by-line before reading the next chunk from the file.
Because this results in more operations on smaller arrays it is likely to be slower in J than the preceding approaches.
 
Because this approach results in more operations on smaller arrays it is likely to be slower in J than the preceding approachesmethods.
 
Example using the <tt>fapplylines</tt> adverb to process a line at a time.
<lang j>
processLine=: monad define
Line 687 ⟶ 702:
Example usage
<lang j>
mungeDataLines jpath '~temp/readings.txt'
NB. add output as for previous example
$ each Dates;DailySumry;MaxRuns
NB. add output as for previous example
</lang>
 
892

edits