Convert CSV records to TSV: Difference between revisions

→‎{{header|J}}: include some crude documentation
(J draft)
(→‎{{header|J}}: include some crude documentation)
Line 115:
Implementation:
<syntaxhighlight lang=J>tokenize=: (0;(0 10#:10*do@>cutLF{{)n
1.1 1.1 1.1 1.1 1.1 NB. 0 start here
1.2 2.0 3.2 1.2 1.0 NB. 1 , or CR or LF starts a new "token"
2 4 2 2 2 NB. 2 quote toggles "quoted field mode"
1.2 2.0 1.0 2.2 1.0 NB. 3 CR,LF: 1 "token" CR,CR and CR,',': 2 "tokens"
1.2 2.0 3.2 1.2 5.3 NB. 4 closing quote must be followed by a delimiter
5 5 1.1 1.1 5 NB. 5 resync on newline after encountering nonsense
}});(;/',"',LF,CR);0 _1 0 _1) ;: LF,]
NB. , " CR LF ...
Line 153:
 
Here, also, we interpret "nonsense" as starting with a closing quote which is not followed by a delimiter.
 
For csv parsing we first break out fields using [[j:Vocabulary/semico#dyadic|;:]]. Here, each field is preceded by a delimiter. (We discard an optional trailing newline from the csv text and prepend a newline at the beginning so that every field has a preceding delimiter. Also, of course, if we were given a file reference, we work with the text of the file rather than its name.)
 
Then, these fields are formed into rows (fields which begin with newlines start a new row), and each field is stripped of delimiters and non-textual quotes are removed.
 
To translate to tsv form, we would first escape special characters in each field, then insert delimiters between each field and terminate each record with a newline.
 
Task example:
6,962

edits