CSV to HTML translation

From Rosetta Code
CSV to HTML translation is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Consider a simplified CSV format where all rows are separated by a newline and all columns are separated by commas. No commas are allowed as field data, but the data may contain other characters and character sequences that would normally be escaped when converted to HTML

The task is to create a function that takes a string representation of the CSV data and returns a text string of an HTML table representing the CSV data. Use the following data as the CSV text to convert, and show your output.

Character,Speech
The multitude,The messiah! Show us the messiah!
Brians mother,<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitude,Who are you?
Brians mother,I'm his mother; that's who!
The multitude,Behold his mother! Behold his mother!

For extra credit, optionally allow special formatting for the first row of the table as if it is the tables header row.

C

This produces a full bare html (tidy gives 1 warning) with styles embedded but not "inlined"; it does not escape characters that does not need to be escaped (provided that the correct encoding matching the encoding of the input is given; currently hard-endoded UTF-8). <lang c>#include <stdio.h>

  1. include <stddef.h>
  2. include <stdlib.h>
  3. include <string.h>
  1. define BUF_LEN 12

void html_min_header(const char *enc, const char *title) {

 printf("<html><head><title>%s</title>"

"<meta http-equiv=\"Content-Type\" content=\"text/html; charset=%s\">" "<style type=\"text/css\">" "</style></head><body>", title, enc); }

void html_min_footer(void) {

 printf("</body></html>");

}

void escape_html(char *o, int c) {

 static const char *specials = "<>&";
 static const char *map[] = { "<", ">", "&"};
 ptrdiff_t pos;
 char *p;
 
 if ( (p = strchr(specials, c)) != NULL ) {
   pos = p - specials;
   if (o != NULL) strcpy(o, map[pos]);
 } else {
   o[0] = c;
   o[1] = '\0';
 }

}

void add_column(const char *type, int c) {

 char buf[BUF_LEN];
 if ( c == '\n' ) return;
 printf("<%s>", type);
 for(; c != EOF && c != '\n'; c = getchar()) {
   if (c == ',') {
     printf("</%s><%s>", type, type);
     continue;
   }
   escape_html(buf, c);
   printf("%s", buf);
 }
 printf("</%s>", type);

}

enum mode {

 FIRST = 1, NEXT

}; int main(int argc, char **argv) {

 int c;
 enum mode status = FIRST;
 html_min_header("utf-8", "CSV converted into HTML");

printf("

"); while( (c = getchar()) != EOF ) { printf(""); switch(status) { case FIRST: add_column("th", c); status = NEXT; break; case NEXT: default: add_column("td", c); break; } printf(""); } printf("

");

 html_min_footer();
 return EXIT_SUCCESS;

}</lang>

Icon and Unicon

This solution for the extra credit works in both Icon and Unicon. The simple CSV is read from standard input and written to standard output. The presence/abscend of "-heading" in the argument list sets the variable thead to the procedure writes or a 1 (for more on this see Introduction to Icon/Unicon - Conjunction yielding different results).

<lang Icon>procedure main(arglist) local pchar,row,thead

pchar := &letters ++ &digits ++ '!?;. ' # printable chars

write("

") thead := if !arglist == "-heading" then writes else 1 while row := trim(read()) do { thead("<THEAD>") writes("") thead("</THEAD>") thead := 1 } every write("</TBODY>") end</lang>

J

Solution (extra credit) <lang j>require 'strings tables/csv' encodeHTML=: ('&';'&';'<';'<';'>';'>')&stringreplace

tag=: adverb define

 'starttag endtag'=.m
 (,&.>/)"1 (starttag , ,&endtag) L:0 y

)

markupCells=: ('') tag markupHdrCells=: ('') tag markupRows=: ('';'',LF) tag markupTable=: (('
")
  while *row > 0 do
row ?:= ( (=",",writes("
")) | writes( tab(many(pchar)) | ("&#" || ord(move(1))) ), tab(0)) write("
';'';'
',LF);'

') tag

makeHTMLtablefromCSV=: verb define

 0 makeHTMLtablefromCSV y             NB. default left arg is 0 (no header row)
 t=. fixcsv encodeHTML y
 if. x do. t=. (markupHdrCells@{. , markupCells@}.) t
     else. t=. markupCells t
 end.
 ;markupTable markupRows t

)</lang>

For those interested, equivalent tacit versions of tag and makeHTMLtablefromCSV are: <lang j>tag=: adverb def '[: (,&.>/)"1 m&(0&{::@[ , 1&{::@[ ,~ ]) L:0@]' makeHTMLtablefromCSV6=: 0&$: : ([: ; markupTable@markupRows@([ markupCells`(markupHdrCells@{. , markupCells@}.)@.[ fixcsv@encodeHTML))</lang>

Example <lang j> CSVstrng=: noun define Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry> The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother! )

  1 makeHTMLtablefromCSV CSVstrng</lang>

HTML output:

<lang html>

CharacterSpeech
The multitudeThe messiah! Show us the messiah!
Brians mother<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitudeWho are you?
Brians motherI'm his mother; that's who!
The multitudeBehold his mother! Behold his mother!

</lang>

OCaml

<lang ocaml>let csv_data = "\ Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; \

             he's a very naughty boy! Now go away!</angry>

The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother!"

(* some utility functions *)

let string_of_char = String.make 1 ;;

let string_of_string_list = String.concat ""

let char_list_of_string str =

 let lst = ref [] in
 String.iter (fun c -> lst := c :: !lst) str;
 (List.rev !lst)

(** escape chars that need to be escaped *) let escape str =

 let chars = char_list_of_string str in
 let rec aux acc = function
 | [] -> (List.rev acc)
 | c :: tl ->
     match c with
     | 'A'..'Z'
     | 'a'..'z'
     | '0'..'9'
     | ' ' | ';' | '!' | '?' ->
         aux ((string_of_char c)::acc) tl
     | c ->
         let esc_char = (Printf.sprintf "&#%04d;" (Char.code c)) in
         aux (esc_char::acc) tl
 in
 string_of_string_list (aux [] chars)

(* now the main part *)

let extract_csv_data ~csv_data:s =

 let len = String.length s in
 let rec aux acc_line acc i j =
   if i = len
   then
     let sub = String.sub s j (i - j) in
     List.rev ((acc_line @ [escape sub])::acc)
   else
     match csv_data.[i] with
     | ',' ->
         let sub = String.sub s (j+1) (i - j - 1) in
         aux ((escape sub)::acc_line) acc (succ i) (succ i)
     | '\n' ->
         let sub = String.sub s j (i - j) in
         let acc_line = List.rev (escape sub::acc_line) in
         aux [] (acc_line::acc) (succ i) i
     | _ ->
         aux acc_line acc (succ i) j
 in
 aux [] [] 0 (-1)

let print_html_table segments =

print_string "

\n"; List.iter (fun line -> print_string "\n"; List.iter (Printf.printf " ") line; print_string "\n\n"; ) segments; print_string "
%s

\n";

let () =

 let segments = extract_csv_data ~csv_data in
 print_html_table segments</lang>

Sample html output:

<lang html>

Character Speech
The multitude The messiah! Show us the messiah!
Brians mother <angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitude Who are you?
Brians mother I'm his mother; that's who!
The multitude Behold his mother! Behold his mother!

</lang>

Extra credit version: <lang ocaml>let csv_data = "\ Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; \

             he's a very naughty boy! Now go away!</angry>

The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother!"

(* some utility functions *)

let string_of_char = String.make 1 ;;

let string_of_string_list = String.concat ""

let char_list_of_string str =

 let lst = ref [] in
 String.iter (fun c -> lst := c :: !lst) str;
 (List.rev !lst)

(** escape chars that need to be escaped *) let escape str =

 let chars = char_list_of_string str in
 let rec aux acc = function
 | [] -> (List.rev acc)
 | c :: tl ->
     match c with
     | 'A'..'Z'
     | 'a'..'z'
     | '0'..'9'
     | ' ' | ';' | '!' | '?' ->
         aux ((string_of_char c)::acc) tl
     | c ->
         let esc_char = (Printf.sprintf "&#%04d;" (Char.code c)) in
         aux (esc_char::acc) tl
 in
 string_of_string_list (aux [] chars)

(* now the main part *)

let extract_csv_data ~csv_data:s =

 let len = String.length s in
 let rec aux acc_line acc i j =
   if i = len
   then
     let sub = String.sub s j (i - j) in
     List.rev ((acc_line @ [escape sub])::acc)
   else
     match csv_data.[i] with
     | ',' ->
         let sub = String.sub s (j+1) (i - j - 1) in
         aux (escape sub::acc_line) acc (succ i) (succ i)
     | '\n' ->
         let sub = String.sub s j (i - j) in
         let acc_line = List.rev (escape sub::acc_line) in
         aux [] (acc_line::acc) (succ i) i
     | _ ->
         aux acc_line acc (succ i) j
 in
 aux [] [] 0 (-1)


let style_th = "style='color:#000; background:#FF0;'" let style_td = "style='color:#000; background:#8FF; \

                      border:1px #000 solid; padding:0.6em;'"

let print_html_table segments =

print_string "

\n"; let print_line tag_open tag_close lines = List.iter (fun line -> Printf.printf " %s%s%s" tag_open line tag_close; ) lines in begin match segments with | head :: body -> print_string " <thead>\n"; print_string " \n"; print_line (" \n" head; print_string " \n"; print_string " </thead>\n"; (); print_string " <tbody>\n"; List.iter (fun line -> print_string " \n"; List.iter (Printf.printf " \n" style_td) line; print_string " \n"; ) body; print_string " </tbody>\n"; | _ -> () end; print_string "
") "
%s

\n";

let () =

 let segments = extract_csv_data ~csv_data in
 print_html_table segments</lang>

Output:

<lang html>

<thead> </thead> <tbody> </tbody>
Character Speech
The multitude The messiah! Show us the messiah!
Brians mother <angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitude Who are you?
Brians mother I'm his mother; that's who!
The multitude Behold his mother! Behold his mother!

</lang>

Perl

Provide the CSV data as standard input. With a command-line argument, the first row will use instead of .

<lang perl>use HTML::Entities;

sub row {

   my $elem = shift;
   my @cells = map {"<$elem>$_</$elem>"} split ',', shift;

print '', @cells, "\n"; } my ($first, @rest) = map {my $x = $_; chomp $x; encode_entities $x} <STDIN>; print "

\n"; row @ARGV ? 'th' : 'td', $first; row 'td', $_ foreach @rest; print "

\n";</lang>

Output (with a command-line argument):

<lang html4strict>

CharacterSpeech
The multitudeThe messiah! Show us the messiah!
Brians mother<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitudeWho are you?
Brians motherI'm his mother; that's who!
The multitudeBehold his mother! Behold his mother!

</lang>

PicoLisp

Simple solution

<lang PicoLisp>(load "@lib/http.l")

(in "text.csv"

(

'myStyle NIL NIL (prinl) (while (split (line) ",") (<row> NIL (ht:Prin (pack (car @))) (ht:Prin (pack (cadr @)))) (prinl) ) ) )</lang> Output:
<table class="myStyle">
<tr><td>Character</td><td>Speech</td></tr>
<tr><td>The multitude</td><td>The messiah! Show us the messiah!</td></tr>
<tr><td>Brians mother</td><td><angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry></td></tr>
<tr><td>The multitude</td><td>Who are you?</td></tr>
<tr><td>Brians mother</td><td>I'm his mother; that's who!</td></tr>
<tr><td>The multitude</td><td>Behold his mother! Behold his mother!</td></tr>
</table>

Extra credit solution

<lang PicoLisp>(load "@lib/http.l")

(in "text.csv"

  (when (split (line) ",")
(
'myStyle NIL (mapcar '((S) (list NIL (pack S))) @) (prinl) (while (split (line) ",") (<row> NIL (ht:Prin (pack (car @))) (ht:Prin (pack (cadr @)))) (prinl) ) ) ) )</lang> Output:
<table class="myStyle"><tr><th>Character</th><th>Speech</th></tr>
<tr><td>The multitude</td><td>The messiah! Show us the messiah!</td></tr>
<tr><td>Brians mother</td><td><angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry></td></tr>
<tr><td>The multitude</td><td>Who are you?</td></tr>
<tr><td>Brians mother</td><td>I'm his mother; that's who!</td></tr>
<tr><td>The multitude</td><td>Behold his mother! Behold his mother!</td></tr>
</table>

Python

(Note: rendered versions of both outputs are shown at the foot of this section).

Simple solution

<lang python>csvtxt = \ Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry> The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother!\

from cgi import escape

def _row2tr(row, attr=None):

   cols = escape(row).split(',')
return ('' + .join('' % data for data in cols) + '') def csv2html(txt): htmltxt = '
%s
\n' for rownum, row in enumerate(txt.split('\n')): htmlrow = _row2tr(row) htmlrow = ' <TBODY>%s</TBODY>\n' % htmlrow htmltxt += htmlrow htmltxt += '

\n'

   return htmltxt

htmltxt = csv2html(csvtxt) print(htmltxt)</lang>

Sample HTML output

<lang html>

<TBODY></TBODY> <TBODY></TBODY> <TBODY></TBODY> <TBODY></TBODY> <TBODY></TBODY> <TBODY></TBODY>
CharacterSpeech
The multitudeThe messiah! Show us the messiah!
Brians mother<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitudeWho are you?
Brians motherI'm his mother; that's who!
The multitudeBehold his mother! Behold his mother!

</lang>

Extra credit solution

<lang python>def _row2trextra(row, attr=None):

   cols = escape(row).split(',')
   attr_tr = attr.get('TR', )
   attr_td = attr.get('TD', )
   return (('<TR%s>' % attr_tr)

+ .join('<TD%s>%s' % (attr_td, data) for data in cols) + '') def csv2htmlextra(txt, header=True, attr=None): ' attr is a dictionary mapping tags to attributes to add to that tag' attr_table = attr.get('TABLE', ) attr_thead = attr.get('THEAD', ) attr_tbody = attr.get('TBODY', ) htmltxt = '<TABLE%s>\n' % attr_table for rownum, row in enumerate(txt.split('\n')): htmlrow = _row2trextra(row, attr) rowclass = ('THEAD%s' % attr_thead) if (header and rownum == 0) else ('TBODY%s' % attr_tbody) htmlrow = ' <%s>%s</%s>\n' % (rowclass, htmlrow, rowclass[:5]) htmltxt += htmlrow htmltxt += '\n'

   return htmltxt

htmltxt = csv2htmlextra(csvtxt, True,

                       dict(TABLE=' border="1" summary="csv2html extra program output"',
                            THEAD=' bgcolor="yellow"',
                            TBODY=' bgcolor="orange"' 
                            )
                       )

print(htmltxt)</lang>

Sample HTML output

<lang html>

<THEAD bgcolor="yellow"></THEAD> <TBODY bgcolor="orange"></TBODY> <TBODY bgcolor="orange"></TBODY> <TBODY bgcolor="orange"></TBODY> <TBODY bgcolor="orange"></TBODY> <TBODY bgcolor="orange"></TBODY>
CharacterSpeech
The multitudeThe messiah! Show us the messiah!
Brians mother<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry>
The multitudeWho are you?
Brians motherI'm his mother; that's who!
The multitudeBehold his mother! Behold his mother!

</lang>

HTML rendered in firefox browser

Tcl

Library: Tcllib (Package: csv)
Library: Tcllib (Package: html)
Library: Tcllib (Package: struct::queue)

<lang tcl>package require Tcl 8.5 package require csv package require html package require struct::queue

set csvData "Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry> The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother!"

struct::queue rows foreach line [split $csvData "\n"] {

   csv::split2queue rows $line

} html::init puts [subst {

   [html::openTag table {summary="csv2html program output"}]
   [html::while {[rows size]} {

[html::row {*}[html::quoteFormValue [rows get]]]

   }]
   [html::closeTag]

}]</lang>

Extra credit version: <lang tcl>package require Tcl 8.5 package require csv package require html package require struct::queue

set csvData "Character,Speech The multitude,The messiah! Show us the messiah! Brians mother,<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away!</angry> The multitude,Who are you? Brians mother,I'm his mother; that's who! The multitude,Behold his mother! Behold his mother!"

html::init {

   table.border 1
   table.summary "csv2html program output"
   tr.bgcolor orange

}

  1. Helpers; the html package is a little primitive otherwise

proc table {contents {opts ""}} {

   set out [html::openTag table $opts]
   append out [uplevel 1 [list subst $contents]]
   append out [html::closeTag]

} proc tr {list {ropt ""}} {

   set out [html::openTag tr $ropt]
   foreach x $list {append out [html::cell "" $x td]}
   append out [html::closeTag]

}

  1. Parse the CSV data

struct::queue rows foreach line [split $csvData "\n"] {

   csv::split2queue rows $line

}

  1. Generate the output

puts [subst {

   [table {

[tr [html::quoteFormValue [rows get]] {bgcolor="yellow"}] [html::while {[rows size]} { [tr [html::quoteFormValue [rows get]]] }]

   }]

}]</lang> Output:

<lang html4strict>

CharacterSpeech
The multitudeThe messiah! Show us the messiah!
Brians mother<angry>Now you listen here! He's not the messiah; he's a very naughty boy! Now go away! </angry>
The multitudeWho are you?
Brians motherI'm his mother; that's who!
The multitudeBehold his mother! Behold his mother!

</lang>