Talk:Strip block comments: Difference between revisions

→‎J draft: new section
(→‎J draft: new section)
Line 10:
 
Do we have a (constant) definition of the delimiters to use or are they parameters to the stripping function? This is important because it leads to quite different solutions… –[[User:Dkf|Donal Fellows]] 09:56, 3 November 2010 (UTC)
 
== J draft ==
 
I have not seen any advancement on this task, so I threw together a quick example where the comment delimiters are hardcoded (which, in my opinion, is a good design decision for this task).
 
That said, the code uses a state machine, so probably deserves a bit of comment.
 
First, here is the version of the code I am commenting on. (The main page might easily be updated with a different version):
 
<lang j>str=:#~1 0 _1*./@:(|."0 1)2>4{"1(5;(0,"0~".;._2]0 :0);'/*'i.a.)&;:
1 0 0
0 2 0
2 3 2
0 2 2
)</lang>
 
The core of this code is a state machine processing a sequence of characters. It first classifies characters into three classes: '/', '*' and everything else. These character classes correspond to the three columns of numbers you see there. ('/' corresponds to the left column and '*' corresponds to the middle column.)
 
The rows of numbers correspond to states. State 0 corresponds to the top row, state 1 corresponds to the next row, ..., state 3 corresponds to the bottom row.
 
The previous character's state (which is initially 0) and current character class are used to determine the current character's state. (Those are the numeric values you see arranged in that four row table.)
 
We then find characters in the original text which have a state less than 2 and whose neighbors on both sides also had a state less than 2.
 
And we throw out everything else.
6,951

edits