Strip block comments
You are encouraged to solve this task according to the task description, using any language you may know.
A block comment begins with a beginning delimiter and ends with a ending delimiter, including the delimiters. These delimiters are often multi-character sequences.
Task: Strip block comments from program text (of a programming language much like classic C). Your demos should at least handle simple, non-nested and multiline block comment delimiters. The beginning delimiter is the two-character sequence “/*” and the ending delimiter is “*/”.
Sample text for stripping:
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */ /** * Another comment. */ function something() { }
Extra credit: Ensure that the stripping code is not hard-coded to the particular delimiters described above, but instead allows the caller to specify them. (If your language supports them, optional parameters may be useful for this.)
C.f: Strip comments from a string
Ada
strip.adb: <lang Ada>with Ada.Strings.Fixed; with Ada.Strings.Unbounded; with Ada.Text_IO; with Ada.Command_Line;
procedure Strip is
use Ada.Strings.Unbounded; procedure Print_Usage is begin Ada.Text_IO.Put_Line ("Usage:"); Ada.Text_IO.New_Line; Ada.Text_IO.Put_Line (" strip <file> [<opening> [<closing>]]"); Ada.Text_IO.New_Line; Ada.Text_IO.Put_Line (" file: file to strip"); Ada.Text_IO.Put_Line (" opening: string for opening comment"); Ada.Text_IO.Put_Line (" closing: string for closing comment"); Ada.Text_IO.New_Line; end Print_Usage;
Opening_Pattern : Unbounded_String := To_Unbounded_String ("/*"); Closing_Pattern : Unbounded_String := To_Unbounded_String ("*/"); Inside_Comment : Boolean := False;
function Strip_Comments (From : String) return String is use Ada.Strings.Fixed; Opening_Index : Natural; Closing_Index : Natural; Start_Index : Natural := From'First; begin if Inside_Comment then Start_Index := Index (Source => From, Pattern => To_String (Closing_Pattern)); if Start_Index < From'First then return ""; end if; Inside_Comment := False; Start_Index := Start_Index + Length (Closing_Pattern); end if; Opening_Index := Index (Source => From, Pattern => To_String (Opening_Pattern), From => Start_Index); if Opening_Index < From'First then return From (Start_Index .. From'Last); else Closing_Index := Index (Source => From, Pattern => To_String (Closing_Pattern), From => Opening_Index + Length (Opening_Pattern)); if Closing_Index > 0 then return From (Start_Index .. Opening_Index - 1) & Strip_Comments (From ( Closing_Index + Length (Closing_Pattern) .. From'Last)); else Inside_Comment := True; return From (Start_Index .. Opening_Index - 1); end if; end if; end Strip_Comments;
File : Ada.Text_IO.File_Type;
begin
if Ada.Command_Line.Argument_Count < 1 or else Ada.Command_Line.Argument_Count > 3 then Print_Usage; return; end if; if Ada.Command_Line.Argument_Count > 1 then Opening_Pattern := To_Unbounded_String (Ada.Command_Line.Argument (2)); if Ada.Command_Line.Argument_Count > 2 then Closing_Pattern := To_Unbounded_String (Ada.Command_Line.Argument (3)); else Closing_Pattern := Opening_Pattern; end if; end if; Ada.Text_IO.Open (File => File, Mode => Ada.Text_IO.In_File, Name => Ada.Command_Line.Argument (1)); while not Ada.Text_IO.End_Of_File (File => File) loop declare Line : constant String := Ada.Text_IO.Get_Line (File); begin Ada.Text_IO.Put_Line (Strip_Comments (Line)); end; end loop; Ada.Text_IO.Close (File => File);
end Strip;</lang> output:
function subroutine() { a = b + c ; } function something() { }
AutoHotkey
<lang AutoHotkey>code = (
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
)
- Open-Close Comment delimiters
openC:="/*" closeC:="*/"
- Make it "Regex-Safe"
openC:=RegExReplace(openC,"(\*|\^|\?|\\|\+|\.|\!|\{|\}|\[|\]|\$|\|)","\$0") closeC:=RegExReplace(closeC,"(\*|\^|\?|\\|\+|\.|\!|\{|\}|\[|\]|\$|\|)","\$0")
- Display final result
MsgBox % sCode := RegExReplace(code,"s)(" . openC . ").*?(" . closeC . ")")</lang>
function subroutine() { a = b + c ; } function something() { }
BBC BASIC
<lang bbcbasic> infile$ = "C:\sample.c"
outfile$ = "C:\stripped.c" PROCstripblockcomments(infile$, outfile$, "/*", "*/") END DEF PROCstripblockcomments(infile$, outfile$, start$, finish$) LOCAL infile%, outfile%, comment%, test%, A$ infile% = OPENIN(infile$) IF infile%=0 ERROR 100, "Could not open input file" outfile% = OPENOUT(outfile$) IF outfile%=0 ERROR 100, "Could not open output file" WHILE NOT EOF#infile% A$ = GET$#infile% TO 10 REPEAT IF comment% THEN test% = INSTR(A$, finish$) IF test% THEN A$ = MID$(A$, test% + LEN(finish$)) comment% = FALSE ENDIF ELSE test% = INSTR(A$, start$) IF test% THEN BPUT#outfile%, LEFT$(A$, test%-1); A$ = MID$(A$, test% + LEN(start$)) comment% = TRUE ENDIF ENDIF UNTIL test%=0 IF NOT comment% BPUT#outfile%, A$ ENDWHILE CLOSE #infile% CLOSE #outfile% ENDPROC</lang>
Output file:
function subroutine() { a = b + c ; } function something() { }
C
<lang C>#include <stdio.h>
- include <string.h>
- include <stdlib.h>
const char *ca = "/*", *cb = "*/"; int al = 2, bl = 2;
char *loadfile(const char *fn) {
FILE *f = fopen(fn, "rb"); int l; char *s;
if (f != NULL) {
fseek(f, 0, SEEK_END); l = ftell(f); s = malloc(l+1); rewind(f); if (s) fread(s, 1, l, f); fclose(f);
} return s;
}
void stripcomments(char *s) {
char *a, *b; int len = strlen(s) + 1;
while ((a = strstr(s, ca)) != NULL) {
b = strstr(a+al, cb); if (b == NULL) break; b += bl; memmove(a, b, len-(b-a));
}
}
int main(int argc, char **argv) {
const char *fn = "input.txt"; char *s;
if (argc >= 2)
fn = argv[1];
s = loadfile(fn); if (argc == 4) {
al = strlen(ca = argv[2]); bl = strlen(cb = argv[3]);
} stripcomments(s); puts(s); free(s); return 0;
}</lang>
- Usage
Specify an input file via the first command line argument, and optionally specify comment opening and closing delimiters with the next two args, or defaults of /* and */ are assumed.
- Output
function subroutine() { a = b + c ; } function something() { }
C++
<lang cpp>#include <string>
- include <iostream>
- include <iterator>
- include <fstream>
- include <boost/regex.hpp>
int main( ) {
std::ifstream codeFile( "samplecode.txt" ) ; if ( codeFile ) { boost::regex commentre( "/\\*.*?\\*/" ) ;//comment start and end, and as few characters in between as possible std::string my_erase( "" ) ; //erase them std::string stripped ; std::string code( (std::istreambuf_iterator<char>( codeFile ) ) ,
std::istreambuf_iterator<char>( ) ) ;
codeFile.close( ) ; stripped = boost::regex_replace( code , commentre , my_erase ) ; std::cout << "Code unstripped:\n" << stripped << std::endl ; return 0 ; } else { std::cout << "Could not find code file!" << std::endl ; return 1 ; }
}</lang> Output:
Code unstripped: function subroutine() { a = b + c ; } function something() { }
C#
<lang Csharp>using System;
class Program { private static string BlockCommentStrip(string commentStart, string commentEnd, string sampleText) { while (sampleText.IndexOf(commentStart) > -1 && sampleText.IndexOf(commentEnd, sampleText.IndexOf(commentStart) + commentStart.Length) > -1) { int start = sampleText.IndexOf(commentStart); int end = sampleText.IndexOf(commentEnd, start + commentStart.Length); sampleText = sampleText.Remove( start, (end + commentEnd.Length) - start ); } return sampleText; } }</lang>
Clojure
<lang Clojure>(defn comment-strip [txt & args]
(let [args (conj {:delim ["/*" "*/"]} (apply hash-map args)) ; This is the standard way of doing keyword/optional arguments in Clojure
[opener closer] (:delim args)]
(loop [out "", txt txt, delim-count 0] ; delim-count is needed to handle nested comments (let [[hdtxt resttxt] (split-at (count opener) txt)] ; This splits "/* blah blah */" into hdtxt="/*" and restxt="blah blah */"
(printf "hdtxt=%8s resttxt=%8s out=%8s txt=%16s delim-count=%s\n" (apply str hdtxt) (apply str resttxt) out (apply str txt) delim-count) (cond (empty? hdtxt) (str out (apply str txt)) (= (apply str hdtxt) opener) (recur out resttxt (inc delim-count)) (= (apply str hdtxt) closer) (recur out resttxt (dec delim-count)) (= delim-count 0)(recur (str out (first txt)) (rest txt) delim-count) true (recur out (rest txt) delim-count))))))</lang>
user> (comment-strip "This /* is */ some /* /* /* */ funny */ */ text") hdtxt= Th resttxt=is /* is */ some /* /* /* */ funny */ */ text out= txt=This /* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= hi resttxt=s /* is */ some /* /* /* */ funny */ */ text out= T txt=his /* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= is resttxt= /* is */ some /* /* /* */ funny */ */ text out= Th txt=is /* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= s resttxt=/* is */ some /* /* /* */ funny */ */ text out= Thi txt=s /* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= / resttxt=* is */ some /* /* /* */ funny */ */ text out= This txt= /* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= /* resttxt= is */ some /* /* /* */ funny */ */ text out= This txt=/* is */ some /* /* /* */ funny */ */ text delim-count=0 hdtxt= i resttxt=s */ some /* /* /* */ funny */ */ text out= This txt= is */ some /* /* /* */ funny */ */ text delim-count=1 hdtxt= is resttxt= */ some /* /* /* */ funny */ */ text out= This txt=is */ some /* /* /* */ funny */ */ text delim-count=1 hdtxt= s resttxt=*/ some /* /* /* */ funny */ */ text out= This txt=s */ some /* /* /* */ funny */ */ text delim-count=1 hdtxt= * resttxt=/ some /* /* /* */ funny */ */ text out= This txt= */ some /* /* /* */ funny */ */ text delim-count=1 hdtxt= */ resttxt= some /* /* /* */ funny */ */ text out= This txt=*/ some /* /* /* */ funny */ */ text delim-count=1 hdtxt= s resttxt=ome /* /* /* */ funny */ */ text out= This txt= some /* /* /* */ funny */ */ text delim-count=0 hdtxt= so resttxt=me /* /* /* */ funny */ */ text out= This txt=some /* /* /* */ funny */ */ text delim-count=0 hdtxt= om resttxt=e /* /* /* */ funny */ */ text out= This s txt=ome /* /* /* */ funny */ */ text delim-count=0 hdtxt= me resttxt= /* /* /* */ funny */ */ text out=This so txt=me /* /* /* */ funny */ */ text delim-count=0 hdtxt= e resttxt=/* /* /* */ funny */ */ text out=This som txt=e /* /* /* */ funny */ */ text delim-count=0 hdtxt= / resttxt=* /* /* */ funny */ */ text out=This some txt= /* /* /* */ funny */ */ text delim-count=0 hdtxt= /* resttxt= /* /* */ funny */ */ text out=This some txt=/* /* /* */ funny */ */ text delim-count=0 hdtxt= / resttxt=* /* */ funny */ */ text out=This some txt= /* /* */ funny */ */ text delim-count=1 hdtxt= /* resttxt= /* */ funny */ */ text out=This some txt=/* /* */ funny */ */ text delim-count=1 hdtxt= / resttxt=* */ funny */ */ text out=This some txt= /* */ funny */ */ text delim-count=2 hdtxt= /* resttxt= */ funny */ */ text out=This some txt=/* */ funny */ */ text delim-count=2 hdtxt= * resttxt=/ funny */ */ text out=This some txt= */ funny */ */ text delim-count=3 hdtxt= */ resttxt= funny */ */ text out=This some txt=*/ funny */ */ text delim-count=3 hdtxt= f resttxt=unny */ */ text out=This some txt= funny */ */ text delim-count=2 hdtxt= fu resttxt=nny */ */ text out=This some txt=funny */ */ text delim-count=2 hdtxt= un resttxt=ny */ */ text out=This some txt= unny */ */ text delim-count=2 hdtxt= nn resttxt=y */ */ text out=This some txt= nny */ */ text delim-count=2 hdtxt= ny resttxt= */ */ text out=This some txt= ny */ */ text delim-count=2 hdtxt= y resttxt=*/ */ text out=This some txt= y */ */ text delim-count=2 hdtxt= * resttxt=/ */ text out=This some txt= */ */ text delim-count=2 hdtxt= */ resttxt= */ text out=This some txt= */ */ text delim-count=2 hdtxt= * resttxt= / text out=This some txt= */ text delim-count=1 hdtxt= */ resttxt= text out=This some txt= */ text delim-count=1 hdtxt= t resttxt= ext out=This some txt= text delim-count=0 hdtxt= te resttxt= xt out=This some txt= text delim-count=0 hdtxt= ex resttxt= t out=This some t txt= ext delim-count=0 hdtxt= xt resttxt= out=This some te txt= xt delim-count=0 hdtxt= t resttxt= out=This some tex txt= t delim-count=0 hdtxt= resttxt= out=This some text txt= delim-count=0 "This some text"
D
<lang d>import std.algorithm, std.regex;
string[2] separateComments(in string txt,
in string cpat0, in string cpat1) { int[2] plen; // to handle /*/ int i, j; // cursors bool inside; // is inside comment?
// pre-compute regex here if desired //auto r0 = regex(cpat0); //auto r1 = regex(cpat1); //enum rct = ctRegex!(r"\n|\r");
bool advCursor() { auto mo = match(txt[i .. $], inside ? cpat1 : cpat0); if (mo.empty) return false; plen[inside] = max(0, plen[inside], mo.front[0].length); j = i + mo.pre.length; // got comment head if (inside) j += mo.front[0].length; // or comment tail
// special adjust for \n\r if (!match(mo.front[0], r"\n|\r").empty) j--; return true; }
string[2] result; while (true) { if (!advCursor()) break; result[inside] ~= txt[i .. j]; // save slice of result
// handle /*/ pattern if (inside && (j - i < plen[0] + plen[1])) { i = j; if (!advCursor()) break; result[inside] ~= txt[i .. j]; // save result again }
i = j; // advance cursor inside = !inside; // toggle search type }
if (inside) throw new Exception("Mismatched Comment"); result[inside] ~= txt[i .. $]; // save rest(non-comment) return result;
}
void main() {
import std.stdio;
static void showResults(in string e, in string[2] pair) { writeln("===Original text:\n", e); writeln("\n\n===Text without comments:\n", pair[0]); writeln("\n\n===The stripped comments:\n", pair[1]); }
// First example ------------------------------ immutable ex1 = ` /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }`;
showResults(ex1, separateComments(ex1, `/\*`, `\*/`));
// Second example ------------------------------ writeln("\n"); immutable ex2 = "apples, pears # and bananas
apples, pears; and bananas "; // test for line comment
showResults(ex2, separateComments(ex2, `#|;`, `[\n\r]|$`));
}</lang>
- Output:
===Original text: /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */ /** * Another comment. */ function something() { } ===Text without comments: function subroutine() { a = b + c ; } function something() { } ===The stripped comments: /** * Some comments * longer comments here that we can parse. * * Rahoo *//* inline comment *//*/ <-- tricky comments *//** * Another comment. */ ===Original text: apples, pears # and bananas apples, pears; and bananas ===Text without comments: apples, pears apples, pears ===The stripped comments: # and bananas; and bananas
F#
Using .NET's regex counter feature to match nested comments. If comments here are nested, they have to be correctly balanced. <lang fsharp>open System open System.Text.RegularExpressions
let balancedComments opening closing =
new Regex( String.Format("""
{0} # An outer opening delimiter
(?> # efficiency: no backtracking here {0} (?<LEVEL>) # An opening delimiter, one level down | {1} (?<-LEVEL>) # A closing delimiter, one level up | (?! {0} | {1} ) . # With negative lookahead: Anything but delimiters )* # As many times as we see these (?(LEVEL)(?!)) # Fail, unless on level 0 here
{1} # Outer closing delimiter """, Regex.Escape(opening), Regex.Escape(closing)),
RegexOptions.IgnorePatternWhitespace ||| RegexOptions.Singleline)
[<EntryPoint>] let main args =
let sample = """ /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. * /* nested balanced */ */ function something() { } """ let balancedC = balancedComments "/*" "*/" printfn "%s" (balancedC.Replace(sample, "")) 0</lang>
Output
function subroutine() { a = b + c ; } function something() { }
Go
For the extra credit: No optional parameters in Go, but documented below is an efficient technique for letting the caller specify the delimiters. <lang go>package main
import (
"fmt" "strings"
)
// idiomatic to name a function newX that allocates an object, initializes it, // and returns it ready to use. the object in this case is a closure. func newStripper(start, end string) func(string) string {
// default to c-style block comments if start == "" || end == "" { start, end = "/*", "*/" } // closes on variables start, end. return func(source string) string { for { cs := strings.Index(source, start) if cs < 0 { break } ce := strings.Index(source[cs+2:], end) if ce < 0 { break } source = source[:cs] + source[cs+ce+4:] } return source }
}
func main() {
// idiomatic is that zero values indicate to use meaningful defaults stripC := newStripper("", "")
// strip function now defined and can be called any number of times // without respecifying delimiters fmt.Println(stripC(` /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }`))
}</lang>
Groovy
<lang groovy>def code = """
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
"""
println ((code =~ "(?:/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(?://.*)").replaceAll())</lang>
Haskell
Comment delimiters can be changed by calling stripComments with different start and end parameters. <lang Haskell>import Data.List
stripComments :: String -> String -> String -> String stripComments start end = notComment
where notComment :: String -> String notComment "" = "" notComment xs | start `isPrefixOf` xs = inComment $ drop (length start) xs | otherwise = head xs:(notComment $ tail xs) inComment :: String -> String inComment "" = "" inComment xs | end `isPrefixOf` xs = notComment $ drop (length end) xs | otherwise = inComment $ tail xs
main = interact (stripComments "/*" "*/")</lang> Output:
function subroutine() { a = b + c ; } function something() { }
Icon and Unicon
If one is willing to concede that the program file will fit in memory, then the following code works: <lang Icon>procedure main()
every (unstripped := "") ||:= !&input || "\n" # Load file as one string write(stripBlockComment(unstripped,"/*","*/"))
end
procedure stripBlockComment(s1,s2,s3) #: strip comments between s2-s3 from s1
result := "" s1 ? { while result ||:= tab(find(s2)) do { move(*s2) tab(find(s3)|0) # or end of string move(*s3) } return result || tab(0) }
end</lang> Otherwise, the following handles an arbitrary length input: <lang Icon>procedure main()
every writes(stripBlockComment(!&input,"/*","*/"))
end
procedure stripBlockComment(s,s2,s3)
static inC # non-null when inside comment (s||"\n") ? while not pos(0) do { if /inC then if inC := 1(tab(find(s2))\1, move(*s2)) then suspend inC else return tab(0) else if (tab(find(s3))\1,move(*s3)) then inC := &null else fail }
end</lang>
J
<lang j>strip=:#~1 0 _1*./@:(|."0 1)2>4{"1(5;(0,"0~".;._2]0 :0);'/*'i.a.)&;:
1 0 0 0 2 0 2 3 2 0 2 2
)</lang> Example data: <lang j>example=: 0 :0
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
)</lang> Example use: <lang j> strip example
function subroutine() { a = b + c ; }
function something() { }</lang>
Here is a version which allows the delimiters to be passed as an optional left argument as a pair of strings: <lang j>stripp=:3 :0
('/*';'*/') stripp y
'open close'=. x marks=. (+./(-i._1+#open,close)|."0 1 open E. y) - close E.&.|. y y #~ -. (+._1&|.) (1 <. 0 >. +)/\.&.|. marks
)</lang>
Java
<lang java>import java.io.*;
public class StripBlockComments{
public static String readFile(String filename) {
BufferedReader reader = new BufferedReader(new FileReader(filename)); try { StringBuilder fileContents = new StringBuilder(); char[] buffer = new char[4096]; while (reader.read(buffer, 0, 4096) > 0) { fileContents.append(buffer); } return fileContents.toString(); } finally { reader.close(); }
}
public static String stripComments(String beginToken, String endToken,
String input) { StringBuilder output = new StringBuilder(); while (true) { int begin = input.indexOf(beginToken); int end = input.indexOf(endToken, begin+beginToken.length()); if (begin == -1 || end == -1) { output.append(input); return output.toString(); } output.append(input.substring(0, begin)); input = input.substring(end + endToken.length()); }
}
public static void main(String[] args) {
if (args.length < 3) { System.out.println("Usage: BeginToken EndToken FileToProcess"); System.exit(1); }
String begin = args[0]; String end = args[1]; String input = args[2];
try { System.out.println(stripComments(begin, end, readFile(input))); } catch (Exception e) { e.printStackTrace(); System.exit(1); }
}
}</lang>
Liberty BASIC
<lang lb>global CRLF$ CRLF$ =chr$( 13) +chr$( 10)
sample$ =" /**"+CRLF$+_ " * Some comments"+CRLF$+_ " * longer comments here that we can parse."+CRLF$+_ " *"+CRLF$+_ " * Rahoo "+CRLF$+_ " */"+CRLF$+_ " function subroutine() {"+CRLF$+_ " a = /* inline comment */ b + c ;"+CRLF$+_ " }"+CRLF$+_ " /*/ <-- tricky comments */"+CRLF$+_ ""+CRLF$+_ " /**"+CRLF$+_ " * Another comment."+CRLF$+_ " */"+CRLF$+_ " function something() {"+CRLF$+_ " }"+CRLF$
startDelim$ ="/*" finishDelim$ ="*/"
print "________________________________" print sample$ print "________________________________" print blockStripped$( sample$, startDelim$, finishDelim$) print "________________________________"
end
function blockStripped$( in$, s$, f$)
for i =1 to len( in$) -len( s$) if mid$( in$, i, len( s$)) =s$ then i =i +len( s$) do if mid$( in$, i, 2) =CRLF$ then blockStripped$ =blockStripped$ +CRLF$ i =i +1 loop until ( mid$( in$, i, len( f$)) =f$) or ( i =len( in$) -len( f$)) i =i +len( f$) -1 else blockStripped$ =blockStripped$ +mid$( in$, i, 1) end if next i
end function</lang>
function subroutine() { a = b + c ; } function something() { }
Lua
It is assumed, that the code is in the file "Text1.txt". <lang lua>filename = "Text1.txt"
fp = io.open( filename, "r" ) str = fp:read( "*all" ) fp:close()
stripped = string.gsub( str, "/%*.-%*/", "" ) print( stripped )</lang>
Mathematica
<lang Mathematica>StringReplace[a,"/*"~~Shortest[___]~~"*/" -> ""]
->
function subroutine() { a = b + c ; }
function something() { }</lang>
MATLAB / Octave
<lang Matlab>function str = stripblockcomment(str,startmarker,endmarker)
while(1) ix1 = strfind(str, startmarker); if isempty(ix1) return; end; ix2 = strfind(str(ix1+length(startmarker):end),endmarker); if isempty(ix2) str = str(1:ix1(1)-1); return; else str = [str(1:ix1(1)-1),str(ix1(1)+ix2(1)+length(endmarker)+1:end)]; end; end;
end;</lang> Output:
>>t = ' /**\n * Some comments\n * longer comments here that we can parse.\n *\n * Rahoo \n */\n function subroutine() {\n a = /* inline comment */ b + c ;\n }\n /*/ <-- tricky comments */\n\n /**\n * Another comment.\n */\n function something() {\n }\n' >>printf(t); >>printf('=============\n'); >>printf(stripblockcomment(t)); /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */ /** * Another comment. */ function something() { } =============== function subroutine() { a = b + c ; } function something() { }
Perl
<lang Perl>#!/usr/bin/perl -w use strict ; use warnings ;
open( FH , "<" , "samplecode.txt" ) or die "Can't open file!$!\n" ; my $code = "" ; {
local $/ ; $code = <FH> ; #slurp mode
} close FH ; $code =~ s,/\*.*?\*/,,sg ; print $code . "\n" ;</lang> Output:
function subroutine() { a = b + c ; } function something() { }
Perl 6
<lang perl6>sample().split(/ '/*' .+? '*/' /).print;
sub sample { ' /**
* Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
'}</lang> Output:
function subroutine() { a = b + c ; } function something() { }
PicoLisp
<lang PicoLisp>(in "sample.txt"
(while (echo "/*") (out "/dev/null" (echo "*/")) ) )</lang>
Output:
function subroutine() { a = b + c ; } function something() { }
PL/I
<lang PL/I>/* A program to remove comments from text. */ strip: procedure options (main); /* 8/1/2011 */
declare text character (80) varying; declare (j, k) fixed binary;
on endfile (sysin) stop;
do forever; get edit (text) (L); do until (k = 0); k = index(text, '/*'); if k > 0 then /* we have a start of comment. */ do; /* Look for end of comment. */ j = index(text, '*/', k+2); if j > 0 then do; text = substr(text, 1, k-1) || substr(text, j+2, length(text)-(j+2)+1); end; else do; /* The comment continues onto the next line. */ put skip list ( substr(text, 1, k-1) );
more: get edit (text) (L);
j = index(text, '*/'); if j = 0 then do; put skip; go to more; end; text = substr(text, j+2, length(text) - (j+2) + 1); end; end; end; put skip list (text); end;
end strip;</lang>
PureBasic
Solution using regular expressions. A procedure to stripBlocks() procedure is defined that will strip comments between any two delimeters. <lang PureBasic>Procedure.s escapeChars(text.s)
Static specialChars.s = "[\^$.|?*+()" Protected output.s, nextChar.s, i, countChar = Len(text) For i = 1 To countChar nextChar = Mid(text, i, 1) If FindString(specialChars, nextChar, 1) output + "\" + nextChar Else output + nextChar EndIf Next ProcedureReturn output
EndProcedure
Procedure.s stripBlocks(text.s, first.s, last.s)
Protected delimter_1.s = escapeChars(first), delimter_2.s = escapeChars(last) Protected expNum = CreateRegularExpression(#PB_Any, delimter_1 + ".*?" + delimter_2, #PB_RegularExpression_DotAll) Protected output.s = ReplaceRegularExpression(expNum, text, "") FreeRegularExpression(expNum) ProcedureReturn output
EndProcedure
Define source.s source.s = " /**" + #CRLF$ source.s + " * Some comments" + #CRLF$ source.s + " * longer comments here that we can parse." + #CRLF$ source.s + " *" + #CRLF$ source.s + " * Rahoo " + #CRLF$ source.s + " */" + #CRLF$ source.s + " function subroutine() {" + #CRLF$ source.s + " a = /* inline comment */ b + c ;" + #CRLF$ source.s + " }" + #CRLF$ source.s + " /*/ <-- tricky comments */" + #CRLF$ source.s + "" + #CRLF$ source.s + " /**" + #CRLF$ source.s + " * Another comment." + #CRLF$ source.s + " */" + #CRLF$ source.s + " function something() {" + #CRLF$ source.s + " }" + #CRLF$
If OpenConsole()
PrintN("--- source ---") PrintN(source) PrintN("--- source with block comments between '/*' and '*/' removed ---") PrintN(stripBlocks(source, "/*", "*/")) PrintN("--- source with block comments between '*' and '*' removed ---") PrintN(stripBlocks(source, "*", "*")) Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input() CloseConsole()
EndIf</lang> Sample output:
--- source --- /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */ /** * Another comment. */ function something() { } --- source with block comments between '/*' and '*/' removed --- function subroutine() { a = b + c ; } function something() { } --- source with block comments between '*' and '*' removed --- / longer comments here that we can parse. Rahoo inline comment / <-- tricky comments Another comment. */ function something() { }
Python
The code has comment delimeters as an argument and will also strip nested block comments. <lang python>def _commentstripper(txt, delim):
'Strips first nest of block comments' deliml, delimr = delim out = if deliml in txt: indx = txt.index(deliml) out += txt[:indx] txt = txt[indx+len(deliml):] txt = _commentstripper(txt, delim) assert delimr in txt, 'Cannot find closing comment delimiter in ' + txt indx = txt.index(delimr) out += txt[(indx+len(delimr)):] else: out = txt return out
def commentstripper(txt, delim=('/*', '*/')):
'Strips nests of block comments' deliml, delimr = delim while deliml in txt: txt = _commentstripper(txt, delim) return txt</lang>
- Tests and sample output
<lang python>def test():
print('\nNON-NESTED BLOCK COMMENT EXAMPLE:') sample = /** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { } print(commentstripper(sample))
print('\nNESTED BLOCK COMMENT EXAMPLE:') sample = /** * Some comments * longer comments here that we can parse. * * Rahoo *//* function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */ */ /** * Another comment. */ function something() { } print(commentstripper(sample))
if __name__ == '__main__':
test()</lang>
NON-NESTED BLOCK COMMENT EXAMPLE: function subroutine() { a = b + c ; } function something() { } NESTED BLOCK COMMENT EXAMPLE: function something() { }
Racket
<lang Racket>
- lang at-exp racket
- default delimiters (strings -- not regexps)
(define comment-start-str "/*") (define comment-end-str "*/")
(define (strip-comments text [rx1 comment-start-str] [rx2 comment-end-str])
(regexp-replace* (~a (regexp-quote rx1) ".*?" (regexp-quote rx2)) text ""))
((compose1 displayln strip-comments)
@~a{/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { } })
</lang>
(Outputs the expected text...)
REXX
<lang rexx>/* REXX ***************************************************************
- Split comments
- This program ignores comment delimiters within literal strings
- such as, e.g., in b = "--' O'Connor's widow --";
- it does not (yet) take care of -- comments (ignore rest of line)
- also it does not take care of say 667/*yuppers*/77 (REXX specialty)
- courtesy GS discussion!
- 12.07.2013 Walter Pachl
- /
fid='in.txt' /* input text */ oic='oc.txt'; 'erase' oic /* will contain comments */ oip='op.txt'; 'erase' oip /* will contain program parts */ oim='om.txt'; 'erase' oim /* oc.txt merged with op.txt */ cmt=0 /* comment nesting */ str= /* ' or " when in a string */ Do ri=1 By 1 While lines(fid)>0 /* loop over input */
l=linein(fid) /* an input line */ oc= /* initialize line for oc.txt */ op= /* initialize line for op.txt */ i=1 /* start at first character */ Do While i<=length(l) /* loop through input line */ If cmt=0 Then Do /* we are not in a comment */ If str<> Then Do /* we are in a string */ If substr(l,i,1)=str Then Do /* string character */ If substr(l,i+1,1)=str Then Do /* another one */ Call app 'P',substr(l,i,2) /* add or "" to op */ i=i+2 /* increase input pointer */ Iterate /* proceed in input line */ End Else Do /* end of literal string */ Call app 'P',substr(l,i,1) /* add ' or " to op */ str=' ' /* no longer in string */ i=i+1 /* increase input pointer */ Iterate /* proceed in input line */ End End End End Select When str= &, /* not in a string */ substr(l,i,2)='/*' Then Do /* start of comment */ cmt=cmt+1 /* increase commenr nesting */ Call app 'C','/*' /* copy to oc */ i=i+2 /* increase input pointer */ End When cmt=0 Then Do /* not in a comment */ If str=' ' Then Do /* not in a string */ If pos(substr(l,i,1),"')>0 Then /* string delimiter */ str=substr(l,i,1) /* remember that */ End Call app 'P',substr(l,i,1) /* copy to op */ i=i+1 /* increase input pointer */ End When substr(l,i,2)='*/' Then Do /* end of comment */ cmt=cmt-1 /* decrement nesting depth */ Call app 'C','*/' /* copy to oc */ i=i+2 /* increase input pointer */ End Otherwise Do /* any other character */ Call app 'C',substr(l,i,1) /* copy to oc */ i=i+1 /* increase input pointer */ End End End Call oc /* Write line oc */ Call op /* Write line op */ End
Call lineout oic /* Close File oic */ Call lineout oip /* Close File oip */
Do ri=1 To ri-1 /* merge program with comments*/
op=linein(oip) oc=linein(oic) Do i=1 To length(oc) If substr(oc,i,1)<> Then op=overlay(substr(oc,i,1),op,i,1) End Call lineout oim,op End
Call lineout oic Call lineout oip Call lineout oim Exit
app: Parse Arg which,string /* add str to oc or op */ /* and corresponding blanks to the other (op or oc) */ If which='C' Then Do
oc=oc||string op=op||copies(' ',length(string)) End
Else Do
op=op||string oc=oc||copies(' ',length(string)) End
Return
oc: Return lineout(oic,oc) op: Return lineout(oip,op)</lang> Input:
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; b = "*/' O'Connor's widow /*"; } /*/ <-- tricky comments */ /** * Another comment. */ function something() { }
Program:
function subroutine() { a = b + c ; b = "*/' O'Connor's widow /*"; } function something() { }
Comments:
/** * Some comments * longer comments here that we can parse. * * Rahoo */ /* inline comment */ /*/ <-- tricky comments */ /** * Another comment. */
Ruby
<lang ruby>def remove_comments!(str, comment_start='/*', comment_end='*/')
while start_idx = str.index(comment_start) end_idx = str.index(comment_end, start_idx + comment_start.length) + comment_end.length - 1 str[start_idx .. end_idx] = "" end str
end
def remove_comments(str, comment_start='/*', comment_end='*/')
remove_comments!(str.dup, comment_start, comment_end)
end
example = <<END_OF_STRING
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
END_OF_STRING
puts remove_comments example</lang> outputs
function subroutine() { a = b + c ; } function something() { }
Seed7
The function replace2 can be used to replace unnested comments.
<lang seed7>$ include "seed7_05.s7i";
const proc: main is func
local const string: stri is "\ \ /**\n\ \ * Some comments\n\ \ * longer comments here that we can parse.\n\ \ *\n\ \ * Rahoo\n\ \ */\n\ \ function subroutine() {\n\ \ a = /* inline comment */ b + c ;\n\ \ }\n\ \ /*/ <-- tricky comments */\n\ \\n\ \ /**\n\ \ * Another comment.\n\ \ */\n\ \ function something() {\n\ \ }"; begin writeln(replace2(stri, "/*", "*/", " ")); end func;</lang>
Output:
function subroutine() { a = b + c ; } function something() { }
Tcl
<lang tcl>proc stripBlockComment {string {openDelimiter "/*"} {closeDelimiter "*/"}} {
# Convert the delimiters to REs by backslashing all non-alnum characters set openAsRE [regsub -all {\W} $openDelimiter {\\&}] set closeAsRE [regsub -all {\W} $closeDelimiter {\\&}]
# Now remove the blocks using a dynamic non-greedy regular expression regsub -all "$openAsRE.*?$closeAsRE" $string ""
}</lang> Demonstration code: <lang tcl>puts [stripBlockComment " /**
* Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
"]</lang> Output:
function subroutine() { a = b + c ; } function something() { }
TUSCRIPT
<lang tuscript> $$ MODE DATA $$ script=*
/** * Some comments * longer comments here that we can parse. * * Rahoo */ function subroutine() { a = /* inline comment */ b + c ; } /*/ <-- tricky comments */
/** * Another comment. */ function something() { }
$$ MODE TUSCRIPT ERROR/STOP CREATE ("testfile",SEQ-E,-std-) ERROR/STOP CREATE ("destfile",SEQ-E,-std-) FILE "testfile" = script BUILD S_TABLE commentbeg=":/*:" BUILD S_TABLE commentend=":*/:"
ACCESS t: READ/STREAM "testfile" s.z/u,a/commentbeg+t+e/commentend,typ ACCESS d: WRITE/STREAM "destfile" s.z/u,a+t+e LOOP READ/EXIT t IF (typ==3) CYCLE t=SQUEEZE(t) WRITE/ADJUST d ENDLOOP ENDACCESS/PRINT t ENDACCESS/PRINT d d=FILE("destfile") TRACE *d </lang> Output:
TRACE * 38 -*TUSTEP.EDT d = * 1 = 2 = function subroutine() { a = 3 = b + c ; } 4 = 5 = function something() { }