Strip comments from a string

From Rosetta Code
Jump to: navigation, search
This task has been flagged for clarification. Code on this page in its current state may be flagged incorrect once this task has been clarified. See this page's Talk page for discussion.
Task
Strip comments from a string
You are encouraged to solve this task according to the task description, using any language you may know.

The task is to remove text that follow any of a set of comment markers, (in these examples either a hash or a semicolon) from a string or input line.

Whitespace debacle: There is some confusion about whether to remove any whitespace from the input line. As of 2 September 2011, at least 8 languages (C, C++, Java, Perl, Python, Ruby, sed, UNIX Shell) were incorrect, out of 36 total languages, because they did not trim whitespace by 29 March 2011 rules. Some other languages might be incorrect for the same reason. Please discuss this issue at Talk:Strip comments from a string.

  • From 29 March 2011, this task required that: "The comment marker and any whitespace at the beginning or ends of the resultant line should be removed. A line without comments should be trimmed of any leading or training whitespace before being produced as a result." The task had 28 languages, which did not all meet this new requirement.
  • From 28 March 2011, this task required that: "Whitespace before the comment marker should be removed."
  • From 30 October 2010, this task did not specify whether or not to remove whitespace.

The following examples will be truncated to either "apples, pears " or "apples, pears". (This example has flipped between "apples, pears " and "apples, pears" in the past.)

apples, pears # and bananas
apples, pears ; and bananas

Cf. Strip block comments

Contents

[edit] Ada

with Ada.Text_IO;
procedure Program is
Comment_Characters : String := "#;";
begin
loop
declare
Line : String := Ada.Text_IO.Get_Line;
begin
exit when Line'Length = 0;
Outer_Loop : for I in Line'Range loop
for J in Comment_Characters'Range loop
if Comment_Characters(J) = Line(I) then
Ada.Text_IO.Put_Line(Line(Line'First .. I - 1));
exit Outer_Loop;
end if;
end loop;
end loop Outer_Loop;
end;
end loop;
end Program;

[edit] ALGOL 68

Works with: ALGOL 68 version Revision 1 - no extensions to language used.
Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny.
Works with: ELLA ALGOL 68 version Any (with appropriate job cards) - tested with release 1.8-8d
#!/usr/local/bin/a68g --script #
 
PROC trim comment = (STRING line, CHAR marker)STRING:(
INT index := UPB line+1;
char in string(marker, index, line);
FOR i FROM index-1 BY -1 TO LWB line
WHILE line[i]=" " DO index := i OD;
line[:index-1]
);
 
CHAR q = """";
 
print((
q, trim comment("apples, pears # and bananas", "#"), q, new line,
q, trim comment("apples, pears ; and bananas", ";"), q, new line,
q, trim comment("apples, pears and bananas ", ";"), q, new line,
q, trim comment(" ", ";"), q, new line, # blank string #
q, trim comment("", ";"), q, new line # empty string #
))
 
CO Alternatively Algol68g has available "grep"
;STRING re marker := " *#", line := "apples, pears # and bananas";
INT index := UPB line;
grep in string(re marker, line, index, NIL);
print((q, line[:index-1], q, new line))
END CO

Output:

"apples, pears"
"apples, pears"
"apples, pears and bananas"
""
""

[edit] AutoHotkey

Delims := "#;"
str := "apples, pears # and bananas"
str2:= "apples, pears, `; and bananas" ; needed to escape the ; since that is AHK's comment marker
msgbox % StripComments(Str,Delims)
msgbox % StripComments(Str2,Delims)
; The % forces expression mode.
 
 
StripComments(String1,Delims){
Loop, parse, delims
{
If Instr(String1,A_LoopField)
EndPosition := InStr(String1,A_LoopField) - 1
Else
EndPosition := StrLen(String1)
StringLeft, String1, String1, EndPosition
}
return String1
}

Output:

apples, pears 
apples, pears, 


[edit] AutoIt

It was always said in discussion, the task is not really stripping comments. It's only a truncation.

 
Dim $Line1 = "apples, pears # and bananas"
Dim $Line2 = "apples, pears ; and bananas"
 
_StripAtMarker($Line1)
_StripAtMarker($Line2)
 
Func _StripAtMarker($_Line, $sMarker='# ;')
Local $aMarker = StringSplit($sMarker, ' ')
Local $iPos
For $i = 1 To $aMarker[0]
$iPos = StringInStr($_Line, $aMarker[$i])
If $iPos Then
ConsoleWrite($_Line & @CRLF)
ConsoleWrite( StringStripWS( StringLeft($_Line, $iPos -1), 2) & @CRLF)
EndIf
Next
EndFunc ;==>_StripAtMarker
 

Output:

apples, pears # and bananas
apples, pears
apples, pears ; and bananas
apples, pears

Here is a really language-related solution to parse script lines and delete comments. A comment in line in AutoIt starts with an semicolon. But it may be possible, that a semicolon is part of a string in a parameter from an function-call/function-headline or in an assignment. That means: the comment starts with the first semicolon outside a string.

 
Dim $aLines[4] = _
[ _
"$a = $b + $c ; Comment line 1", _
"Dim $s1 = 'some text; tiled with semicolon', $s2 = 'another text; also tiled with semicolon' ; Comment line 2 - semicolon as part of assignment", _
"_SomeFunctionCall('string parameter with ;', $anotherParam) ; Comment line 3 - semicolon as part parameter in an function call", _
"Func _AnotherFunction($param1=';', $param2=';', $param3=';') ; Comment line 4 - semicolon as default value in parameter of a function headline" _
]
 
For $i = 0 To 3
ConsoleWrite('+> Line ' & $i+1 & ' full:' & @CRLF & '+>' & $aLines[$i] & @CRLF)
ConsoleWrite('!> without comment:' & @CRLF & '!>' & _LineStripComment($aLines[$i]) & @CRLF & @CRLF)
Next
 
 
Func _LineStripComment($_Line)
; == tile line by all included comment marker
Local $aPartsWithMarker = StringSplit($_Line, ';')
Local $sNoComment
 
; == if no comment marker: return full line
If $aPartsWithMarker[0] = 0 Then Return $_Line
 
; == check if string in part, if is'nt: following part(s) are comment
For $i = 1 To $aPartsWithMarker[0]
If Not StringRegExp($aPartsWithMarker[$i], "('|\x22)") Then
If $i = 1 Then
Return StringStripWS($aPartsWithMarker[$i], 2)
Else
Return StringStripWS($sNoComment & $aPartsWithMarker[$i], 2)
EndIf
Else
; == check if next leftside string delimiter has uneven count
Local $iLen = StringLen($aPartsWithMarker[$i])
Local $fDetectDelim = False, $sStringDelim, $iDelimCount, $sCurr
For $j = $iLen To 1 Step -1
$sCurr = StringMid($aPartsWithMarker[$i], $j, 1)
If Not $fDetectDelim Then
If $sCurr = "'" Or $sCurr = '"' Then
$sStringDelim = $sCurr
$iDelimCount += 1
$fDetectDelim = True
EndIf
Else
If $sCurr = $sStringDelim Then $iDelimCount += 1
EndIf
Next
If Mod($iDelimCount, 2) Then
; == uneven count: so it masks the comment marker
$sNoComment &= $aPartsWithMarker[$i] & ';'
Else
; == even count: all following is comment
Return StringStripWS($sNoComment & $aPartsWithMarker[$i], 2)
EndIf
EndIf
Next
EndFunc ;==>_LineStripComment
 

Output:

+> Line 1 full:
+>$a = $b + $c ; Comment line 1
>> without comment:
>>$a = $b + $c

+> Line 2 full:
+>Dim $s1 = 'some text; tiled with semicolon', $s2 = 'another text; also tiled with semicolon' ; Comment line 2 - semicolon as part of assignment
>> without comment:
>>Dim $s1 = 'some text; tiled with semicolon', $s2 = 'another text; also tiled with semicolon'

+> Line 3 full:
+>_SomeFunctionCall('string parameter with ;', $anotherParam) ; Comment line 3 - semicolon as part parameter in an function call
>> without comment:
>>_SomeFunctionCall('string parameter with ;', $anotherParam)

+> Line 4 full:
+>Func _AnotherFunction($param1=';', $param2=';', $param3=';') ; Comment line 4 - semicolon as default value in parameter of a function headline
>> without comment:
>>Func _AnotherFunction($param1=';', $param2=';', $param3=';')

[edit] AWK

#!/usr/local/bin/awk -f
{
sub("[ \t]*[#;].*$","",$0);
print;
}

[edit] BBC BASIC

      marker$ = "#;"
PRINT FNstripcomment("apples, pears # and bananas", marker$)
PRINT FNstripcomment("apples, pears ; and bananas", marker$)
PRINT FNstripcomment(" apples, pears ", marker$)
END
 
DEF FNstripcomment(text$, delim$)
LOCAL I%, D%
FOR I% = 1 TO LEN(delim$)
D% = INSTR(text$, MID$(delim$, I%, 1))
IF D% text$ = LEFT$(text$, D%-1)
NEXT I%
WHILE ASC(text$) = 32 text$ = MID$(text$,2) : ENDWHILE
WHILE LEFT$(text$) = " " text$ = RIGHT$(text$) : ENDWHILE
= text$

[edit] C

#include<stdio.h>
 
int main()
{
char ch, str[100];
int i;
 
do{
printf("\nEnter the string :");
fgets(str,100,stdin);
for(i=0;str[i]!=00;i++)
{
if(str[i]=='#'||str[i]==';')
{
str[i]=00;
break;
}
}
printf("\nThe modified string is : %s",str);
printf("\nDo you want to repeat (y/n): ");
scanf("%c",&ch);
fflush(stdin);
}while(ch=='y'||ch=='Y');
 
return 0;
}

Output:

Enter the string :apples, pears # and bananas

The modified string is : apples, pears
Do you want to repeat (y/n): y

Enter the string :apples, pears ; and bananas

The modified string is : apples, pears
Do you want to repeat (y/n): n

[edit] C++

#include <iostream>
#include <string>
 
std::string strip_white(const std::string& input)
{
size_t b = input.find_first_not_of(' ');
if (b == std::string::npos) b = 0;
return input.substr(b, input.find_last_not_of(' ') + 1 - b);
}
 
std::string strip_comments(const std::string& input, const std::string& delimiters)
{
return strip_white(input.substr(0, input.find_first_of(delimiters)));
}
 
int main( ) {
std::string input;
std::string delimiters("#;");
while ( getline(std::cin, input) && !input.empty() ) {
std::cout << strip_comments(input, delimiters) << std::endl ;
}
return 0;
}

Sample output:

apples, pears # and bananas
apples, pears
apples, pears ; and bananas
apples, pears

[edit] C#

 
using System.Text.RegularExpressions;
 
string RemoveComments(string str, string delimiter)
{
//regular expression to find a character (delimiter) and
// replace it and everything following it with an empty string.
//.Trim() will remove all beginning and ending white space.
return Regex.Replace(str, delimiter + ".+", string.Empty).Trim();
}
 

Sample output:

Console.WriteLine(RemoveComments("apples, pears # and bananas", "#"));
Console.WriteLine(RemoveComments("apples, pears ; and bananas", ";"));
apples, pears
apples, pears

[edit] Clojure

> (apply str (take-while #(not (#{\# \;} %)) "apples # comment"))
"apples "

[edit] D

import std.stdio, std.regex;
 
string remove1LineComment(in string s, in string pat=";#") {
const re = "([^" ~ pat ~ "]*)([" ~ pat ~ `])[^\n\r]*([\n\r]|$)`;
return s.replace(regex(re, "gm"), "$1$3");
}
 
void main() {
const s = "apples, pears # and bananas
apples, pears ; and bananas "
;
 
writeln(s, "\n====>\n", s.remove1LineComment());
}
Output:
apples, pears # and bananas
apples, pears ; and bananas 
====>
apples, pears 
apples, pears 

[edit] Delphi

program StripComments;
 
{$APPTYPE CONSOLE}
 
uses
SysUtils;
 
function DoStripComments(const InString: string; const CommentMarker: Char): string;
begin
Result := Trim(Copy(InString,1,Pos(CommentMarker,InString)-1));
end;
 
begin
Writeln('apples, pears # and bananas --> ' + DoStripComments('apples, pears # and bananas','#'));
Writeln('');
Writeln('apples, pears ; and bananas --> ' + DoStripComments('apples, pears ; and bananas',';'));
Readln;
end.

[edit] DWScript

function StripComments(s : String) : String;
begin
var p := FindDelimiter('#;', s);
if p>0 then
Result := Trim(Copy(s, 1, p-1))
else Result := Trim(s);
end;
 
PrintLn(StripComments('apples, pears # and bananas'));
PrintLn(StripComments('apples, pears ; and bananas'));

[edit] F#

let stripComments s =
s
|> Seq.takeWhile (fun c -> c <> '#' && c <> ';')
|> Seq.map System.Char.ToString
|> Seq.fold (+) ""

[edit] Fantom

Using a regular expression:

class Main
{
static Str removeComment (Str str)
{
regex := Regex <|(;|#)|>
matcher := regex.matcher (str)
if (matcher.find)
return str[0..<matcher.start]
else
return str
}
 
public static Void main ()
{
echo (removeComment ("String with comment here"))
echo (removeComment ("String with comment # here"))
echo (removeComment ("String with comment ; here"))
}
}

[edit] Fortran

!****************************************************
module string_routines
!****************************************************
implicit none
private
public :: strip_comments
contains
!****************************************************
 
function strip_comments(str,c) result(str2)
implicit none
character(len=*),intent(in) :: str
character(len=1),intent(in) :: c !comment character
character(len=len(str)) :: str2
 
integer :: i
 
i = index(str,c)
if (i>0) then
str2 = str(1:i-1)
else
str2 = str
end if
 
end function strip_comments
 
!****************************************************
end module string_routines
!****************************************************
 
!****************************************************
program main
!****************************************************
! Example use of strip_comments function
!****************************************************
use string_routines, only: strip_comments
implicit none
 
write(*,*) strip_comments('apples, pears # and bananas', '#')
write(*,*) strip_comments('apples, pears ; and bananas', ';')
 
!****************************************************
end program main
!****************************************************

output:

apples, pears
apples, pears

[edit] Go

package main
 
import (
"fmt"
"strings"
)
 
const commentChars = "#;"
 
func stripComment(source string) string {
if cut := strings.IndexAny(source, commentChars); cut >= 0 {
return source[:cut]
}
return source
}
 
func main() {
for _, s := range []string{
"apples, pears # and bananas",
"apples, pears ; and bananas",
"no bananas",
} {
fmt.Printf("source:  %q\n", s)
fmt.Printf("stripped: %q\n", stripComment(s))
}
}

Output:

source:   "apples, pears # and bananas"
stripped: "apples, pears"
source:   "apples, pears ; and bananas"
stripped: "apples, pears"
source:   "no bananas"
stripped: "no bananas"

[edit] Haskell

ms = ";#"
 
main = getContents >>=
mapM_ (putStrLn . takeWhile (`notElem` ms)) . lines

[edit] Icon and Unicon

# strip_comments: 
# return part of string up to first character in 'markers',
# or else the whole string if no comment marker is present
procedure strip_comments (str, markers)
return str ? tab(upto(markers) | 0)
end
 
procedure main ()
write (strip_comments ("apples, pears and bananas", cset ("#;")))
write (strip_comments ("apples, pears # and bananas", cset ("#;")))
write (strip_comments ("apples, pears ; and bananas", cset ("#;")))
end

Output:

apples, pears   and bananas
apples, pears 
apples, pears 

[edit] Inform 7

Home is a room.
 
When play begins:
strip comments from "apples, pears # and bananas";
strip comments from "apples, pears ; and bananas";
end the story.
 
To strip comments from (T - indexed text):
say "[T] -> ";
replace the regular expression "<#;>.*$" in T with "";
say "[T][line break]".

Since square brackets have a special meaning in strings, Inform's regular expression syntax uses angle brackets for character grouping.

[edit] J

Solution 1 (mask & filter):
strip=: #~  *./\@:-.@e.&';#'
Solution 2 (index & cut):
strip=: {.~  <./@i.&';#'
Example:
   strip 'apples, pears # and bananas'
apples, pears
strip 'apples, pears ; and bananas'
apples, pears

[edit] Java

import java.io.*;
 
public class StripLineComments{
public static void main( String[] args ){
if( args.length < 1 ){
System.out.println("Usage: java StripLineComments StringToProcess");
}
else{
String inputFile = args[0];
String input = "";
try{
BufferedReader reader = new BufferedReader( new FileReader( inputFile ) );
String line = "";
while((line = reader.readLine()) != null){
System.out.println( line.split("[#;]")[0] );
}
}
catch( Exception e ){
e.printStackTrace();
}
}
}
}


[edit] JavaScript

function stripComments(s) {
var re1 = /^\s+|\s+$/g; // Strip leading and trailing spaces
var re2 = /\s*[#;].+$/g; // Strip everything after # or ; to the end of the line, including preceding spaces
return s.replace(re1,'').replace(re2,'');
}
 
 
var s1 = 'apples, pears # and bananas';
var s2 = 'apples, pears ; and bananas';
 
alert(stripComments(s1) + '\n' + stripComments(s2));
 

A more efficient version that caches the regular expressions in a closure:

var stripComments = (function () {
var re1 = /^\s+|\s+$/g;
var re2 = /\s*[#;].+$/g;
return function (s) {
return s.replace(re1,'').replace(re2,'');
};
}());
 

A difference with the two versions is that in the first, all declarations are processed before code is executed so the function declaration can be after the code that calls it. However in the second example, the expression creating the function must be executed before the function is available, so it must be before the code that calls it.

[edit] Liberty BASIC

string1$ = "apples, pears # and bananas"
string2$ = "pears;, " + chr$(34) + "apples ; " + chr$(34) + " an;d bananas"
commentMarker$ = "; #"
Print parse$(string2$, commentMarker$)
End
 
Function parse$(string$, commentMarker$)
For i = 1 To Len(string$)
charIn$ = Mid$(string$, i, 1)
If charIn$ = Chr$(34) Then
inQuotes = Not(inQuotes)
End If
If Instr(commentMarker$, charIn$) And (inQuotes = 0) Then Exit For
next i
parse$ = Left$(string$, (i - 1))
End Function

[edit] Lua

comment_symbols = ";#"
 
s1 = "apples, pears # and bananas"
s2 = "apples, pears ; and bananas"
 
print ( string.match( s1, "[^"..comment_symbols.."]+" ) )
print ( string.match( s2, "[^"..comment_symbols.."]+" ) )

[edit] Mathematica

a = "apples, pears # and bananas
apples, pears ; and bananas";
b = StringReplace[a, RegularExpression["[ ]+[#;].+[\n]"] -> "\n"];
StringReplace[b, RegularExpression["[ ]+[#;].+$"] -> ""] // FullForm

Output:

"apples, pears\napples, pears"

[edit] MATLAB / Octave

function line = stripcomment(line) 
e = min([find(line=='#',1),find(line==';',1)]);
if ~isempty(e)
e = e-1;
while isspace(line(e)) e = e - 1; end;
line = line(1:e);
end;
end;
 

Output:

>> stripcomment('apples, pears # and bananas\n')
ans = apples, pears
>> stripcomment('apples, pears ; and bananas\n')
ans = apples, pears

[edit] OCaml

let strip_comments str =
let len = String.length str in
let rec aux print i =
if i >= len then () else
match str.[i] with
| '#' | ';' ->
aux false (succ i)
| '\n' ->
print_char '\n';
aux true (succ i)
| c ->
if print then print_char c;
aux print (succ i)
in
aux true 0
 
let () =
strip_comments "apples, pears # and bananas\n";
strip_comments "apples, pears ; and bananas\n";
;;

or with an imperative style:

let strip_comments =
let print = ref true in
String.iter (function
| ';' | '#' -> print := false
| '\n' -> print_char '\n'; print := true
| c -> if !print then print_char c)

[edit] Pascal

See Delphi

[edit] Perl

while (<>)
{
s/[#;].*$//s; # remove comment
s/^\s+//; # remove leading whitespace
s/\s+$//; # remove trailing whitespace
print
}

[edit] Perl 6

$*IN.slurp.subst(/ \h* <[ # ; ]> \N* /, '', :g).print

[edit] PicoLisp

(for Str '("apples, pears # and bananas" "apples, pears ; and bananas")
(prinl (car (split (chop Str) "#" ";"))) )

Output:

apples, pears 
apples, pears 

[edit] PL/I

k = search(text, '#;');
if k = 0 then put skip list (text);
else put skip list (substr(text, 1, k-1));

[edit] Prolog

Works with: SWI Prolog

This version is implemented as a state automata to strip multiple lines of comments.

stripcomment(A,B) :- stripcomment(A,B,a).
stripcomment([A|AL],[A|BL],a) :- \+ A=0';, \+ A=0'# , \+ A=10, \+ A=13 , stripcomment(AL,BL,a).
stripcomment([A|AL], BL ,a) :- ( A=0';; A=0'#), \+ A=10, \+ A=13 , stripcomment(AL,BL,b).
stripcomment([A|AL], BL ,b) :- \+ A=10, \+ A=13 , stripcomment(AL,BL,b).
stripcomment([A|AL],[A|BL],_M):- ( A=10; A=13), stripcomment(AL,BL,a).
stripcomment([],[],_M).
start :-
In = "apples, pears ; and bananas
apples, pears # and bananas",
stripcomment(In,Out),
format("~s~n",[Out]).

Output:

?- start.
apples, pears 
apples, pears 

This version uses prolog's pattern matching with two append/3 to strip 1 line.

strip_1comment(A,D) :- ((S1=0'#;S1=0';),append(B,[S1|C],A)), \+ ((S2=0'#;S2=0';),append(_X,[S2|_Y],B)) -> B=D; A=D.

At the query console:

?- strip_1comment("apples, pears ; and bananas",O1),format("~s~n",[O1]).
apples, pears 
O1 = [97, 112, 112, 108, 101, 115, 44, 32, 112|...] .
?- strip_1comment("apples, pears # and bananas",O1),format("~s~n",[O1]).
apples, pears 
O1 = [97, 112, 112, 108, 101, 115, 44, 32, 112|...] .

[edit] PureBasic

Procedure.s Strip_comments(Str$)
Protected result$=Str$, l, l1, l2
l1 =FindString(Str$,"#",1)
l2 =FindString(Str$,";",1)
;
; See if any comment sign was found, prioritizing '#'
If l1
l=l1
ElseIf l2
l=l2
EndIf
l-1
If l>0
result$=Left(Str$,l)
EndIf
ProcedureReturn result$
EndProcedure

Implementation

#instring1 ="apples, pears # and bananas"
#instring2 ="apples, pears ; and bananas"
 
PrintN(Strip_comments(#instring1))
PrintN(Strip_comments(#instring2))
Output:
apples, pears
apples, pears

[edit] Python

>>> marker, line = '#', 'apples, pears # and bananas'
>>> line[:line.index(marker)].strip()
'apples, pears'
>>>
>>> marker, line = ';', ' apples, pears ; and bananas'
>>> line[:line.index(marker)].strip()
'apples, pears'

The code above does not handle multiple markers at once; moreover it is going to give an error if a line doesn't contain a marker. Another implementation:

def remove_comments(line, sep):
for s in sep:
line = line.split(s)[0]
return line.strip()
 
# test
print remove_comments('apples ; pears # and bananas', ';#')
print remove_comments('apples ; pears # and bananas', '!')
 

[edit] R

This is most cleanly accomplished using the stringr package.

strip_comments <- function(str)
{
if(!require(stringr)) stop("you need to install the stringr package")
str_trim(str_split_fixed(str, "#|;", 2)[, 1])
}

Example usage:

x <-c(
"apples, pears # and bananas", # the requested hash test
"apples, pears ; and bananas", # the requested semicolon test
"apples, pears and bananas", # without a comment
" apples, pears # and bananas" # with preceding spaces
)
strip_comments(x)

[edit] Racket

 
#lang at-exp racket
 
(define comment-start-rx "[;#]")
 
(define text
@~a{apples, pears # and bananas
apples, pears ; and bananas
})
 
(define (strip-comments text [rx comment-start-rx])
(string-join
(for/list ([line (string-split text "\n")])
(string-trim line (pregexp (~a "\\s*" rx ".*")) #:left? #f))
"\n"))
 
;; Alternatively, do it in a single regexp operation
(define (strip-comments2 text [rx comment-start-rx])
(regexp-replace* (pregexp (~a "(?m:\\s*" rx ".*)")) text ""))
 
(strip-comments2 text) ; -> "apples, pears\napples, pears"
 

[edit] REXX

The first REXX subroutine takes advantage of the fact that there are only two single-character delimiters:

  • # (hash or pound sign),
  •  : (a semicolon).

The second and third subroutines take a general approach to the (number of) delimiters,
the third version is more straightforward and reads better.

All three versions trim leading and trailing blanks after striping the "comments".

/*REXX program strips a string delinated by a hash (#) or semicolon (;).*/
old1=' apples, pears # and bananas' ; say ' old ───►'old1"◄───"
new1=stripCom1(old1)  ; say '1st version new ───►'new1"◄───"
new2=stripCom2(old1)  ; say '2nd version new ───►'new2"◄───"
new3=stripCom3(old1)  ; say '3rd version new ───►'new3"◄───"
new4=stripCom3(old1)  ; say '4th version new ───►'new4"◄───"
say copies('═',55)
old2=' apples, pears ; and bananas' ; say ' old ───►'old2"◄───"
new1=stripCom1(old2)  ; say '1st version new ───►'new1"◄───"
new2=stripCom2(old2)  ; say '2nd version new ───►'new2"◄───"
new3=stripCom3(old2)  ; say '3rd version new ───►'new3"◄───"
new4=stripCom3(old2)  ; say '4th version new ───►'new4"◄───"
exit /*stick a fork in it, we're done.*/
/*──────────────────────────────────STRIPCOM1 subroutine────────────────*/
stripCom1: procedure; parse arg x /*get the argument (string). */
x=translate(x,'#',";") /*translate semicolons to hash. */
parse var x x '#' /*parse string, ending in hash. */
return strip(x) /*return striped shortened string*/
/*──────────────────────────────────STRIPCOM2 subroutine────────────────*/
stripCom2: procedure; parse arg x /*get the argument (string). */
d='#;' /*delimiter list to be used. */
d1=left(d,1) /*get the 1st character in delim.*/
x=translate(x,copies(d1,length(d)),d) /*trans all delims ──► 1st delim.*/
parse var x x (d1) /*parse string, ending in hash. */
return strip(x) /*return striped shortened string*/
/*──────────────────────────────────STRIPCOM3 subroutine────────────────*/
stripCom3: procedure; parse arg x /*get the argument (string). */
d=';#' /*delimiter list to be used. */
do j=1 for length(d) /*process each delimiter singly. */
_=substr(d,j,1) /*use one delimiter at a time. */
parse var x x (_) /*parse X string for each delim. */
end /*j*/
return strip(x) /*return striped shortened string*/
/*──────────────────────────────────STRIPCOM4 subroutine────────────────*/
stripCom4: procedure; parse arg x /*get the argument (string). */
d=';#' /*delimiter list to be used. */
do k=1 for length(d) /*process each delimiter singly. */
p=pos(substr(d,k,1),x) /*see if a delimiter is in X. */
if p\==0 then x=left(x,p-1) /*shorten the X string.*/
end /*k*/
return strip(x) /*return striped shortened string*/

output

            old ───► apples, pears # and bananas◄───
1st version new ───►apples, pears◄───
2nd version new ───►apples, pears◄───
3rd version new ───►apples, pears◄───
4th version new ───►apples, pears◄───
═══════════════════════════════════════════════════════
            old ───► apples, pears ; and bananas◄───
1st version new ───►apples, pears◄───
2nd version new ───►apples, pears◄───
3rd version new ───►apples, pears◄───
4th version new ───►apples, pears◄───

[edit] Ruby

class String
def strip_comment( markers = ['#',';'] )
re = Regexp.union( markers ) # construct a regular expression which will match any of the markers
self[0, self =~ re] # slice the string where the regular expression matches, and return it.
end
end
 
puts 'apples, pears # and bananas'.strip_comment
str = 'apples, pears ; and bananas'
puts str.strip_comment

Output:

apples, pears 
apples, pears 

[edit] Scala

object StripComments {
def stripComments1(s:String, markers:String =";#")=s takeWhile (!markers.contains(_)) trim
 
// using regex and pattern matching
def stripComments2(s:String, markers:String =";#")={
val R=("(.*?)[" + markers + "].*").r
(s match {
case R(line) => line
case _ => s
}) trim
}
 
def print(s:String)={
println("'"+s+"' =>")
println(" '"+stripComments1(s)+"'")
println(" '"+stripComments2(s)+"'")
}
 
def main(args: Array[String]): Unit = {
print("apples, pears # and bananas")
print("apples, pears ; and bananas")
}
}

Output:

'apples, pears # and bananas' =>
   'apples, pears'
   'apples, pears'
'apples, pears ; and bananas' =>
   'apples, pears'
   'apples, pears'

[edit] Scheme

Works with: Guile
(use-modules (ice-9 regex))
 
(define (strip-comments s)
(regexp-substitute #f
(string-match "[ \t\r\n\v\f]*[#;].*" s) 'pre "" 'post))
 
(display (strip-comments "apples, pears # and bananas"))(newline)
(display (strip-comments "apples, pears ; and bananas"))(newline)
Output:
apples, pears
apples, pears

[edit] sed

#!/bin/sh
# Strip comments
echo "$1" | sed 's/ *[#;].*$//g' | sed 's/^ *//'

[edit] Seed7

$ include "seed7_05.s7i";
 
const func string: stripComment (in string: line) is func
result
var string: lineWithoutComment is "";
local
var integer: lineEnd is 0;
var integer: pos is 0;
begin
lineEnd := length(line);
for pos range 1 to length(line) do
if line[pos] in {'#', ';'} then
lineEnd := pred(pos);
pos := length(line);
end if;
end for;
lineWithoutComment := line[.. lineEnd];
end func;
 
const proc: main is func
local
var string: stri is "apples, pears # and bananas\n\
\apples, pears ; and bananas";
var string: line is ""
begin
writeln(stri);
writeln("====>");
for line range split(stri, '\n') do
writeln(stripComment(line));
end for;
end func;

Output:

apples, pears # and bananas
apples, pears ; and bananas
====>
apples, pears 
apples, pears 

[edit] Tcl

proc stripLineComments {inputString {commentChars ";#"}} {
# Switch the RE engine into line-respecting mode instead of the default whole-string mode
regsub -all -line "\[$commentChars\].*$" $inputString "" commentStripped
# Now strip the whitespace
regsub -all -line {^[ \t\r]*(.*\S)?[ \t\r]*$} $commentStripped {\1}
}

Demonstration:

# Multi-line string constant
set input "apples, pears # and bananas
apples, pears ; and bananas"

# Do the stripping
puts [stripLineComments $input]

Output:

apples, pears
apples, pears

The above code has one issue though; it's notion of a set of characters is very much that of the RE engine. That's possibly desirable, but to handle any sequence of characters as a set of separators requires a bit more cleverness.

proc stripLineComments {inputString {commentChars ";#"}} {
# Convert the character set into a transformation
foreach c [split $commentChars ""] {lappend map $c "\uFFFF"}; # *very* rare character!
# Apply transformation and then use a simpler constant RE to strip
regsub -all -line {\uFFFF.*$} [string map $map $inputString] "" commentStripped
# Now strip the whitespace
regsub -all -line {^[ \t\r]*(.*\S)?[ \t\r]*$} $commentStripped {\1}
}

Output in the example is the same as above.

[edit] TUSCRIPT

$$ MODE TUSCRIPT
strngcomment=*
DATA apples, pears # and bananas
DATA apples, pears ; and bananas
 
BUILD S_TABLE comment_char="|#|;|"
 
LOOP s=strngcomment
x=SPLIT (s,comment_char,string,comment)
PRINT string
ENDLOOP

Output:

apples, pears
apples, pears

[edit] UNIX Shell

Works with: bash
Works with: pdksh

Adapted from the Advanced Bash-Scripting Guide, section 10.1 Manipulating Strings.

bash$ a='apples, pears ; and bananas'
bash$ b='apples, pears # and bananas'
bash$ echo ${a%%;*}
apples, pears
bash$ echo ${b%%#*}
apples, pears
bash$
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox