Strip whitespace from a string/Top and tail

From Rosetta Code
Revision as of 20:59, 8 August 2011 by rosettacode>IanOsgood (Forth)
Task
Strip whitespace from a string/Top and tail
You are encouraged to solve this task according to the task description, using any language you may know.

The task is to demonstrate how to strip leading and trailing whitespace from a string. The solution should demonstrate how to achieve the following three results:

  • String with leading whitespace removed
  • String with trailing whitespace removed
  • String with both leading and trailing whitespace removed

For the purposes of this task whitespace includes non printable characters such as the space character, the tab character, and other such characters that have no corresponding graphical representation.

C

<lang c>#include <stdio.h>

  1. include <stdlib.h>
  2. include <string.h>
  3. include <ctype.h>

char *rtrim(const char *s) {

 while( isspace(*s) || !isprint(*s) ) ++s;
 return strdup(s);

}

char *ltrim(const char *s) {

 char *r = strdup(s);
 if (r != NULL)
 {
   char *fr = r + strlen(s) - 1;
   while( (isspace(*fr) || !isprint(*fr) || *fr == 0) && fr >= r) --fr;
   *++fr = 0;
 }
 return r;

}

char *trim(const char *s) {

 char *r = rtrim(s);
 char *f = ltrim(r);
 free(r);
 return f;

}

const char *a = " this is a string ";

int main() {

 char *b = rtrim(a);
 char *c = ltrim(a);
 char *d = trim(a);
 printf("'%s'\n'%s'\n'%s'\n", b, c, d);
 
 free(b);
 free(c);
 free(d);
 return 0;

}</lang>

C++

<lang>#include <boost/algorithm/string.hpp>

  1. include <string>
  2. include <iostream>

int main( ) {

  std::string testphrase( "    There are unwanted blanks here!    " ) ;
  std::string lefttrimmed = boost::trim_left_copy( testphrase ) ;
  std::string righttrimmed = boost::trim_right_copy( testphrase ) ;
  std::cout << "The test phrase is :" << testphrase << "\n" ;
  std::cout << "Trimmed on the left side :" << lefttrimmed << "\n" ;
  std::cout << "Trimmed on the right side :" << righttrimmed << "\n" ;
  boost::trim( testphrase ) ;
  std::cout << "Trimmed on both sides :" <<  testphrase  << "\n" ;
  return 0 ;

}</lang> Output:

The test phrase is :    There are unwanted blanks here!    
Trimmed on the left side :There are unwanted blanks here!    
Trimmed on the right side :    There are unwanted blanks here!
Trimmed on both sides :There are unwanted blanks here!

D

<lang d>import std.stdio, std.string;

void main() {

   auto s = " \t \r \n String with spaces  \t  \r  \n  ";
   assert(s.stripl() == "String with spaces  \t  \r  \n  ");
   assert(s.stripr() == " \t \r \n String with spaces");
   assert(s.strip() == "String with spaces");

}</lang>

Delphi

<lang Delphi>program StripWhitespace;

{$APPTYPE CONSOLE}

uses SysUtils;

const

 TEST_STRING = '     String with spaces     ';

begin

 Writeln('"' + TEST_STRING + '"');
 Writeln('"' + TrimLeft(TEST_STRING) + '"');
 Writeln('"' + TrimRight(TEST_STRING) + '"');
 Writeln('"' + Trim(TEST_STRING) + '"');

end.</lang>

Forth

<lang forth>: -leading ( addr len -- addr' len' )

 begin over c@ bl = while 1 /string repeat ;

\ -trailing is built in

s" test " 2dup -leading cr type 2dup -trailing cr type

    -leading -trailing cr type</lang>

Go

<lang go>package main

import (

   "fmt"
   "strings"
   "unicode"

)

var simple = `

   simple   `

func main() {

   show("original", simple)
   show("leading ws removed", strings.TrimLeftFunc(simple, unicode.IsSpace))
   show("trailing ws removed", strings.TrimRightFunc(simple, unicode.IsSpace))
   // equivalent to strings.TrimFunc(simple, unicode.IsSpace)
   show("both removed", strings.TrimSpace(simple))

}

func show(label, str string) {

   fmt.Printf("%s: |%s| %v\n", label, str, []int(str))

}</lang> Example text is shows a leading linefeed and tab, and three trailing spaces. The code uses the Unicode definition of whitespace. Other defintions could be implemented with a custom function given to TrimXFunc.

Output below shows the text surrounded by vertical bars to show the extent of whitespace, followed by a list of the character values in the string, to show exactly what whitespace is present.

original: |
        simple   | [10 9 115 105 109 112 108 101 32 32 32]
leading ws removed: |simple   | [115 105 109 112 108 101 32 32 32]
trailing ws removed: |
        simple| [10 9 115 105 109 112 108 101]
both removed: |simple| [115 105 109 112 108 101]

Icon and Unicon

This solution takes the phrase "other such characters that have no corresponding graphical representation" quite literallly. <lang Unicon>procedure main()

   unp := &cset[1+:32]++' \t'++&cset[127:0]   # all 'unprintable' chars
   s := " Hello, people of earth!  	"
   write("Original:      '",s,"'")
   write("leading trim:  '",reverse(trim(reverse(s),unp)),"'")
   write("trailing trim: '",trim(s,unp),"'")
   write("full trim:     '",reverse(trim(reverse(trim(s,unp)),unp)),"'")

end</lang> A sample run:

->trim
Original:      ' Hello, people of earth!        '
leading trim:  'Hello, people of earth!         '
trailing trim: ' Hello, people of earth!'
full trim:     'Hello, people of earth!'
->

J

Note: The quote verb is only used to enclose the resulting string in single quotes so the beginning and end of the new string are visible. <lang j> require 'strings' NB. the strings library is automatically loaded in versions from J7 on

  quote dlb '  String with spaces   '    NB. delete leading blanks

'String with spaces '

  quote dtb '  String with spaces   '    NB. delete trailing blanks

' String with spaces'

  quote dltb '  String with spaces   '   NB. delete leading and trailing blanks

'String with spaces'</lang> In addition deb (delete extraneous blanks) will trim both leading and trailing blanks as well as replace consecutive spaces within the string with a single space. <lang j> quote deb ' String with spaces ' NB. delete extraneous blanks 'String with spaces'</lang> These existing definitions can be easily amended to include whitespace other than spaces if desired. <lang j>whpsc=: ' ',TAB NB. define whitespace as desired dlws=: }.~ (e.&whpsc i. 0:) NB. delete leading whitespace (spaces and tabs) dtws=: #~ ([: +./\. -.@:e.&whpsc) NB. delete trailing whitespace dltws=: #~ ([: (+./\ *. +./\.) -.@:e.&whpsc) NB. delete leading & trailing whitespace dews=: #~ (+. (1: |. (> </\)))@(-.@:e.&whpsc) NB. delete extraneous whitespace</lang>

Java

Left trim and right trim taken from here.

Character.isWhitespace() returns true if the character given is one of the following Unicode characters: '\u00A0', '\u2007', '\u202F', '\u0009', '\u000A', '\u000B', '\u000C', '\u000D', '\u001C', '\u001D', '\u001E', or '\u001F'. <lang java> public class Trims{

  public static String ltrim(String s){
     int i = 0;
     while (i < s.length() && Character.isWhitespace(s.charAt(i))){
        i++;
     }
     return s.substring(i);
  }
  public static String rtrim(String s){
     int i = s.length() - 1;
     while (i > 0 && Character.isWhitespace(s.charAt(i))){
        i--;
     }
     return s.substring(0, i + 1);
  }
  public static void main(String[] args){
     String s = " \t \r \n String with spaces  \t  \r  \n  ";
     System.out.println(ltrim(s));
     System.out.println(rtrim(s));
     System.out.println(s.trim()); //trims both ends
  }

}</lang>

Lua

<lang lua>str = " \t \r \n String with spaces \t \r \n "

print( string.format( "Leading whitespace removed: %s", str:match( "^%s*(.+)" ) ) ) print( string.format( "Trailing whitespace removed: %s", str:match( "(.-)%s*$" ) ) ) print( string.format( "Leading and trailing whitespace removed: %s", str:match( "^%s*(.-)%s*$" ) ) )</lang>

Nemerle

<lang Nemerle>def str = "\t\n\t A string with\nwhitespace\n\n\t "; WriteLine(str.TrimStart()); WriteLine(str.TrimEnd()); WriteLine(str.Trim()); // both ends at once, of course, internal whitespace is preserved in all 3</lang>

Objective-C

Works with: Cocoa
Works with: GNUstep

<lang objc>#import <Foundation/Foundation.h>

@interface NSString (RCExt) -(NSString *) ltrim; -(NSString *) rtrim; -(NSString *) trim; @end

@implementation NSString (RCExt) -(NSString *) ltrim {

 NSInteger i;
 NSCharacterSet *cs = [NSCharacterSet whitespaceAndNewlineCharacterSet];
 for(i = 0; i < [self length]; i++)
 {
   if ( ![cs characterIsMember: [self characterAtIndex: i]] ) break;
 }
 return [self substringFromIndex: i];

}

-(NSString *) rtrim {

 NSInteger i;
 NSCharacterSet *cs = [NSCharacterSet whitespaceAndNewlineCharacterSet];
 for(i = [self length] -1; i >= 0; i--)
 {
   if ( ![cs characterIsMember: [self characterAtIndex: i]] ) break;    
 }
 return [self substringToIndex: (i+1)];

}

-(NSString *) trim {

 return [self 

stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceAndNewlineCharacterSet]]; } @end

int main() {

 NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
 NSString *s = @"     this is a string     ";
 NSLog(@"'%@'", s);
 NSLog(@"'%@'", [s ltrim]);
 NSLog(@"'%@'", [s rtrim]);
 NSLog(@"'%@'", [s trim]);


 [pool release];
 return 0;

}</lang>

OCaml

<lang ocaml>let left_pos s len =

 let rec aux i =
   if i >= len then None
   else match s.[i] with
   | ' ' | '\n' | '\t' | '\r' -> aux (succ i)
   | _ -> Some i
 in
 aux 0

let right_pos s len =

 let rec aux i =
   if i < 0 then None
   else match s.[i] with
   | ' ' | '\n' | '\t' | '\r' -> aux (pred i)
   | _ -> Some i
 in
 aux (pred len)

let trim s =

 let len = String.length s in
 match left_pos s len, right_pos s len with
 | Some i, Some j -> String.sub s i (j - i + 1)
 | None, None -> ""
 | _ -> assert false

let ltrim s =

 let len = String.length s in
 match left_pos s len with
 | Some i -> String.sub s i (len - i)
 | None -> ""

let rtrim s =

 let len = String.length s in
 match right_pos s len with
 | Some i -> String.sub s 0 (i + 1)
 | None -> ""</lang>

we put the previous code in a file called "trim.ml", and then we test these functions in the toplevel:

$ ocaml
# #use "trim.ml" ;;
val left_pos : string -> int -> int option = <fun>
val right_pos : string -> int -> int option = <fun>
val trim : string -> string = <fun>
val ltrim : string -> string = <fun>
val rtrim : string -> string = <fun>
# let s = " \t \r \n String with spaces \t \r \n " ;;
val s : string = " \t \r \n String with spaces \t \r \n "
# trim s ;;
- : string = "String with spaces"
# ltrim s ;;
- : string = "String with spaces \t \r \n "
# rtrim s ;;
- : string = " \t \r \n String with spaces"

OpenEdge/Progress

<lang progress>DEF VAR cc AS CHAR INIT " string with spaces ".

MESSAGE

  "|" + LEFT-TRIM( cc )  + "|" SKIP
  "|" + RIGHT-TRIM( cc ) + "|" SKIP
  "|" + TRIM( cc )       + "|"

VIEW-AS ALERT-BOX.</lang> Output:

---------------------------
Message
---------------------------
|string with spaces   | 
|   string with spaces| 
|string with spaces|
---------------------------
OK   
---------------------------

Perl

<lang perl>use strict;

sub ltrim {

   my $c = shift;
   $c =~ s/^\s+//;
   return $c;

}

sub rtrim {

   my $c = shift;
   $c =~ s/\s+$//;
   return $c;

}

sub trim {

   my $c = shift;
   return ltrim(rtrim($c));

}

my $p = " this is a string ";

print "'", $p, "'\n"; print "'", trim($p), "'\n"; print "'", ltrim($p), "'\n"; print "'", rtrim($p), "'\n";</lang>

Perl 6

<lang perl6>my $s = "\r\n \t\x2029 Good Stuff \x2028\n"; say $s.trim.perl; say $s.trim-leading.perl; say $s.trim-trailing.perl;</lang>

QBasic

<lang qbasic> mystring$=ltrim(mystring$) ' remove leading whitespace

mystring$=rtrim(mystring$)           ' remove trailing whitespace
mystring$=ltrim(rtrim(mystring$))    ' remove both leading and trailing whitespace

</lang>

PL/I

<lang PL/I> put ( trim(text, ' ', ) ); /* trims leading blanks. */ put ( trim(text, , ' ') ); /* trims trailing blanks. */ put ( trim(text) ); /* trims leading and trailing */

                                     /* blanks.                     */

</lang> To remove any white-space character(s) in a portable way:- <lang> declare whitespace character(33) value

  ((substr(collate(), 1, 32) || ' '));

put ( trim(text, whitespace) ); /* trims leading white space. */ put ( trim(text, , whitespace) ); /* trims trailing white space. */ put ( trim(text, whitespace, whitespace) );

                                     /* trims leading and trailing  */
                                     /* white space.                */

</lang>

PicoLisp

<lang PicoLisp>(de trimLeft (Str)

  (pack (flip (trim (flip (chop Str))))) )

(de trimRight (Str)

  (pack (trim (chop Str))) )

(de trimBoth (Str)

  (pack (clip (chop Str))) )</lang>

Test:

: (trimLeft " ^G ^I trimmed left ^L ")
-> "trimmed left ^L "

: (trimRight " ^G ^I trimmed right ^L ")
-> " ^G ^I trimmed right"

: (trimBoth " ^G ^I trimmed both ^L ")
-> "trimmed both"

PureBasic

Note, if only spaces need to be removed, PureBasic provides commands that do this: LTrim(), RTrim(), and Trim(). To handle a larger selection of whitespace the following functions meet the task. <lang PureBasic>;define the whitespace as desired

  1. whitespace$ = " " + Chr($9) + Chr($A) + Chr($B) + Chr($C) + Chr($D) + Chr($1C) + Chr($1D) + Chr($1E) + Chr($1F)

Procedure.s myLTrim(source.s)

 Protected i, *ptrChar.Character, length = Len(source)
 *ptrChar = @source
 For i = 1 To length
   If Not FindString(#whitespace$, Chr(*ptrChar\c))
     ProcedureReturn Right(source, length + 1 - i)
   EndIf
   *ptrChar + SizeOf(Character)
 Next

EndProcedure

Procedure.s myRTrim(source.s)

 Protected i, *ptrChar.Character, length = Len(source)
 *ptrChar = @source + (length - 1) * SizeOf(Character)
 For i = length To 1 Step - 1
   If Not FindString(#whitespace$, Chr(*ptrChar\c))
     ProcedureReturn Left(source, i)
   EndIf
   *ptrChar - SizeOf(Character)
 Next

EndProcedure

Procedure.s myTrim(source.s)

 ProcedureReturn myRTrim(myLTrim(source))

EndProcedure

If OpenConsole()

 PrintN(#DQUOTE$ + myLTrim("  Top  ") + #DQUOTE$)
 PrintN(#DQUOTE$ + myRTrim("  Tail  ") + #DQUOTE$)
 PrintN(#DQUOTE$ +  myTrim("  Both  ") + #DQUOTE$)
 
 Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input()
 CloseConsole()

EndIf</lang> Sample output:

"Top  "
"  Tail"
"Both"

Python

<lang python>>>> s = ' \t \r \n String with spaces \t \r \n ' >>> s ' \t \r \n String with spaces \t \r \n ' >>> s.lstrip() 'String with spaces \t \r \n ' >>> s.rstrip() ' \t \r \n String with spaces' >>> s.strip() 'String with spaces' >>> </lang>

Sather

<lang sather>class MAIN is

   ltrim(s :STR) :STR is
     i ::= 0;
     loop while!(i < s.size);
       if " \t\f\v\n".contains(s[i]) then
          i := i + 1;
       else
          break!;
       end;
     end;
     return s.tail(s.size - i);
   end;
   rtrim(s :STR) :STR is
     i ::= s.size-1;
     loop while!(i >= 0);
       if " \t\f\v\n".contains(s[i]) then
          i := i - 1;
       else
          break!;
       end;
     end;
     return s.head(i+1);
   end;
   trim(s :STR) :STR is
      return ltrim(rtrim(s));
   end;


   main is
     p ::= "     this is a string     ";
     #OUT + ltrim(p).pretty + "\n";
     #OUT + rtrim(p).pretty + "\n";
     #OUT + trim(p).pretty + "\n";
   end;

end;</lang>

Smalltalk

Works with: GNU Smalltalk

<lang smalltalk>String extend [

  ltrim [
     ^self replacingRegex: '^\s+' with: .
  ]
  rtrim [
     ^self replacingRegex: '\s+$' with: .
  ]
  trim [
     ^self ltrim rtrim.
  ]

]

|a| a := ' this is a string '.

('"%1"' % {a}) displayNl. ('"%1"' % {a ltrim}) displayNl. ('"%1"' % {a rtrim}) displayNl. ('"%1"' % {a trim}) displayNl.</lang>

Tcl

Whitespace stripping is done with string trim and related commands: <lang tcl>set str " hello world " puts "original: >$str<" puts "trimmed head: >[string trimleft $str]<" puts "trimmed tail: >[string trimright $str]<" puts "trimmed both: >[string trim $str]<"</lang> Output:

original: >      hello world      <
trimmed head: >hello world      <
trimmed tail: >      hello world<
trimmed both: >hello world<

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT str= " sentence w/whitespace before and after " trimmedtop=EXTRACT (str,":<|<> :"|,0) trimmedtail=EXTRACT (str,0,":<> >|:") trimmedboth=SQUEEZE(str) PRINT "string <|", str," >|" PRINT "trimmed on top <|",trimmedtop,">|" PRINT "trimmed on tail <|", trimmedtail,">|" PRINT "trimmed on both <|", trimmedboth,">|" </lang> Output:

string           <|      sentence w/whitespace before and after     >|
trimmed on top   <|sentence w/whitespace before and after    >|
trimmed on tail  <|      sentence w/whitespace before and after>|
trimmed on both  <|sentence w/whitespace before and after>|