Substring/Top and tail

From Rosetta Code
Revision as of 16:37, 12 October 2012 by Loren (talk | contribs) (Add XPL0)
Substring/Top and tail
You are encouraged to solve this task according to the task description, using any language you may know.

The task is to demonstrate how to remove the first and last characters from a string. The solution should demonstrate how to obtain the following results:

  • String with first character removed
  • String with last character removed
  • String with both the first and last characters removed

If the program uses UTF-8 or UTF-16, it must work on any valid Unicode code point, whether in the Basic Multilingual Plane or above it. The program must reference logical characters (code points), not 8-bit code units for UTF-8 or 16-bit code units for UTF-16. Programs for other encodings (such as 8-bit ASCII, or EUC-JP) are not required to handle all Unicode characters.


<lang Lisp>(defun str-rest (str)

  (coerce (rest (coerce str 'list)) 'string))

(defun rdc (xs)

  (if (endp (rest xs))
      (cons (first xs)
            (rdc (rest xs)))))

(defun str-rdc (str)

  (coerce (rdc (coerce str 'list)) 'string))

(str-rdc "string") (str-rest "string") (str-rest (str-rdc "string"))</lang>


<lang Ada>with Ada.Text_IO;

procedure Remove_Characters is

  S: String := "upraisers";
  use Ada.Text_IO;


  Put_Line("Full String:   """ & S & """");
  Put_Line("Without_First: """ & S(S'First+1 .. S'Last) & """");
  Put_Line("Without_Last:  """ & S(S'First   .. S'Last-1) & """");
  Put_Line("Without_Both:  """ & S(S'First+1 .. S'Last-1) & """");

end Remove_Characters;</lang>


Full String:   "upraisers"
Without_First: "praisers"
Without_Last:  "upraiser"
Without_Both:  "praiser"


Translation of: AWK
Works with: ALGOL 68 version Revision 1 - no extensions to language used.
Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny.

<lang algol68>#!/usr/local/bin/a68g --script #

STRING str="upraisers"; printf(($gl$,

 str,                      # remove no characters #
 str[LWB str+1:         ], # remove the first character #
 str[         :UPB str-1], # remove the last character #
 str[LWB str+1:UPB str-1], # remove both the first and last character #
 str[LWB str+2:         ], # remove the first 2 characters #
 str[         :UPB str-2], # remove the last 2 characters #
 str[LWB str+1:UPB str-2], # remove 1 before and 2 after #
 str[LWB str+2:UPB str-1], # remove 2 before and one after #
 str[LWB str+2:UPB str-2]  # remove both the first and last 2 characters #

))</lang> Output:



<lang AutoHotkey>myString := "knights" MsgBox % SubStr(MyString, 2) MsgBox % SubStr(MyString, 1, StrLen(MyString)-1) MsgBox % SubStr(MyString, 2, StrLen(MyString)-2)</lang>


<lang awk>BEGIN {

 print substr(mystring,2)                       # remove the first letter
 print substr(mystring,1,length(mystring)-1)    # remove the last character
 print substr(mystring,2,length(mystring)-2)    # remove both the first and last character



Bracmat uses UTF-8 internally. The function utf fails if its argument isn't a valid UTF-8 multibyte string, but in two slightly different ways: an indefinite and a definite way. If the argument does not have the required number of bytes but otherwise seems to be ok, Bracmat's backtacking mechanism lenghtens the argument and then calls utf again. This is repeated until utf either succeeds or definitely fails. The code is far from efficient.

<lang bracmat>(substringUTF-8=

 @( Δημοτική
  : (%?a&utf$!a) ?"String with first character removed"

& @( Δημοτική

  : ?"String with last character removed" (?z&utf$!z)

& @( Δημοτική

  :   (%?a&utf$!a)
      ?"String with both the first and last characters removed"

& out

 $ ("String with first character removed:" !"String with first character removed")

& out

 $ ("String with last character removed:" !"String with last character removed")

& out

 $ ( "String with both the first and last characters removed:"
     !"String with both the first and last characters removed"
String with first character removed: ημοτική
String with last character removed: Δημοτικ
String with both the first and last characters removed: ημοτικ

If the string is known to consist of 8-byte characters, we can use a simpler method. Essential are the % and @ prefixes. The % prefix matches 1 or more elements (bytes, in the case of string pattern matching), while @ matches 0 or 1 elements. In combination these prefixes match 1 and only 1 byte.

<lang bracmat>(substring-8-bit=

 @("8-bit string":%@ ?"String with first character removed")

& @("8-bit string":?"String with last character removed" @) & @( "8-bit string"

  : %@ ?"String with both the first and last characters removed" @

& out

 $ ("String with first character removed:" !"String with first character removed")

& out

 $ ("String with last character removed:" !"String with last character removed")

& out

 $ ( "String with both the first and last characters removed:"
     !"String with both the first and last characters removed"
String with first character removed: -bit string
String with last character removed: 8-bit strin
String with both the first and last characters removed: -bit strin


<lang c>#include <string.h>

  1. include <stdlib.h>
  2. include <stdio.h>

int main( int argc, char ** argv ){

 const char * str_a = "knight";
 const char * str_b = "socks";
 const char * str_c = "brooms";
 char * new_a = malloc( strlen( str_a ) - 1 );
 char * new_b = malloc( strlen( str_b ) - 1 );
 char * new_c = malloc( strlen( str_c ) - 2 );
 strcpy( new_a, str_a + 1 );
 strncpy( new_b, str_b, strlen( str_b ) - 1 );
 strncpy( new_c, str_c + 1, strlen( str_c ) - 2 );
 printf( "%s\n%s\n%s\n", new_a, new_b, new_c );
 free( new_a );
 free( new_b );
 free( new_c );
 return 0;




ANSI C provides little functionality for text manipulation outside of string.h. While a number of libraries for this purpose have been written, this example uses only ANSI C.


<lang cpp>#include <string>

  1. include <iostream>

int main( ) {

  std::string word( "Premier League" ) ;
  std::cout << "Without first letter: " << word.substr( 1 ) << " !\n" ;
  std::cout << "Without last letter: " << word.substr( 0 , word.length( ) - 1 ) << " !\n" ;
  std::cout << "Without first and last letter: " << word.substr( 1 , word.length( ) - 2 ) << " !\n" ;
  return 0 ;

}</lang> Output:

Without first letter: remier League !
Without last letter: Premier Leagu !
Without first and last letter: remier Leagu !


<lang C sharp> using System;

class Program {

   static void Main(string[] args)
       string testString = "test";
       Console.WriteLine(testString.Substring(0, testString.Length - 1));
       Console.WriteLine(testString.Substring(1, testString.Length - 2));

} </lang>




Version for ASCII strings or Unicode dstrings: <lang d>import std.stdio;

void main() {

   // strip first character
   writeln("knight"[1 .. $]);
   // strip last character
   writeln("socks"[0 .. $ - 1]);
   // strip both first and last characters
   writeln("brooms"[1 .. $ - 1]);




<lang Delphi>program TopAndTail;



 TEST_STRING = '1234567890';


 Writeln(TEST_STRING);                                    // full string
 Writeln(Copy(TEST_STRING, 2, Length(TEST_STRING)));      // first character removed
 Writeln(Copy(TEST_STRING, 1, Length(TEST_STRING) - 1));  // last character removed
 Writeln(Copy(TEST_STRING, 2, Length(TEST_STRING) - 2));  // first and last characters removed



<lang euphoria>function strip_first(sequence s)

   return s[2..$]

end function

function strip_last(sequence s)

   return s[1..$-1]

end function

function strip_both(sequence s)

   return s[2..$-1]

end function

puts(1, strip_first("knight")) -- strip first character puts(1, strip_last("write")) -- strip last character puts(1, strip_both("brooms")) -- strip both first and last characters</lang>


In Forth, strings typically take up two cells on the stack, diagrammed ( c-addr u ), with C-ADDR the address of the string and U its length. Dropping leading and trailing characters then involves simple mathematical operations on the address or length, without mutating or copying the string.

<lang forth>: hello ( -- c-addr u )

 s" Hello" ;  

hello 1 /string type \ => ello

hello 1- type \ => hell

hello 1 /string 1- type \ => ell</lang>

This works for ASCII, and a slight variation (2 instead of 1 per character) will suffice for BIG5, GB2312, and like, but Unicode-general code can use +X/STRING and X\STRING- from Forth-200x's XCHAR wordset.


<lang Fortran>program substring

 character(len=5) :: string
 string = "Hello"
 write (*,*) string
 write (*,*) string(2:)
 write (*,*) string( :len(string)-1)
 write (*,*) string(2:len(string)-1)

end program substring</lang>


Go strings are byte arrays that can hold whatever you want them to hold. Common contents are ASCII and UTF-8. You use different techniques depending on how you are interpreting the string. The utf8 package functions shown here allows efficient extraction of first and last runes without decoding the entire string. <lang go>package main

import (



func main() {

   // ASCII contents:  Interpreting "characters" as bytes.
   s := "ASCII"
   fmt.Println("String:                ", s)
   fmt.Println("First byte removed:    ", s[1:])
   fmt.Println("Last byte removed:     ", s[:len(s)-1])
   fmt.Println("First and last removed:", s[1:len(s)-1])
   // UTF-8 contents:  "Characters" as runes (unicode code points)
   u := "Δημοτική"
   fmt.Println("String:                ", u)
   _, sizeFirst := utf8.DecodeRuneInString(u)
   fmt.Println("First rune removed:    ", u[sizeFirst:])
   _, sizeLast := utf8.DecodeLastRuneInString(u)
   fmt.Println("Last rune removed:     ", u[:len(u)-sizeLast])
   fmt.Println("First and last removed:", u[sizeFirst:len(u)-sizeLast])

}</lang> Output:

String:                 ASCII
First byte removed:     SCII
Last byte removed:      ASCI
First and last removed: SCI
String:                 Δημοτική
First rune removed:     ημοτική
Last rune removed:      Δημοτικ
First and last removed: ημοτικ


<lang qbasic>10 A$="knight":B$="socks":C$="brooms" 20 PRINT MID$(A$,2) 30 PRINT LEFT$(B$,LEN(B$)-1) 40 PRINT MID$(C$,2,LEN(C$)-2)</lang>


Solution: <lang groovy>def top = { it.size() > 1 ? it[0..-2] : } def tail = { it.size() > 1 ? it[1..-1] : }</lang>

Test: <lang groovy>def testVal = 'upraisers' println """ original: ${testVal} top: ${top(testVal)} tail: ${tail(testVal)} top&tail: ${tail(top(testVal))} """</lang>


original: upraisers
top:      upraiser
tail:     praisers
top&tail: praiser


<lang Haskell>-- We define the functions to return an empty string if the argument is too -- short for the particular operation.

remFirst, remLast, remBoth :: String -> String

remFirst "" = "" remFirst cs = tail cs

remLast "" = "" remLast cs = init cs

remBoth (c:cs) = remLast cs remBoth _ = ""

main :: IO () main = do

 let s = "Some string."  
 mapM_ (\f -> putStrLn . f $ s) [remFirst, remLast, remBoth]</lang>

Icon and Unicon

The task is accomplished by sub-stringing. <lang Icon>procedure main() write(s := "knight"," --> ", s[2:0]) # drop 1st char write(s := "sock"," --> ", s[1:-1]) # drop last write(s := "brooms"," --> ", s[2:-1]) # drop both end</lang>

It could also be accomplished (less clearly) by assigning into the string as below. Very awkward for both front and back. <lang Icon>write(s := "knight"," --> ", s[1] := "", s) # drop 1st char</lang>


The monadic primitives }. (Behead) and }: (Curtail) are useful for this task.

Example use:
<lang j> }. 'knight' NB. drop first item night

  }: 'socks'       NB. drop last item


  }: }. 'brooms'   NB. drop first and last items



I solve this problem two ways. First I use substring which is relatively fast for small strings, since it simply grabs the characters within a set of given bounds. The second uses regular expressions, which have a higher overhead for such short strings.

<lang Java>public class RM_chars {

 public static void main( String[] args ){
   System.out.println( "knight".substring( 1 ) );
   System.out.println( "socks".substring( 0, 4 ) );
   System.out.println( "brooms".substring( 1, 5 ) );
     // first, do this by selecting a specific substring
     // to exclude the first and last characters
   System.out.println( "knight".replaceAll( "^.", "" ) );
   System.out.println( "socks".replaceAll( ".$", "" ) );
   System.out.println( "brooms".replaceAll( "^.|.$", "" ) );
     // then do this using a regular expressions





<lang javascript>alert("knight".slice(1)); // strip first character alert("socks".slice(0, -1)); // strip last character alert("brooms".slice(1, -1)); // strip both first and last characters</lang>

Liberty BASIC

<lang lb>string$ = "Rosetta Code" Print Mid$(string$, 2) Print Left$(string$, (Len(string$) - 1)) Print Mid$(string$, 2, (Len(string$) - 2))</lang>

Locomotive Basic

<lang locobasic>10 a$="knight":b$="socks":c$="brooms" 20 PRINT MID$(a$,2) 30 PRINT LEFT$(b$,LEN(b$)-1) 40 PRINT MID$(c$,2,LEN(c$)-2)</lang>


<lang lua>print (string.sub("knights",2)) -- remove the first character print (string.sub("knights",1,-2)) -- remove the last character print (string.sub("knights",2,-2)) -- remove the first and last characters</lang>


<lang Mathematica>StringDrop["input string",1] StringDrop["input string",-1] StringTake["input string",{2,-2}] </lang>

MATLAB / Octave

The following case will not handle UTF-8. However, Matlab supports conversion of utf-8 to utf-16 using native2unicode(). <lang MATLAB>

   % String with first character removed


   % String with last character removed


   % String with both the first and last characters removed 

str(2:end-1) </lang>


<lang Nemerle>using System; using System.Console;

module RemoveChars {

   Main() : void
       def str = "*A string*";
       def end = str.Remove(str.Length - 1);  // from pos to end
       def beg = str.Remove(0, 1);            // start pos, # of chars to remove
       def both = str.Trim(array['*']);       // with Trim() you need to know what char's you're removing
       WriteLine($"$str -> $beg -> $end -> $both");



<lang objeck> bundle Default {

  class TopTail {
     function : Main(args : System.String[]) ~ Nil {
        string := "test";
        string->SubString(1, string->Size() - 1)->PrintLine();
        string->SubString(string->Size() - 1)->PrintLine();
        string->SubString(1, string->Size() - 2)->PrintLine();

} </lang>


<lang ocaml>let strip_first_char str =

 if str = "" then "" else
 String.sub str 1 ((String.length str) - 1)

let strip_last_char str =

 if str = "" then "" else
 String.sub str 0 ((String.length str) - 1)

let strip_both_chars str =

 match String.length str with
 | 0 | 1 | 2 -> ""
 | len -> String.sub str 1 (len - 2)

let () =

 print_endline (strip_first_char "knight");
 print_endline (strip_last_char "socks");
 print_endline (strip_both_chars "brooms");


<lang parigp>df(s)=concat(vecextract(Vec(s),1<<#s-2)); dl(s)=concat(vecextract(Vec(s),1<<(#s-1)-1)); db(s)=concat(vecextract(Vec(s),1<<(#s-1)-2));</lang>


See Delphi


<lang perl>print substr("knight",1), "\n"; # strip first character print substr("socks", 0, -1), "\n"; # strip last character print substr("brooms", 1, -1), "\n"; # strip both first and last characters</lang>

In perl, we can also remove the last character from a string variable with the chop function:

<lang perl>$string = 'ouch'; $bits = chop($string); # The last letter is returned by the chop function print $bits; # h print $string; # ouc # See we really did chop the last letter off</lang>

Perl 6

Perl 6 has a substr routine similar to that of Perl. The only real difference is that it may be called as a subroutine or as a method.

<lang perl6>say substr('knight', 1); # strip first character - sub say 'knight'.substr(1); # strip first character - method

say substr('socks', 0, -1); # strip last character - sub say 'socks'.substr( 0, -1); # strip last character - method

say substr('brooms', 1, -1); # strip both first and last characters - sub say 'brooms'.substr(1, -1); # strip both first and last characters - method</lang>

Perl 6 also has chop though it works differently from Perl. There is also p5chop that works like Perls chop.

<lang perl6>my $string = 'ouch'; say $string.chop; # ouc - does not modify original $string say $string; # ouch say $string.p5chop; # h - returns the character chopped off and modifies $string say $string; # ouc</lang>


<lang php><?php echo substr("knight", 1), "\n"; // strip first character echo substr("socks", 0, -1), "\n"; // strip last character echo substr("brooms", 1, -1), "\n"; // strip both first and last characters ?></lang>


<lang PicoLisp>: (pack (cdr (chop "knight"))) # Remove first character -> "night"

(pack (head -1 (chop "socks"))) # Remove last character

-> "sock"

(pack (cddr (rot (chop "brooms")))) # Remove first and last characters

-> "room"</lang>


<lang PL/I> declare s character (100) varying; s = 'now is the time to come to the aid of the party'; if length(s) <= 2 then stop; put skip list ('First character removed=' || substr(s,2) ); put skip list ('Last character removed=' || substr(s, 1, length(s)-1) ); put skip list ('One character from each end removed=' ||

  substr(s, 2, length(s)-2) );

</lang> OUTPUT:

First character removed=ow is the time to come to the aid of the party
Last character removed=now is the time to come to the aid of the part 
One character from each end removed=ow is the time to come to the aid of the part 


Works with SWI-Prolog.

<lang Prolog>remove_first_last_chars :- L = "Rosetta", L = [_|L1], remove_last(L, L2), remove_last(L1, L3), writef('Original string  : %s\n', [L]), writef('Without first char  : %s\n', [L1]), writef('Without last char  : %s\n', [L2]), writef('Without first/last chars : %s\n', [L3]).

remove_last(L, LR) :- reverse(L, [_ | L1]), reverse(L1, LR).</lang> Output :

 ?- remove_first_last_chars.
Original string          : Rosetta
Without first char       : osetta
Without last char        : Rosett
Without first/last chars : osett


<lang PureBasic>If OpenConsole()

 PrintN(Right("knight", Len("knight") - 1))  ;strip the first letter
 PrintN(Left("socks", Len("socks")- 1))      ;strip the last letter
 PrintN(Mid("brooms", 2, Len("brooms") - 2)) ;strip both the first and last letter
 Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input()

EndIf</lang> Sample output:



<lang python>print "knight"[1:] # strip first character print "socks"[:-1] # strip last character print "brooms"[1:-1] # strip both first and last characters</lang>


<lang rexx>/*REXX program to show removal of 1st/last/1st&last chars from a string.*/

z = 'abcdefghijk'

say ' the original string =' z say 'string first character removed =' substr(z,2) say 'string last character removed =' left(z,length(z)-1) say 'string first & last character removed =' substr(z,2,length(z)-2) exit

       /* ┌───────────────────────────────────────────────┐
          │ however, the original string may be null,     │
          │ or of insufficient length which may cause the │
          │ BIFs to fail  (because of negative length).   │
          └───────────────────────────────────────────────┘ */

say ' the original string =' z say 'string first character removed =' substr(z,2) say 'string last character removed =' left(z,max(0,length(z)-1)) say 'string first & last character removed =' substr(z,2,max(0,length(z)-2))</lang> output

                  the original string = abcdefghijk
string first        character removed = bcdefghijk
string         last character removed = abcdefghij
string first & last character removed = bcdefghij


<lang ruby>puts "knight"[1..-1] # strip first character puts "socks"[0..-2] # strip last character puts "socks".chop # alternate way to strip last character puts "brooms"[1..-2] # strip both first and last characters</lang>


<lang runbasic>s$ = "Run BASIC" print mid$(s$,2) 'strip first print left$(s$,len(s$) -1) 'strip last print mid$(s$,2,len(s$) -2) 'strip first and last</lang>


<lang scala>println("knight" tail) // strip first character println("socks" dropRight 1) // strip last character println("brooms".tail dropRight 1) // strip both first and last characters</lang>


<lang seed7>$ include "seed7_05.s7i";

const proc: main is func

   const string: stri is "upraisers";
   writeln("Full string:   " <& stri);
   writeln("Without first: " <& stri[2 ..]);
   writeln("Without last:  " <& stri[.. pred(length(stri))]);
   writeln("Without both:  " <& stri[2 .. pred(length(stri))]);
 end func;</lang>


Full string:   upraisers
Without first: praisers
Without last:  upraiser
Without both:  praiser


<lang snobol4> "knight" len(1) rem . output  ;* strip first character

    "socks" rtab(1) . output           ;* strip last character
    "brooms" len(1) rtab(1) . output   ;* strip both first and last characters</lang>


<lang tcl>puts [string range "knight" 1 end]; # strip first character puts [string range "write" 0 end-1]; # strip last character puts [string range "brooms" 1 end-1]; # strip both first and last characters</lang>


<lang tuscript> $$ MODE TUSCRIPT str="upraisers" str1=EXTRACT (str,2,0) str2=EXTRACT (str,0,-1) str3=EXTRACT (str,2,-1) PRINT str PRINT str1 PRINT str2 PRINT str3 </lang> Output:



<lang vala> // declare test string string s = "Hello, world!"; // remove first letter string s_first = s[1:s.length]; //remove last letter string s_last = s[0:s.length - 1]; // remove first and last letters string s_first_last = s[1:s.length - 1]; </lang>


<lang XPL0>include c:\cxpl\stdlib; char S, P; [S:= "Smiles"; Text(0, S+1); \first character removed CrLf(0); P:= S + StrLen(S) - 2; \point to last character in string P(0):= P(0) ! $80; \set the MSb on the last character Text(0, S); \last character removed CrLf(0); Text(0, S+1); \first and last characters removed CrLf(0); ]</lang>



ZX Spectrum Basic

<lang zxbasic>10 PRINT FN f$("knight"): REM strip the first letter 20 PRINT FN l$("socks"): REM strip the last letter 30 PRINT FN b$("brooms"): REM strip both the first and last letter 100 STOP

9000 DEF FN f$(a$)=a$(2 TO LEN(a$)) 9010 DEF FN l$(a$)=a$(1 TO LEN(a$)-(1 AND (LEN(a$)>=1))) 9020 DEF FN b$(a$)=FN l$(FN f$(a$)) </lang>