Substring

From Rosetta Code
Jump to: navigation, search
Task
Substring
You are encouraged to solve this task according to the task description, using any language you may know.

Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.

You may see other such operations in the Basic Data Operations category, or:

Integer Operations
Arithmetic | Comparison

Boolean Operations
Bitwise | Logical

String Operations
Concatenation | Interpolation | Comparison | Matching

Memory Operations
Pointers & references | Addresses

In this task display a substring:

  • starting from n characters in and of m length;
  • starting from n characters in, up to the end of the string;
  • whole string minus last character;
  • starting from a known character within the string and of m length;
  • starting from a known substring within the string and of m length.

If the program uses UTF-8 or UTF-16, it must work on any valid Unicode code point, whether in the Basic Multilingual Plane or above it. The program must reference logical characters (code points), not 8-bit code units for UTF-8 or 16-bit code units for UTF-16. Programs for other encodings (such as 8-bit ASCII, or EUC-JP) are not required to handle all Unicode characters.

Contents

[edit] Ada

String in Ada is an array of Character elements indexed by Positive:

type String is array (Positive range <>) of Character;

Substring is a first-class object in Ada, an anonymous subtype of String. The language uses the term slice for it. Slices can be retrieved, assigned and passed as a parameter to subprograms in mutable or immutable mode. A slice is specified as:

A (<first-index>..<last-index>)

A string array in Ada can start with any positive index. This is why the implementation below uses Str'First in all slices, which in this concrete case is 1, but intentionally left in the code because the task refers to N as an offset to the string beginning rather than an index in the string. In Ada it is unusual to deal with slices in such way. One uses plain string index instead.

with Ada.Text_IO;        use Ada.Text_IO;
with Ada.Strings.Fixed; use Ada.Strings.Fixed;
 
procedure Test_Slices is
Str : constant String := "abcdefgh";
N : constant := 2;
M : constant := 3;
begin
Put_Line (Str (Str'First + N - 1..Str'First + N + M - 2));
Put_Line (Str (Str'First + N - 1..Str'Last));
Put_Line (Str (Str'First..Str'Last - 1));
Put_Line (Head (Tail (Str, Str'Last - Index (Str, "d", 1)), M));
Put_Line (Head (Tail (Str, Str'Last - Index (Str, "de", 1) - 1), M));
end Test_Slices;
Output:
bcd
bcdefgh
abcdefg
efg
fgh

[edit] Aikido

Aikido uses square brackets for slices. The syntax is [start:end]. If you want to use length you have to add to the start. Shifting strings left or right removes characters from the ends.

 
const str = "abcdefg"
var n = 2
var m = 3
 
println (str[n:n+m-1]) // pos 2 length 3
println (str[n:]) // pos 2 to end
println (str >> 1) // remove last character
var p = find (str, 'c')
println (str[p:p+m-1]) // from pos of p length 3
 
var s = find (str, "bc")
println (str[s, s+m-1]) // pos of bc length 3
 

[edit] Aime

text s;
data b, d;
 
s = "The quick brown fox jumps over the lazy dog.";
 
o_text(cut(s, 4, 15));
o_newline();
o_text(cut(s, 4, length(s)));
o_newline();
o_text(delete(s, -1));
o_newline();
o_text(cut(s, index(s, 'q'), 5));
o_newline();
 
b_cast(b, s);
b_cast(d, "brown");
o_text(cut(s, b_find(b, d), 15));
o_newline();
Output:
quick brown fox
quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
quick
brown fox jumps

[edit] ALGOL 68

Translation of: python
Works with: ALGOL 68 version Standard - no extensions to language used
Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny
main: (
STRING s = "abcdefgh";
INT n = 2, m = 3;
CHAR char = "d";
STRING chars = "cd";
 
printf(($gl$, s[n:n+m-1]));
printf(($gl$, s[n:]));
printf(($gl$, s[:UPB s-1]));
 
INT pos;
char in string("d", pos, s);
printf(($gl$, s[pos:pos+m-1]));
string in string("de", pos, s);
printf(($gl$, s[pos:pos+m-1]))
)
Output:
bcd
bcdefgh
abcdefg
def
def

[edit] AutoHotkey

The code contains some alternatives.

String := "abcdefghijklmnopqrstuvwxyz"
; also: String = abcdefghijklmnopqrstuvwxyz
n := 12
m := 5
 
; starting from n characters in and of m length;
subString := SubStr(String, n, m)
; alternative: StringMid, subString, String, n, m
MsgBox % subString
 
; starting from n characters in, up to the end of the string;
subString := SubStr(String, n)
; alternative: StringMid, subString, String, n
MsgBox % subString
 
; whole string minus last character;
StringTrimRight, subString, String, 1
; alternatives: subString := SubStr(String, 1, StrLen(String) - 1)
; StringMid, subString, String, 1, StrLen(String) - 1
MsgBox % subString
 
; starting from a known character within the string and of m length;
findChar := "q"
subString := SubStr(String, InStr(String, findChar), m)
; alternatives: RegExMatch(String, findChar . ".{" . m - 1 . "}", subString)
; StringMid, subString, String, InStr(String, findChar), m
MsgBox % subString
 
; starting from a known character within the string and of m length;
findString := "pq"
subString := SubStr(String, InStr(String, findString), m)
; alternatives: RegExMatch(String, findString . ".{" . m - StrLen(findString) . "}", subString)
; StringMid, subString, String, InStr(String, findString), m
MsgBox % subString
 
Output:
 lmnop
 lmnopqrstuvwxyz
 abcdefghijklmnopqrstuvwxy
 qrstu
 pqrst

[edit] AWK

Translation of: AutoHotKey
BEGIN {
str = "abcdefghijklmnopqrstuvwxyz"
n = 12
m = 5
 
print substr(str, n, m)
print substr(str, n)
print substr(str, 1, length(str) - 1)
print substr(str, index(str, "q"), m)
print substr(str, index(str, "pq"), m)
}
Output:
$ awk -f substring.awk  
lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
qrstu
pqrst

[edit] BASIC

DIM baseString AS STRING, subString AS STRING, findString AS STRING
DIM m AS INTEGER, n AS INTEGER
 
baseString = "abcdefghijklmnopqrstuvwxyz"
n = 12
m = 5
 
' starting from n characters in and of m length;
subString = MID$(baseString, n, m)
PRINT subString
 
' starting from n characters in, up to the end of the string;
subString = MID$(baseString, n)
PRINT subString
 
' whole string minus last character;
subString = LEFT$(baseString, LEN(baseString) - 1)
PRINT subString
 
' starting from a known character within the string and of m length;
' starting from a known substring within the string and of m length.
findString = "pq"
subString = MID$(baseString, INSTR(baseString, findString), m)
PRINT subString
 
Output:
 lmnop
 lmnopqrstuvwxyz
 abcdefghijklmnopqrstuvwxy
 pqrst

[edit] ZX Spectrum Basic

ZX Spectrum Basic has unfortunately no direct way to find a substring within a string, however a similar effect can be done searching with a for loop:

10 LET A$="abcdefghijklmnopqrstuvwxyz"
15 LET n=10: LET m=7
20 PRINT A$(n TO n+m-1)
30 PRINT A$(n TO )
40 PRINT A$( TO LEN (A$)-1)
50 FOR i=1 TO LEN (A$)
60 IF A$(i)="g" THEN PRINT A$(i TO i+m-1): LET i=LEN (A$): GO TO 70
70 NEXT i
80 LET B$="ijk"
90 FOR i=1 TO LEN (A$)-LEN (B$)+1
100 IF A$(i TO i+LEN (B$)-1)=B$ THEN PRINT A$(i TO i+m-1): LET i=LEN (A$)-LEN (B$)+1: GO TO 110
110 NEXT i
120 STOP
 
Output:
jklmnop
jklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
ghijklm
ijklmno

[edit] BBC BASIC

      basestring$ = "The five boxing wizards jump quickly"
n% = 10
m% = 5
 
REM starting from n characters in and of m length:
substring$ = MID$(basestring$, n%, m%)
PRINT substring$
 
REM starting from n characters in, up to the end of the string:
substring$ = MID$(basestring$, n%)
PRINT substring$
 
REM whole string minus last character:
substring$ = LEFT$(basestring$)
PRINT substring$
 
REM starting from a known character within the string and of m length:
char$ = "w"
substring$ = MID$(basestring$, INSTR(basestring$, char$), m%)
PRINT substring$
 
REM starting from a known substring within the string and of m length:
find$ = "iz"
substring$ = MID$(basestring$, INSTR(basestring$, find$), m%)
PRINT substring$
Output:
boxin
boxing wizards jump quickly
The five boxing wizards jump quickl
wizar
izard

[edit] Bracmat

Translation of: BBC BASIC
( (basestring = "The five boxing wizards jump quickly")
& (n = 10)
& (m = 5)
 
{ starting from n characters in and of m length: }
& @(!basestring:? [(!n+-1) ?substring [(!n+!m+-1) ?)
& out$!substring
 
{ starting from n characters in, up to the end of the string: }
& @(!basestring:? [(!n+-1) ?substring)
& out$!substring
 
{ whole string minus last character: }
& @(!basestring:?substring [-2 ?)
& out$!substring
 
{ starting from a known character within the string and of m length: }
& (char = "w")
& @(!basestring:? ([?p !char ?: ?substring [(!p+!m) ?))
& out$!substring
 
{ starting from a known substring within the string and of m length: }
& (find = "iz")
& @(!basestring:? ([?p !find ?: ?substring [(!p+!m) ?))
& out$!substring
&
)
Output:
boxin
boxing wizards jump quickly
The five boxing wizards jump quickl
wizar
izard

[edit] C

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
char *substring(const char *s, size_t n, ptrdiff_t m)
{
char *result;
/* check for null s */
if (NULL == s)
return NULL;
/* negative m to mean 'up to the mth char from right' */
if (m < 0)
m = strlen(s) + m - n + 1;
 
/* n < 0 or m < 0 is invalid */
if (n < 0 || m < 0)
return NULL;
 
/* make sure string does not end before n
* and advance the "s" pointer to beginning of substring */

for ( ; n > 0; s++, n--)
if (*s == '\0')
/* string ends before n: invalid */
return NULL;
 
result = malloc(m+1);
if (NULL == result)
/* memory allocation failed */
return NULL;
result[0]=0;
strncat(result, s, m); /* strncat() will automatically add null terminator
* if string ends early or after reading m characters */

return result;
}
 
char *str_wholeless1(const char *s)
{
return substring(s, 0, strlen(s) - 1);
}
 
char *str_fromch(const char *s, int ch, ptrdiff_t m)
{
return substring(s, strchr(s, ch) - s, m);
}
 
char *str_fromstr(const char *s, char *in, ptrdiff_t m)
{
return substring(s, strstr(s, in) - s , m);
}
 
 
#define TEST(A) do { \
char *r = (A); \
if (NULL == r) \
puts("--error--"); \
else { \
puts(r); \
free(r); \
} \
} while(0)

 
int main()
{
const char *s = "hello world shortest program";
 
TEST( substring(s, 12, 5) ); // get "short"
TEST( substring(s, 6, -1) ); // get "world shortest program"
TEST( str_wholeless1(s) ); // "... progra"
TEST( str_fromch(s, 'w', 5) ); // "world"
TEST( str_fromstr(s, "ro", 3) ); // "rog"
 
return 0;
}

[edit] C++

#include <iostream>
#include <string>
 
int main()
{
std::string s = "0123456789";
 
int const n = 3;
int const m = 4;
char const c = '2';
std::string const sub = "456";
 
std::cout << s.substr(n, m)<< "\n";
std::cout << s.substr(n) << "\n";
std::cout << s.substr(0, s.size()-1) << "\n";
std::cout << s.substr(s.find(c), m) << "\n";
std::cout << s.substr(s.find(sub), m) << "\n";
}

[edit] C#

using System;
namespace SubString
{
class Program
{
static void Main(string[] args)
{
string s = "0123456789";
const int n = 3;
const int m = 2;
const char c = '3';
const string z = "345";
 
Console.WriteLine(s.Substring(n, m));
Console.WriteLine(s.Substring(n, s.Length - n));
Console.WriteLine(s.Substring(0, s.Length - 1));
Console.WriteLine(s.Substring(s.IndexOf(c,0,s.Length), m));
Console.WriteLine(s.Substring(s.IndexOf(z, 0, s.Length), m));
}
}
}
 

[edit] Clojure

 
 
(def string "alphabet")
(def n 2)
(def m 4)
(def len (count string))
 
;starting from n characters in and of m length;
(println
(subs string n (+ n m))) ;phab
;starting from n characters in, up to the end of the string;
(println
(subs string n)) ;phabet
;whole string minus last character;
(println
(subs string 0 (dec len))) ;alphabe
;starting from a known character within the string and of m length;
(let [pos (.indexOf string (int \l))]
(println
(subs string pos (+ pos m)))) ;lpha
;starting from a known substring within the string and of m length.
(let [pos (.indexOf string "ph")]
(println
(subs string pos (+ pos m)))) ;phab
 


[edit] Common Lisp

(let ((string "0123456789")
(n 2)
(m 3)
(start #\5)
(substring "34"))
(list (subseq string n (+ n m))
(subseq string n)
(subseq string 0 (1- (length string)))
(let ((pos (position start string)))
(subseq string pos (+ pos m)))
(let ((pos (search substring string)))
(subseq string pos (+ pos m)))))

[edit] Component Pascal

BlackBox Component Builder

 
MODULE Substrings;
IMPORT StdLog,Strings;
 
PROCEDURE Do*;
CONST
aStr = "abcdefghijklmnopqrstuvwxyz";
VAR
str: ARRAY 128 OF CHAR;
pos: INTEGER;
BEGIN
Strings.Extract(aStr,3,10,str);
StdLog.String("from 3, 10 characters:> ");StdLog.String(str);StdLog.Ln;
Strings.Extract(aStr,3,LEN(aStr) - 3,str);
StdLog.String("from 3, until the end:> ");StdLog.String(str);StdLog.Ln;
Strings.Extract(aStr,0,LEN(aStr) - 1,str);
StdLog.String("whole string but last:> ");StdLog.String(str);StdLog.Ln;
Strings.Find(aStr,'d',0,pos);
Strings.Extract(aStr,pos + 1,10,str);
StdLog.String("from 'd', 10 characters:> ");StdLog.String(str);StdLog.Ln;
Strings.Find(aStr,"de",0,pos);
Strings.Extract(aStr,pos + LEN("de"),10,str);
StdLog.String("from 'de', 10 characters:> ");StdLog.String(str);StdLog.Ln;
END Do;
 
END Substrings.
 

Execute: ^Q Substrings.Do

Output:
from 3, 10 characters:> defghijklm
from 3, until the end:> defghijklmnopqrstuvwxyz
whole string but last:> abcdefghijklmnopqrstuvwxy
from 'd', 10 characters:> efghijklmn
from 'de', 10 characters:> fghijklmno

[edit] D

Works with: D version 2
import std.stdio, std.string;
 
void main() {
const s = "the quick brown fox jumps over the lazy dog";
enum n = 5, m = 3;
 
writeln(s[n .. n + m]);
 
writeln(s[n .. $]);
 
writeln(s[0 .. $ - 1]);
 
const i = s.indexOf("q");
writeln(s[i .. i + m]);
 
const j = s.indexOf("qu");
writeln(s[j .. j + m]);
}
Output:
uic
uick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
qui
qui

[edit] Delphi

program ShowSubstring;
 
{$APPTYPE CONSOLE}
 
uses SysUtils;
 
const
s = '0123456789';
n = 3;
m = 4;
c = '2';
sub = '456';
begin
Writeln(Copy(s, n, m)); // starting from n characters in and of m length;
Writeln(Copy(s, n, Length(s))); // starting from n characters in, up to the end of the string;
Writeln(Copy(s, 1, Length(s) - 1)); // whole string minus last character;
Writeln(Copy(s, Pos(c, s), m)); // starting from a known character within the string and of m length;
Writeln(Copy(s, Pos(sub, s), m)); // starting from a known substring within the string and of m length.
end.
Output:
2345
23456789
012345678
2345
4567

[edit] E

def string := "aardvarks"
def n := 4
def m := 4
println(string(n, n + m))
println(string(n))
println(string(0, string.size() - 1))
println({string(def i := string.indexOf1('d'), i + m)})
println({string(def i := string.startOf("ard"), i + m)})
Output:
vark
varks
aardvark
dvar
ardv

[edit] ECL

 
/* In this task display a substring:
 
1. starting from n characters in and of m length;
2. starting from n characters in, up to the end of the string;
3. whole string minus last character;
4. starting from a known character within the string and of m length;
5. starting from a known substring within the string and of m length.
*/
 
IMPORT STD; //imports a standard string library
 
TheString := 'abcdefghij';
CharIn  := 3; //n
StrLength := 4; //m
KnownChar := 'f';
KnownSub  := 'def';
FindKnownChar := STD.Str.Find(TheString, KnownChar,1);
FindKnownSub  := STD.Str.Find(TheString, KnownSub,1);
 
OUTPUT(TheString[Charin..CharIn+StrLength-1]); //task1
OUTPUT(TheString[Charin..]); //task2
OUTPUT(TheString[1..LENGTH(TheString)-1]); //task3
OUTPUT(TheString[FindKnownChar..FindKnownChar+StrLength-1]);//task4
OUTPUT(TheString[FindKnownSub..FindKnownSub+StrLength-1]); //task5
 
/* OUTPUTS:
defg
cdefghij
abcdefghi
fghi
defg
*/
 

[edit] Eero

#import <Foundation/Foundation.h>
 
int main()
autoreleasepool
str := 'abcdefgh'
n := 2
m := 3
Log( '%@', str[0 .. str.length-1] ) // abcdefgh
Log( '%@', str[n .. m] ) // cd
Log( '%@', str[n .. str.length-1] ) // cdefgh
Log( '%@', str.substringFromIndex: n ) // cdefgh
Log( '%@', str[(str.rangeOfString:'b').location .. m] ) // bcd
return 0

[edit] Erlang

Interactive session in Erlang shell showing built in functions doing the task.

1> N = 3.            
2> M = 5.
3> string:sub_string( "abcdefghijklm", N ).
"cdefghijklm"
4> string:sub_string( "abcdefghijklm", N, N + M - 1 ).
"cdefg"
6> string:sub_string( "abcdefghijklm", 1, string:len("abcdefghijklm") - 1 ).
"abcdefghijkl"
7> Start_character = string:chr( "abcdefghijklm", $e ).
8> string:sub_string( "abcdefghijklm", Start_character, Start_character + M - 1 ).
"efghi"
9> Start_string = string:str( "abcdefghijklm", "efg" ).
10> string:sub_string( "abcdefghijklm", Start_string, Start_string + M - 1 ).
"efghi"

[edit] Euphoria

sequence baseString, subString, findString
integer findChar
integer m, n
 
baseString = "abcdefghijklmnopqrstuvwxyz"
 
-- starting from n characters in and of m length;
n = 12
m = 5
subString = baseString[n..n+m-1]
puts(1, subString )
puts(1,'\n')
 
-- starting from n characters in, up to the end of the string;
n = 12
subString = baseString[n..$]
puts(1, subString )
puts(1,'\n')
 
-- whole string minus last character;
subString = baseString[1..$-1]
puts(1, subString )
puts(1,'\n')
 
-- starting from a known character within the string and of m length;
findChar = 'o'
m = 5
n = find(findChar,baseString)
subString = baseString[n..n+m-1]
puts(1, subString )
puts(1,'\n')
 
-- starting from a known substring within the string and of m length.
findString = "pq"
m = 5
n = match(findString,baseString)
subString = baseString[n..n+m-1]
puts(1, subString )
puts(1,'\n')
Output:
lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
opqrs
pqrst


[edit] F#

[<EntryPoint>]
let main args =
let s = "一二三四五六七八九十"
let n, m = 3, 2
let c = '六'
let z = "六七八"
 
printfn "%s" (s.Substring(n, m))
printfn "%s" (s.Substring(n))
printfn "%s" (s.Substring(0, s.Length - 1))
printfn "%s" (s.Substring(s.IndexOf(c), m))
printfn "%s" (s.Substring(s.IndexOf(z), m))
0
Output:
四五
四五六七八九十
一二三四五六七八九
六七
六七

[edit] Factor

USING: math sequences kernel ;
 
! starting from n characters in and of m length
: subseq* ( from length seq -- newseq ) [ over + ] dip subseq ;
 
! starting from n characters in, up to the end of the string
: dummy ( seq n -- tailseq ) tail ;
 
! whole string minus last character
: dummy1 ( seq -- headseq ) but-last ;
 
USING: fry sequences kernel ;
! helper word
: subseq-from-* ( subseq len seq quot -- seq ) [ nip ] prepose 2keep subseq* ; inline
 
! starting from a known character within the string and of m length;
: subseq-from-char ( char len seq -- seq ) [ index ] subseq-from-* ;
 
! starting from a known substring within the string and of m length.
: subseq-from-seq ( subseq len seq -- seq ) [ start ] subseq-from-* ;

[edit] Forth

/STRING and SEARCH are standard words. SCAN is widely implemented. Substrings represented by address/length pairs require neither mutation nor allocation.

2 constant Pos
3 constant Len
: Str ( -- c-addr u ) s" abcdefgh" ;
 
Str Pos /string drop Len type \ cde
Str Pos /string type \ cdefgh
Str 1- type \ abcdefg
Str char d scan drop Len type \ def
Str s" de" search 2drop Len type \ def

[edit] Fortran

Works with: Fortran version 90 and later
program test_substring
 
character (*), parameter :: string = 'The quick brown fox jumps over the lazy dog.'
character (*), parameter :: substring = 'brown'
character , parameter :: c = 'q'
integer , parameter :: n = 5
integer , parameter :: m = 15
integer :: i
 
! Display the substring starting from n characters in and of length m.
write (*, '(a)') string (n : n + m - 1)
! Display the substring starting from n characters in, up to the end of the string.
write (*, '(a)') string (n :)
! Display the whole string minus the last character.
i = len (string) - 1
write (*, '(a)') string (: i)
! Display the substring starting from a known character and of length m.
i = index (string, c)
write (*, '(a)') string (i : i + m - 1)
! Display the substring starting from a known substring and of length m.
i = index (string, substring)
write (*, '(a)') string (i : i + m - 1)
 
end program test_substring
Output:
quick brown fox
quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
quick brown fox
brown fox jumps

Note that in Fortran positions inside character strings are one-based, i. e. the first character is in position one.

[edit] GAP

LETTERS;
# "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
LETTERS{[5 .. 10]};
# "EFGHIJ"

[edit] Go

package main
import "fmt"
import "strings"
 
func main() {
s := "ABCDEFGH"
n, m := 2, 3
 
fmt.Println(s[n:n+m]) // "CDE"
fmt.Println(s[n:]) // "CDEFGH"
fmt.Println(s[0:len(s)-1]) // "ABCDEFG"
fmt.Println(s[strings.Index(s, "D"):strings.Index(s, "D")+m]) // "DEF"
fmt.Println(s[strings.Index(s, "DE"):strings.Index(s, "DE")+m]) // "DEF"
}

[edit] Groovy

Strings in Groovy are 0-indexed.

def str = 'abcdefgh'
def n = 2
def m = 3
println str[n..n+m-1]
println str[n..-1]
println str[0..-2]
def index1 = str.indexOf('d')
println str[index1..index1+m-1]
def index2 = str.indexOf('de')
println str[index2..index2+m-1]

[edit] Haskell

Works with: Haskell version 6.10.4

A string in Haskell is a list of chars: [Char]

  • The first three tasks are simply:
*Main> take 3 $ drop 2 "1234567890"
"345"

*Main> drop 2 "1234567890"
"34567890"

*Main> init "1234567890"
"123456789"
  • The last two can be formulated with the following function:
t45 n c s | null sub = []
| otherwise = take n. head $ sub
where sub = filter(isPrefixOf c) $ tails s
*Main> t45 3 "4" "1234567890"
"456"

*Main> t45 3 "45" "1234567890"
"456"

*Main> t45 3 "31" "1234567890"
""

[edit] HicEst

CHARACTER :: string = 'ABCDEFGHIJK', known = 'B',  substring = 'CDE'
REAL, PARAMETER :: n = 5, m = 8
 
WRITE(Messagebox) string(n : n + m - 1), "| substring starting from n, length m"
WRITE(Messagebox) string(n :), "| substring starting from n, to end of string"
WRITE(Messagebox) string(1: LEN(string)-1), "| whole string minus last character"
 
pos_known = INDEX(string, known)
WRITE(Messagebox) string(pos_known : pos_known+m-1), "| substring starting from pos_known, length m"
 
pos_substring = INDEX(string, substring)
WRITE(Messagebox) string(pos_substring : pos_substring+m-1), "| substring starting from pos_substring, length m"

[edit] Icon and Unicon

procedure main(arglist)
write("Usage: substring <string> <first position> <second position> <single character> <substring>")
s := \arglist[1] | "aardvarks"
n := \arglist[2] | 5
m := \arglist[3] | 4
c := \arglist[4] | "d"
ss := \arglist[5] | "ard"
 
write( s[n+:m] )
write( s[n:0] )
write( s[1:-1] )
write( s[find(c,s)+:m] )
write( s[find(ss,s)+:m] )
end

[edit] J

   5{.3}.'Marshmallow'
shmal
3}.'Marshmallow'
shmallow
}.'Marshmallow'
arshmallow
}:'Marshmallow'
Marshmallo
5{.(}.~ i.&'m')'Marshmallow'
mallo
5{.(}.~ I.@E.~&'sh')'Marshmallow'
shmal

Note that there are other, sometimes better, ways of accomplishing this task.

   'Marshmallow'{~(+i.)/3 5
shmal

The taketo / takeafter and dropto / dropafter utilities from the strings script further simplify these types of tasks.

   require 'strings'
'sh' dropto 'Marshmallow'
shmallow
5{. 'sh' dropto 'Marshmallow'
shmal
'sh' takeafter 'Marshmallow'
mallow

Note also that these operations work the same way on lists of numbers that they do on this example list of characters.

   3}. 2 3 5 7 11 13 17 19
7 11 13 17 19
7 11 dropafter 2 3 5 7 11 13 17 19
2 3 5 7 11

[edit] Java

Strings in Java are 0-indexed.

String x = "testing123";
System.out.println(x.substring(n, n + m));
System.out.println(x.substring(n));
System.out.println(x.substring(0, x.length() - 1));
int index1 = x.indexOf('i');
System.out.println(x.substring(index1, index1 + m));
int index2 = x.indexOf("ing");
System.out.println(x.substring(index2, index2 + m));
//indexOf methods also have an optional "from index" argument which will
//make indexOf ignore characters before that index

[edit] JavaScript

The String object has two similar methods: substr and substring.

  • substr(start, [len]) returns a substring beginning at a specified location and having a specified length.
  • substring(start, [end]) returns a string containing the substring from start up to, but not including, end.
var str = "abcdefgh";
 
var n = 2;
var m = 3;
 
// * starting from n characters in and of m length;
str.substr(n, m); // => "cde"
 
// * starting from n characters in, up to the end of the string;
str.substr(n); // => "cdefgh"
str.substring(n); // => "cdefgh"
 
// * whole string minus last character;
str.substring(0, str.length - 1); // => "abcdefg"
 
// * starting from a known character within the string and of m length;
str.substr(str.indexOf('b'), m); // => "bcd"
 
// * starting from a known substring within the string and of m length.
str.substr(str.indexOf('bc'), m); // => "bcd"

[edit] jq

Works with: jq version 1.4

For this exercise we use the Chinese characters for 1 to 10, the character for "10" being "十":

def s: "一二三四五六七八九十";

jq strings are UTF-8 strings, and array-based string indexing and most string functions, such as length/0, are based on Unicode code points. However, the function index/1 currently uses character counts when its input is a string, and therefore in the following we use ix/1 defined as follows:

def ix(s): explode | index(s|explode);

(Users who have access to the regex function match/1 can use it, as illustrated in the comments below.)

Since jq arrays and strings have an index origin of 0, "n characters in" is interpreted to require an index of (n+1).

# starting from n characters in and of m length:  .[n+1: n+m+1]
"s[1:2] => \( s[1:2] )",
 
# starting from n characters in, up to the end of the string: .[n+1:]
"s[9:] => \( s[9:] )",
 
# whole string minus last character: .[0:length-1]
"s|.[0:length-1] => \(s | .[0:length-1] )",
 
# starting from a known character within the string and of m length:
# jq 1.4: ix(c) as $i | .[ $i: $i + m]
# jq>1.4: match(c).offset as $i | .[ $i: $i + m]
"s | ix(\"五\") as $i | .[$i: $i + 1] => \(s | ix("五") as $i | .[$i: $i + 1] )",
 
 
# starting from a known substring within the string and of m length:
# jq 1.4: ix(sub) as $i | .[ $i: $i + m]
# jq>1.4: match(sub).offset as $i | .[ $i: $i + m]
"s | ix(\"五六\") as $i | .[$i: $i + 2] => " +
"\( s | ix("五六") as $i | .[$i: $i + 2] )"
Output:
$ jq -M -n -r -f Substring.jq
s[1:2] => 二
s[9:] => 十
s|.[0:length-1] => 一二三四五六七八九
s | ix("五") as $i | .[$i: $i + 1] => 五
s | ix("五六") as $i | .[$i: $i + 2] => 五六

[edit] Julia

By default, the type of the string is infered from its elements. In the example below, the string s is an ASCII string. In order to interpret the string as an UTF8 string with logical access to its argument, one should use CharString("/\ʕ•ᴥ•ʔ/\"...). Without the CharString declaration, the string is interpreted as an UTF8 string with access through its byte representation.

julia> s = "abcdefg"
"abcdefg"
 
julia> n = 3
3
 
julia> s[n:end]
"cdefg"
 
julia> m=2
2
 
julia> s[n:n+m]
"cde"
 
julia> s[1:end-1]
"abcdef"
 
julia> s[search(s,'c')]
'c'
 
julia> s[search(s,'c'):search(s,'c')+m]
"cde"

[edit] LabVIEW

To enhance readability, this task was split into two separate GUI's. In the second, note that "Known Substring" can be a single character.
1: LabVIEW Substring1.png
2: LabVIEW Substring2.png

[edit] Lang5

: cr "\n". ; [] '__A set : dip swap __A swap 1 compress append '__A set execute __A
-1 extract nip ; : nip swap drop ; : tuck swap over ; : -rot rot rot ; : 0= 0 == ; : 1+ 1 + ;
: 2dip swap 'dip dip ; : 2drop drop drop ; : |a,b> over - iota + ; : bi* 'dip dip execute ; : bi@ dup bi* ;
: comb "" split ; : concat "" join ; : empty? length 0= ; : tail over lensize |a,b> subscript ;
: lensize length nip ; : while do 'dup dip 'execute 2dip rot if dup 2dip else break then loop 2drop ;
 
: <substr> comb -rot over + |a,b> subscript concat ;
: str-tail tail concat ;
: str-index
 : 2streq 2dup over lensize iota subscript eq '* reduce ;
swap 'comb bi@ length -rot 0 -rot
"2dup 'lensize bi@ <="
"2streq if 0 reshape else '1+ 2dip 0 extract drop then"
while empty? if 2drop tuck == if drop -1 then else 4 ndrop -1 then ;
 
'abcdefgh 'str set 2 'n set 3 'm set
n m str <substr>
str comb n str-tail
str "d" str-index m str <substr>
str "de" str-index m str <substr>


[edit] Lasso

local(str = 'The quick grey rhino jumped over the lazy green fox.')
 
//starting from n characters in and of m length;
#str->substring(16,5) //rhino
 
//starting from n characters in, up to the end of the string
#str->substring(16) //rhino jumped over the lazy green fox.
 
//whole string minus last character
#str->substring(1,#str->size - 1) //The quick grey rhino jumped over the lazy green fox
 
//starting from a known character within the string and of m length;
#str->substring(#str->find('g'),10) //grey rhino
 
//starting from a known substring within the string and of m length
#str->substring(#str->find('rhino'),12) //rhino jumped

[edit] Liberty BASIC

'These tasks can be completed with various combinations of Liberty Basic's
'built in Mid$()/ Instr()/ Left$()/ Right$()/ and Len() functions, but these
'examples only use the Mid$()/ Instr()/ and Len() functions.
 
baseString$ = "Thequickbrownfoxjumpsoverthelazydog."
n = 12
m = 5
 
'starting from n characters in and of m length
Print Mid$(baseString$, n, m)
 
'starting from n characters in, up to the end of the string
Print Mid$(baseString$, n)
 
'whole string minus last character
Print Mid$(baseString$, 1, (Len(baseString$) - 1))
 
'starting from a known character within the string and of m length
Print Mid$(baseString$, Instr(baseString$, "f", 1), m)
 
'starting from a known substring within the string and of m length
Print Mid$(baseString$, Instr(baseString$, "jump", 1), m)

[edit]

Works with: UCB Logo

The following are defined to behave similarly to the built-in index operator ITEM. As with most Logo list operators, these are designed to work for both words (strings) and lists.

to items :n :thing
if :n >= count :thing [output :thing]
output items :n butlast :thing
end
 
to butitems :n :thing
if or :n <= 0 empty? :thing [output :thing]
output butitems :n-1 butfirst :thing
end
 
to middle :n :m :thing
output items :m-(:n-1) butitems :n-1 :thing
end
 
to lastitems :n :thing
if :n >= count :thing [output :thing]
output lastitems :n butfirst :thing
end
 
to starts.with :sub :thing
if empty? :sub [output "true]
if empty? :thing [output "false]
if not equal? first :sub first :thing [output "false]
output starts.with butfirst :sub butfirst :thing
end
 
to members :sub :thing
output cascade [starts.with :sub ?] [bf ?] :thing
end
 
; note: Logo indices start at one
make "s "abcdefgh
print items 3 butitems 2 :s ; cde
print middle 3 5  :s  ; cde
print butitems 2  :s  ; cdefgh
print butlast  :s  ; abcdefg
print items 3 member "d  :s ; def
print items 3 members "de :s ; def

[edit] Logtalk

Using atoms for representing strings and usng the same sample data as e.g. in the Java solution:

 
:- object(substring).
 
:- public(test/5).
 
test(String, N, M, Character, Substring) :-
sub_atom(String, N, M, _, Substring1),
write(Substring1), nl,
sub_atom(String, N, _, 0, Substring2),
write(Substring2), nl,
sub_atom(String, 0, _, 1, Substring3),
write(Substring3), nl,
% there can be multiple occurences of the character
once(sub_atom(String, Before4, 1, _, Character)),
sub_atom(String, Before4, M, _, Substring4),
write(Substring4), nl,
% there can be multiple occurences of the substring
once(sub_atom(String, Before5, _, _, Substring)),
sub_atom(String, Before5, M, _, Substring5),
write(Substring5), nl.
 
:- end_object.
 
Output:
 
| ?- ?- substring::test('abcdefgh', 2, 3, 'b', 'bc').
cde
cdefgh
abcdefg
bcd
bcd
yes
 

[edit] Lua

str = "abcdefghijklmnopqrstuvwxyz"
n, m = 5, 15
 
print( string.sub( str, n, m ) ) -- efghijklmno
print( string.sub( str, n, -1 ) ) -- efghijklmnopqrstuvwxyz
print( string.sub( str, 1, -2 ) ) -- abcdefghijklmnopqrstuvwxy
 
pos = string.find( str, "i" )
if pos ~= nil then print( string.sub( str, pos, pos+m ) ) end -- ijklmnopqrstuvwx
 
pos = string.find( str, "ijk" )
if pos ~= nil then print( string.sub( str, pos, pos+m ) ) end-- ijklmnopqrstuvwx
 
-- Alternative (more modern) notation
 
print ( str:sub(n,m) ) -- efghijklmno
print ( str:sub(n) ) -- efghijklmnopqrstuvwxyz
print ( str:sub(1,-2) ) -- abcdefghijklmnopqrstuvwxy
 
pos = str:find "i"
if pos then print (str:sub(pos,pos+m)) end -- ijklmnopqrstuvwx
 
pos = str:find "ijk"
if pos then print (str:sub(pos,pos+m)) end d-- ijklmnopqrstuvwx
 
 

[edit] Maple

 
> n, m := 3, 5:
> s := "The Higher, The Fewer!":
> s[ n .. n + m - 1 ];
"e Hig"
 

There are a few ways to get everything from the n-th character on.

 
> s[ n .. -1 ] = s[ n .. ];
"e Higher, The Fewer!" = "e Higher, The Fewer!"
 
> StringTools:-Drop( s, n - 1 );
"e Higher, The Fewer!"
 

There are a few ways to get all but the last character.

 
> s[ 1 .. -2 ] = s[ .. -2 ];
"The Higher, The Fewer" = "The Higher, The Fewer"
 
> StringTools:-Chop( s );
"The Higher, The Fewer"
 

The searchtext command returns the position of a matching substring.

 
> pos := searchtext( ",", s ):
> s[ pos .. pos + m - 1 ];
", The"
 
> pos := searchtext( "Higher", s ):
> s[ pos .. pos + m - 1 ];
"Highe"
 

But, note that searchtext returns 0 when there is no match, and 0 is not a valid index into a string.

[edit] Mathematica

The StringTake and StringDrop are relevant for this exercise.

 
n = 2
m = 3
StringTake["Mathematica", {n+1, n+m-1}]
 
StringDrop["Mathematica", n]
 
(* StringPosition returns a list of starting and ending character positions for a substring *)
pos = StringPosition["Mathematica", "e"][[1]][[1]]
StringTake["Mathematica", {pos, pos+m-1}]
 
(* Similar to above *)
pos = StringPosition["Mathematica", "the"][[1]]
StringTake["Mathematica", {pos, pos+m-1}]
 

[edit] MATLAB / Octave

Unicode, UTF-8, UTF-16 is only partially supported. In some cases, a conversion of unicode2native() or native2unicode() is necessary.

 
% starting from n characters in and of m length;
s(n+(1:m))
s(n+1:n+m)
% starting from n characters in, up to the end of the string;
s(n+1:end)
% whole string minus last character;
s(1:end-1)
% starting from a known character within the string and of m length;
s(find(s==c,1)+[0:m-1])
% starting from a known substring within the string and of m length.
s(strfind(s,pattern)+[0:m-1])
 


[edit] Maxima

s: "the quick brown fox jumps over the lazy dog";
substring(s, 17);
/* "fox jumps over the lazy dog" */
substring(s, 17, 20);
/* "fox" */


[edit] MUMPS

MUMPS has the first position in a string numbered as 1.

 
SUBSTR(S,N,M,C,K)
 ;show substring operations
 ;S is the string
 ;N is a position within the string (that is, n<length(string))
 ;M is an integer of positions to show
 ;C is a character within the string S
 ;K is a substring within the string S
 ;$Find returns the position after the substring
NEW X
WRITE !,"The base string is:",!,?5,"'",S,"'"
WRITE !,"From position ",N," for ",M," characters:"
WRITE !,?5,$EXTRACT(S,N,N+M-1)
WRITE !,"From position ",N," to the end of the string:"
WRITE !,?5,$EXTRACT(S,N,$LENGTH(S))
WRITE !,"Whole string minus last character:"
WRITE !,?5,$EXTRACT(S,1,$LENGTH(S)-1)
WRITE !,"Starting from character '",C,"' for ",M," characters:"
SET X=$FIND(S,C)-$LENGTH(C)
WRITE !,?5,$EXTRACT(S,X,X+M-1)
WRITE !,"Starting from string '",K,"' for ",M," characters:"
SET X=$FIND(S,K)-$LENGTH(K)
W !,?5,$EXTRACT(S,X,X+M-1)
QUIT
 

Usage:

USER>D SUBSTR^ROSETTA("ABCD1234efgh",3,4,"D","23")
 
The base string is:
     'ABCD1234efgh'
From position 3 for 4 characters:
     CD12
From position 3 to the end of the string:
     CD1234efgh
Whole string minus last character:
     ABCD1234efg
Starting from character 'D' for 4 characters:
     D123
Starting from string '23' for 4 characters:
     234e

[edit] Nemerle

using System;
using System.Console;
 
module Substrings
{
Main() : void
{
string s = "0123456789";
def n = 3;
def m = 2;
def c = '3';
def z = "345";
 
WriteLine(s.Substring(n, m));
WriteLine(s.Substring(n, s.Length - n));
WriteLine(s.Substring(0, s.Length - 1));
WriteLine(s.Substring(s.IndexOf(c,0,s.Length), m));
WriteLine(s.Substring(s.IndexOf(z, 0, s.Length), m));
}
}

[edit] NetRexx

Translation of: REXX
/* NetRexx */
 
options replace format comments java crossref savelog symbols
 
s = 'abcdefghijk'
n = 4
m = 3
 
say s
say s.substr(n, m)
say s.substr(n)
say s.substr(1, s.length - 1)
say s.substr(s.pos('def'), m)
say s.substr(s.pos('g'), m)
 
return
 
Output:
abcdefghijk
def
defghijk
abcdefghij
def
ghi

[edit] newLISP

> (set 'str "alphabet" 'n 2 'm 4)
4
> ; starting from n characters in and of m length
> (slice str n m)
"phab"
> ; starting from n characters in, up to the end of the string
> (slice str n)
"phabet"
> ; whole string minus last character
> (chop str)
"alphabe"
> ; starting from a known character within the string and of m length
> (slice str (find "l" str) m)
"lpha"
> ; starting from a known substring within the string and of m length
> (slice str (find "ph" str) m)
"phab"
 

[edit] Nimrod

import strutils
 
let
s = "abcdefgh"
n = 2
m = 3
c = 'd'
cs = "cd"
var i = 0
 
# starting from n=2 characters in and m=3 in length
echo s[n-1 .. n+m-2]
 
# starting from n characters in, up to the end of the string
echo s[n-1 .. s.high]
 
# whole string minus last character:
echo s[0 .. <s.high]
 
# starting from a known character c='d'within the string and of m length
i = s.find(c)
echo s[i .. <i+m]
 
# starting from a known substring cs="cd" within the string and of m length
i = s.find(cs)
echo s[i .. <i+m]

[edit] Niue

( based on the JavaScript code )
'abcdefgh 's ;
s str-len 'len ;
2 'n ;
3 'm ;
 
( starting from n characters in and of m length )
s n n m + substring . ( => cde ) newline
 
( starting from n characters in, up to the end of the string )
s n len substring . ( => cdefgh ) newline
 
( whole string minus last character )
s 0 len 1 - substring . ( => abcdefg ) newline
 
( starting from a known character within the string and of m length )
s s 'b str-find dup m + substring . ( => bcd ) newline
 
( starting from a known substring within the string and of m length )
s s 'bc str-find dup m + substring . ( => bcd ) newline
 

[edit] Objeck

 
bundle Default {
class SubString {
function : Main(args : String[]) ~ Nil {
s := "0123456789";
 
n := 3;
m := 4;
c := '2';
sub := "456";
 
s->SubString(n, m)->PrintLine();
s->SubString(n)->PrintLine();
s->SubString(0, s->Size())->PrintLine();
s->SubString(s->Find(c), m)->PrintLine();
s->SubString(s->Find(sub), m)->PrintLine();
}
}
}
 

[edit] OCaml

# let s = "ABCDEFGH" ;;
val s : string = "ABCDEFGH"
 
# let n, m = 2, 3 ;;
val n : int = 2
val m : int = 3
 
# String.sub s n m ;;
- : string = "CDE"
 
# String.sub s n (String.length s - n) ;;
- : string = "CDEFGH"
 
# String.sub s 0 (String.length s - 1) ;;
- : string = "ABCDEFG"
 
# String.sub s (String.index s 'D') m ;;
- : string = "DEF"
 
# #load "str.cma";;
# let n = Str.search_forward (Str.regexp_string "DE") s 0 in
String.sub s n m ;;
- : string = "DEF"

[edit] Oz

declare
fun {DropUntil Xs Prefix}
case Xs of nil then nil
[] _|Xr then
if {List.isPrefix Prefix Xs} then Xs
else {DropUntil Xr Prefix}
end
end
end
 
Digits = "1234567890"
in
{ForAll
[{List.take {List.drop Digits 2} 3} = "345"
{List.drop Digits 2} = "34567890"
{List.take Digits {Length Digits}-1} = "123456789"
{List.take {DropUntil Digits "4"} 3} = "456"
{List.take {DropUntil Digits "56"} 3} = "567"
{List.take {DropUntil Digits "31"} 3} = ""
]
System.showInfo}

[edit] Pascal

See Delphi

[edit] Perl

my $str = 'abcdefgh';
my $n = 2;
my $m = 3;
print substr($str, $n, $m), "\n";
print substr($str, $n), "\n";
print substr($str, 0, -1), "\n";
print substr($str, index($str, 'd'), $m), "\n";
print substr($str, index($str, 'de'), $m), "\n";

[edit] Perl 6

my $str = 'abcdefgh';
my $n = 2;
my $m = 3;
say $str.substr($n, $m);
say $str.substr($n);
say $str.substr(0, *-1);
say $str.substr($str.index('d'), $m);
say $str.substr($str.index('de'), $m);

[edit] PHP

<?php
$str = 'abcdefgh';
$n = 2;
$m = 3;
echo substr($str, $n, $m), "\n"; //cde
echo substr($str, $n), "\n"; //cdefgh
echo substr($str, 0, -1), "\n"; //abcdefg
echo substr($str, strpos($str, 'd'), $m), "\n"; //def
echo substr($str, strpos($str, 'de'), $m), "\n"; //def
?>

[edit] PicoLisp

(let Str (chop "This is a string")
(prinl (head 4 (nth Str 6))) # From 6 of 4 length
(prinl (nth Str 6)) # From 6 up to the end
(prinl (head -1 Str)) # Minus last character
(prinl (head 8 (member "s" Str))) # From character "s" of length 8
(prinl # From "isa" of length 8
(head 8
(seek '((S) (pre? "is a" S)) Str) ) ) )
Output:
is a
is a string
This is a strin
s is a s
is a str

[edit] PL/I

 
s='abcdefghijk';
n=4; m=3;
u=substr(s,n,m);
u=substr(s,n);
u=substr(s,1,length(s)-1);
u=left(s,length(s)-1);
u=substr(s,1,length(s)-1);
u=substr(s,index(s,'g'),m);
 

[edit] PowerShell

Since .NET and PowerShell use zero-based indexing, all character indexes have to be reduced by one.

# test string
$s = "abcdefgh"
# test parameters
$n, $m, $c, $s2 = 2, 3, [char]'d', $s2 = 'cd'
 
# starting from n characters in and of m length
# n = 2, m = 3
$s.Substring($n-1, $m) # returns 'bcd'
 
# starting from n characters in, up to the end of the string
# n = 2
$s.Substring($n-1) # returns 'bcdefgh'
 
# whole string minus last character
$s.Substring(0, $s.Length - 1) # returns 'abcdefg'
 
# starting from a known character within the string and of m length
# c = 'd', m =3
$s.Substring($s.IndexOf($c), $m) # returns 'def'
 
# starting from a known substring within the string and of m length
# s2 = 'cd', m = 3
$s.Substring($s.IndexOf($s2), $m) # returns 'cde'

[edit] PureBasic

If OpenConsole()
 
Define baseString.s, m, n
 
baseString = "Thequickbrownfoxjumpsoverthelazydog."
n = 12
m = 5
 
;Display the substring starting from n characters in and of m length.
PrintN(Mid(baseString, n, m))
 
;Display the substring starting from n characters in, up to the end of the string.
PrintN(Mid(baseString, n)) ;or PrintN(Right(baseString, Len(baseString) - n))
 
;Display the substring whole string minus last character
PrintN(Left(baseString, Len(baseString) - 1))
 
;Display the substring starting from a known character within the string and of m length.
PrintN(Mid(baseString, FindString(baseString, "b", 1), m))
 
;Display the substring starting from a known substring within the string and of m length.
PrintN(Mid(baseString, FindString(baseString, "ju", 1), m))
 
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
Input()
CloseConsole()
EndIf
Output:
wnfox
wnfoxjumpsoverthelazydog.
Thequickbrownfoxjumpsoverthelazydog
brown
jumps

[edit] Python

Python uses zero-based indexing, so the n'th character is at index n-1.

>>> s = 'abcdefgh'
>>> n, m, char, chars = 2, 3, 'd', 'cd'
>>> # starting from n=2 characters in and m=3 in length;
>>> s[n-1:n+m-1]
'bcd'
>>> # starting from n characters in, up to the end of the string;
>>> s[n-1:]
'bcdefgh'
>>> # whole string minus last character;
>>> s[:-1]
'abcdefg'
>>> # starting from a known character char="d" within the string and of m length;
>>> indx = s.index(char)
>>> s[indx:indx+m]
'def'
>>> # starting from a known substring chars="cd" within the string and of m length.
>>> indx = s.index(chars)
>>> s[indx:indx+m]
'cde'
>>>

[edit] R

s <- "abcdefgh"
n <- 2; m <- 2; char <- 'd'; chars <- 'cd'
substring(s, n, n + m)
substring(s, n)
substring(s, 1, nchar(s)-1)
indx <- which(strsplit(s, '')[[1]]%in%strsplit(char, '')[[1]])
substring(s, indx, indx + m)
indx <- which(strsplit(s, '')[[1]]%in%strsplit(chars, '')[[1]])[1]
substring(s, indx, indx + m)

[edit] Racket

 
#lang racket
 
(define str "abcdefghijklmnopqrstuvwxyz")
 
(define n 10)
(define m 2)
(define start-char #\x)
(define start-str "xy")
 
;; starting from n characters in and of m length;
(substring str n (+ n m)) ; -> "kl"
 
;; starting from n characters in, up to the end of the string;
(substring str m) ; -> "klmnopqrstuvwxyz"
 
;; whole string minus last character;
(substring str 0 (sub1 (string-length str))) ; -> "abcdefghijklmnopqrstuvwxy"
 
;; starting from a known character within the string and of m length;
(substring str (caar (regexp-match-positions (regexp-quote (string start-char))
str))) ; -> "xyz"
 
;; starting from a known substring within the string and of m length.
(substring str (caar (regexp-match-positions (regexp-quote start-str)
str))) ; -> "xyz"
 

[edit] Raven

define println use $s
$s print "\n" print
 
"0123456789" as $str
 
$str 3 2 extract println # at 4th pos get 2 chars
$str 8 4 extract println # at 9th pos get 4 chars (when only 1 char available)
 
 
$str 3 $str length extract println # at 4th pos get all chars to end of str
$str 3 0x7FFFFFFF extract println # at 4th pos get all chars to end of str
 
$str 3 -1 extract println # at 4th pos get rest of chars but last one
$str 0 -1 extract println # all chars but last one
 
"3" as $matchChr # starting chr for extraction
4 as $subLen # Nr chars after found starting char
$str $matchChr split as $l
"" $l 0 set $l $matchChr join
0 $subLen extract println
 
"345" as $matchChrs # starting chrs for extraction
6 as $subLen # Nr chars after found starting chars
$str $matchChrs split as $l
"" $l 0 set $l $matchChrs join
0 $subLen extract println
Output:
34
89
3456789
3456789
345678
012345678
3456
345678

[edit] REBOL

rebol [
Title: "Retrieve Substring"
Author: oofoe
Date: 2009-12-06
URL: http://rosettacode.org/wiki/Retrieve_a_substring
]

 
s: "abcdefgh" n: 2 m: 3 char: #"d" chars: "cd"
 
; Note that REBOL uses base-1 indexing. Strings are series values,
; just like blocks or lists so I can use the same words to manipulate
; them. All these examples use the 'copy' function against the 's'
; string with a particular offset as needed.
 
; For the fragment "copy/part skip s n - 1 m", read from right to
; left. First you have 'm', which we ignore for now. Then evaluate
; 'n - 1' (makes 1), to adjust the offset. Then 'skip' jumps from the
; start of the string by that offset. 'copy' starts copying from the
; new start position and the '/part' refinement limits the copy by 'm'
; characters.
 
print ["Starting from n, length m:"
copy/part skip s n - 1 m]
 
; It may be helpful to see the expression with optional parenthesis:
 
print ["Starting from n, length m (parens):"
(copy/part (skip s (n - 1)) m)]
 
; This example is much simpler, so hopefully it's easier to see how
; the string start is position for the copy:
 
print ["Starting from n to end of string:"
copy skip s n - 1]
 
print ["Whole string minus last character:"
copy/part s (length? s) - 1]
 
print ["Starting from known character, length m:"
copy/part find s char m]
 
print ["Starting from substring, length m:"
copy/part find s chars m]
Output:
Script: "Retrieve Substring" (6-Dec-2009)
Starting from n, length m: bcd
Starting from n, length m (parens): bcd
Starting from n to end of string: bcdefgh
Whole string minus last character: abcdefg
Starting from known character, length m: def
Starting from substring, length m: cde


[edit] REXX

Note:   in REXX,   the 1st character index of a string is   1,   not   0.

/*REXX program demonstrates various ways to extract substrings from a string of characters.  */
s='abcdefghijk'; n=4; m=3 /*define some REXX constants (string, index, length of string).*/
say 'original string='s /* [↑] M can be zero (which indicates a null string). */
say '──────────────────────────────────────────────────────────1'
 
u=substr(s,n,m) /*starting from N characters in and of M length. */
say u
parse var s =(n) a +(m) /*another way of doing the above by using the PARSE instruction*/
say a
say '──────────────────────────────────────────────────────────2'
 
u=substr(s,n) /*starting from N characters in, up to the end-of-string. */
say u
parse var s =(n) a /*another way of doing the above by using the PARSE instruction*/
say a
say '──────────────────────────────────────────────────────────3'
 
u=substr(s,1,length(s)-1) /*OK: the whole string except the last character. */
say u
v=substr(s,1,max(0,length(s)-1)) /*better: this version handles the case of a null string. */
say v
L=length(s) - 1
parse var s a +(L) /*another way of doing the above by using the PARSE instruction*/
say a
say '──────────────────────────────────────────────────────────4'
 
u=substr(s,pos('g',s),m) /*starting from a known char within the string & of M length.*/
say u
parse var s 'g' a +(m) /*another way of doing the above by using the PARSE instruction*/
say a
say '──────────────────────────────────────────────────────────5'
 
u=substr(s,pos('def',s),m) /*starting from a known substr within the string & of M length.*/
say u
parse var s 'def' a +(m) /*another way of doing the above by using the PARSE instruction*/
say a
/*stick a fork in it sir, we're all done and Bob's your uncle. */
Output:
original string=abcdefghijk
──────────────────────────────────────────────────────────1
def
def
──────────────────────────────────────────────────────────2
defghijk
defghijk
──────────────────────────────────────────────────────────3
abcdefghij
abcdefghij
abcdefghij
──────────────────────────────────────────────────────────4
ghi
ghi
──────────────────────────────────────────────────────────5
def
def

Programming note:   generally, the REXX   parse   statement is faster than using an assignment statement and using a BIF (built-in function), but the use of   parse   is more obtuse.

[edit] Ruby

str = 'abcdefgh'
n = 2
m = 3
puts str[n, m] #=> cde
puts str[n..m] #=> cd
puts str[n..-1] #=> cdefgh
puts str[0..-2] #=> abcdefg
puts str[str.index('d'), m] #=> def
puts str[str.index('de'), m] #=> def
puts str[/a.*d/] #=> abcd

[edit] Run BASIC

n  = 2
m = 3
s$ = "abcd"
a$ = mid$(a$,n,m) ' starting from n characters in and of m length
a$ = mid$(a$,n) ' starting from n characters in, up to the end of the string
a$ = Print mid$(a$,1,(len(a$)-1)) ' whole string minus last character
a$ = mid$(a$,instr(a$,s$,1),m) ' starting from a known character within the string and of m length
a$ = mid$(a$,instr(a$,s$,1), m) ' starting from a known substring within the string and of m length.

[edit] SAS

data _null_;
a="abracadabra";
b=substr(a,2,3); /* first number is position, starting at 1,
second number is length */

put _all_;
run;

[edit] Sather

class MAIN is
main is
s ::= "hello world shortest program";
#OUT + s.substring(12, 5) + "\n";
#OUT + s.substring(6) + "\n";
#OUT + s.head( s.size - 1) + "\n";
#OUT + s.substring(s.search('w'), 5) + "\n";
#OUT + s.substring(s.search("ro"), 3) + "\n";
end;
end;

[edit] Scala

Library: Scala
object Substring {
// Ruler 1 2 3 4 5 6
// 012345678901234567890123456789012345678901234567890123456789012
val str = "The good life is one inspired by love and guided by knowledge."
val (n, m) = (21, 16) // An one-liner to set n = 21, m = 16
 
// Starting from n characters in and of m length
assert("inspired by love" == str.slice(n, n + m))
 
// Starting from n characters in, up to the end of the string
assert("inspired by love and guided by knowledge." == str.drop(n))
 
// Whole string minus last character
assert("The good life is one inspired by love and guided by knowledge" == str.init)
 
// Starting from a known character within the string and of m length
assert("life is one insp" == str.dropWhile(_ != 'l').take(m) )
 
// Starting from a known substring within the string and of m length
assert("good life is one" == { val i = str.indexOf("good"); str.slice(i, i + m) })
// Alternatively
assert("good life is one" == str.drop(str.indexOf("good")).take(m))
}

[edit] Scheme

Works with: Guile
(define s "Hello, world!")
(define n 5)
(define m (+ n 6))
 
(display (substring s n m))
(newline)
 
(display (substring s n))
(newline)
 
(display (substring s 0 (- (string-length s) 1)))
(newline)
 
(display (substring s (string-index s #\o) m))
(newline)
 
(display (substring s (string-contains s "lo") m))
(newline)

[edit] Sed

 
# 2 chars starting from 3rd
$ echo string | sed -r 's/.{3}(.{2}).*/\1/'
in
# remove first 3 chars
echo string | sed -r 's/^.{3}//'
# delete last char
$ echo string | sed -r 's/.$//'
strin
# `r' with two following chars
$ echo string | sed -r 's/.*(r.{2}).*/\1/'
rin
 

[edit] Seed7

$ include "seed7_05.s7i";
 
const proc: main is func
local
const string: stri is "abcdefgh";
const integer: N is 2;
const integer: M is 3;
begin
writeln(stri[N len M]);
writeln(stri[N ..]);
writeln(stri[.. pred(length(stri))]);
writeln(stri[pos(stri, 'c') len M]);
writeln(stri[pos(stri, "de") len M]);
end func;
Output:
bcd
bcdefgh
abcdefg
cde
def

[edit] Slate

 
#s := 'hello world shortest program'.
#n := 13.
#m := 4.
inform: (s copyFrom: n to: n + m).
inform: (s copyFrom: n).
inform: s allButLast.
inform: (s copyFrom: (s indexOf: $w) to: (s indexOf: $w) + m).
inform: (s copyFrom: (s indexOfSubSeq: 'ro') to: (s indexOfSubSeq: 'ro') + m).
 

[edit] Smalltalk

The distinction between searching a single character or a string into another string is rather blurred. In the following code, instead of using 'w' (a string) we could use $w (a character), but it makes no difference.

|s|
s := 'hello world shortest program'.
 
(s copyFrom: 13 to: (13+4)) displayNl.
"4 is the length (5) - 1, since we need the index of the
last char we want, which is included"

 
(s copyFrom: 7) displayNl.
(s allButLast) displayNl.
 
(s copyFrom: ((s indexOfRegex: 'w') first)
to: ( ((s indexOfRegex: 'w') first) + 4) ) displayNl.
(s copyFrom: ((s indexOfRegex: 'ro') first)
to: ( ((s indexOfRegex: 'ro') first) + 2) ) displayNl.

These last two examples in particular seem rather complex, so we can extend the string class.

Works with: GNU Smalltalk
String extend [
copyFrom: index length: nChar [
^ self copyFrom: index to: ( index + nChar - 1 )
]
copyFromRegex: regEx length: nChar [
|i|
i := self indexOfRegex: regEx.
^ self copyFrom: (i first) length: nChar
]
].
 
"and show it simpler..."
 
(s copyFrom: 13 length: 5) displayNl.
(s copyFromRegex: 'w' length: 5) displayNl.
(s copyFromRegex: 'ro' length: 3) displayNl.

[edit] SNOBOL4

	string = "abcdefghijklmnopqrstuvwxyz"
n = 12
m = 5
known_char = "q"
known_str = "pq"
* starting from n characters in and of m length;
string len(n - 1) len(m) . output
* starting from n characters in, up to the end of the string;
string len(n - 1) rem . output
* whole string minus last character;
string rtab(1) . output
* starting from a known character within the string and of m length;
string break(known_char) len(m) . output
* starting from a known substring <= m within the string and of m length.
string (known_str len(m - size(known_str))) . output
end
Output:
 lmnop
 lmnopqrstuvwxyz
 abcdefghijklmnopqrstuvwxy
 qrstu
 pqrst

[edit] Tcl

set str "abcdefgh"
set n 2
set m 3
 
puts [string range $str $n [expr {$n+$m-1}]]
puts [string range $str $n end]
puts [string range $str 0 end-1]
# Because Tcl does substrings with a pair of indices, it is easier to express
# the last two parts of the task as a chained pair of [string range] operations.
# A maximally efficient solution would calculate the indices in full first.
puts [string range [string range $str [string first "d" $str] end] [expr {$m-1}]]
puts [string range [string range $str [string first "de" $str] end] [expr {$m-1}]]
 
# From Tcl 8.5 onwards, these can be contracted somewhat.
puts [string range [string range $str [string first "d" $str] end] $m-1]
puts [string range [string range $str [string first "de" $str] end] $m-1]

Of course, if you were doing 'position-plus-length' a lot, it would be easier to add another subcommand to string, like this:

Works with: Tcl version 8.5
# Define the substring operation, efficiently
proc ::substring {string start length} {
string range $string $start [expr {$start + $length - 1}]
}
# Plumb it into the language
set ops [namespace ensemble configure string -map]
dict set ops substr ::substring
namespace ensemble configure string -map $ops
 
# Now show off by repeating the challenge!
set str "abcdefgh"
set n 2
set m 3
 
puts [string substr $str $n $m]
puts [string range $str $n end]
puts [string range $str 0 end-1]
puts [string substr $str [string first "d" $str] $m]
puts [string substr $str [string first "de" $str] $m]

[edit] TUSCRIPT

 
$$ MODE TUSCRIPT
string="abcdefgh", n=4,m=n+2
substring=EXTRACT (string,#n,#m)
PRINT substring
substring=Extract (string,#n,0)
PRINT substring
substring=EXTRACT (string,0,-1)
PRINT substring
n=SEARCH (string,":d:"),m=n+2
substring=EXTRACT (string,#n,#m)
PRINT substring
substring=EXTRACT (string,":{substring}:"|,0)
PRINT substring
 
Output:
de
defgh
abcdefg
de
fgh

[edit] UNIX Shell

[edit] POSIX shells

Works with: Almquist Shell
str="abc qrdef qrghi"
n=6
m=3
 
expr "x$str" : "x.\{$n\}\(.\{1,$m\}\)"
expr "x$str" : "x.\{$n\}\(.*\)"
printf '%s\n' "${str%?}"
expr "r${str#*r}" : "\(.\{1,$m\}\)"
expr "qr${str#*qr}" : "\(.\{1,$m\}\)"
def
def qrghi
abc qrdef qrgh
rde
qrd

This program uses expr(1) to capture a substring.

[edit] Bourne Shell

Works with: Bourne Shell
str="abc qrdef qrghi"
n=6
m=3
 
expr "x$str" : "x.\{$n\}\(.\{1,$m\}\)"
expr "x$str" : "x.\{$n\}\(.*\)"
expr "x$str" : "x\(.*\)."
 
index() {
i=0 s=$1
until test "x$s" = x || expr "x$s" : "x$2" >/dev/null; do
i=`expr $i + 1` s=`expr "x$s" : "x.\(.*\)"`
done
echo $i
}
expr "x$str" : "x.\{`index "$str" r`\}\(.\{1,$m\}\)"
expr "x$str" : "x.\{`index "$str" qr`\}\(.\{1,$m\}\)"
def
def qrghi
abc qrdef qrgh
rde
qrd

[edit] zsh

Works with: zsh

Note that the last two constructs won't work with bash as only zsh supports nested string manipulation.

 
#!/bin/zsh
string='abcdefghijk'
echo ${string:2:3} # Display 3 chars starting 2 chars in ie: 'cde'
echo ${string:2} # Starting 2 chars in, display to end of string
echo ${string:0:${#string}-1} # Whole string minus last character
echo ${string%?} # Shorter variant of the above
echo ${${string/*c/c}:0:3} # Display 3 chars starting with 'c'
echo ${${string/*cde/cde}:0:3} # Display 3 chars starting with 'cde'
 

[edit] Pipe

This example shows how to cut(1) a substring from a string.

Translation of: AWK
Works with: Almquist Shell
#!/bin/sh
str=abcdefghijklmnopqrstuvwxyz
n=12
m=5
 
printf %s "$str" | cut -c $n-`expr $n + $m - 1`
printf %s "$str" | cut -c $n-
printf '%s\n' "${str%?}"
printf q%s "${str#*q}" | cut -c 1-$m
printf pq%s "${str#*pq}" | cut -c 1-$m
Output:
$ sh substring.sh                                                              
lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
qrstu
pqrst

[edit] Vala

 
string s = "Hello, world!";
int n = 1;
int m = 3;
// start at n and go m letters
string s_n_to_m = s[n:n+m];
// start at n and go to end
string s_n_to_end = s[n:s.length];
// start at beginning and show all but last
string s_notlast = s[0:s.length - 1];
// start from known letter and then go m letters
int index_of_l = s.index_of("l");
string s_froml_for_m = s[index_of_l:index_of_l + m];
// start from known substring then go m letters
int index_of_lo = s.index_of("lo");
string s_fromlo_for_m = s[index_of_lo:index_of_lo + m];
 

[edit] Wart

s <- "abcdefgh"
s.0
=> "a"
 
# starting from n characters in and of m length;
def (substr s start len)
(s start start+len)
(substr s 3 2)
=> "de"
 
# starting from n characters in, up to the end of the string
(s 3 nil)
=> "defgh"
 
# whole string minus last character;
(s 3 -1)
=> "defg"
 
# starting from a known character within the string and of <tt>m</tt> length;
# starting from a known substring within the string and of <tt>m</tt> length.
let start (pos s pat)
(s start start+m)

[edit] Yorick

str = "abcdefgh";
n = 2;
m = 3;
 
// starting from n character in and of m length
write, strpart(str, n:n+m-1);
// starting from n character in, up to the end of the string
write, strpart(str, n:);
// whole string minus last character
write, strpart(str, :-1);
// starting from a known character within the string and of m length
match = strfind("d", str);
write, strpart(str, [match(1), match(1)+m]);
// starting from a known substring within the string and of m length
match = strfind("cd", str);
write, strpart(str, [match(1), match(1)+m]);

[edit] zkl

8 bit ASCII

var str = "abcdefgh", n = 2, m = 3;
str[n,m] //-->"cde"
str[n,*] //-->"cdefgh"
str[0,-1] //-->"abcdefg"
str[str.find("d"),m] //-->"def"
str[str.find("de"),m] //-->"def"
Retrieved from "http://rosettacode.org/mw/index.php?title=Substring&oldid=192754"
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox