Binary strings

From Rosetta Code
Task
Binary strings
You are encouraged to solve this task according to the task description, using any language you may know.

Many languages have powerful and useful (binary safe) string manipulation functions, while others don't, making it harder for these languages to accomplish some tasks. This task is about creating functions to handle binary strings (strings made of arbitrary bytes, i.e. byte strings according to Wikipedia) for those languages that don't have built-in support for them. If your language of choice does have this built-in support, show a possible alternative implementation for the functions or abilities already provided by the language. In particular the functions you need to create are:

  • String creation and destruction (when needed and if there's no garbage collection or similar mechanism)
  • String assignment
  • String comparison
  • String cloning and copying
  • Check if a string is empty
  • Append a byte to a string
  • Extract a substring from a string
  • Replace every occurrence of a byte (or a string) in a string with another string
  • Join strings

Possible contexts of use: compression algorithms (like LZW compression), L-systems (manipulation of symbols), many more.

Ada

Ada has native support for single dimensioned arrays, which provide all specified operations. String is a case of array. The array of bytes is predefined in Ada in the package System.Storage_Elements (LRM 13.7.1). Storage_Element is substitute for byte.

<lang Ada>declare

  Data : Storage_Array (1..20); -- Data created

begin

  Data := (others => 0); -- Assign all zeros
  if Data = (1..10 => 0) then -- Compare with 10 zeros
     declare
        Copy : Storage_Array := Data; -- Copy Data
     begin
        if Data'Length = 0 then -- If empty
           ...
        end if;
     end;
  end if;
  ... Data & 1 ...         -- The result is Data with byte 1 appended
  ... Data & (1,2,3,4) ... -- The result is Data with bytes 1,2,3,4 appended
  ... Data (3..5) ...      -- The result the substring of Data from 3 to 5

end; -- Data destructed</lang> Storage_Array is "binary string" used for memory representation. For stream-oriented I/O communication Ada provides alternative "binary string" called Stream_Element_Array (LRM 13.13.1). When dealing with octets of bits, programmers are encouraged to provide a data type of their own to ensure that the byte is exactly 8 bits length. For example: <lang Ada>type Octet is mod 2**8; for Octet'Size use 8; type Octet_String is array (Positive range <>) of Octet;</lang> Alternatively: <lang Ada>with Interfaces; use Interfaces; ... type Octet is new Interfaces.Unsigned_8; type Octet_String is array (Positive range <>) of Octet;</lang> Note that all of these types will have all operations described above.

ALGOL 68

Translation of: Tcl
Works with: ALGOL 68 version Standard - no extensions to language used
Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386

<lang algol68># String creation # STRING a,b,c,d,e,f,g,h,i,j,l,r; a := "hello world"; print((a, new line));

  1. String destruction (for garbage collection) #

b := (); BEGIN

 LOC STRING lb := "hello earth";  # allocate off the LOC stack  #
 HEAP STRING hb := "hello moon"; # allocate out of the HEAP space #
 ~

END; # local variable "lb" has LOC stack space recovered at END #

  1. String assignment #

c := "a"+REPR 0+"b"; print (("string length c:", UPB c, new line));# ==> 3 #

  1. String comparison #

l := "ab"; r := "CD";

BOOL result; FORMAT summary = $""""g""" is "b("","NOT ")"lexicographically "g" """g""""l$ ;

result := l < r OR l LT r; printf((summary, l, result, "less than", r)); result := l <= r OR l LE r # OR l ≤ r #; printf((summary, l, result, "less than or equal to", r)); result := l = r OR l EQ r; printf((summary, l, result, "equal to", r)); result := l /= r OR l NE r # OR l ≠ r #; printf((summary, l, result, "not equal to", r)); result := l >= r OR l GE r # OR l ≥ r #; printf((summary, l, result, "greater than or equal to", r)); result := l > r OR l GT r; printf((summary, l, result, "greater than", r));

  1. String cloning and copying #

e := f;

  1. Check if a string is empty #

IF g = "" THEN print(("g is empty", new line)) FI; IF UPB g = 0 THEN print(("g is empty", new line)) FI;

  1. Append a byte to a string #

h +:= "A";

  1. Append a string to a string #

h +:= "BCD"; h PLUSAB "EFG";

  1. Prepend a string to a string - because STRING addition isn't communitive #

"789" +=: h; "456" PLUSTO h; print(("The result of prepends and appends: ", h, new line));

  1. Extract a substring from a string #

i := h[2:3]; print(("Substring 2:3 of ",h," is ",i, new line));

  1. Replace every occurrences of a byte (or a string) in a string with another string #

PROC replace = (STRING string, old, new, INT count)STRING: (

 INT pos;
 STRING tail := string, out;
 TO count WHILE string in string(old, pos, tail) DO
   out +:= tail[:pos-1]+new;
   tail := tail[pos+UPB old:]
 OD;
 out+tail

);

j := replace("hello world", "world", "planet", max int); print(("After replace string: ", j, new line));

INT offset = 7;

  1. Replace a character at an offset in the string #

j[offset] := "P"; print(("After replace 7th character: ", j, new line));

  1. Replace a substring at an offset in the string #

j[offset:offset+3] := "PlAN"; print(("After replace 7:10th characters: ", j, new line));

  1. Insert a string before an offset in the string #

j := j[:offset-1]+"INSERTED "+j[offset:]; print(("Insert string before 7th character: ", j, new line));

  1. Join strings #

a := "hel"; b := "lo w"; c := "orld"; d := a+b+c;

print(("a+b+c is ",d, new line));

  1. Pack a string into the target CPU's word #

BYTES word := bytes pack(d);

  1. Extract a CHAR from a CPU word #

print(("7th byte in CPU word is: ", offset ELEM word, new line))</lang> Output:

hello world
string length c:         +3
"ab" is NOT lexicographically less than "CD"
"ab" is NOT lexicographically less than or equal to "CD"
"ab" is NOT lexicographically equal to "CD"
"ab" is lexicographically not equal to "CD"
"ab" is lexicographically greater than or equal to "CD"
"ab" is lexicographically greater than "CD"
g is empty
g is empty
The result of prepends and appends: 456789ABCDEFG
Substring 2:3 of 456789ABCDEFG is 56
After replace string: hello planet
After replace 7th character: hello Planet
After replace 7:10th characters: hello PlANet
Insert string before 7th character: hello INSERTED PlANet
a+b+c is hello world
7th byte in CPU word is: w

C

estrings.h <lang c>#ifndef ESTRINGS_H_

  1. define ESTRINGS_H_
  1. include <string.h>
  2. include <stdlib.h>
  3. include <stdbool.h>
  1. define BYTES_PER_BLOCK 128

struct StringStruct {

 char *bstring;
 size_t length;
 size_t blocks;

}; typedef struct StringStruct *String;


String newString(); String setString(String s, const char *p, size_t len); String appendChar(String s, char c); int compareStrings(String s1, String s2); void destroyString(String s); String copyString(String to, String from); String cloneString(String s); bool stringIsEmpty(String s); String subString(String s, size_t from, size_t to); String replaceWith(String str, String ch, String repl); String joinStrings(String s1, String s2);

  1. endif</lang>

estrings.c <lang c>#include "estrings.h"

  1. include <stdio.h>
  2. define NOT_IMPLEMENTED_YET fprintf(stderr, "not implemented yet\n")</lang>

<lang c>String newString() {

 String t;
 t = malloc(sizeof(struct StringStruct));
 if ( t == NULL ) return NULL;
 t->length = 0;
 t->blocks = 1;
 t->bstring = malloc(BYTES_PER_BLOCK * t->blocks);
 if ( t->bstring == NULL ) { free(t); return NULL; }
 return t;

}</lang>

<lang c>static bool _fitString(String s, size_t len) {

 s->blocks = len/BYTES_PER_BLOCK + 1;
 s->bstring = realloc(s->bstring, s->blocks * BYTES_PER_BLOCK);
 if ( s->bstring != NULL ) return true;
 return false;

}</lang>

<lang c>String setString(String s, const char *p, size_t len) {

  if ( s != NULL )
  {
     if ( p == NULL ) { s->length = 0; return s; }
     _fitString(s, len);
     if ( s->bstring != NULL )
     {
       memcpy(s->bstring, p, len);
       s->length = len;
     }
  }
  return s;

}</lang>

<lang c>String appendChar(String s, char c) {

  if ( s == NULL ) return NULL;
  _fitString(s, s->length + 1);
  s->length++;
  if ( s->bstring != NULL )
  {
     s->bstring[s->length-1] = c;
  }
  return s;

}</lang>

<lang c>int compareStrings(String s1, String s2) {

  if ( s1->length < s2->length ) return -1;
  if ( s1->length > s2->length ) return 1;
  return memcmp(s1->bstring, s2->bstring, s1->length);

}</lang>

<lang c>void destroyString(String s) {

  if ( s != NULL )
  {
     if ( s->bstring != NULL ) free(s->bstring);
     free(s);
  }

}</lang>

<lang c>String copyString(String to, String from) {

  if ( (to->bstring != NULL) && (from->bstring != NULL) )
  {
     to->blocks = from->blocks;
     to->length = from->length;
     to->bstring = realloc(to->bstring, to->blocks * BYTES_PER_BLOCK);
     memcpy(to->bstring, from->bstring, to->length);
  }
  return to;

}</lang>

<lang c>String cloneString(String s) {

  String ps = malloc(sizeof(struct StringStruct));
  if ( ps != NULL )
  {
     ps->length = s->length;
     ps->blocks = s->blocks;
     ps->bstring = malloc(s->blocks * BYTES_PER_BLOCK);
     if ( ps->bstring != NULL )
     {
        memcpy(ps->bstring, s->bstring, s->length);
     } else {
        free(ps); return NULL;
     }
  }
  return ps;

}</lang>

<lang c>bool stringIsEmpty(String s) {

 if ( s == NULL ) return true;
 if ( s->length == 0 ) return true;
 return false;

}</lang>

<lang c>String subString(String s, size_t from, size_t to) {

 String ss;
 if ( stringIsEmpty(s) || (to < from) || ( from >= s->length )) return newString();
 if ( (from == 0) && (to >= (s->length - 1) ) ) return cloneString(s);
 ss = newString();
 if ( ss == NULL ) return NULL;
 if ( _fitString(ss, to - from) ) {
   ss->length = to - from;
   memcpy(ss->bstring, s->bstring+from, ss->length);
 }
 return ss;

}</lang>

<lang c>String replaceWith(String str, String ch, String repl) {

 String d = NULL;
 int occ = 0, i, j;
 if ( stringIsEmpty(str) ) return NULL;
 if ( stringIsEmpty(ch) ) return cloneString(str);
 if ( ch->length > 1 ) {
   NOT_IMPLEMENTED_YET;
   return str;
 }
 for(i=0; i < str->length; i++) {
   if ( str->bstring[i] == ch->bstring[0] ) occ++;
 }
 if ( occ == 0 ) return cloneString(str);
 d = newString();
 if ( _fitString(d, str->length + occ * (repl->length - 1)) ) {
   d->length = str->length + occ * (repl->length - 1);
   for(i=0, j=0; i < str->length; i++) {
     if ( str->bstring[i] != ch->bstring[0] ) {
       d->bstring[j] = str->bstring[i];
       j++;
     } else {
       memcpy(d->bstring + j, repl->bstring, repl->length);
       j += repl->length;
     }
   }
 }
 return d;

}</lang>

<lang c>String joinStrings(String s1, String s2) {

 String d;
 
 if ( stringIsEmpty(s1) ) return cloneString(s2);
 if ( stringIsEmpty(s2) ) return cloneString(s1);
 d = newString();
 if ( _fitString(d, s1->length + s2->length) ) {
   memcpy(d->bstring, s1->bstring, s1->length);
   memcpy(d->bstring + s1->length, s2->bstring, s2->length);
   d->length = s1->length + s2->length;
 }
 return d;

}</lang>

<lang c>#undef NOT_IMPLEMENTED_YET</lang>

Common Lisp

String creation (garbage collection will handle its destruction) using the string as an atom and casting a character list to a string <lang lisp> "string" (coerce (#\s #\t #\r #\i #\n #\g) 'string) </lang>

String assignment <lang lisp> (defvar *string* "string") </lang>

comparing two string <lang lisp> (equal "string" "string") </lang>

copy a string <lang lisp> (copy-seq "string") </lang>

<lang lisp> (defun string-empty-p (string)

 (cond
   ((= 0 (length string))t)
   (nil)))

</lang>

<lang lisp> (concatenate 'string "string" "b") </lang>

<lang lisp> (subseq "string" 1 6) "ring" </lang>

string replacement isn't covered by the ansi standard probably best to use (replace-all) or cl-ppcre


joining strings works in the same way as appending bytes

D

String creation (destruction is handled by the garbage collector) <lang d>byte[]str;</lang>

String assignment <lang d>byte[]str = cast(byte[])"blah";</lang>

String comparison <lang d>byte[]str1; byte[]str2; if (str1 == str2) // strings equal</lang>

String cloning and copying <lang d>byte[]str; byte[]str2 = str.dup; // copy entire array</lang>

Check if a string is empty <lang d>byte[]str = cast(byte[])"blah"; if (str.length) // string not empty if (!str.length) // string empty</lang>

Append a byte to a string <lang d>byte[]str; str ~= 'a';</lang>

Extract a substring from a string <lang d>byte[]str = "blork"; byte[]substr = str[1..$-1]; // this takes off the first and last bytes and assigns them to the new byte string</lang>

Replace every occurrence of a byte (or a string) in a string with another string <lang d>byte[]str = cast(byte)"blah"; replace(cast(char[])str,"la","al");</lang>

Join strings <lang d>byte[]b1; byte[]b2; byte[]b3 = b1~b2;</lang>

E

(Since the task is not a specific program, the code here consists of example REPL sessions, not a whole program.)

In E, binary data is represented as ELists (implemented as arrays or ropes) of integers; a String is strictly a character string. ELists come in Flex (mutable) and Const (immutable) varieties.

To work with binary strings we must first have a byte type; this is a place where E shows its Java roots (to be fixed).

<lang e>? def int8 := <type:java.lang.Byte>

  1. value: int8</lang>
  1. There are several ways to create a FlexList; perhaps the simplest is: <lang e>? def bstr := [].diverge(int8)
    1. value: [].diverge()
    ? def bstr1 := [1,2,3].diverge(int8)
    1. value: [1, 2, 3].diverge()
    ? def bstr2 := [-0x7F,0x2,0x3].diverge(int8)
    1. value: [-127, 2, 3].diverge()</lang>
    As E is a memory-safe garbage-collected language there is no explicit destruction. It is good practice to work with immutable ConstLists when reasonable, however; especially when passing strings around.
  2. There is no specific assignment between FlexLists; a reference may be passed in the usual manner, or the contents of one could be copied to another as shown below.
  3. There is no comparison operation between FlexLists (since it would not be a stable ordering ), but there is between ConstLists. <lang e>? bstr1.snapshot() < bstr2.snapshot()
    1. value: false</lang>
  4. To make an independent copy of a FlexList, simply .diverge() it again.
  5. <lang e>? bstr1.size().isZero()
    1. value: false
    ? bstr.size().isZero()
    1. value: true</lang>
  6. Appending a single element to a FlexList is done by .push(x): <lang e>? bstr.push(0) ? bstr
    1. value: [0].diverge()</lang>
  7. Substrings, or runs, are always immutable and specified as start-end indexes (as opposed to first-last or start-count). Or, one can copy an arbitrary portion of one list into another using replace(target range, source list, source range). <lang e>? bstr1(1, 2)
    1. value: [2]
    ? bstr.replace(0, bstr.size(), bstr2, 1, 3) ? bstr
    1. value: [2, 3].diverge()</lang>
  8. Replacing must be written as an explicit loop; there is no built-in operation (though there is for character strings). <lang e>? for i => byte ? (byte == 2) in bstr2 { bstr2[i] := -1 } ? bstr2
    1. value: [-127, -1, 3].diverge()</lang>
  9. Two lists can be concatenated into a ConstList by +: bstr1 + bstr2. append appends on the end of a FlexList, and replace can be used to insert at the beginning or anywhere inside. <lang e>? bstr1.append(bstr2) ? bstr1
    1. value: [1, 2, 3, -127, 2, 3].diverge()</lang>

Factor

Factor has a byte-array type which works exactly like other arrays, except only bytes can be stored in it. Comparisons on byte-arrays (like comparisons on arrays) are lexicographic.

To convert a string to a byte-array: <lang factor>"Hello, byte-array!" utf8 encode .</lang>

B{
    72 101 108 108 111 44 32 98 121 116 101 45 97 114 114 97 121 33
}

Reverse: <lang factor>B{ 147 250 150 123 } shift-jis decode .</lang>

"日本"

Haskell

Note that any of the following functions can be assigned to 'variables' in a working program or could just as easily be written as one-off expressions. They are given here as they are to elucidate the workings of Haskell's type system. Hopefully the type declarations will help beginners understand what's going on. Also note that there are likely more concise ways to express many of the below functions. However, I have opted for clarity here as Haskell can be somewhat intimidating to the (currently) non- functional programmer. <lang haskell>import Text.Regex {- The above import is needed only for the last function. It is used there purely for readability and conciseness -}

{- Assigning a string to a 'variable'. We're being explicit about it just for show. Haskell would be able to figure out the type of "world" -} string = "world" :: String</lang>

<lang haskell>{- Comparing two given strings and returning a boolean result using a simple conditional -} strCompare :: String -> String -> Bool strCompare x y =

   if x == y
       then True
       else False</lang>

<lang haskell>{- As strings are equivalent to lists of characters in Haskell, test and see if the given string is an empty list -} strIsEmpty :: String -> Bool strIsEmpty x =

   if x == []
       then True
       else False</lang>

<lang haskell>{- This is the most obvious way to append strings, using the built-in (++) concatenation operator Note the same would work to join any two strings (as 'variables' or as typed strings -} strAppend :: String -> String -> String strAppend x y = x ++ y</lang>

<lang haskell>{- Take the specified number of characters from the given string -} strExtract :: Int -> String -> String strExtract x s = take x s</lang>

<lang haskell>{- Take a certain substring, specified by two integers, from the given string -} strPull :: Int -> Int -> String -> String strPull x y s = take (y-x+1) (drop x s)</lang>

<lang haskell>{- Much thanks to brool.com for this nice and elegant solution. Using an imported standard library (Text.Regex), replace a given substring with another -} strReplace :: String -> String -> String -> String strReplace old new orig = subRegex (mkRegex old) orig new</lang>

J

J's literal data type supports arbitrary binary data. J's semantics are pass by value (with garbage collection) with a minor exception (mapped files).

  • String creation and destruction is not needed

<lang j> FIXME: show mapped file example</lang>

  • String assignment

<lang j> name=: 'value'</lang>

  • String comparison

<lang j> name1 -: name2</lang>

  • String cloning and copying is not needed

<lang j> FIXME: show mapped file example</lang>

  • Check if a string is empty

<lang j> 0=#string</lang>

  • Append a byte to a string

<lang j> string,byte</lang>

  • Extract a substring from a string

<lang j> 3{.5}.'The quick brown fox runs...'</lang>

  • Replace every occurrence of a byte (or a string) in a string with another string

<lang j>require 'strings' 'The quick brown fox runs...' rplc ' ';' !!! '</lang>

  • Join strings

<lang j> 'string1','string2'</lang>

Lua

<lang lua>foo = 'foo' -- Ducktyping foo to be string 'foo' bar = 'bar' assert (foo == "foo") -- Comparing string var to string literal assert (foo ~= bar) str = foo -- Copy foo contents to str if #str == 0 then -- # operator returns string length

   print 'str is empty'

end str=str..string.char(50)-- Char concatenated with .. operator substr = str:sub(1,3) -- Extract substring from index 1 to 3, inclusively

str = "string string string string" -- This function will replace all occurances of 'replaced' in a string with 'replacement' function replaceAll(str,replaced,replacement)

   local function sub (a,b)
       if b > a then
           return str:sub(a,b)
       end
       return nil
   end
   a,b = str:find(replaced)
   while a do
       str = str:sub(1,a-1) .. replacement .. str:sub(b+1,#str)
       a,b = str:find(replaced)
   end
   return str

end str = replaceAll (str, 'ing', 'ong') print (str)

str = foo .. bar -- Strings concatenate with .. operator</lang>

OCaml

  • String creation and destruction

String.create n returns a fresh string of length n, which initially contains arbitrary characters: <lang ocaml># String.create 10 ;; - : string = "\000\023\000\000\001\000\000\000\000\000"</lang>

No destruction, OCaml features a garbage collector.

OCaml strings can contain any of the 256 possible bytes included the null character '\000'.

  • String assignment

<lang ocaml># let str = "some text" ;; val str : string = "some text"

(* modifying a character, OCaml strings are mutable *)

  1. str.[0] <- 'S' ;;

- : unit = ()</lang>

  • String comparison

<lang ocaml># str = "Some text" ;; - : bool = true

  1. "Hello" > "Ciao" ;;

- : bool = true</lang>

  • String cloning and copying

<lang ocaml># String.copy str ;; - : string = "Some text"</lang>

  • Check if a string is empty

<lang ocaml># let string_is_empty s = (s = "") ;; val string_is_empty : string -> bool = <fun>

  1. string_is_empty str ;;

- : bool = false

  1. string_is_empty "" ;;

- : bool = true</lang>

  • Append a byte to a string

it is not possible to append a byte to a string, in the sens modifying the length of a given string, but we can use the concatenation operator to append a byte and return the result as a new string

<lang ocaml># str ^ "!" ;; - : string = "Some text!"</lang>

But OCaml has a module named Buffer for string buffers. This module implements string buffers that automatically expand as necessary. It provides accumulative concatenation of strings in quasi-linear time (instead of quadratic time when strings are concatenated pairwise).

<lang ocaml>Buffer.add_char str c</lang>

  • Extract a substring from a string

<lang ocaml># String.sub str 5 4 ;; - : string = "text"</lang>

  • Replace every occurrence of a byte (or a string) in a string with another string

using the Str module <lang ocaml># #load "str.cma";;

  1. let replace str occ by =
   Str.global_replace (Str.regexp_string occ) by str
 ;;

val replace : string -> string -> string -> string = <fun>

  1. replace "The white dog let out a single, loud bark." "white" "black" ;;

- : string = "The black dog let out a single, loud bark."</lang>

  • Join strings

<lang ocaml># "Now just remind me" ^ " how the horse moves again?" ;; - : string = "Now just remind me how the horse moves again?"</lang>

PL/I

<lang PL/I> /* PL/I has immediate facilities for all those operations except for */ /* replace. */ s = t; /* assignment */ s = t || u; /* catenation - append one or more bytes. */ if length(s) = 0 then ... /* test for an empty string. */ if s = t then ... /* compare strings. */ u = substr(t, i, j); /* take a substring of t beginning at the */

                          /* i-th character andcontinuing for j     */
                          /* characters.                            */

substr(u, i, j) = t; /* replace j characters in u, beginning */

                          /* with the i-th character.               */

/* In string t, replace every occurrence of string u with string v. */ replace: procedure (t, u, v);

  declare (t, u, v) character (*) varying;
  do until (k = 0);
     k = index(t, u);
     if k > 0 then
        t = substr(t, 1, k-1) || v || substr(t, k+length(u));
  end;

end replace; </lang>

Python

Python does have a native byte string type. They can contain any byte sequence - they're not zero-terminated. There is a separate type for Unicode data.

  • String creation and destruction

<lang python>s1 = "A 'string' literal \n" s2 = 'You may use any of \' or " as delimiter' s3 = """This text

  goes over several lines
      up to the closing triple quote"""</lang>

Strings are normal objects, and are destroyed after they're no more reachable from any other live object.

  • String assignment

There is nothing special about assignments:

<lang python>s = "Hello " t = "world!" u = s + t # + concatenates</lang>

  • String comparison

They're compared byte by byte, lexicographically:

<lang python>assert "Hello" == 'Hello' assert '\t' == '\x09' assert "one" < "two" assert "two" >= "three"</lang>

  • String cloning and copying

Strings are immutable, so there is no need to clone/copy them. If you want to modify a string, you must create a new one with the desired contents. (There is another type, array, that provides a mutable buffer; there is also bytearray in Python 3)

  • Check if a string is empty

<lang python>if x==: print "Empty string" if not x: print "Empty string, provided you know x is a string"</lang>

  • Append a byte to a string

<lang python>txt = "Some text" txt += '\x07'

  1. txt refers now to a new string having "Some text\x07"</lang>
  • Extract a substring from a string

Strings are sequences, they can be indexed with s[index] (index is 0-based) and sliced s[start:stop] (all characters from s[start] up to, but not including, s[stop])

<lang python>txt = "Some more text" assert txt[4] == " " assert txt[0:4] == "Some" assert txt[:4] == "Some" # you can omit the starting index if 0 assert txt[5:9] == "more" assert txt[5:] == "more text" # omitting the second index means "to the end"</lang>

Negative indexes count from the end: -1 is the last byte, and so on:

<lang python>txt = "Some more text" assert txt[-1] == "t" assert txt[-4:] == "text"</lang>

  • Replace every occurrence of a byte (or a string) in a string with another string

Strings are objects and have methods, like replace:

<lang python>v1 = "hello world" v2 = v1.replace("l", "L") print v2 # prints heLLo worLd</lang>

  • Join strings

If they're separate variables, use the + operator:

<lang python>v1 = "hello" v2 = "world" msg = v1 + " " + v2</lang>

If the elements to join are contained inside any iterable container (e.g. a list)

<lang python>items = ["Smith", "John", "417 Evergreen Av", "Chimichurri", "481-3172"] joined = ",".join(items) print joined

  1. output:
  2. Smith,John,417 Evergreen Av,Chimichurri,481-3172</lang>

The reverse operation (split) is also possible:

<lang python>line = "Smith,John,417 Evergreen Av,Chimichurri,481-3172" fields = line.split(',') print fields

  1. output:
  2. ['Smith', 'John', '417 Evergreen Av', 'Chimichurri', '481-3172']</lang>

Ruby

A String object holds and manipulates an arbitrary sequence of bytes. There are also the Array#pack and String#unpack methods to convert data to binary strings. <lang ruby># string creation x = "hello world"

  1. string destruction

x = nil

  1. string assignment with a null byte

x = "a\0b" x.length # ==> 3

  1. string comparison

if x == "hello"

 puts equal

else

 puts "not equal"

end y = 'bc' if x < y

 puts "#{x} is lexicographically less than #{y}"

end

  1. string cloning

xx = x.dup x == xx # true, same length and content x.equal?(xx) # false, different objects

  1. check if empty

if x.empty?

 puts "is empty"

end

  1. append a byte

x << "\07"

  1. substring

xx = x[0..-2]

  1. replace bytes

y = "hello world".tr("l", "L")

  1. join strings

a = "hel" b = "lo w" c = "orld" d = a + b + c</lang>

Tcl

Tcl strings are binary safe, and a binary string is any string that only contains UNICODE characters in the range \u0000\u00FF. <lang tcl># string creation set x "hello world"

  1. string destruction

unset x

  1. string assignment with a null byte

set x a\0b string length $x ;# ==> 3

  1. string comparison

if {$x eq "hello"} {puts equal} else {puts "not equal"} set y bc if {$x < $y} {puts "$x is lexicographically less than $y"}

  1. string copying; cloning happens automatically behind the scenes

set xx $x

  1. check if empty

if {$x eq ""} {puts "is empty"} if {[string length $x] == 0} {puts "is empty"}

  1. append a byte

append x \07

  1. substring

set xx [string range $x 0 end-1]

  1. replace bytes

set y [string map {l L} "hello world"]

  1. join strings

set a "hel" set b "lo w" set c "orld" set d $a$b$c</lang>