Jump to content

String Character Length: Difference between revisions

m
Stupid case-sensitivity.
m (Switch to header template)
m (Stupid case-sensitivity.)
 
(14 intermediate revisions by 6 users not shown)
Line 1:
#REDIRECT [[String length]]
{{task}}
{{Template:split-review}}
In this task, the goal is to find the <em>character</em> length of a string. This means encodings like [[UTF-8]] need to be handled properly, as there is not necessarily a one-to-one relationship between bytes and characters.
 
For byte length, see [[String Byte Length]].
 
=={{header|ActionScript}}==
myStrVar.length()
 
=={{header|Ada}}==
'''Compiler:''' GCC 4.1.2
 
Str : String := "Hello World";
Length : constant Natural := Str'Length;
 
=={{header|AppleScript}}==
count of "Hello World"
Or:
count "Hello World"
 
=={{header|AWK}}==
From within any code block:
w=length("Hello, world!") # static string example
x=length("Hello," s " world!") # dynamic string example
y=length($1) # input field example
z=length(s) # variable name example
Ad hoc program from command line:
echo "Hello, world!" | awk '{print length($0)}'
From executable script: (prints for every line arriving on stdin)
#!/usr/bin/awk -f
{print"The length of this line is "length($0)}
 
=={{header|C}}==
'''Standard:''' [[ANSI C]] (AKA [[C89]]):
 
'''Compiler:''' GCC 3.3.3
 
#include <string.h>
int main(void)
{
const char *string = "Hello, world!";
size_t length = strlen(string);
return 0;
}
 
or by hand:
 
int main(void)
{
const char *string = "Hello, world!";
size_t length = 0;
char *p = (char *) string;
while (*p++ != '\0') length++;
return 0;
}
 
or (for arrays of char only)
 
#include <stdlib.h>
int main(void)
{
char const s[] = "Hello, world!";
size_t length = sizeof s - 1;
return 0;
}
 
For wide character strings (usually Unicode):
 
#include <stdio.h>
#include <wchar.h>
int main(void)
{
wchar_t *s = L"\x304A\x306F\x3088\x3046"; /* Japanese hiragana ohayou */
size_t length;
length = wcslen(s);
printf("Length in characters = %d\n", length);
printf("Length in bytes = %d\n", sizeof(s) * sizeof(wchar_t));
return 0;
}
 
=={{header|Objective-C}}==
// Return the length in unicode characters
unsigned length = [@"Hello Word!" length];
 
=={{header|C++}}==
 
'''Standard:''' [[ISO C plus plus|ISO C++]] (AKA [[C plus plus 98|C++98]]):
 
'''Compiler:''' g++ 4.0.2
 
#include <string> // note: '''not''' <string.h>
int main()
{
std::string s = "Hello, world!";
// Always in characters == bytes since sizeof(char) == 1
std::string::size_type length = s.length(); // option 1: In Characters/Bytes
std::string::size_type size = s.size(); // option 2: In Characters/Bytes
}
 
For wide character strings:
 
#include <string>
int main()
{
std::wstring s = L"\u304A\u306F\u3088\u3046";
std::wstring::size_type length = s.length();
}
 
=={{header|C sharp|C#}}==
'''Platform:''' [[.NET]]
'''Language Version:''' 1.0+
 
string s = "Hello, world!";
int clength = s.Length; // In characters
int blength = System.Text.Encoding.GetBytes(s).length; // In Bytes.
 
==[[Clean]]==
[[Category:Clean]]
 
Clean Strings are unboxed arrays of characters. Characters are always a single byte. The function size returns the number of elements in an array.
 
import StdEnv
strlen :: String -> Int
strlen string = size string
Start = strlen "Hello, world!"
 
=={{header|ColdFusion}}==
#len("Hello World")#
 
=={{header|Common Lisp}}==
(length "Hello World")
 
=={{header|Component Pascal}}==
LEN("Hello, World!")
 
=={{header|E}}==
"Hello World".size()
 
=={{header|Forth}}==
The 1994 ANS standard does not have any notion of a particular character encoding, although it distinguishes between character and machine-word addresses. (There is some ongoing work on standardizing an "XCHAR" wordset for dealing with strings in particular encodings such as UTF-8.)
 
'''Interpreter:''' ANS Forth
 
The following code will count the number of UTF-8 characters in a null-terminated string. It relies on the fact that all bytes of a UTF-8 character except the first have the the binary bit pattern "10xxxxxx".
 
2 base !
: utf8+ ( str -- str )
begin
char+
dup c@
11000000 and
10000000 <>
until ;
decimal
: count-utf8 ( zstr -- n )
0
begin
swap dup c@
while
utf8+
swap 1+
repeat drop ;
 
=={{header|Haskell}}==
'''Interpreter:''' [[GHC | GHCi]] 6.6, [[Hugs]]
 
'''Compiler:''' [[GHC]] 6.6
 
strlen = length "Hello, world!"
 
=={{header|IDL}}==
'''Compiler:''' any IDL compiler should do
 
length = strlen("Hello, world!")
 
=={{header|Java}}==
 
Java encodes strings in UTF-16, which represents each character with one or two 16-bit values. The most commonly used characters are represented by one 16-bit value, while rarer ones like some mathematical symbols are represented by two.
 
The length method of String objects gives the number of 16-bit values used to encode a string.
String s = "Hello, world!";
int length = s.length();
 
Since Java 1.5, the actual number of characters can be determined by calling the codePointCount method.
String str = "\uD834\uDD2A"; //U+1D12A
int length1 = str.length(); //2
int length2 = str.codePointCount(0, str.length()); //1
 
=={{header|JavaScript}}==
JavaScript encodes strings in UTF-16, which represents each character with one or two 16-bit values. The most commonly used characters are represented by one 16-bit value, while rarer ones like some mathematical symbols are represented by two.
 
JavaScript has no built-in way to determine how many characters are in a string. However, if the string only contains commonly used characters, the number of characters will be equal to the number of 16-bit values used to represent the characters.
var str1 = "Hello, world!";
var len1 = str1.length; //13
var str2 = "\uD834\uDD2A"; //U+1D12A represented by a UTF-16 surrogate pair
var len2 = str2.length; //2
 
=={{header|JudoScript}}==
//Store length of hello world in length and print it
. length = "Hello World".length();
 
=={{header|LSE64}}==
LSE uses counted strings: arrays of characters, where the first cell contains the number of characters in the string.
" Hello world" @ , # 11
 
=={{Lua}}==
 
'''Interpreter:''' [[Lua]] 5.0 or later.
 
string="Hello world"
length=#string
 
=={{header|MAXScript}}==
"Hello world".count
 
=={{header|mIRC Scripting Language}}==
'''Interpreter:''' [[mIRC]]
 
alias stringlength { echo -a Your Name is: $len($$?="Whats your name") letters long! }
 
=={{header|OCaml}}==
'''Interpreter'''/'''Compiler:''' [[Ocaml]] 3.09
 
String.length "Hello world";;
 
 
=={{header|Perl}}==
'''Interpreter:''' [[Perl]] any 5.X
 
my $length = length "Hello, world!";
 
=={{header|PHP}}==
$length = strlen('Hello, world!');
 
=={{header|PL/SQL|PL/SQL}}==
DECLARE
string VARCHAR2( 50 ) := 'Hello, world!';
stringlength NUMBER;
BEGIN
stringlength := length( string );
END;
 
=={{header|Python}}==
'''Interpreter:''' [[Python]] 2.4
 
len() returns the length of a unicode string or plain ascii string. To get the length of encoded string, you have to decode it first:
<pre>
>>> len('ascii')
5
>>> len(u'\u05d0') # the letter Alef as unicode literal
1
>>> len('\xd7\x90'.decode('utf-8')) # Same encoded as utf-8 string
1
</pre>
 
=={{header|Ruby}}==
'''Library:''' [[active_support]]
 
require 'active_support'
puts "Hello World".chars.length
 
=={{header|Scheme}}==
(string-length "Hello world")
 
=={{header|Seed7}}==
length("Hello, world!")
 
=={{header|Smalltalk}}==
string := 'Hello, world!".
string size.
 
=={{header|Standard ML}}==
'''Interpreter:''' [[Standard ML of New Jersey | SML/NJ]] 110.60, [[Moscow ML]] 2.01 (January 2004)
 
'''Compiler:''' [[MLton]] 20061107
 
val strlen = size "Hello, world!";
 
=={{header|Tcl}}==
Basic version:
 
string length "Hello, world!"
 
or more elaborately, needs '''Interpreter''' any 8.X. Tested on 8.4.12.
 
fconfigure stdout -encoding utf-8; #So that Unicode string will print correctly
set s1 "hello, world"
set s2 "\u304A\u306F\u3088\u3046"
puts [format "length of \"%s\" in characters is %d" $s1 [string length $s1]]
puts [format "length of \"%s\" in characters is %d" $s2 [string length $s2]]
 
=={{header|UNIX Shell}}==
With external utilities:
 
'''Interpreter:''' any [[Bourne Shell]]
 
string='Hello, world!'
length=`echo -n "$string" | wc -c | tr -dc '0-9'`
echo $length # if you want it printed to the terminal
 
With SUSv3 parameter expansion modifier:
 
'''Interpreter:''' [[Almquist SHell]] (NetBSD 3.0), [[Bourne Again SHell]] 3.2, [[Korn SHell]] (5.2.14 99/07/13.2), [[Z SHell]]
 
string='Hello, world!'
length="${#string}"
echo $length # if you want it printed to the terminal
 
 
=={{header|VBScript}}==
Len(string|varname)
 
Returns the length of the string|varname
Returns null if string|varname is null
 
=={{header|xTalk}}==
'''Interpreter:''' HyperCard
 
put the length of "Hello World"
 
or
 
put the number of characters in "Hello World"
Anonymous user
Cookies help us deliver our services. By using our services, you agree to our use of cookies.