Character codes
From Rosetta Code
You are encouraged to solve this task according to the task description, using any language you may know.
Given a character value in your language, print its code (could be ASCII code, Unicode code, or whatever your language uses). For example, the character 'a' (lowercase letter A) has a code of 97 in ASCII (as well as Unicode, as ASCII forms the beginning of Unicode). Conversely, given a code, print out the corresponding character.
[edit] ActionScript
In ActionScript, you cannot take the character code of a character directly. Instead you must create a string and call charCodeAt with the character's position in the string as a parameter.
trace(String.fromCharCode(97)); //prints 'a'
trace("a".charCodeAt(0));//prints '97'
[edit] Ada
with Ada.Text_IO; use Ada.Text_IO;
procedure Char_Code is
begin
Put_Line (Character'Val (97) & " =" & Integer'Image (Character'Pos ('a')));
end Char_Code;
The predefined language attributes S'Pos and S'Val for every discrete subtype, and Character is such a type, yield the position of a value and value by its position correspondingly. Sample output.
a = 97
[edit] ALGOL 68
In ALGOL 68 the format $g$ is type aware, hence the type conversion operators abs & repr are used to set the type.
main:(
printf(($gl$, ABS "a")); # for ASCII this prints "+97" EBCDIC prints "+129" #
printf(($gl$, REPR 97)) # for ASCII this prints "a"; EBCDIC prints "/" #
)
Character conversions may be available in the standard prelude so that when a foreign tape is mounted, the characters will be converted transparently as the tape's records are read.
FILE tape;
INT errno = open(tape, "/dev/tape1", stand out channel)
make conv(tape, ebcdic conv);
FOR record DO getf(tape, ( ~ )) OD; ~ # etc ... #
Every channel has an associated standard character conversion that can be determined using the stand conv query routine and then the conversion applied to a particular file/tape. eg.
make conv(tape, stand conv(stand out channel))
[edit] AutoHotkey
MsgBox % Chr(97)
MsgBox % Asc("a")
[edit] AWK
AWK has not built-in way to convert a character into ASCII (or whatever) code; but a function that does so can be easily built using an associative array (where the keys are the characters). The opposite can be done using printf (or sprintf) with %c
function ord(c)
{
return chmap[c]
}
BEGIN {
for(i=0; i < 256; i++) {
chmap[sprintf("%c", i)] = i
}
print ord("a"), ord("b")
printf "%c %c\n", 97, 98
s = sprintf("%c%c", 97, 98)
print s
}
[edit] BASIC
Works with: QuickBasic version 4.5
charCode = 97
char = "a"
PRINT CHR$(charCode) 'prints a
PRINT ASC(char) 'prints 97
[edit] Befunge
"A". 88*1+,
[edit] C
char is already an integer type in C, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.
#include <stdio.h>
int main() {
printf("%d\n", 'a'); /* prints "97" */
printf("%c\n", 97); /* prints "a"; we don't have to cast because printf is type agnostic */
return 0;
}
[edit] C++
char is already an integer type in C++, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.
In this case, the output operator << is overloaded to handle integer (outputs the decimal representation) and character (outputs just the character) types differently, so we need to cast it in both cases.
#include <iostream>
int main() {
std::cout << (int)'a' << std::endl; // prints "97"
std::cout << (char)97 << std::endl; // prints "a"
return 0;
}
[edit] C#
C# represents strings and characters internally as Unicode, so casting a char to an int returns its Unicode character encoding.
using System;
namespace RosettaCode.CharacterCode
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine((int) 'a'); //Prints "97"
Console.WriteLine((char) 97); //Prints "a"
}
}
}
[edit] Clojure
(print (int \a)) ; prints "97"
(print (char 97)) ; prints \a
; Unicode is also available, as Clojure uses the underlying java Strings & chars
(print (int \π)) ; prints 960
(print (char 960)) ; prints \π
[edit] Common Lisp
(princ (char-code #\a)) ; prints "97"
(princ (code-char 97)) ; prints "a"
[edit] D
Could be treated like C, but since the standard string type is UTF-8, let's be verbose.
import std.stdio, std.utf;
void main() {
string test = "a";
size_t index = 0;
// get four-byte utf32 value for index 0
// this returns dchar, so cast it to numeric
writefln(cast(uint) test.decode(index));
// index has moved to next character position in input
assert(index == 1);
}
[edit] E
? 'a'.asInteger()
# value: 97
? <import:java.lang.makeCharacter>.asChar(97)
# value: 'a'
[edit] Erlang
In Erlang, lists and strings are the same, only the representation changes. Thus:
1> F = fun([X]) -> X end.
#Fun<erl_eval.6.13229925>
2> F("a").
97
If entered manually, one can also get ASCII codes by prefixing characters with $:
3> $a.
97
Unicode is fully supported since release R13A only.
[edit] F#
let c = 'A'
let n = 65
printfn "%d" (int c)
printfn "%c" (char n)
Output
65 A
[edit] Factor
CHAR: katakana-letter-a .
"ア" first .
12450 1string print
[edit] FALSE
'A."
"65,
[edit] Forth
As with C, characters are just integers on the stack which are treated as ASCII.
char a
dup . \ 97
emit \ a
[edit] Fortran
Functions ACHAR and IACHAR specifically work with the ASCII character set, while the results of CHAR and ICHAR will depend on the default character set being used.
WRITE(*,*) ACHAR(97), IACHAR("a")
WRITE(*,*) CHAR(97), ICHAR("a")
[edit] Go
In Go, a character literal is simply an integer constant of the character code:
fmt.Println('a') // prints "97"
fmt.Println('π') // prints "960"
To obtain the character code of the first Unicode character in a UTF-8 string:
rune, _ := utf8.DecodeRuneInString("π")
fmt.Println(rune) // prints "960"
To print the character represented by a character code, we can create a string of that character (by converting the character code to string, it encodes the character in UTF-8):
fmt.Println(string(97)) // prints "a"
fmt.Println(string(960)) // prints "π"
[edit] Groovy
Groovy does not have a character literal at all, so one-character strings have to be coerced to char. Groovy printf (like Java, but unlike C) is not type-agnostic, so the cast or coercion from char to int is also required. The reverse direction is considerably simpler.
printf ("%d\n", ('a' as char) as int)
printf ("%c\n", 97)
Output:
97 a
[edit] Haskell
import Data.Char
main = do
print (ord 'a') -- prints "97"
print (chr 97) -- prints "'a'"
print (ord 'π') -- prints "960"
print (chr 960) -- prints "'\960'"
[edit] HicEst
WRITE(Messagebox) ICHAR('a'), CHAR(97)
[edit] Icon and Unicon
[edit] Icon
procedure main(arglist)
if *arglist > 0 then L := arglist else L := [97, "a"]
every x := !L do
write(x, " ==> ", char(integer(x)) | ord(x) ) # char produces a character, ord produces a number
end
Icon and Unicon do not currently support double byte character sets. Sample output:
97 ==> a a ==> 97
[edit] Unicon
This Icon solution works in Unicon.
[edit] J
4 u: 97 98 99 9786
abc☺
3 u: 7 u: 'abc☺'
97 98 99 9786
[edit] Java
char is already an integer type in Java, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.
In this case, the println method is overloaded to handle integer (outputs the decimal representation) and character (outputs just the character) types differently, so we need to cast it in both cases.
public class Foo {
public static void main(String[] args) {
System.out.println((int)'a'); // prints "97"
System.out.println((char)97); // prints "a"
}
}
Java characters support Unicode:
public class Bar {
public static void main(String[] args) {
System.out.println((int)'π'); // prints "960"
System.out.println((char)960); // prints "π"
}
}
[edit] JavaScript
Here character is just a string of length 1
document.write('a'.charCodeAt(0)); // prints "97"
document.write(String.fromCharCode(97)); // prints "a"
[edit] Joy
'a ord.
97 chr.
[edit] Logo
Logo characters are words of length 1.
print ascii "a ; 97
print char 97 ; a
[edit] Lua
print(string.byte(io.read()))
[edit] Mathematica
Use the FromCharacterCode and ToCharacterCode functions:
ToCharacterCode["abcd"]
FromCharacterCode[{97}]
Result:
{97, 98, 99, 100}
"a"
[edit] MATLAB
There are two built-in function that perform these tasks. To convert from a number to a character use:
character = char(asciiNumber)
To convert from a character to its corresponding ascii character use:
asciiNumber = double(character)
or if you need this number as an integer not a double use:
asciiNumber = uint16(character)
asciiNumber = uint32(character)
asciiNumber = uint64(character)
Sample Usage:
>> char(87)
ans =
W
>> double('W')
ans =
87
>> uint16('W')
ans =
87
[edit] Metafont
Metafont handles only ASCII (even though codes beyond 127 can be given and used as real ASCII codes)
message "enter a letter: ";
string a;
a := readstring;
message decimal (ASCII a); % writes the decimal number of the first character
% of the string a
message "enter a number: ";
num := scantokens readstring;
message char num; % num can be anything between 0 and 255; what will be seen
% on output depends on the encoding used by the "terminal"; e.g.
% any code beyond 127 when UTF-8 encoding is in use will give
% a bad encoding; e.g. to see correctly an "è", we should write
message char10; % (this add a newline...)
message char hex"c3" & char hex"a8"; % since C3 A8 is the UTF-8 encoding for "è"
end
[edit] Modula-3
The built in functions ORD and VAL work on characters, among other things.
ORD('a') (* Returns 97 *)
VAL(97, CHAR); (* Returns 'a' *)
[edit] MUMPS
WRITE $ASCII("M")
WRITE $CHAR(77)
[edit] Objeck
'a'->As(Int)->PrintLine();
97->As(Char)->PrintLine();
[edit] OCaml
Printf.printf "%d\n" (int_of_char 'a'); (* prints "97" *)
Printf.printf "%c\n" (char_of_int 97); (* prints "a" *)
[edit] Oz
Characters in Oz are the same as integers in the range 0-255 (ISO 8859-1 encoding). To print a number as a character, we need to use it as a string (i.e. a list of integers from 0 to 255):
{System.show &a} %% prints "97"
{System.showInfo [97]} %% prints "a"
[edit] Pascal
writeln(ord('a'));
writeln(chr(97));
[edit] Perl
Here character is just a string of length 1
print ord('a'), "\n"; # prints "97"
print chr(97), "\n"; # prints "a"
[edit] Perl 6
As Perl 5.
[edit] PHP
Here character is just a string of length 1
echo ord('a'), "\n"; // prints "97"
echo chr(97), "\n"; // prints "a"
[edit] PicoLisp
: (char "a")
-> 97
: (char "字")
-> 23383
: (char 23383)
-> "字"
: (chop "文字")
-> ("文" "字")
: (mapcar char @)
-> (25991 23383)
[edit] PL/I
declare 1 u union,
2 c character (1),
2 i fixed binary (8) unsigned;
c = 'a'; put skip list (i); /* prints 97 */
i = 97; put skip list (c); /* prints 'a' */
[edit] PowerShell
PowerShell does not allow for character literals directly, so to get a character one first needs to convert a single-character string to a char:
$char = [char] 'a'
Then a simple cast to int yields the character code:
$charcode = [int] $char # => 97
This also works with Unicode:
[int] [char] '☺' # => 9786
For converting an integral character code into the actual character, a cast to char suffices:
[char] 97 # a
[char] 9786 # ☺
[edit] PureBasic
PureBasic allows compiling code so that it will use either Ascii or a Unicode (UCS-2) encoding for representing its string content. It also allows for the source code that is being compiled to be in either Ascii or UTF-8 encoding. A one-character string is used here to hold the character and a numerical character type is used to hold the character code. The character type is either one or two bytes in size, depending on whether compiling for Ascii or Unicode respectively.
If OpenConsole()
;Results are the same when compiled for Ascii or Unicode
charCode.c = 97
Char.s = "a"
PrintN(Chr(charCode)) ;prints a
PrintN(Str(Asc(Char))) ;prints 97
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
Input()
CloseConsole()
EndIf
This version should be compiled with Unicode setting and the source code to be encoded using UTF-8.
If OpenConsole()
;UTF-8 encoding compiled for Unicode (UCS-2)
charCode.c = 960
Char.s = "π"
PrintN(Chr(charCode)) ;prints π
PrintN(Str(Asc(Char))) ;prints 960
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
Input()
CloseConsole()
EndIf
[edit] Python
[edit] 2.x
Here character is just a string of length 1
8-bit characters:
print ord('a') # prints "97"
print chr(97) # prints "a"
Unicode characters:
print ord(u'π') # prints "960"
print unichr(960) # prints "π"
[edit] 3.x
Here character is just a string of length 1
print(ord('a')) # prints "97"
print(ord('π')) # prints "960"
print(chr(97)) # prints "a"
print(chr(960)) # prints "π"
[edit] R
ascii <- as.integer(charToRaw("hello world")); ascii
text <- rawToChar(as.raw(ascii)); text
[edit] Ruby
[edit] 1.8
In Ruby 1.8 characters are usually represented directly as their integer character code. Ruby has a syntax for "character literal" which evaluates directly to the integer code: ?a evaluates to the integer 97. Subscripting a string also gives just the integer code for the character.
> ?a
=> 97
> "a"[0]
=> 97
> 97.chr
=> "a"
[edit] 1.9
In Ruby 1.9 characters are represented as length-1 strings; same as in Python. The previous "character literal" syntax ?a is now the same as "a". Subscripting a string also gives a length-1 string. There is now an "ord" method of strings to convert a character into its integer code.
> "a".ord
=> 97
> 97.chr
=> "a"
[edit] Sather
class MAIN is
main is
#OUT + 'a'.int + "\n"; -- or
#OUT + 'a'.ascii_int + "\n";
#OUT + CHAR::from_ascii_int(97) + "\n";
end;
end;
[edit] Scala
Scala supports unicode characters, but each character is UTF-16, so there is not a 1-to-1 relationship for supplementary character sets.
Without worrying about supplemental character sets:
scala> 'a' toInt
res9: Int = 97
scala> 97 toChar
res10: Char = a
Worrying about supplemental character sets, we need to test the "next" character as well:
def charToInt(c: Char, next: Char): Option[Int] = (c, next) match {
case _ if (c.isHighSurrogate && next.isLowSurrogate) => Some(java.lang.Character.toCodePoint(c, next))
case _ if (c.isLowSurrogate) => None
case _ => Some(c.toInt)
}
def intToChars(n: Int): Array[Char] = java.lang.Character.toChars(n)
[edit] Scheme
(display (char->integer #\a)) (newline) ; prints "97"
(display (integer->char 97)) (newline) ; prints "a"
[edit] Seed7
writeln(ord('a'));
writeln(chr(97));
[edit] Slate
$a code.
97 as: String Character.
[edit] Smalltalk
($a asInteger) displayNl. "output 97"
(Character value: 97) displayNl. "output a"
[edit] SNOBOL4
Snobol implementations may or may not have built-in char( ) and ord ( ) or asc( ). These are based on examples in the Snobol4+ tutorial and work with the native (1-byte) charset.
define('chr(n)') :(chr_end)
chr &alphabet tab(n) len(1) . chr :s(return)f(freturn)
chr_end
define('asc(str)c') :(asc_end)
asc str len(1) . c
&alphabet break(c) @asc :s(return)f(freturn)
asc_end
* # Test and display
output = char(65) ;* Built-in
output = chr(65)
output = asc('A')
end
Output:
A A 65
[edit] Standard ML
print (Int.toString (ord #"a") ^ "\n"); (* prints "97" *)
print (Char.toString (chr 97) ^ "\n"); (* prints "a" *)
[edit] Tcl
# ASCII
puts [scan "a" %c] ;# ==> 97
puts [format %c 97] ;# ==> a
# Unicode is the same
puts [scan "π" %c] ;# ==> 960
puts [format %c 960] ;# ==> π
[edit] TI-89 BASIC
The TI-89 uses an 8-bit charset/encoding which is similar to ISO-8859-1, but with more mathematical symbols and Greek letters. At least codes 14-31, 128-160, 180 differ. The ASCII region is unmodified. (TODO: Give a complete list.)
The TI Connect X desktop software converts between this unique character set and Unicode characters, though sometimes in a consistent but inappropriate fashion.
The below program will display the character and code for any key pressed. Some keys do not correspond to characters and have codes greater than 255. The portion of the program actually implementing the task is marked with a line of “©”s.
Prgm
Local k, s
ClrIO
Loop
Disp "Press a key, or ON to exit."
getKey() © clear buffer
0 → k : While k = 0 : getKey() → k : EndWhile
ClrIO
If k ≥ 256 Then
Disp "Not a character."
Disp "Code: " & string(k)
Else
char(k) → s ©
© char() and ord() are inverses. ©
Disp "Character: " & s ©
Disp "Code: " & string(ord(s)) ©
EndIf
EndLoop
EndPrgm
[edit] Trith
Characters are Unicode code points, so the solution is the same for Unicode characters as it is for ASCII characters:
"a" ord print
97 chr print
"π" ord print
960 chr print
[edit] Visual Basic .NET
Console.WriteLine(Chr(97)) 'Prints a
Console.WriteLine(Asc("a")) 'Prints 97
[edit] Ursala
Character code functions are not built in but easily defined as reifications of the character table.
#import std
#import nat
chr = -: num characters
asc = -:@rlXS num characters
#cast %cnX
test = (chr97,asc`a)
output:
(`a,97)

