Character codes

From Rosetta Code
(Redirected from Character code)
Jump to: navigation, search
Task
Character codes
You are encouraged to solve this task according to the task description, using any language you may know.

Given a character value in your language, print its code (could be ASCII code, Unicode code, or whatever your language uses). For example, the character 'a' (lowercase letter A) has a code of 97 in ASCII (as well as Unicode, as ASCII forms the beginning of Unicode). Conversely, given a code, print out the corresponding character.

Contents

[edit] ABAP

In ABAP you must first cast the character to a byte field and back to a number in order to get its ASCII value.

report zcharcode
data: c value 'A', n type i.
field-symbols <n> type x.
 
assign c to <n> casting.
move <n> to n.
write: c, '=', n left-justified.
Output:
A = 65

[edit] ACL2

Similar to Common Lisp:

(cw "~x0" (char-code #\a))
(cw "~x0" (code-char 97))

[edit] ActionScript

In ActionScript, you cannot take the character code of a character directly. Instead you must create a string and call charCodeAt with the character's position in the string as a parameter.

trace(String.fromCharCode(97)); //prints 'a' 
trace("a".charCodeAt(0));//prints '97'

[edit] Ada

with Ada.Text_IO;  use Ada.Text_IO;
 
procedure Char_Code is
begin
Put_Line (Character'Val (97) & " =" & Integer'Image (Character'Pos ('a')));
end Char_Code;
The predefined language attributes S'Pos and S'Val for every discrete subtype, and Character is such a type, yield the position of a value and value by its position correspondingly. Sample output:
a = 97

[edit] Aime

# prints "97"
o_integer('a');
o_byte('\n');
# prints "a"
o_byte(97);
o_byte('\n');

[edit] ALGOL 68

In ALGOL 68 the format $g$ is type aware, hence the type conversion operators abs & repr are used to set the type.

main:(
printf(($gl$, ABS "a")); # for ASCII this prints "+97" EBCDIC prints "+129" #
printf(($gl$, REPR 97)) # for ASCII this prints "a"; EBCDIC prints "/" #
)

Character conversions may be available in the standard prelude so that when a foreign tape is mounted, the characters will be converted transparently as the tape's records are read.

FILE tape;
INT errno = open(tape, "/dev/tape1", stand out channel)
make conv(tape, ebcdic conv);
FOR record DO getf(tape, ( ~ )) OD; ~ # etc ... #

Every channel has an associated standard character conversion that can be determined using the stand conv query routine and then the conversion applied to a particular file/tape. eg.

 make conv(tape, stand conv(stand out channel))

[edit] AutoHotkey

MsgBox % Chr(97)
MsgBox % Asc("a")

[edit] AWK

AWK has not built-in way to convert a character into ASCII (or whatever) code; but a function that does so can be easily built using an associative array (where the keys are the characters). The opposite can be done using printf (or sprintf) with %c

function ord(c)
{
return chmap[c]
}
BEGIN {
for(i=0; i < 256; i++) {
chmap[sprintf("%c", i)] = i
}
print ord("a"), ord("b")
printf "%c %c\n", 97, 98
s = sprintf("%c%c", 97, 98)
print s
}

[edit] Babel

'abcdefg' str2ar
{%d nl <<} eachar

Output: 97 98 99 100 101 102 103

(98 97 98 101 108) ls2lf ar2str nl <<
 

Output: babel

[edit] BASIC

Works with: QuickBasic version 4.5
charCode = 97
char = "a"
PRINT CHR$(charCode) 'prints a
PRINT ASC(char) 'prints 97

On the ZX Spectrum string variable names must be a single letter but numeric variables can be multiple characters:

Works with: ZX Spectrum Basic
10 LET c = 97: REM c is a character code
20 LET d$ = "b": REM d$ holds the character
30 PRINT CHR$(c): REM this prints a
40 PRINT CODE(d$): REM this prints 98

[edit] Applesoft BASIC

CHR$(97) is used in place of "a" because on the older model Apple II, lower case is difficult to input.

?CHR$(97)"="ASC(CHR$(97))
Output:
a=97

Output as it appears on the text display on the Apple II and Apple II plus, with the original text character ROM:

!=97

[edit] BBC BASIC

      charCode = 97
char$ = "a"
PRINT CHR$(charCode) : REM prints a
PRINT ASC(char$) : REM prints 97

[edit] Befunge

The instruction . will output as an integer. , will output as ASCII character.

"a". 99*44*+, @

[edit] Bracmat

( put
$ ( str
$ ( "\nLatin a
ISO-9959-1: "
asc$a
" = "
chr$97
"
UTF-8: "
utf$a
" = "
chu$97
\n
"Cyrillic а (UTF-8): "
utf$а
" = "
chu$1072
\n
)
)
)
Output:
Latin a

       ISO-9959-1: 97 = a
            UTF-8: 97 = a
Cyrillic а (UTF-8): 1072 = а

[edit] C

char is already an integer type in C, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.

#include <stdio.h>
 
int main() {
printf("%d\n", 'a'); /* prints "97" */
printf("%c\n", 97); /* prints "a"; we don't have to cast because printf is type agnostic */
return 0;
}

[edit] C++

char is already an integer type in C++, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.

In this case, the output operator << is overloaded to handle integer (outputs the decimal representation) and character (outputs just the character) types differently, so we need to cast it in both cases.

#include <iostream>
 
int main() {
std::cout << (int)'a' << std::endl; // prints "97"
std::cout << (char)97 << std::endl; // prints "a"
return 0;
}

[edit] C#

C# represents strings and characters internally as Unicode, so casting a char to an int returns its Unicode character encoding.

using System;
 
namespace RosettaCode.CharacterCode
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine((int) 'a'); //Prints "97"
Console.WriteLine((char) 97); //Prints "a"
}
}
}

[edit] Clojure

(print (int \a)) ; prints "97"
(print (char 97)) ; prints \a
 
; Unicode is also available, as Clojure uses the underlying java Strings & chars
(print (int \π)) ; prints 960
(print (char 960)) ; prints \π
 
; use String because char in Java can't represent characters outside Basic Multilingual Plane
(print (.codePointAt "𝅘𝅥𝅮" 0)) ; prints 119136
(print (String. (int-array 1 119136) 0 1)) ; prints 𝅘𝅥𝅮

[edit] CoffeeScript

CoffeeScript transcompiles to JavaScript, so it uses the JS standard library.

console.log 'a'.charCodeAt 0 # 97
console.log String.fromCharCode 97 # a

[edit] Common Lisp

(princ (char-code #\a)) ; prints "97"
(princ (code-char 97)) ; prints "a"

[edit] Component Pascal

BlackBox Component Builder

PROCEDURE CharCodes*;
VAR
c : CHAR;
BEGIN
c := 'A';
StdLog.Char(c);StdLog.String(":> ");StdLog.Int(ORD(c));StdLog.Ln;
c := CHR(3A9H);
StdLog.Char(c);StdLog.String(":> ");StdLog.Int(ORD(c));StdLog.Ln
END CharCodes;
Output:
A:>  65
Ω:>  937

[edit] D

void main() {
import std.stdio, std.utf;
 
string test = "a";
size_t index = 0;
 
// Get four-byte utf32 value for index 0.
writefln("%d", test.decode(index));
 
// 'index' has moved to next character input position.
assert(index == 1);
}
Output:
97

[edit] Delphi

Example from Studio 2006.

program Project1;
 
{$APPTYPE CONSOLE}
 
uses
SysUtils;
var
aChar:Char;
aCode:Byte;
uChar:WideChar;
uCode:Word;
begin
aChar := Chr(97); Writeln(aChar);
aCode := Ord(aChar); Writeln(aCode);
uChar := WideChar(97); Writeln(uChar);
uCode := Ord(uChar); Writeln(uCode);
 
Readln;
end.

[edit] DWScript

PrintLn(Ord('a'));
PrintLn(Chr(97));

[edit] E

? 'a'.asInteger()
# value: 97
 
? <import:java.lang.makeCharacter>.asChar(97)
# value: 'a'

[edit] Elena

#define system.
 
#symbol Program =>
[
console write:("a" getAt:0 Number).
console write:(CharValue new:97).
].

[edit] Erlang

In Erlang, lists and strings are the same, only the representation changes. Thus:

1> F = fun([X]) -> X end. 
#Fun<erl_eval.6.13229925>
2> F("a").
97

If entered manually, one can also get ASCII codes by prefixing characters with $:

3> $a.
97

Unicode is fully supported since release R13A only.

[edit] Euphoria

printf(1,"%d\n", 'a') -- prints "97"
printf(1,"%s\n", 97) -- prints "a"

[edit] F#

let c = 'A'
let n = 65
printfn "%d" (int c)
printfn "%c" (char n)
Output:
65

A

[edit] Factor

CHAR: katakana-letter-a .
"ア" first .
 
12450 1string print

[edit] FALSE

'A."
"65,

[edit] Fantom

A character is represented in single quotes: the 'toInt' method returns the code for the character. The 'toChar' method converts an integer into its respective character.

fansh> 97.toChar
a
fansh> 'a'.toInt
97

[edit] Forth

As with C, characters are just integers on the stack which are treated as ASCII.

char a
dup . \ 97
emit \ a

[edit] Fortran

Functions ACHAR and IACHAR specifically work with the ASCII character set, while the results of CHAR and ICHAR will depend on the default character set being used.

WRITE(*,*) ACHAR(97), IACHAR("a")   
WRITE(*,*) CHAR(97), ICHAR("a")

[edit] Frink

The function char[x] in Frink returns the numerical Unicode codepoints for a string or character, or returns the Unicode string for an integer value or array of integer values.

println[char["a"]]              // prints 97
println[char[97]] // prints a
println[char["Frink rules!"]] // prints [70, 114, 105, 110, 107, 32, 114, 117, 108, 101, 115, 33]
println[[70, 114, 105, 110, 107, 32, 114, 117, 108, 101, 115, 33]] // prints "Frink rules!"

[edit] GAP

# Code must be in 0 .. 255.
CharInt(65);
# 'A'
IntChar('Z');
# 90

[edit] Go

In Go, a character literal is simply an integer constant of the character code:

fmt.Println('a') // prints "97"
fmt.Println('π') // prints "960"

Go constants are not fully typed however. A character stored in a variable has a data type, and the types most commonly used for character data are byte, rune, and string. This example program shows character codes (as literals) stored in typed variables, and printed out with default formatting. Note that since byte and rune are integer types, the default formatting is a printable base 10 number. String is not numeric, and a little extra work must be done to print the character codes.

package main
 
import "fmt"
 
func main() {
// yes, there is more concise syntax, but this makes
// the data types very clear.
var b byte = 'a'
var r rune = 'π'
var s string = "aπ"
 
fmt.Println(b, r, s)
for _, c := range s { // this gives c the type rune
fmt.Println(c)
}
}
Output:

97 960 aπ 97

960

For the second part of the task, printing the character of a given code, the %c verb of fmt.Printf will do this directly from integer values, emitting the UTF-8 encoding of the code, (which will typically print the character depending on your hardware and operating system configuration.)

b := byte(97)
r := rune(960)
fmt.Printf("%c %c\n%c %c\n", 97, 960, b, r)
Output:

a π

a π

You can think of the default formatting of strings as being the printable characters of the string. In fact however, it is even simpler. Since we expect our output device to interpret UTF-8, and we expect our string to contain UTF-8, The default formatting simply dumps the bytes of the string to the output.

Examples showing strings constructed from integer constants and then printed:

fmt.Println(string(97)) // prints "a"
fmt.Println(string(960)) // prints "π"
fmt.Println(string([]rune{97, 960})) // prints "aπ"

[edit] Golfscript

To convert a number to a string, we use the array to string coercion.

97[]+''+p

To convert a string to a number, we have a many options, of which the simplest and shortest are:

'a')\;p
'a'(\;p
'a'0=p
'a'{}/p

[edit] Groovy

Groovy does not have a character literal at all, so one-character strings have to be coerced to char. Groovy printf (like Java, but unlike C) is not type-agnostic, so the cast or coercion from char to int is also required. The reverse direction is considerably simpler.

printf ("%d\n", ('a' as char) as int)
printf ("%c\n", 97)
Output:
97
a

[edit] Haskell

import Data.Char
 
main = do
print (ord 'a') -- prints "97"
print (chr 97) -- prints "'a'"
print (ord 'π') -- prints "960"
print (chr 960) -- prints "'\960'"

[edit] HicEst

WRITE(Messagebox) ICHAR('a'), CHAR(97)

[edit] Icon and Unicon

procedure main(arglist)
if *arglist > 0 then L := arglist else L := [97, "a"]
 
every x := !L do
write(x, " ==> ", char(integer(x)) | ord(x) ) # char produces a character, ord produces a number
end

Icon and Unicon do not currently support double byte character sets.

Output:
97 ==> a

a ==> 97

[edit] J

   4 u: 97 98 99 9786
abc☺
 
3 u: 7 u: 'abc☺'
97 98 99 9786

[edit] Java

char is already an integer type in Java, and it gets automatically promoted to int. So you can use a character where you would otherwise use an integer. Conversely, you can use an integer where you would normally use a character, except you may need to cast it, as char is smaller.

In this case, the println method is overloaded to handle integer (outputs the decimal representation) and character (outputs just the character) types differently, so we need to cast it in both cases.

public class Foo {
public static void main(String[] args) {
System.out.println((int)'a'); // prints "97"
System.out.println((char)97); // prints "a"
}
}

Java characters support Unicode:

public class Bar {
public static void main(String[] args) {
System.out.println((int)'π'); // prints "960"
System.out.println((char)960); // prints "π"
}
}

[edit] JavaScript

Here character is just a string of length 1

document.write('a'.charCodeAt(0)); // prints "97"
document.write(String.fromCharCode(97)); // prints "a"

[edit] Joy

'a ord.
97 chr.

[edit] Julia

Julia character constants (of type Char) are treated as an integer type representing the Unicode codepoint of the character, and can easily be converted to and from other integer types.

println(int('a'))
println(char(97))
Output:
97

a

[edit] K

  _ic "abcABC"
97 98 99 65 66 67
 
_ci 97 98 99 65 66 67
"abcABC"

[edit] LabVIEW

This image is a VI Snippet, an executable image of LabVIEW code. The LabVIEW version is shown on the top-right hand corner. You can download it, then drag-and-drop it onto the LabVIEW block diagram from a file browser, and it will appear as runnable, editable code.
LabVIEW Character codes.png

[edit] Lang5

: CHAR  "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[" comb
'\\ comb -1 remove append "]^_`abcdefghijklmnopqrstuvwxyz{|}~" comb append ;
: CODE 95 iota 33 + ;  : comb "" split ;
: extract' rot 1 compress index subscript expand drop ;
: chr CHAR CODE extract' ;
: ord CODE CHAR extract' ;
 
'a ord . # 97
97 chr . # a

[edit] Lasso

'a'->integer
'A'->integer
97->bytes
65->bytes
Output:
97

65 a

A

[edit] LFE

In LFE/Erlang, lists and strings are the same, only the representation changes. For example:

> (list 68 111 110 39 116 32 80 97 110 105 99 46)
"Don't Panic."

As for this exercise, here's how you could print out the ASCII code for a letter, and a letter from the ASCII code:

> (: io format '"~w~n" '"a")
97
ok
> (: io format '"~p~n" (list '(97)))
"a"
ok

[edit] Liberty BASIC

charCode = 97
char$ = "a"
print chr$(charCode) 'prints a
print asc(char$) 'prints 97

[edit]

Logo characters are words of length 1.

print ascii "a    ; 97
print char 97  ; a

[edit] Logtalk

|?- char_code(Char, 97), write(Char).
a
Char = a
yes
|?- char_code(a, Code), write(Code).
97
Code = 97
yes

[edit] Lua

print(string.byte("a")) -- prints "97"
print(string.char(97)) -- prints "a"

[edit] Maple

There are two ways to do this in Maple. First, there are procedures in StringTools for this purpose.

> use StringTools in Ord( "A" ); Char( 65 ) end;
65
 
"A"
 

Second, the procedure convert handles conversions to and from byte values.

> convert( "A", bytes );
[65]
 
> convert( [65], bytes );
"A"
 

[edit] Mathematica

Use the FromCharacterCode and ToCharacterCode functions:

ToCharacterCode["abcd"]
FromCharacterCode[{97}]
Output:
{97, 98, 99, 100}

"a"

[edit] MATLAB / Octave

There are two built-in function that perform these tasks. To convert from a number to a character use:

character = char(asciiNumber)

To convert from a character to its corresponding ascii character use:

asciiNumber = double(character)

or if you need this number as an integer not a double use:

asciiNumber = uint16(character)
asciiNumber = uint32(character)
asciiNumber = uint64(character)

Sample Usage:

>> char(87)
 
ans =
 
W
 
>> double('W')
 
ans =
 
87
 
>> uint16('W')
 
ans =
 
87

[edit] Maxima

ascii(65);
"A"
 
cint("A");
65

[edit] Metafont

Metafont handles only ASCII (even though codes beyond 127 can be given and used as real ASCII codes)

message "enter a letter: ";
string a;
a := readstring;
message decimal (ASCII a); % writes the decimal number of the first character
 % of the string a
message "enter a number: ";
num := scantokens readstring;
message char num;  % num can be anything between 0 and 255; what will be seen
 % on output depends on the encoding used by the "terminal"; e.g.
 % any code beyond 127 when UTF-8 encoding is in use will give
 % a bad encoding; e.g. to see correctly an "è", we should write
message char10;  % (this add a newline...)
message char hex"c3" & char hex"a8";  % since C3 A8 is the UTF-8 encoding for "è"
end

[edit] Modula-2

MODULE asc;
 
IMPORT InOut;
 
VAR letter : CHAR;
ascii : CARDINAL;
 
BEGIN
letter := 'a';
InOut.Write (letter);
ascii := ORD (letter);
InOut.Write (11C); (* ASCII TAB *)
InOut.WriteCard (ascii, 8);
ascii := ascii - ORD ('0');
InOut.Write (11C); (* ASCII TAB *)
InOut.Write (CHR (ascii));
InOut.WriteLn
END asc.

Producing the output:

jan@Beryllium:~/modula/rosetta$ ./asc
a 97 1

[edit] Modula-3

The built in functions ORD and VAL work on characters, among other things.

ORD('a') (* Returns 97 *)
VAL(97, CHAR); (* Returns 'a' *)

[edit] MUMPS

WRITE $ASCII("M")
WRITE $CHAR(77)

[edit] NetRexx

NetRexx provides built-in functions to convert between character and decimal/hexadecimal.

/* NetRexx */
options replace format comments java crossref symbols nobinary
 
runSample(arg)
return
 
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method runSample(arg) private static
-- create some sample data: character, hex and unicode
samp = ' ' || 'a'.sequence('e') || '$' || '\xa2'.sequence('\xa5') || '\u20a0'.sequence('\u20b5')
-- use the C2D C2X D2C and X2C built-in functions
say "'"samp"'"
say ' | Chr C2D C2X D2C X2C'
say '---+ --- ------ ---- --- ---'
loop ci = 1 to samp.length
cc = samp.substr(ci, 1)
cd = cc.c2d -- char to decimal
cx = cc.c2x -- char to hexadecimal
dc = cd.d2c -- decimal to char
xc = cx.x2c -- hexadecimal to char
say ci.right(3)"| '"cc"'" cd.right(6) cx.right(4, 0) "'"dc"' '"xc"'"
end ci
return
Output:
' abcde$¢£¤¥₠₡₢₣₤₥₦₧₨₩₪₫€₭₮₯₰₱₲₳₴₵'
   | Chr    C2D  C2X D2C X2C
---+ --- ------ ---- --- ---
  1| ' '     32 0020 ' ' ' '
  2| 'a'     97 0061 'a' 'a'
  3| 'b'     98 0062 'b' 'b'
  4| 'c'     99 0063 'c' 'c'
  5| 'd'    100 0064 'd' 'd'
  6| 'e'    101 0065 'e' 'e'
  7| '$'     36 0024 '$' '$'
  8| '¢'    162 00A2 '¢' '¢'
  9| '£'    163 00A3 '£' '£'
 10| '¤'    164 00A4 '¤' '¤'
 11| '¥'    165 00A5 '¥' '¥'
 12| '₠'   8352 20A0 '₠' '₠'
 13| '₡'   8353 20A1 '₡' '₡'
 14| '₢'   8354 20A2 '₢' '₢'
 15| '₣'   8355 20A3 '₣' '₣'
 16| '₤'   8356 20A4 '₤' '₤'
 17| '₥'   8357 20A5 '₥' '₥'
 18| '₦'   8358 20A6 '₦' '₦'
 19| '₧'   8359 20A7 '₧' '₧'
 20| '₨'   8360 20A8 '₨' '₨'
 21| '₩'   8361 20A9 '₩' '₩'
 22| '₪'   8362 20AA '₪' '₪'
 23| '₫'   8363 20AB '₫' '₫'
 24| '€'   8364 20AC '€' '€'
 25| '₭'   8365 20AD '₭' '₭'
 26| '₮'   8366 20AE '₮' '₮'
 27| '₯'   8367 20AF '₯' '₯'
 28| '₰'   8368 20B0 '₰' '₰'
 29| '₱'   8369 20B1 '₱' '₱'
 30| '₲'   8370 20B2 '₲' '₲'
 31| '₳'   8371 20B3 '₳' '₳'
 32| '₴'   8372 20B4 '₴' '₴'
 33| '₵'   8373 20B5 '₵' '₵'

[edit] Oberon-2

MODULE Ascii;
IMPORT Out;
VAR
c: CHAR;
d: INTEGER;
BEGIN
c := CHR(97);
d := ORD("a");
Out.Int(d,3);Out.Ln;
Out.Char(c);Out.Ln
END Ascii.
Output:

97

a

[edit] Objeck

'a'->As(Int)->PrintLine();
97->As(Char)->PrintLine();

[edit] OCaml

Printf.printf "%d\n" (int_of_char 'a'); (* prints "97" *)
Printf.printf "%c\n" (char_of_int 97); (* prints "a" *)

The following are aliases for the above functions:

# Char.code ;;
- : char -> int = <fun>
# Char.chr;;
- : int -> char = <fun>

[edit] OpenEdge/Progress

MESSAGE
CHR(97) SKIP
ASC("a")
VIEW-AS ALERT-BOX.

[edit] Oz

Characters in Oz are the same as integers in the range 0-255 (ISO 8859-1 encoding). To print a number as a character, we need to use it as a string (i.e. a list of integers from 0 to 255):

{System.show &a}  %% prints "97"
{System.showInfo [97]} %% prints "a"

[edit] PARI/GP

print(Vecsmall("a")[1]);
print(Strchr([72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]))

[edit] Pascal

writeln(ord('a'));
writeln(chr(97));

[edit] Perl

Here character is just a string of length 1

print ord('a'), "\n"; # prints "97"
print chr(97), "\n"; # prints "a"

[edit] Perl 6

Both Perl 5 and Perl 6 have good Unicode support. Note that even characters outside the BMP are considered single characters, not a surrogate pair. Here we use the character "four dragons" (with 64 strokes!) to demonstrate that.

say ord('𪚥').fmt('0x%04x');
say chr(0x2a6a5);
Output:
0x2a6a5

𪚥

[edit] PHP

Here character is just a string of length 1

echo ord('a'), "\n"; // prints "97"
echo chr(97), "\n"; // prints "a"

[edit] PicoLisp

: (char "a")
-> 97
: (char "字")
-> 23383
: (char 23383)
-> "字"
: (chop "文字")
-> ("文" "字")
: (mapcar char @)
-> (25991 23383)

[edit] PL/I

declare 1 u union,
2 c character (1),
2 i fixed binary (8) unsigned;
c = 'a'; put skip list (i); /* prints 97 */
i = 97; put skip list (c); /* prints 'a' */

[edit] PowerShell

PowerShell does not allow for character literals directly, so to get a character one first needs to convert a single-character string to a char:

$char = [char] 'a'

Then a simple cast to int yields the character code:

$charcode = [int] $char   # => 97

This also works with Unicode:

[int] [char] '☺'          # => 9786

For converting an integral character code into the actual character, a cast to char suffices:

[char] 97    # a
[char] 9786 # ☺

[edit] Prolog

SWI-Prolog has predefined predicate char_code/2.

?- char_code(a, X).
X = 97.

?- char_code(X, 97).
X = a.

[edit] PureBasic

PureBasic allows compiling code so that it will use either Ascii or a Unicode (UCS-2) encoding for representing its string content. It also allows for the source code that is being compiled to be in either Ascii or UTF-8 encoding. A one-character string is used here to hold the character and a numerical character type is used to hold the character code. The character type is either one or two bytes in size, depending on whether compiling for Ascii or Unicode respectively.

If OpenConsole()
;Results are the same when compiled for Ascii or Unicode
charCode.c = 97
Char.s = "a"
PrintN(Chr(charCode)) ;prints a
PrintN(Str(Asc(Char))) ;prints 97
 
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
Input()
CloseConsole()
EndIf

This version should be compiled with Unicode setting and the source code to be encoded using UTF-8.

If OpenConsole()
;UTF-8 encoding compiled for Unicode (UCS-2)
charCode.c = 960
Char.s = "π"
PrintN(Chr(charCode)) ;prints π
PrintN(Str(Asc(Char))) ;prints 960
 
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
Input()
CloseConsole()
EndIf

[edit] Python

Works with: Python version 2.x

Here character is just a string of length 1

8-bit characters:

print ord('a') # prints "97"
print chr(97) # prints "a"

Unicode characters:

print ord(u'π') # prints "960"
print unichr(960) # prints "π"
Works with: Python version 3.x

Here character is just a string of length 1

print(ord('a')) # prints "97"
print(ord('π')) # prints "960"
print(chr(97)) # prints "a"
print(chr(960)) # prints "π"

[edit] R

ascii <- as.integer(charToRaw("hello world")); ascii
text <- rawToChar(as.raw(ascii)); text

[edit] Racket

#lang racket
 
(define (code ch)
(printf "The unicode number for ~s is ~a\n" ch (char->integer ch)))
(code #\a)
(code #\λ)
 
(define (char n)
(printf "The unicode number ~a is the character ~s\n" n (integer->char n)))
(char 97)
(char 955)

[edit] Retro

'c putc

[edit] REXX

REXX supports handling of characters with built-in functions, whether it be hexadecimal, binary (bits), or decimal codes.

yyy='c'               /*assign a lowercase   c to  YYY.*/
yyy='34'x /*assign hexadecimal 34 to YYY.*/
/*the X can be upper/lowercase.*/
yyy=x2c(34) /* (same as above) */
yyy='00110100'b /* (same as above) */
yyy='0011 0100'b /* (same as above) */
/*the B can be upper/lowercase.*/
yyy=d2c(97) /*assign decimal code 97 to YYY.*/
 
say yyy /*displays the value of YYY. */
say c2x(yyy) /*displays the value of YYY in hexadecimal. */
say c2d(yyy) /*displays the value of YYY in decimal. */
say x2b(c2x(yyy)) /*displays the value of YYY in binary (bit string). */
/*Note: some REXXes support the c2b bif */

[edit] Ruby

[edit] 1.8

In Ruby 1.8 characters are usually represented directly as their integer character code. Ruby has a syntax for "character literal" which evaluates directly to the integer code: ?a evaluates to the integer 97. Subscripting a string also gives just the integer code for the character.

> ?a
=> 97
> "a"[0]
=> 97
> 97.chr
=> "a"

[edit] 1.9

In Ruby 1.9 characters are represented as length-1 strings; same as in Python. The previous "character literal" syntax ?a is now the same as "a". Subscripting a string also gives a length-1 string. There is now an "ord" method of strings to convert a character into its integer code.

> "a".ord
=> 97
> 97.chr
=> "a"

[edit] Run BASIC

print chr$(97) 'prints a
print asc("a") 'prints 97

[edit] Sather

class MAIN is
main is
#OUT + 'a'.int + "\n"; -- or
#OUT + 'a'.ascii_int + "\n";
#OUT + CHAR::from_ascii_int(97) + "\n";
end;
end;

[edit] Scala

Library: Scala

Scala supports unicode characters, but each character is UTF-16, so there is not a 1-to-1 relationship for supplementary character sets.

[edit] In a REPL session

scala> 'a' toInt
res2: Int = 97
 
scala> 97 toChar
res3: Char = a
 
scala> '\u0061'
res4: Char = a
 
scala> "\uD869\uDEA5"
res5: String = 𪚥

[edit] Full swing workout

Taken the supplemental character sets in account.

import java.lang.Character._; import scala.annotation.tailrec
 
object CharacterCode extends App {
def intToChars(n: Int): Array[Char] = java.lang.Character.toChars(n)
 
def UnicodeToList(UTFstring: String) = {
@tailrec
def inner(str: List[Char], acc: List[String], surrogateHalf: Option[Char]): List[String] = {
(str, surrogateHalf) match {
case (Nil, _) => acc
case (ch :: rest, None) => if (ch.isSurrogate) inner(rest, acc, Some(ch))
else inner(rest, acc :+ ch.toString, None)
case (ch :: rest, Some(f)) => inner(rest, (acc :+ (f.toString + ch)), None)
}
}
inner(UTFstring.toList, Nil, None)
}
 
def UnicodeToInt(utf: String) = {
def charToInt(high: Char, low: Char) =
{ if (isSurrogatePair(high, low)) toCodePoint(high, low) else high.toInt }
charToInt(utf(0), if (utf.size > 1) utf(1) else 0)
}
 
def UTFtoHexString(utf: String) = { utf.map(ch => f"${ch.toInt}%04X").mkString("\"\\u", "\\u", "\"") }
 
def flags(ch: String) = { // Testing Unicode character properties
(if (ch matches "\\p{M}") "Y" else "N") + (if (ch matches "\\p{Mn}") "Y" else "N")
}
 
val str = '\uFEFF' /*big-endian BOM*/ + "\u0301a" +
"$áabcde¢£¤¥©ÇßIJijŁłʒλπक्तु•₠₡₢₣₤₥₦₧₨₩₪₫€₭₮₯₰₱₲₳₴₵℃←→⇒∙⌘☃☹☺☻ア字文𠀀" + intToChars(173733).mkString
 
println(s"Example string: $str")
println(""" | Chr C/C++/Java source Code Point Hex Dec Mn Name
!----+ --- ------------------------- ------- -------- -- "
"".stripMargin('!') + "-" * 27)
 
(UnicodeToList(str)).zipWithIndex.map {
case (coll, nr) =>
f"$nr%4d: $coll\t${UTFtoHexString(coll)}%27s U+${UnicodeToInt(coll)}%05X" +
f"${"(" + UnicodeToInt(coll).toString}%8s) ${flags(coll)} ${getName(coll(0).toInt)} "
}.foreach(println)
}
Output:
Example string: ́a$áabcde¢£¤¥©ÇßIJijŁłʒλπक्तु•₠₡₢₣₤₥₦₧₨₩₪₫€₭₮₯₰₱₲₳₴₵℃←→⇒∙⌘☃☹☺☻ア字文𠀀𪚥
    | Chr C/C++/Java source  Code Point Hex      Dec Mn Name
----+ --- ------------------------- ------- -------- -- ---------------------------
   0: 	                   "\uFEFF" U+0FEFF  (65279) NN  ZERO WIDTH NO-BREAK SPACE 
   1: ́	                   "\u0301" U+00301    (769) YY  COMBINING ACUTE ACCENT 
   2: a	                   "\u0061" U+00061     (97) NN  LATIN SMALL LETTER A 
   3: $	                   "\u0024" U+00024     (36) NN  DOLLAR SIGN 
   4: á	                   "\u00E1" U+000E1    (225) NN  LATIN SMALL LETTER A WITH ACUTE 
   5: a	                   "\u0061" U+00061     (97) NN  LATIN SMALL LETTER A 
   6: b	                   "\u0062" U+00062     (98) NN  LATIN SMALL LETTER B 
   7: c	                   "\u0063" U+00063     (99) NN  LATIN SMALL LETTER C 
   8: d	                   "\u0064" U+00064    (100) NN  LATIN SMALL LETTER D 
   9: e	                   "\u0065" U+00065    (101) NN  LATIN SMALL LETTER E 
  10: ¢	                   "\u00A2" U+000A2    (162) NN  CENT SIGN 
  11: £	                   "\u00A3" U+000A3    (163) NN  POUND SIGN 
  12: ¤	                   "\u00A4" U+000A4    (164) NN  CURRENCY SIGN 
  13: ¥	                   "\u00A5" U+000A5    (165) NN  YEN SIGN 
  14: ©	                   "\u00A9" U+000A9    (169) NN  COPYRIGHT SIGN 
  15: Ç	                   "\u00C7" U+000C7    (199) NN  LATIN CAPITAL LETTER C WITH CEDILLA 
  16: ß	                   "\u00DF" U+000DF    (223) NN  LATIN SMALL LETTER SHARP S 
  17: IJ	                   "\u0132" U+00132    (306) NN  LATIN CAPITAL LIGATURE IJ 
  18: ij	                   "\u0133" U+00133    (307) NN  LATIN SMALL LIGATURE IJ 
  19: Ł	                   "\u0141" U+00141    (321) NN  LATIN CAPITAL LETTER L WITH STROKE 
  20: ł	                   "\u0142" U+00142    (322) NN  LATIN SMALL LETTER L WITH STROKE 
  21: ʒ	                   "\u0292" U+00292    (658) NN  LATIN SMALL LETTER EZH 
  22: λ	                   "\u03BB" U+003BB    (955) NN  GREEK SMALL LETTER LAMDA 
  23: π	                   "\u03C0" U+003C0    (960) NN  GREEK SMALL LETTER PI 
  24: क	                   "\u0915" U+00915   (2325) NN  DEVANAGARI LETTER KA 
  25: ्	                   "\u094D" U+0094D   (2381) YY  DEVANAGARI SIGN VIRAMA 
  26: त	                   "\u0924" U+00924   (2340) NN  DEVANAGARI LETTER TA 
  27: ु	                   "\u0941" U+00941   (2369) YY  DEVANAGARI VOWEL SIGN U 
  28: •	                   "\u2022" U+02022   (8226) NN  BULLET 
  29: ₠	                   "\u20A0" U+020A0   (8352) NN  EURO-CURRENCY SIGN 
  30: ₡	                   "\u20A1" U+020A1   (8353) NN  COLON SIGN 
  31: ₢	                   "\u20A2" U+020A2   (8354) NN  CRUZEIRO SIGN 
  32: ₣	                   "\u20A3" U+020A3   (8355) NN  FRENCH FRANC SIGN 
  33: ₤	                   "\u20A4" U+020A4   (8356) NN  LIRA SIGN 
  34: ₥	                   "\u20A5" U+020A5   (8357) NN  MILL SIGN 
  35: ₦	                   "\u20A6" U+020A6   (8358) NN  NAIRA SIGN 
  36: ₧	                   "\u20A7" U+020A7   (8359) NN  PESETA SIGN 
  37: ₨	                   "\u20A8" U+020A8   (8360) NN  RUPEE SIGN 
  38: ₩	                   "\u20A9" U+020A9   (8361) NN  WON SIGN 
  39: ₪	                   "\u20AA" U+020AA   (8362) NN  NEW SHEQEL SIGN 
  40: ₫	                   "\u20AB" U+020AB   (8363) NN  DONG SIGN 
  41: €	                   "\u20AC" U+020AC   (8364) NN  EURO SIGN 
  42: ₭	                   "\u20AD" U+020AD   (8365) NN  KIP SIGN 
  43: ₮	                   "\u20AE" U+020AE   (8366) NN  TUGRIK SIGN 
  44: ₯	                   "\u20AF" U+020AF   (8367) NN  DRACHMA SIGN 
  45: ₰	                   "\u20B0" U+020B0   (8368) NN  GERMAN PENNY SIGN 
  46: ₱	                   "\u20B1" U+020B1   (8369) NN  PESO SIGN 
  47: ₲	                   "\u20B2" U+020B2   (8370) NN  GUARANI SIGN 
  48: ₳	                   "\u20B3" U+020B3   (8371) NN  AUSTRAL SIGN 
  49: ₴	                   "\u20B4" U+020B4   (8372) NN  HRYVNIA SIGN 
  50: ₵	                   "\u20B5" U+020B5   (8373) NN  CEDI SIGN 
  51: ℃	                   "\u2103" U+02103   (8451) NN  DEGREE CELSIUS 
  52: ←	                   "\u2190" U+02190   (8592) NN  LEFTWARDS ARROW 
  53: →	                   "\u2192" U+02192   (8594) NN  RIGHTWARDS ARROW 
  54: ⇒	                   "\u21D2" U+021D2   (8658) NN  RIGHTWARDS DOUBLE ARROW 
  55: ∙	                   "\u2219" U+02219   (8729) NN  BULLET OPERATOR 
  56: ⌘	                   "\u2318" U+02318   (8984) NN  PLACE OF INTEREST SIGN 
  57: ☃	                   "\u2603" U+02603   (9731) NN  SNOWMAN 
  58: ☹	                   "\u2639" U+02639   (9785) NN  WHITE FROWNING FACE 
  59: ☺	                   "\u263A" U+0263A   (9786) NN  WHITE SMILING FACE 
  60: ☻	                   "\u263B" U+0263B   (9787) NN  BLACK SMILING FACE 
  61: ア	                   "\u30A2" U+030A2  (12450) NN  KATAKANA LETTER A 
  62: 字	                   "\u5B57" U+05B57  (23383) NN  CJK UNIFIED IDEOGRAPHS 5B57 
  63: 文	                   "\u6587" U+06587  (25991) NN  CJK UNIFIED IDEOGRAPHS 6587 
  64: 	                   "\uF8FF" U+0F8FF  (63743) NN  PRIVATE USE AREA F8FF 
  65: 𠀀	             "\uD840\uDC00" U+20000 (131072) NN  HIGH SURROGATES D840 
  66: 𪚥	             "\uD869\uDEA5" U+2A6A5 (173733) NN  HIGH SURROGATES D869
More background info: "Java: a rough guide to character encoding"

[edit] Scheme

(display (char->integer #\a)) (newline) ; prints "97"
(display (integer->char 97)) (newline) ; prints "a"

[edit] Seed7

writeln(ord('a'));
writeln(chr(97));

[edit] Slate

$a code.
97 as: String Character.

[edit] Smalltalk

($a asInteger) displayNl. "output 97"
(Character value: 97) displayNl. "output a"

[edit] SNOBOL4

Snobol implementations may or may not have built-in char( ) and ord ( ) or asc( ). These are based on examples in the Snobol4+ tutorial and work with the native (1-byte) charset.

        define('chr(n)') :(chr_end)
chr &alphabet tab(n) len(1) . chr :s(return)f(freturn)
chr_end
 
define('asc(str)c') :(asc_end)
asc str len(1) . c
&alphabet break(c) @asc :s(return)f(freturn)
asc_end
 
* # Test and display
output = char(65) ;* Built-in
output = chr(65)
output = asc('A')
end
Output:
A
A
65

[edit] Standard ML

print (Int.toString (ord #"a") ^ "\n"); (* prints "97" *)
print (Char.toString (chr 97) ^ "\n"); (* prints "a" *)

[edit] Tcl

# ASCII
puts [scan "a" %c] ;# ==> 97
puts [format %c 97] ;# ==> a
# Unicode is the same
puts [scan "π" %c] ;# ==> 960
puts [format %c 960] ;# ==> π

[edit] TI-83 BASIC

TI-83 BASIC provides no built in way to do this, so in all String<-->List routines and anything else which requires character codes, a workaround using inString( and sub( is used. In this example, the code of 'A' is displayed, and then the character matching a user-defined code is displayed.

"ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789→Str1
Disp inString(Str1,"A
Input "CODE? ",A
Disp sub(Str1,A,1

[edit] TI-89 BASIC

The TI-89 uses an 8-bit charset/encoding which is similar to ISO-8859-1, but with more mathematical symbols and Greek letters. At least codes 14-31, 128-160, 180 differ. The ASCII region is unmodified. (TODO: Give a complete list.)

The TI Connect X desktop software converts between this unique character set and Unicode characters, though sometimes in a consistent but inappropriate fashion.

The below program will display the character and code for any key pressed. Some keys do not correspond to characters and have codes greater than 255. The portion of the program actually implementing the task is marked with a line of “©”s.

Prgm
Local k, s
ClrIO
Loop
Disp "Press a key, or ON to exit."
getKey() © clear buffer
0 → k : While k = 0 : getKey() → k : EndWhile
ClrIO
If k ≥ 256 Then
Disp "Not a character."
Disp "Code: " & string(k)
Else
 
char(k) → s ©
© char() and ord() are inverses. ©
Disp "Character: " & s ©
Disp "Code: " & string(ord(s)) ©
 
EndIf
EndLoop
EndPrgm

[edit] Trith

Characters are Unicode code points, so the solution is the same for Unicode characters as it is for ASCII characters:

"a" ord print
97 chr print
"π" ord print
960 chr print

[edit] TUSCRIPT

$$ MODE TUSCRIPT
SET character ="a", code=DECODE (character,byte)
PRINT character,"=",code
Output:
a=97

[edit] Ursala

Character code functions are not built in but easily defined as reifications of the character table.

#import std
#import nat
 
chr = -: num characters
asc = -:@rlXS num characters
 
#cast %cnX
 
test = (chr97,asc`a)
Output:
(`a,97)

[edit] VBA

Debug.Print Chr(97) 'Prints a
Debug.Print [Code("a")] ' Prints 97

[edit] Visual Basic .NET

Console.WriteLine(Chr(97)) 'Prints a
Console.WriteLine(Asc("a")) 'Prints 97

[edit] XPL0

A character is represented by an integer value equal to its ASCII code. The up-arrow character is used to convert the immediately following character to an integer equal to its ASCII code.

IntOut(0, ^a);  \(Integer Out)  displays "97" on the console (device 0)
ChOut(0, 97); \(Character Out) displays "a" on the console (device 0)

[edit] zkl

The character set is 8 bit ASCII (but doesn't care if you use UTF-8 or unicode characters).

 "a".toAsc()  //-->97
(97).toChar() //-->"a"
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox