Idiomatically determine all the characters that can be used for symbols: Difference between revisions

jq
(jq)
Line 157:
Unicode Identifier start: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzªµºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐ...
Unicode Identifier part: [0][1][2][3][4][5][6][7][8][14][15][16][17][18][19][20][21][22][23][24][25][26][27][48][49]...</pre>
 
=={{header|jq}}==
===jq identifiers===
Excluding key names from consideration, in jq 1.4 the set of characters that can be
used in jq identifiers corresponds to the regex: [A-Za-z0-9$_].
Thus, assuming the availability of test/1 as a builtin, the test in jq
for a valid identifier character is: test("[A-Za-z0-9$_]").
 
To generate a string of such characters idiomatically:
<lang jq>[range(0;128) | [.] | implode | select(test("[A-Za-z0-9$_]"))] | add</lang>
 
jq 1.5 also allows ":" as a joining character in the form "module::name".
 
 
===JSON key names===
Any JSON string can be used as a key. Accordingly,
some characters must be entered as escaped character sequences,
e.g. \u0000 for NUL, \\ for backslash, etc. Thus any Unicode character
except for the control characters can appear in a jq key.
Therefore, assuming the availability in jq of the test/1 builtin, the test
in jq for whether a character can appear literally in a jq identifier or key is:
<lang jq>test("[^\u0000-\u0007F]")</lang>
 
===Symbols===
The following function screens for characters by "\p" class:
<lang jq>def is_character(class):
test( "\\p{" + class + "}" );</lang>
For example, to test whether a character is a Unicode letter, symbol or numeric character:
<lang jq>is_character("L") or is_character("S") or is_character("N")</lang>
 
An efficient way to count the number of Unicode characters within a character class is
to use the technique illustrated by the following function:
<lang jq>def count(class; m; n):
reduce (range(m;n) | [.] | implode | select( test( "\\p{" + class + "}" ))) as $i
(0; . + 1);</lang>
 
For example the number of Unicode "symbol" characters can be obtained by evaluating:
<lang jq>count("S"; 0; 1114112)</lang>
The result is 3958.
 
=={{header|ooRexx}}==
2,462

edits