Idiomatically determine all the characters that can be used for symbols: Difference between revisions

Content added Content deleted
(jq)
Line 157: Line 157:
Unicode Identifier start: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzªµºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐ...
Unicode Identifier start: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzªµºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐ...
Unicode Identifier part: [0][1][2][3][4][5][6][7][8][14][15][16][17][18][19][20][21][22][23][24][25][26][27][48][49]...</pre>
Unicode Identifier part: [0][1][2][3][4][5][6][7][8][14][15][16][17][18][19][20][21][22][23][24][25][26][27][48][49]...</pre>

=={{header|jq}}==
===jq identifiers===
Excluding key names from consideration, in jq 1.4 the set of characters that can be
used in jq identifiers corresponds to the regex: [A-Za-z0-9$_].
Thus, assuming the availability of test/1 as a builtin, the test in jq
for a valid identifier character is: test("[A-Za-z0-9$_]").

To generate a string of such characters idiomatically:
<lang jq>[range(0;128) | [.] | implode | select(test("[A-Za-z0-9$_]"))] | add</lang>

jq 1.5 also allows ":" as a joining character in the form "module::name".


===JSON key names===
Any JSON string can be used as a key. Accordingly,
some characters must be entered as escaped character sequences,
e.g. \u0000 for NUL, \\ for backslash, etc. Thus any Unicode character
except for the control characters can appear in a jq key.
Therefore, assuming the availability in jq of the test/1 builtin, the test
in jq for whether a character can appear literally in a jq identifier or key is:
<lang jq>test("[^\u0000-\u0007F]")</lang>

===Symbols===
The following function screens for characters by "\p" class:
<lang jq>def is_character(class):
test( "\\p{" + class + "}" );</lang>
For example, to test whether a character is a Unicode letter, symbol or numeric character:
<lang jq>is_character("L") or is_character("S") or is_character("N")</lang>

An efficient way to count the number of Unicode characters within a character class is
to use the technique illustrated by the following function:
<lang jq>def count(class; m; n):
reduce (range(m;n) | [.] | implode | select( test( "\\p{" + class + "}" ))) as $i
(0; . + 1);</lang>

For example the number of Unicode "symbol" characters can be obtained by evaluating:
<lang jq>count("S"; 0; 1114112)</lang>
The result is 3958.


=={{header|ooRexx}}==
=={{header|ooRexx}}==