Idiomatically determine all the characters that can be used for symbols: Difference between revisions
(→{{header|Perl 6}}: clarify about whitespace not being allowed in symbols) |
m (tidy up task description) |
||
Line 1: | Line 1: | ||
{{draft task}} |
{{draft task}} |
||
Idiomatically determine all the characters that can be used for ''symbols''. |
Idiomatically determine all the characters that can be used for ''symbols''. |
||
⚫ | The word ''symbols'' is meant things like names of variables, procedures (i.e., named fragments of programs, functions, subroutines, routines), statement labels, events or conditions, and in general, anything a computer programmer can choose to ''name'', but not being restricted to this list. ''Identifiers'' might be another name for ''symbols''. |
||
⚫ | |||
⚫ | The word ''symbols'' is meant things like names of variables, procedures |
||
⚫ | |||
⚫ | |||
⚫ | |||
Display the set of all the characters that can be used for symbols which can be used (allowed) by the computer program. |
Display the set of all the characters that can be used for symbols which can be used (allowed) by the computer program. |
||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
;See also |
|||
;Cf: |
|||
* [[Idiomatically_determine_all_the_lowercase_and_uppercase_letters|Idiomatically determine all the lowercase and uppercase letters]]. |
* [[Idiomatically_determine_all_the_lowercase_and_uppercase_letters|Idiomatically determine all the lowercase and uppercase letters]]. |
||
<br> |
|||
=={{header|ooRexx}}== |
=={{header|ooRexx}}== |
Revision as of 18:28, 23 March 2014
Idiomatically determine all the characters that can be used for symbols. The word symbols is meant things like names of variables, procedures (i.e., named fragments of programs, functions, subroutines, routines), statement labels, events or conditions, and in general, anything a computer programmer can choose to name, but not being restricted to this list. Identifiers might be another name for symbols.
The method should find the characters regardless of the hardware architecture that is being used (ASCII, EBCDIC, or other).
- Task requirements
Display the set of all the characters that can be used for symbols which can be used (allowed) by the computer program. You may want to mention what hardware architecture is being used, and if applicable, the operating system.
Note that most languages have additional restrictions on what characters can't be used for the first character of a variable or statement label, for instance. These type of restrictions needn't be addressed here (but can be mentioned).
- See also
ooRexx
<lang oorexx>/*REXX program determines what characters are valid for REXX symbols.*/ /* copied/adjusted from REXX */ a= /*set symbol characters " " */
do j=0 for 2**8 /*traipse through all the chars. */ _=d2c(j) /*convert decimal number to char.*/ if datatype(_,'S') then a=a || _ /*Symbol char? Then add to list.*/ end /*j*/ /* [?] put some chars into a list*/
say ' symbol characters: ' a /*display all symbol characters.*/</lang>
- Output:
symbol characters: !.0123456789?ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz
Perl 6
Any Unicode character or combination of characters can be used for symbols in Perl 6. Here's some counting rods and some cuneiform: <lang perl6>sub postfix:<𒋦>($n) { say "$n trilobites" }
sub term:<𝍧> { unival('𝍧') }
𝍧𒋦</lang>
- Output:
8 trilobites
Of course, as in other languages, most of the characters you'll typically see in names are going to be alphanumerics from ASCII (or maybe Unicode), but that's a convention, not a limitation, due to the syntactic category notation demonstrated above, which can introduce any sequence of characters as a term or operator.
Actually, the above is a slight prevarication. The syntactic category notation does not allow you to use whitespace in the definition of a new symbol. But that leaves many more characters allowed than not allowed. Hence, it is much easier to enumerate the characters that cannot be used in symbols: <lang perl6>say .fmt("%4x"),"\t", uniname($_)
if uniprop($_,'Z') for 0..0x1ffff;</lang>
- Output:
20 SPACE a0 NO-BREAK SPACE 1680 OGHAM SPACE MARK 2000 EN QUAD 2001 EM QUAD 2002 EN SPACE 2003 EM SPACE 2004 THREE-PER-EM SPACE 2005 FOUR-PER-EM SPACE 2006 SIX-PER-EM SPACE 2007 FIGURE SPACE 2008 PUNCTUATION SPACE 2009 THIN SPACE 200a HAIR SPACE 2028 LINE SEPARATOR 2029 PARAGRAPH SEPARATOR 202f NARROW NO-BREAK SPACE 205f MEDIUM MATHEMATICAL SPACE 3000 IDEOGRAPHIC SPACE
We enforce the whitespace restriction to prevent insanity in the readers of programs. That being said, even the whitespace restriction is arbitrary, and can be bypassed by deriving a new grammar and switching to it. We view all other languages as dialects of Perl 6, even the insane ones. :-)
Python
See String class isidentifier.
REXX
<lang rexx>/*REXX program determines what characters are valid for REXX symbols.*/ @= /*set symbol characters " " */
do j=0 for 2**8 /*traipse through all the chars. */ _=d2c(j) /*convert decimal number to char.*/ if datatype(_,'S') then @=@ || _ /*Symbol char? Then add to list.*/ end /*j*/ /* [↑] put some chars into a list*/
say ' symbol characters: ' @ /*display all symbol characters.*/
/*stick a fork in it, we're done.*/</lang>
Programming note: REXX allows any symbol to begin a (statement) label, but variables can't begin with a period (.) or a numeric digit.
All examples below were executed on a (ASCII) PC using Windows/XP and Windows/7 with code page 437 in a DOS window.
Using PC/REXX and
Using Personal REXX and
Using Regina (versions 3.2 ───► 3.7)
output
symbol characters: !#$.0123456789?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz
Using R4
output
symbol characters: !#$.0123456789?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyzÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£áíóúñÑ╡╢╖─╞╟╨╤╥╙╘╒╓╫╪▐αßΓπΣσµτΦΘΩδ∞φ
Using ROO
output
symbol characters: !#$.0123456789?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyzÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£áíóúñÑ╡╢╖╞╟╨╤╥╙╘╒╓╫╪▐αßΓπΣσµτΦΘΩδ∞φ