Idiomatically determine all the characters that can be used for symbols: Difference between revisions

Content added Content deleted

Inline

@@ Line 44: / Line 44: @@
 nd..nth char: 193 NG, 63 OK 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz
 </pre>
 =={{header|F_Sharp|F#}}==
 Well, if the purpose of this task is to determine what can be used as an identifier then in F# anything so long as you enclose it in double backticks so:
@@ Line 367: / Line 368: @@
     print $c if $c =~ /\p{Word}/;
 }</lang>
-=={{header|Perl 6}}==
-Any Unicode character or combination of characters can be used for symbols in Perl 6.  Here's some counting rods and some cuneiform:
-<lang perl6>sub postfix:<𒋦>($n) { say "$n trilobites" }
-sub term:<𝍧> { unival('𝍧') }
-𝍧𒋦</lang>
-{{out}}
-<pre>8 trilobites</pre>
-And here is a Zalgo-text symbol:
-<lang perl6>sub Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ ($n) { say "$n COMES" }
-Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ 'HE'</lang>
-{{out}}
-<pre>HE COMES</pre>
-Of course, as in other languages, most of the characters you'll typically see in names are going to be alphanumerics from ASCII (or maybe Unicode), but that's a convention, not a limitation, due to the syntactic category notation demonstrated above, which can introduce any sequence of characters as a term or operator.
-Actually, the above is a slight prevarication.  The syntactic category notation does not allow you to use whitespace in the definition of a new symbol. But that leaves many more characters allowed than not allowed.  Hence, it is much easier to enumerate the characters that <em>cannot</em> be used in symbols:
-<lang perl6>say .fmt("%4x"),"\t", uniname($_)
-    if uniprop($_,'Z')
-        for 0..0x1ffff;</lang>
-{{out}}
-<pre>  20	SPACE
-  a0	NO-BREAK SPACE
-	OGHAM SPACE MARK
-	EN QUAD
-	EM QUAD
-	EN SPACE
-	EM SPACE
-	THREE-PER-EM SPACE
-	FOUR-PER-EM SPACE
-	SIX-PER-EM SPACE
-	FIGURE SPACE
-	PUNCTUATION SPACE
-	THIN SPACE
-a	HAIR SPACE
-	LINE SEPARATOR
-	PARAGRAPH SEPARATOR
-f	NARROW NO-BREAK SPACE
-f	MEDIUM MATHEMATICAL SPACE
-	IDEOGRAPHIC SPACE</pre>
-We enforce the whitespace restriction to prevent insanity in the readers of programs.
-That being said, even the whitespace restriction is arbitrary, and can be bypassed by deriving a new grammar and switching to it.  We view all other languages as dialects of Perl 6, even the insane ones.  <tt>:-)</tt>
 =={{header|Phix}}==
@@ Line 542: / Line 495: @@
 (3 i have a space i've got a quote in me i'm not a "dot on my own", but my neighbour is! . λ my characters aren't even mapped in unicode 􎑃)</pre>
 The output to <code>(main)</code> is massive, and probably not dissimilar to Tcl's (anyone want to compare?)
+=={{header|Raku}}==
+(formerly Perl 6)
+Any Unicode character or combination of characters can be used for symbols in Perl 6.  Here's some counting rods and some cuneiform:
+<lang perl6>sub postfix:<𒋦>($n) { say "$n trilobites" }
+sub term:<𝍧> { unival('𝍧') }
+𝍧𒋦</lang>
+{{out}}
+<pre>8 trilobites</pre>
+And here is a Zalgo-text symbol:
+<lang perl6>sub Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ ($n) { say "$n COMES" }
+Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ 'HE'</lang>
+{{out}}
+<pre>HE COMES</pre>
+Of course, as in other languages, most of the characters you'll typically see in names are going to be alphanumerics from ASCII (or maybe Unicode), but that's a convention, not a limitation, due to the syntactic category notation demonstrated above, which can introduce any sequence of characters as a term or operator.
+Actually, the above is a slight prevarication.  The syntactic category notation does not allow you to use whitespace in the definition of a new symbol. But that leaves many more characters allowed than not allowed.  Hence, it is much easier to enumerate the characters that <em>cannot</em> be used in symbols:
+<lang perl6>say .fmt("%4x"),"\t", uniname($_)
+    if uniprop($_,'Z')
+        for 0..0x1ffff;</lang>
+{{out}}
+<pre>  20	SPACE
+  a0	NO-BREAK SPACE
+	OGHAM SPACE MARK
+	EN QUAD
+	EM QUAD
+	EN SPACE
+	EM SPACE
+	THREE-PER-EM SPACE
+	FOUR-PER-EM SPACE
+	SIX-PER-EM SPACE
+	FIGURE SPACE
+	PUNCTUATION SPACE
+	THIN SPACE
+a	HAIR SPACE
+	LINE SEPARATOR
+	PARAGRAPH SEPARATOR
+f	NARROW NO-BREAK SPACE
+f	MEDIUM MATHEMATICAL SPACE
+	IDEOGRAPHIC SPACE</pre>
+We enforce the whitespace restriction to prevent insanity in the readers of programs.
+That being said, even the whitespace restriction is arbitrary, and can be bypassed by deriving a new grammar and switching to it.  We view all other languages as dialects of Perl 6, even the insane ones.  <tt>:-)</tt>
 =={{header|REXX}}==
@@ Line 620: / Line 622: @@
 }</lang>
 =={{header|Tcl}}==
 Tcl permits ''any'' character to be used in a variable or command name (subject to the restriction that <code>::</code> is a namespace separator and, for variables only, a <code>(…)</code> sequence is an array reference). The set of characters that can be used after <code>$</code> is more restricted, excluding many non-letter-like symbols, but still large. It is ''recommended practice'' to only use ASCII characters for variable names as this makes scripts more resistant to the majority of encoding problems when transporting them between systems, but the language does not itself impose such a restriction.