Strip control codes and extended characters from a string: Difference between revisions

Rename Perl 6 -> Raku, alphabetize, minor clean-up
(Rename Perl 6 -> Raku, alphabetize, minor clean-up)
Line 91:
<pre>
<<�abc
ádef~
ûÿ!>> - without control characters: <<abcádef~ûÿ!>>
<<�abc
ádef~
ûÿ!>> - without control or extended characters: <<abcdef~!>>
</pre>
Line 125:
control and extended stripped: abz (length 3)
</pre>
 
=={{header|BASIC}}==
{{works with|QBasic}}
Line 462 ⟶ 463:
!"#$%&'()*+,-./0123456789:;<=>?@[\]^_`{|}~</lang>
 
=={{header|C++}}==
<lang Cpp>#include <string>
#include <iostream>
#include <algorithm>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/casts.hpp>
#include <ctime>
#include <cstdlib>
using namespace boost::lambda ;
 
struct MyRandomizer {
char operator( )( ) {
return static_cast<char>( rand( ) % 256 ) ;
}
} ;
 
std::string deleteControls ( std::string startstring ) {
std::string noControls( " " ) ;//creating space for
//the standard algorithm remove_copy_if
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noControls.begin( ) ,
ll_static_cast<int>( _1 ) < 32 && ll_static_cast<int>( _1 ) == 127 ) ;
return noControls ;
}
 
std::string deleteExtended( std::string startstring ) {
std::string noExtended ( " " ) ;//same as above
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noExtended.begin( ) ,
ll_static_cast<int>( _1 ) > 127 || ll_static_cast<int>( _1 ) < 32 ) ;
return noExtended ;
}
int main( ) {
std::string my_extended_string ;
for ( int i = 0 ; i < 40 ; i++ ) //we want the extended string to be 40 characters long
my_extended_string.append( " " ) ;
srand( time( 0 ) ) ;
std::generate_n( my_extended_string.begin( ) , 40 , MyRandomizer( ) ) ;
std::string no_controls( deleteControls( my_extended_string ) ) ;
std::string no_extended ( deleteExtended( my_extended_string ) ) ;
std::cout << "string with all characters: " << my_extended_string << std::endl ;
std::cout << "string without control characters: " << no_controls << std::endl ;
std::cout << "string without extended characters: " << no_extended << std::endl ;
return 0 ;
}</lang>
Output:
<PRE>string with all characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without control characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without extended characters: KO:~75W@>qQ@W-
</PRE>
=={{header|C sharp|C#}}==
Uses the test string from REXX.
Line 565 ⟶ 515:
Stripped of extended: string of , may include control characters and other ilk.
</pre>
 
=={{header|C++}}==
<lang Cpp>#include <string>
#include <iostream>
#include <algorithm>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/casts.hpp>
#include <ctime>
#include <cstdlib>
using namespace boost::lambda ;
 
struct MyRandomizer {
char operator( )( ) {
return static_cast<char>( rand( ) % 256 ) ;
}
} ;
 
std::string deleteControls ( std::string startstring ) {
std::string noControls( " " ) ;//creating space for
//the standard algorithm remove_copy_if
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noControls.begin( ) ,
ll_static_cast<int>( _1 ) < 32 && ll_static_cast<int>( _1 ) == 127 ) ;
return noControls ;
}
 
std::string deleteExtended( std::string startstring ) {
std::string noExtended ( " " ) ;//same as above
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noExtended.begin( ) ,
ll_static_cast<int>( _1 ) > 127 || ll_static_cast<int>( _1 ) < 32 ) ;
return noExtended ;
}
int main( ) {
std::string my_extended_string ;
for ( int i = 0 ; i < 40 ; i++ ) //we want the extended string to be 40 characters long
my_extended_string.append( " " ) ;
srand( time( 0 ) ) ;
std::generate_n( my_extended_string.begin( ) , 40 , MyRandomizer( ) ) ;
std::string no_controls( deleteControls( my_extended_string ) ) ;
std::string no_extended ( deleteExtended( my_extended_string ) ) ;
std::cout << "string with all characters: " << my_extended_string << std::endl ;
std::cout << "string without control characters: " << no_controls << std::endl ;
std::cout << "string without extended characters: " << no_extended << std::endl ;
return 0 ;
}</lang>
Output:
<PRE>string with all characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without control characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without extended characters: KO:~75W@>qQ@W-
</PRE>
 
=={{header|Clojure}}==
Line 620 ⟶ 622:
}</lang>
{{out}}
<pre> abcédefabcédef�
abcédef
abcdef</pre>
Line 650 ⟶ 652:
41> strip_control_codes:task().
String (256 characters): ^@^A^B^C^D^E^F^G^H
^N^O^P^Q^R^S^T^U^V^W^X^Y^Z^[^\^]^^^_ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~^€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ�������������������������������� ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
String without control codes (223 characters): !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ�������������������������������� ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
String without control codes nor extended characters (95 characters): !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
</pre>
Line 679 ⟶ 681:
Output:
<pre>
Original: string of ☺☻♥♦☺☻♥♦�, may include control characters and other ilk.♫☼§►↔◄
Stripped of controls: string of ☺☻♥♦☺☻♥♦�, may include control characters and other ilk.♫☼§►↔◄
Stripped of extended: string of , may include control characters and other ilk.
</pre>
Line 992 ⟶ 994:
source text:
déjà vu
� !~�����
as⃝df̅
 
Line 1,084 ⟶ 1,086:
 
[[Category:String manipulation]]
 
=={{header|Java}}==
{{works with|Java|8+}}
<lang java>import java.util.function.IntPredicate;
 
public class StripControlCodes {
 
public static void main(String[] args) {
String s = "\u0000\n abc\u00E9def\u007F";
System.out.println(stripChars(s, c -> c > '\u001F' && c != '\u007F'));
System.out.println(stripChars(s, c -> c > '\u001F' && c < '\u007F'));
}
 
static String stripChars(String s, IntPredicate include) {
return s.codePoints().filter(include::test).collect(StringBuilder::new,
StringBuilder::appendCodePoint, StringBuilder::append).toString();
}
}</lang>
<pre> abcédef
abcdef</pre>
 
=={{header|JavaScript}}==
Line 1,108 ⟶ 1,130:
<lang JavaScript>"abcd"</lang>
 
=={{header|Java}}==
{{works with|Java|8+}}
<lang java>import java.util.function.IntPredicate;
 
public class StripControlCodes {
 
public static void main(String[] args) {
String s = "\u0000\n abc\u00E9def\u007F";
System.out.println(stripChars(s, c -> c > '\u001F' && c != '\u007F'));
System.out.println(stripChars(s, c -> c > '\u001F' && c < '\u007F'));
}
 
static String stripChars(String s, IntPredicate include) {
return s.codePoints().filter(include::test).collect(StringBuilder::new,
StringBuilder::appendCodePoint, StringBuilder::append).toString();
}
}</lang>
<pre> abcédef
abcdef</pre>
=={{header|jq}}==
{{works with|jq|1.4}}
Line 1,199 ⟶ 1,202:
<pre>
Originally:
String = 123 abcDEFabcDEF�+-*/€æŧðłþ Length = 22
 
After stripping control characters:
Line 1,260 ⟶ 1,263:
end function
</lang>
 
=={{header|Lua}}==
<lang lua>function Strip_Control_Codes( str )
Line 1,287 ⟶ 1,290:
print( Strip_Control_Codes(q) )
print( Strip_Control_and_Extended_Codes(q) )</lang>
<pre> !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ€�‚ƒ„…†‡ˆ‰Š‹Œ�Ž��‘’“”•–—˜™š›œ�žŸ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~</pre>
 
Line 1,302 ⟶ 1,305:
\.15\.16\.17\.18\.19\.1a\[RawEscape]\.1c\.1d\.1e\.1f !"#$%&'()*+,-./
0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]
^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ��������������������������������� ¡¢£¤¥¦§¨©ª«\[Not]­®¯\[Degree]
\[PlusMinus]\.b2\.b3\.b4\[Micro]\[Paragraph]\[CenterDot]¸¹º»¼½¾¿
ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ*ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö/øùúûüýþÿ
Line 1,308 ⟶ 1,311:
stripCtrl[CompleteSet]
->!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]
^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ�������������������������������� ¡¢£¤¥¦§¨©ª«\[Not]­®¯\[Degree]
\[PlusMinus]\.b2\.b3\.b4\[Micro]\[Paragraph]\[CenterDot]
¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ*ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö
Line 1,316 ⟶ 1,319:
->!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]
^_`abcdefghijklmnopqrstuvwxyz{|}~</pre>
 
 
=={{header|MATLAB}} / {{header|Octave}}==
Line 1,460 ⟶ 1,462:
Without extended: L08&YHOn):OG$."zOQ?
</PRE>
 
=={{header|Perl 6}}==
{{works with|Rakudo|2018.03}}
 
<lang perl6>my $str = (0..400).roll(80)».chr.join;
 
say $str;
say $str.subst(/<:Cc>/, '', :g); # unicode property: control character
say $str.subst(/<-[\ ..~]>/, '', :g);</lang>
<pre>kşaNĹĭŗ‘|Ęw�•�"ÄlĄWł8iCƁę��Ż�¬ž5ĎĶ'óü¸'ÍŸ;ŢƐ¦•´ŷQċűÒŴ$ÃŅ‰Đįð+=ĥƂ+Ōĭħ¼ŕc¤H~ìïēÕ
kşaNĹĭŗ|Ęw"ÄlĄWł8iCƁ꯬5ĎĶ'óü¸'ÍŸ;ŢƐ¦´ŷQċűÒŴ$ÃŅĐįð+=ĥƂ+Ōĭħ¼ŕc¤H~ìïēÕ
kaN|w"lW8iC5'';Q$+=+cH~</pre>
 
=={{header|Phix}}==
Line 1,500 ⟶ 1,490:
{{out}}
<pre>
The full string: " abc+®defdef�", Length:11
No Control Chars: " abc+®def", Length:9
" and no Extended: " abcdef", Length:7
Line 1,747 ⟶ 1,737:
(regexp-replace* #rx"[^\040-\176]+" str ""))
</lang>
 
=={{header|Raku}}==
(formerly Perl 6)
{{works with|Rakudo|2018.03}}
 
<lang perl6>my $str = (0..400).roll(80)».chr.join;
 
say $str;
say $str.subst(/<:Cc>/, '', :g); # unicode property: control character
say $str.subst(/<-[\ ..~]>/, '', :g);</lang>
<pre>kşaNĹĭŗ�|Ęw���"ÄlĄWł8iCƁę��Ż�¬�5ĎĶ'óü¸'ÍŸ;ŢƐ¦�´ŷQċűÒŴ$ÃŅ�Đįð+=ĥƂ+Ōĭħ¼ŕc¤H~ìïēÕ
kşaNĹĭŗ|Ęw"ÄlĄWł8iCƁ꯬5ĎĶ'óü¸'ÍŸ;ŢƐ¦´ŷQċűÒŴ$ÃŅĐįð+=ĥƂ+Ōĭħ¼ŕc¤H~ìïēÕ
kaN|w"lW8iC5'';Q$+=+cH~</pre>
 
=={{header|REXX}}==
Line 1,852 ⟶ 1,855:
input : chr$(31)+"abc"+chr$(13)+"def"+chr$(11)+"ghi"+chr$(10)
output : abcdefghi</pre>
 
 
=={{header|Scala}}==
Line 1,998 ⟶ 2,000:
source text:
déjà vu
� !~€ÿ��ÿ
as⃝df̅
Stripped of control codes:
déjà vu !~€ÿas⃝df̅�ÿas⃝df̅
Stripped of control codes and extended characters:
dj vu !~asdf
Line 2,088 ⟶ 2,090:
End Function
 
WScript.StdOut.Write "ab�cd�ef�gh€ab�cd�ef�gh�€" & " = " & StripCtrlCodes("ab�cd�ef�gh€ab�cd�ef�gh�€")
WScript.StdOut.WriteLine
WScript.StdOut.Write "ab�cd�ef�ghij†klð€ab�cd�ef�gh�ij†klð€" & " = " & StripCtrlCodesExtChrs("ab�cd�ef�ghij†klð€ab�cd�ef�gh�ij†klð€")
WScript.StdOut.WriteLine
</lang>
Line 2,096 ⟶ 2,098:
{{Out}}
<pre>
ab�cd�ef�gh€ab�cd�ef�gh�€ = abcdefgh€
ab�cd�ef�ghij†klð€ab�cd�ef�gh�ij†klð€ = abcdefghijkl
</pre>
 
10,333

edits