Strip control codes and extended characters from a string: Difference between revisions

Content added Content deleted
(add BQN)
m (syntax highlighting fixup automation)
Line 22:
<langsyntaxhighlight lang="11l">F stripped(s)
R s.filter(i -> Int(i.code) C 32..126).join(‘’)
Line 33:
=={{header|8086 Assembly}}==
<langsyntaxhighlight lang="asm"> .model small
.stack 1024
Line 104:
int 10h
end start</langsyntaxhighlight>
Line 113:
<langsyntaxhighlight Actionlang="action!">BYTE FUNC IsAscii(CHAR c)
IF c<32 OR c>124 OR c=96 OR c=123 THEN
Line 142:
PrintF("Stripped string: ""%S""%E",dst)
[ Screenshot from Atari 8-bit computer]
Line 152:
<langsyntaxhighlight Adalang="ada">with Ada.Text_IO;
procedure Strip_ASCII is
Line 187:
Put_Line("Neither_Extended:", Filter(Full, Above => Character'Last)); -- defaults for From and To
end Strip_ASCII;
Line 197:
=={{header|ALGOL 68}}==
<langsyntaxhighlight lang="algol68"># remove control characters and optionally extended characters from the string text #
# assums ASCII is the character set #
PROC strip characters = ( STRING text, BOOL strip extended )STRING:
Line 226:
STRING t = REPR 2 + "abc" + REPR 10 + REPR 160 + "def~" + REPR 127 + REPR 10 + REPR 150 + REPR 152 + "!";
print( ( "<<" + t + ">> - without control characters: <<" + strip characters( t, FALSE ) + ">>", newline ) );
print( ( "<<" + t + ">> - without control or extended characters: <<" + strip characters( t, TRUE ) + ">>", newline ) )</langsyntaxhighlight>
Line 238:
<langsyntaxhighlight lang="rebol">str: {string of ☺☻♥♦⌂, may include control characters and other ♫☼§►↔◄░▒▓█┌┴┐±÷²¬└┬┘ilk.}
print "with extended characters"
Line 247:
print join select split str 'x ->
and? ascii? x
not? in? to :integer to :char x (0..31)++127</langsyntaxhighlight>
Line 258:
<langsyntaxhighlight AHKlang="ahk">Stripped(x){
Loop Parse, x
if Asc(A_LoopField) > 31 and Asc(A_LoopField) < 128
Line 264:
return r
MsgBox % stripped("`ba" Chr(00) "b`n`rc`fd" Chr(0xc3))</langsyntaxhighlight>
<syntaxhighlight lang="awk">
<lang AWK>
Line 276:
Line 289:
While DOS does support ''some'' extended characters, they aren't entirely standardized, and shouldn't be relied upon.
<langsyntaxhighlight lang="qbasic">DECLARE FUNCTION strip$ (what AS STRING)
Line 326:
strip2$ = outP
END FUNCTION</langsyntaxhighlight>
Line 336:
=={{header|BBC BASIC}}==
<langsyntaxhighlight lang="bbcbasic"> test$ = CHR$(9) + "Fran" + CHR$(231) + "ais." + CHR$(127)
PRINT "Original ISO-8859-1 string: " test$ " (length " ; LEN(test$) ")"
test$ = FNstripcontrol(test$)
Line 362:
= A$</langsyntaxhighlight>
Line 373:
Using BQN's character arithmetic and comparison, characters are binned using <code>⍋</code> and removed if they are inside the range.
<langsyntaxhighlight lang="bqn">StripCt←((1≠(@+0‿32)⊸⍋)∧(@+127)⊸≠)⊸/
<langsyntaxhighlight lang="bqn"> RP←•rand.Deal∘≠⊸⊏ # Random Permutation
ascii←RP @+↕256
Line 400:
≠StripCtEx unicode
<langsyntaxhighlight lang="bracmat">( "string of ☺☻♥♦⌂, may include control
characters and other ilk.\L\D§►↔◄
Rødgrød med fløde"
Line 429:
& );</langsyntaxhighlight>
<pre>Control characters stripped:
Line 447:
A true/false function checks if the character is in the valid range.<br>
<langsyntaxhighlight Clang="c">#include <stdlib.h>
#include <stdio.h>
#include <string.h>
Line 573:
return 0;
Line 585:
===<tt>apply mask from a table</tt>===
<langsyntaxhighlight Clang="c">#include <stdio.h>
#include <stdlib.h>
Line 648:
return 0;
}</langsyntaxhighlight>output:<syntaxhighlight lang="text"> !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ <odd stuff my xterm thinks are bad unicode hence can't be properly shown>
=={{header|C sharp|C#}}==
Uses the test string from REXX.
<langsyntaxhighlight lang="csharp">
using System;
using System.Collections.Generic;
Line 697:
Line 706:
<langsyntaxhighlight Cpplang="cpp">#include <string>
#include <iostream>
#include <algorithm>
Line 748:
std::cout << "string without extended characters: " << no_extended << std::endl ;
return 0 ;
<PRE>string with all characters: K�O:~���7�5����
Line 758:
<langsyntaxhighlight lang="clojure">; generate our test string of characters with control and extended characters
(def range-of-chars (apply str (map char (range 256))))
Line 765:
; filter to return String of characters that are between 32 - 126:
(apply str (filter #(<= 32 (int %) 126) range-of-chars))</langsyntaxhighlight>
=={{header|Common Lisp}}==
Line 786:
<langsyntaxhighlight lang="d">import std.traits;
S stripChars(S)(S s, bool function(dchar) pure nothrow mustStrip)
Line 804:
writeln(s.stripChars( c => isControl(c) || c == '\u007F' ));
writeln(s.stripChars( c => isControl(c) || c >= '\u007F' ));
<pre> abcédef�
Line 812:
Exported functions to be used by [[Update_a_configuration_file]]
<syntaxhighlight lang="erlang">
<lang Erlang>
-module( strip_control_codes ).
Line 831:
String_without_cc_nor_ec = lists:filter( fun is_not_control_code_nor_extended_character/1, String ),
io:fwrite( "String without control codes nor extended characters (~p characters): ~s~n", [erlang:length(String_without_cc_nor_ec), String_without_cc_nor_ec] ).
Line 843:
=={{header|F Sharp|F#}}==
Uses test string from REXX.
<langsyntaxhighlight lang="fsharp">
open System
Line 862:
printfn "Stripped of extended: %s" (stripExtended test)
0//main must return integer, much like in C/C++
Line 871:
<syntaxhighlight lang="text">USING: ascii kernel sequences ;
: strip-control-codes ( str -- str' ) [ control? not ] filter ;
: strip-control-codes-and-extended ( str -- str' )
strip-control-codes [ ascii? ] filter ;</langsyntaxhighlight>
<langsyntaxhighlight lang="forth">: strip ( buf len -- buf len' ) \ repacks buffer, so len' <= len
over + over swap over ( buf dst limit src )
Line 886:
over - ;</langsyntaxhighlight>
<langsyntaxhighlight lang="fortran">module stripcharacters
implicit none
Line 943:
write (*,*) strip(string,not_extended)
end program test
<langsyntaxhighlight lang="freebasic">' FB 1.05.0 Win64
Function stripControlChars(s As Const String) As String
Line 1,023:
Print "Press any key to quit"
Line 1,039:
<langsyntaxhighlight lang="frink">stripExtended[str] := str =~ %s/[^\u0020-\u007e]//g
stripControl[str] := str =~ %s/[\u0000-\u001F\u007f]//g
println[stripExtended[char[0 to 127]]]
println[stripControl[char[0 to 127]]]</langsyntaxhighlight>
Line 1,053:
'''[ Click this link to run this code]'''
<langsyntaxhighlight lang="gambas">Public Sub Main()
Dim sString As String = "The\t \equick\n \fbrownfox \vcost £125.00 or €145.00 or $160.00 \bto \ncapture ©®"
Dim sStd, sExtend As String
Line 1,083:
Return sResult
Line 1,097:
Go works for ASCII and non-ASCII systems. The first pair of functions below interpret strings as byte strings, presumably useful for strings consisting of ASCII and 8-bit extended ASCII data. The second pair of functions interpret strings as UTF-8.
<langsyntaxhighlight lang="go">package main
import (
Line 1,186:
fmt.Println("\nas decomposed and stripped Unicode:")
Output: (varies with display configuration)
Line 1,211:
<langsyntaxhighlight Groovylang="groovy">def stripControl = { it.replaceAll(/\p{Cntrl}/, '') }
def stripControlAndExtended = { it.replaceAll(/[^\p{Print}]/, '') }</langsyntaxhighlight>
<langsyntaxhighlight Groovylang="groovy">def text = (0..255).collect { (char) it }.join('')
def textMinusControl = text.findAll { int v = (char)it; v > 31 && v != 127 }.join('')
def textMinusControlAndExtended = textMinusControl.findAll {((char)it) < 128 }.join('')
assert stripControl(text) == textMinusControl
assert stripControlAndExtended(text) == textMinusControlAndExtended</langsyntaxhighlight>
<langsyntaxhighlight Haskelllang="haskell">import Control.Applicative (liftA2)
strip, strip2 :: String -> String
Line 1,233:
main =
(putStrLn . unlines) $
[strip, strip2] <*> ["alphabetic 字母 with some less parochial parts"]</langsyntaxhighlight>
<pre>alphabetic with some less parochial parts
Line 1,240:
=={{header|Icon}} and {{header|Unicon}}==
We'll use ''deletec'' to remove unwanted characters (2nd argument) from a string (1st argument). The procedure below coerces types back and forth between string and cset. The character set of unwanted characters is the difference of all ASCII characters and the ASCII characters from 33 to 126.
<langsyntaxhighlight Iconlang="icon">procedure main(A)
link strings
{{libheader|Icon Programming Library}}
Line 1,250:
The IPL procedure ''deletec'' is equivalent to this:
<langsyntaxhighlight Iconlang="icon">procedure deletec(s, c) #: delete characters
result := ""
s ? {
Line 1,256:
return result ||:= tab(0)
Line 1,263:
<langsyntaxhighlight lang="j">stripControlCodes=: -.&(DEL,32{.a.)
stripControlExtCodes=: ([ -. -.)&(32}.127{.a.)</langsyntaxhighlight>
<langsyntaxhighlight lang="j"> mystring=: a. {~ ?~256 NB. ascii chars 0-255 in random order
#mystring NB. length of string
Line 1,280:
stripControlExtCodes myunicodestring
k}w:]U3xEh9"GZdr/#^B.Sn%\uFOo[(`t2-J6*IA=Vf&N;lQ8,${XLz5?D0~s)'Y7Kq|ip4<WRCaM!b@cgv_T +mH>1ejPy</langsyntaxhighlight>
Generally speaking, <code>([-.-.)</code> gives us the contents from the sequence on the left, restricted to only the items which appear in the sequence on the right.
Line 1,288:
{{works with|Java|8+}}
<langsyntaxhighlight lang="java">import java.util.function.IntPredicate;
public class StripControlCodes {
Line 1,302:
StringBuilder::appendCodePoint, StringBuilder::append).toString();
<pre> abcédef
Line 1,310:
===ES 5===
<langsyntaxhighlight JavaScriptlang="javascript">(function (strTest) {
// s -> s
Line 1,323:
return strip(strTest);
<langsyntaxhighlight JavaScriptlang="javascript">"abcd"</langsyntaxhighlight>
{{works with|jq|1.4}}
<langsyntaxhighlight lang="jq">def strip_control_codes:
explode | map(select(. > 31 and . != 127)) | implode;
def strip_extended_characters:
explode | map(select(31 < . and . < 127)) | implode;</langsyntaxhighlight>
<langsyntaxhighlight lang="jq">def string: "string of ☺☻♥♦⌂, may include control characters such as null(\u0000) and other ilk.\n§►↔◄\nRødgrød med fløde";
"string | strip_control_codes\n => \(string | strip_control_codes)",
"string | strip_extended_characters\n => \(string | strip_extended_characters)"</langsyntaxhighlight>
<langsyntaxhighlight lang="sh">$ jq -n -r -f Strip_control_codes_and_extended_characters.jq
string | strip_control_codes
=> string of ☺☻♥♦⌂, may include control characters such as null() and other ilk.§►↔◄Rødgrød med fløde
string | strip_extended_characters
=> string of , may include control characters such as null() and other ilk.Rdgrd med flde</langsyntaxhighlight>
<syntaxhighlight lang="julia">
<lang Julia>
stripc0{T<:String}(a::T) = replace(a, r"[\x00-\x1f\x7f]", "")
stripc0x{T<:String}(a::T) = replace(a, r"[^\x20-\x7e]", "")
Line 1,359:
println("\nWith C0 control characters removed:\n ", stripc0(a))
println("\nWith C0 and extended characters removed:\n ", stripc0x(a))
Line 1,375:
<langsyntaxhighlight lang="scala">// version 1.1.2
fun String.strip(extendedChars: Boolean = false): String {
Line 1,396:
val u = s.strip(true)
println("String = $u Length = ${u.length}")
Line 1,411:
<langsyntaxhighlight lang="langur">val .str = "()\x15abcd\uFFFF123\uBBBB!@#$%^&*\x01"
writeln "original : ", .str
writeln "without ctrl chars: ", replace(.str, RE/\p{Cc}/, ZLS)
writeln "print ASCII only : ", replace(.str, re/[^ -~]/, ZLS)</langsyntaxhighlight>
Line 1,423:
=={{header|Liberty BASIC}}==
<syntaxhighlight lang="lb">
<lang lb>
all$ =""
for i =0 to 255
Line 1,461:
extendedStripped$ =r$
end function
<langsyntaxhighlight lang="lua">function Strip_Control_Codes( str )
local s = ""
for i in str:gmatch( "%C+" ) do
Line 1,488:
print( Strip_Control_Codes(q) )
print( Strip_Control_and_Extended_Codes(q) )</langsyntaxhighlight>
<pre> !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€�‚ƒ„…†‡ˆ‰Š‹Œ�Ž��‘’“”•–—˜™š›œ�žŸ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
=={{header|Mathematica}}/{{header|Wolfram Language}}==
<langsyntaxhighlight Mathematicalang="mathematica">stripCtrl[x_]:=StringJoin[Select[Characters[x],
Line 1,521:
=={{header|MATLAB}} / {{header|Octave}}==
<langsyntaxhighlight MATLABlang="matlab"> function str = stripped(str)
str = str(31<str & str<127);
end; </langsyntaxhighlight>
<langsyntaxhighlight lang="nim">proc stripped(str: string): string =
result = ""
for c in str:
Line 1,539:
echo strippedControl "\ba\x00b\n\rc\fdÄ"
echo stripped "\ba\x00b\n\rc\fd\xc3"</langsyntaxhighlight>
Line 1,546:
<langsyntaxhighlight lang="ocaml">let is_control_code c =
let d = int_of_char c in
d < 32 || d = 127
Line 1,578:
print_endline (strip is_control_code s);
print_endline (strip (fun c -> (is_control_code c) || (is_extended_char c)) s);
{{works with|Free_Pascal}}
<langsyntaxhighlight lang="pascal">program StripCharacters(output);
function Strip (s: string; control, extended: boolean): string;
Line 1,607:
writeln ('No extnd: ', Strip(test, false, true));
writeln ('ASCII: ', Strip(test, true, true));
<pre>% ./StripCharacters
Line 1,620:
Peloton has a native instruction for removing control codes from a string, SAL, the Low ASCII Strip. From the manual:
<langsyntaxhighlight lang="sgml">Create variable with control characters: <@ SAYLETVARLIT>i|This string has control characters
- - - - - -
Line 1,627:
Assign infix <@ LETVARSALVAR>j|i</@> <@ SAYVAR>j</@>
Assign prepend <@ LETSALVARVAR>k|i</@> <@ SAYVAR>k</@>
Reflexive assign <@ ACTSALVAR>i</@> <@ SAYVAR>i</@></langsyntaxhighlight>
Peloton also has SAH, High ASCII Strip. Again, from the manual:
<langsyntaxhighlight lang="sgml">Create variable with high and low ANSI: <@ SAYLETVARLIT>i|This string has both low ansi and high ansi characters - il doit d'être prévenu</@>
Strip high ANSI <@ SAYSAHVAR>i</@>
Assign infix <@ LETVARSAHVAR>j|i</@> <@ SAYVAR>j</@>
Assign prepend <@ LETSAHVARVAR>k|i</@> <@ SAYVAR>k</@>
Reflexive assign <@ ACTSAHVAR>i</@> <@ SAYVAR>i</@></langsyntaxhighlight>
<langsyntaxhighlight Perllang="perl">#!/usr/bin/perl -w
use strict ;
Line 1,655:
print "\nWithout extended: " ;
print join( '' , map { chr( $_ ) } @noextended ) ;
print "\n" ;</langsyntaxhighlight>
<PRE>before sanitation : �L08&YH�O��n)�:���O�G$���.���"zO���Q�?��
Line 1,667:
to build a new one character-by-character.<br>
I credited Ada solely for the sensible fromch / toch / abovech idea.
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #7060A8;">requires</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"1.0.2"</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- (param default fixes in pwa/p2js)</span>
Line 1,690:
<span style="color: #000000;">put_line</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"No Control Chars:"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">filter_it</span><span style="color: #0000FF;">(</span><span style="color: #000000;">full</span><span style="color: #0000FF;">))</span> <span style="color: #000080;font-style:italic;">-- default values for fromch, toch, and abovech</span>
<span style="color: #000000;">put_line</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"\" and no Extended:"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">filter_it</span><span style="color: #0000FF;">(</span><span style="color: #000000;">full</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">abovech</span><span style="color: #0000FF;">:=</span><span style="color: #000000;">#FF</span><span style="color: #0000FF;">))</span> <span style="color: #000080;font-style:italic;">-- defaults for fromch and toch</span>
(desktop/Phix, in a grubby Windows console)
Line 1,700:
The full string: " abcédef abcédef�", Length:10
No Control Chars: " abcédef", Length:8
" and no Extended: " abcdef", Length:7
Line 1,709:
Control characters in strings are written with a hat (^) in PicoLisp. ^? is the DEL character.
<langsyntaxhighlight PicoLisplang="picolisp">(de stripCtrl (Str)
Line 1,720:
'((C) (> "^?" C "^_"))
(chop Str) ) ) )</langsyntaxhighlight>
<pre>: (char "^?")
Line 1,735:
<langsyntaxhighlight Pikelang="pike">> string input = random_string(100);
> (string)((array)input-enumerate(32)-enumerate(255-126,1,127));
Result: "p_xx08M]cK<FHgR3\\I.x>)Tm<VgakYddy&P7"</langsyntaxhighlight>
<syntaxhighlight lang="pl/i">
<lang PL/I>
stripper: proc options (main);
declare s character (100) varying;
Line 1,806:
end stripper;
Line 1,816:
<syntaxhighlight lang="powershell">
<lang PowerShell>
function Remove-Character
Line 1,864:
<syntaxhighlight lang="powershell">
<lang PowerShell>
$test = "$([char]9)Français."
Line 1,873:
"Extended characters stripped : `"$($test | Remove-Character -Extended)`""
"Control & extended stripped : `"$($test | Remove-Character)`""
Line 1,881:
Control & extended stripped : "Franais."
<syntaxhighlight lang="powershell">
<lang PowerShell>
"Français", "Čeština" | Remove-Character -Extended
Line 1,891:
<langsyntaxhighlight PureBasiclang="purebasic">Procedure.s stripControlCodes(source.s)
Protected i, *ptrChar.Character, length = Len(source), result.s
*ptrChar = @source
Line 1,928:
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input()
Sample output:
<pre>»╫=┐C─≡G(═ç╤â√╝÷╔¬ÿ▌x  è4∞|)ï└⌐ƒ9²òτ┌ºáj)▓<~-vPÿφQ╨ù¿╖îFh"[ü╗dÉ₧q#óé├p╫■
Line 1,935:
<langsyntaxhighlight Pythonlang="python">stripped = lambda s: "".join(i for i in s if 31 < ord(i) < 127)
print(stripped("\ba\x00b\n\rc\fd\xc3"))</langsyntaxhighlight>Output:<syntaxhighlight lang="text">abcd</langsyntaxhighlight>
<syntaxhighlight lang="racket">
<lang Racket>
#lang racket
;; Works on both strings (Unicode) and byte strings (raw/ASCII)
Line 1,948:
(define (strip-controls-and-extended str)
(regexp-replace* #rx"[^\040-\176]+" str ""))
Line 1,954:
{{works with|Rakudo|2018.03}}
<syntaxhighlight lang="raku" perl6line>my $str = (0..400).roll(80)».chr.join;
say $str;
say $str.subst(/<:Cc>/, '', :g); # unicode property: control character
say $str.subst(/<-[\ ..~]>/, '', :g);</langsyntaxhighlight>
Line 1,967:
===idiomatic version===
This REXX version processes each character in an idiomatic way &nbsp; (if it's a wanted character, then keep it).
<langsyntaxhighlight lang="rexx">/*REXX program strips all "control codes" from a character string (ASCII or EBCDIC). */
z= 'string of ☺☻♥♦⌂, may include control characters and other ♫☼§►↔◄░▒▓█┌┴┐±÷²¬└┬┘ilk.'
@=' !"#$%&''()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
Line 1,976:
say 'old = »»»'z"«««" /*add ««fence»» before & after old text*/
say 'new = »»»'$"«««" /* " " " " " new " */</langsyntaxhighlight>
Line 1,989:
Because there are &nbsp; (or should be) &nbsp; fewer unwanted characters than wanted characters, this version is faster.
<langsyntaxhighlight lang="rexx">/*REXX program strips all "control codes" from a character string (ASCII or EBCDIC). */
x= 'string of ☺☻♥♦⌂, may include control characters and other ♫☼§►↔◄░▒▓█┌┴┐±÷²¬└┬┘ilk.'
@=' !"#$%&''()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij' || ,
Line 1,999:
say 'old = »»»' || x || "«««" /*add ««fence»» before & after old text*/
say 'new = »»»' || $ || "«««" /* " " " " " new " */</langsyntaxhighlight>
{{out|output|text=&nbsp; is identical to the 1<sup>st</sup> REXX version.}} <br><br>
<langsyntaxhighlight lang="ring">
s = char(31) + "abc" + char(13) + "def" + char(11) + "ghi" + char(10)
see strip(s) + nl
Line 2,016:
return strip
<langsyntaxhighlight lang="ruby">class String
def strip_control_characters()
chars.each_with_object("") do |char, str|
Line 2,035:
p s = "\ba\x00b\n\rc\fd\xc3\x7ffoo"
p s.strip_control_characters
p s.strip_control_and_extended_characters</langsyntaxhighlight>
Line 2,043:
=={{header|Run BASIC}}==
<langsyntaxhighlight lang="runbasic">s$ = chr$(31) + "abc" + chr$(13) + "def" + chr$(11) + "ghi" + chr$(10)
print strip$(s$)
Line 2,063:
end if
next i
END FUNCTION</langsyntaxhighlight>
input : chr$(31)+"abc"+chr$(13)+"def"+chr$(11)+"ghi"+chr$(10)
Line 2,070:
===ASCII: Using StringOps Class===
<langsyntaxhighlight Scalalang="scala">val controlCode : (Char) => Boolean = (c:Char) => (c <= 32 || c == 127)
val extendedCode : (Char) => Boolean = (c:Char) => (c <= 32 || c > 127)
Line 2,081:
println( "ctrl and extended filtered out: \n\n" +
teststring.filterNot(controlCode).filterNot(extendedCode) + "\n" )</langsyntaxhighlight>
<pre>ctrl filtered out:
Line 2,097:
===Unicode: Using Regular Expressions===
<syntaxhighlight lang="scala">//
<lang Scala>//
// A Unicode test string
Line 2,116:
val htmlNoExtCode = for( i <- sNoExtCode.indices ) yield
"&#" + sNoExtCode(i).toInt + ";" + (if( (i+1) % 10 == 0 ) "\n" else "")
println( "ctrl and extended filtered out: <br/><br/>\n\n" + htmlNoExtCode.mkString + "<br/><br/>\n" )</langsyntaxhighlight>
<pre>ctrl filtered out:
Line 2,156:
Unicode characters with UTF-8 encoding to the console.
<langsyntaxhighlight lang="seed7">$ include "seed7_05.s7i";
include "utf8.s7i";
Line 2,206:
writeln("Stripped of control codes and extended characters:");
end func;</langsyntaxhighlight>
Line 2,221:
<langsyntaxhighlight lang="ruby">var str = "\ba\x00b\n\rc\fd\xc3\x7ffoo"
var letters ={.ord}
Line 2,230:
var noextended = nocontrols.grep{ _ < 127 }
Line 2,239:
=={{header|Standard ML}}==
<langsyntaxhighlight lang="sml">(* string -> string *)
val stripCntrl = concat o String.tokens Char.isCntrl
(* string -> string *)
val stripCntrlAndExt = concat o String.tokens (not o Char.isPrint)</langsyntaxhighlight>
<langsyntaxhighlight lang="tcl">proc stripAsciiCC str {
regsub -all {[\u0000-\u001f\u007f]+} $str ""
proc stripCC str {
regsub -all {[^\u0020-\u007e]+} $str ""
=={{header|TI-83 BASIC}}==
Line 2,258:
The following "normal characters" do exist, but can't be typed on the calculator and a hex editor must be used to enter them:
<syntaxhighlight lang ="ti83b">#$&@;_`abcdefghijklmnopqrstuvwxyz|~</langsyntaxhighlight>
The double quote character (ASCII decimal 34) can be entered, but cannot be escaped and thus cannot be stored to strings without the use of hex editors. The following program will remove double quotes from the input string if they were hacked in simply because having one stored to the "check" string is syntactically invalid.
Line 2,264:
So, in sum, you have to hack the calculator to enter in this program, but once it's entered you can transfer it to unhacked calculators and it will work.
<langsyntaxhighlight lang="ti83b">:" !#$%&'()*+,-./0123456789:;<=>?ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~"→Str0
:Input ">",Str1
Line 2,272:
:Pause Str1</langsyntaxhighlight>
Line 2,278:
<langsyntaxhighlight lang="txrlisp">(defun strip-controls (str)
(regsub #/[\x0-\x1F\x7F]+/ "" str))
(defun strip-controls-and-extended (str)
(regsub #/[^\x20-\x7F]+/ "" str))</langsyntaxhighlight>
Derived from the BASIC version.
<syntaxhighlight lang="vb">
<lang vb>
Function StripCtrlCodes(s)
tmp = ""
Line 2,313:
WScript.StdOut.Write "ab�cd�ef�gh�ij†klð€" & " = " & StripCtrlCodesExtChrs("ab�cd�ef�gh�ij†klð€")
Line 2,323:
<langsyntaxhighlight lang="vlang">fn main() {
Line 2,331:
for value in source {if value > 31 && value < 128 {result += value.ascii_str()}}
return result
Line 2,340:
<langsyntaxhighlight lang="ecmascript">import "/pattern" for Pattern
var s = "\t\n\r\x01\0\fabc\v\v\b\a\x1f\x7f🌇Páez😃É"
Line 2,353:
r = p.replaceAll(s, "")
System.print("%(r) -> length %(r.count)")
Line 2,362:
<langsyntaxhighlight XPL0lang="xpl0">include c:\cxpl\codes; \intrinsic 'code' declarations
string 0; \use zero-terminated string convention
Line 2,390:
Strip(String, true);
Text(0, String); CrLf(0);
Line 2,402:
<langsyntaxhighlight lang="zkl">var ctlCodes=([1..31].pump(String,"toChar") +(127).toChar());
var extdChars=[127..255].pump(String,"toChar");
Line 2,410:
(test-extdChars).println("<< no extended chars");
(test-extdChars-ctlCodes).println("<< text");