XXXX redacted: Difference between revisions
m
→{{header|Phix}}: added syntax colouring and online link, with \t handling removed for that.
Alpha bravo (talk | contribs) (Added AutoHotkey) |
m (→{{header|Phix}}: added syntax colouring and online link, with \t handling removed for that.) |
||
Line 1,021:
Written on the assumption that overkill implies partial (see talk page).<br>
utf32_length() fashioned after [[Reverse_a_string#Phix]] with added ZWJ - I do not expect it to be entirely complete.
<!--<lang Phix>
<span style="color: #008080;">enum</span> <span style="color: #000000;">WHOLE</span><span style="color: #0000FF;">,</span><span style="color: #000000;">PARTIAL</span><span style="color: #0000FF;">,</span><span style="color: #000000;">OVERKILL</span><span style="color: #0000FF;">,</span><span style="color: #000000;">INSENSITIVE</span>
<span style="color: #008080;">constant</span> <span style="color: #000000;">spunc</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">" \r\n.?\""</span> <span style="color: #000080;font-style:italic;">-- spaces and punctuation</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">utf32_length</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">utf32</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">l</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">utf32</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #000000;">l</span> <span style="color: #008080;">do</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
<span style="color: #008080;">if</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #000000;">0x300</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><=</span><span style="color: #000000;">0x36f</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">or</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #000000;">0x1dc0</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><=</span><span style="color: #000000;">0x1dff</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">or</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #000000;">0x20d0</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><=</span><span style="color: #000000;">0x20ff</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">or</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #000000;">0xfe20</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><=</span><span style="color: #000000;">0xfe2f</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">l</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">elsif</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0x200D</span> <span style="color: #008080;">then</span> <span style="color: #000080;font-style:italic;">-- ZERO WIDTH JOINER</span>
<span style="color: #000000;">l</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">2</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #008080;">return</span> <span style="color: #000000;">l</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">redact</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">sequence</span> <span style="color: #000000;">t_utf32</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">utf8_to_utf32</span><span style="color: #0000FF;">(</span><span style="color: #000000;">text</span><span style="color: #0000FF;">),</span>
<span style="color: #000000;">l_utf32</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">,</span>
<span style="color: #000000;">w_utf32</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">utf8_to_utf32</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">opt</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"[?|s]"</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">></span><span style="color: #000000;">INSENSITIVE</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">options</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">INSENSITIVE</span>
<span style="color: #000000;">opt</span><span style="color: #0000FF;">[</span><span style="color: #000000;">4</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'i'</span>
<span style="color: #000000;">l_utf32</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">lower</span><span style="color: #0000FF;">(</span><span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">w_utf32</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">lower</span><span style="color: #0000FF;">(</span><span style="color: #000000;">w_utf32</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">opt</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"wpo"</span><span style="color: #0000FF;">[</span><span style="color: #000000;">options</span><span style="color: #0000FF;">]</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">idx</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">while</span> <span style="color: #004600;">true</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">idx</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">match</span><span style="color: #0000FF;">(</span><span style="color: #000000;">w_utf32</span><span style="color: #0000FF;">,</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">,</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">idx</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">edx</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">idx</span><span style="color: #0000FF;">+</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">w_utf32</span><span style="color: #0000FF;">)-</span><span style="color: #000000;">1</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">=</span><span style="color: #000000;">WHOLE</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">if</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">or</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">spunc</span><span style="color: #0000FF;">))</span>
<span style="color: #008080;">and</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">=</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">or</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">spunc</span><span style="color: #0000FF;">))</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">..</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #008000;">'X'</span><span style="color: #0000FF;">,</span><span style="color: #000000;">utf32_length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">..</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">]))</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">elsif</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">=</span><span style="color: #000000;">PARTIAL</span>
<span style="color: #008080;">or</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">=</span><span style="color: #000000;">OVERKILL</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">options</span><span style="color: #0000FF;">=</span><span style="color: #000000;">OVERKILL</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">idx</span><span style="color: #0000FF;">></span><span style="color: #000000;">1</span> <span style="color: #008080;">and</span> <span style="color: #008080;">not</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">spunc</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> <span style="color: #000000;">idx</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">edx</span><span style="color: #0000FF;"><</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">and</span> <span style="color: #008080;">not</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">spunc</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> <span style="color: #000000;">edx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">..</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #008000;">'X'</span><span style="color: #0000FF;">,</span><span style="color: #000000;">utf32_length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">[</span><span style="color: #000000;">idx</span><span style="color: #0000FF;">..</span><span style="color: #000000;">edx</span><span style="color: #0000FF;">]))</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">idx</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">edx</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #000000;">text</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">utf32_to_utf8</span><span style="color: #0000FF;">(</span><span style="color: #000000;">t_utf32</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">return</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">opt</span><span style="color: #0000FF;">,</span><span style="color: #000000;">text</span><span style="color: #0000FF;">}</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">constant</span> <span style="color: #000000;">test</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">`
Tom? Toms bottom tomato is in his stomach while playing the "Tom-tom" brand tom-toms. That's so tom.`</span><span style="color: #0000FF;">,</span>
<span style="color: #000000;">tests</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"Tom"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"tom"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"t"</span><span style="color: #0000FF;">}</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">t</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Redact %s:\n"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">t</span><span style="color: #0000FF;">]})</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">o</span><span style="color: #0000FF;">=</span><span style="color: #000000;">WHOLE</span> <span style="color: #008080;">to</span> <span style="color: #000000;">OVERKILL</span> <span style="color: #008080;">do</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"%s:%s\n"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">redact</span><span style="color: #0000FF;">(</span><span style="color: #000000;">test</span><span style="color: #0000FF;">,</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">t</span><span style="color: #0000FF;">],</span><span style="color: #000000;">o</span><span style="color: #0000FF;">))</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"%s:%s\n"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">redact</span><span style="color: #0000FF;">(</span><span style="color: #000000;">test</span><span style="color: #0000FF;">,</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">t</span><span style="color: #0000FF;">],</span><span style="color: #000000;">o</span><span style="color: #0000FF;">+</span><span style="color: #000000;">INSENSITIVE</span><span style="color: #0000FF;">))</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #008080;">constant</span> <span style="color: #000000;">ut</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"🧑 👨 🧔 👨👩👦"</span><span style="color: #0000FF;">,</span>
<span style="color: #000000;">fmt</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"""
%s
Redact
Redact 👨👩👦 %s %s
"""</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #000000;">fmt</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">ut</span><span style="color: #0000FF;">}&</span><span style="color: #000000;">redact</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ut</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"👨"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">WHOLE</span><span style="color: #0000FF;">)&</span><span style="color: #000000;">redact</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ut</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"👨👩👦"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">WHOLE</span><span style="color: #0000FF;">))</span>
<!--</lang>-->
{{out}}
<pre>
Redact Tom:
Line 1,118 ⟶ 1,119:
[o|s]:Tom? Toms XXXXXX XXXXXX is in his XXXXXXX while playing XXX "XXXXXXX" brand XXXXXXXX. XXXXXX so XXX.
[o|i]:XXX? XXXX XXXXXX XXXXXX is in his XXXXXXX while playing XXX "XXXXXXX" brand XXXXXXXX. XXXXXX so XXX.
🧑 👨 🧔 👨👩👦
Redact 👨 [w|s] 🧑 XX 🧔 👨👩👦
Redact 👨👩👦 [w|s] 🧑 👨 🧔 XXXX
</pre>
You can run this online [http://phix.x10.mx/p2js/redact.htm here]. Note the windows console makes a complete mockery of those unicode characters.
=={{header|Raku}}==
|