Determine if a string is squeezable

From Rosetta Code
Revision as of 22:15, 22 November 2019 by rosettacode>Craigd (→‎{{header|zkl}}: UTF-8 ize)
Determine if a string is squeezable is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Determine if a character string is   squeezable.

And if so,   squeeze the string   (by removing any number of a   specified   immediately repeated   character).


This task is very similar to the task     Determine if a character string is collapsible     except that only a specified character is   squeezed   instead of any character that is immediately repeated.


If a character string has a specified   immediately repeated   character(s),   the repeated characters are to be deleted (removed),   but not the primary (1st) character(s).


A specified   immediately repeated   character is any specified character that is   immediately   followed by an identical character (or characters).   Another word choice could've been   duplicated character,   but that might have ruled out   (to some readers)   triplicated characters   ···   or more.


{This Rosetta Code task was inspired by a newly introduced   (as of around November 2019)   PL/I   BIF:   squeeze.}


Examples

In the following character string with a specified   immediately repeated   character of   e:


 The better the 4-wheel drive, the further you'll be from help when ya get stuck! 


Only the 2nd   e   is an specified repeated character,   indicated by an underscore (above),   even though they (the characters) appear elsewhere in the character string.


So, after squeezing the string, the result would be:

 The better the 4-whel drive, the further you'll be from help when ya get stuck! 



Another example: In the following character string,   using a specified immediately repeated character   s:

 headmistressship 


The "squeezed" string would be:

 headmistreship 


Task

Write a subroutine/function/procedure/routine···   to locate a   specified immediately repeated   character and   squeeze   (delete)   them from the character string.   The character string can be processed from either direction.


Show all output here, on this page:

  •   the   specified repeated character   (to be searched for and possibly squeezed):
  •   the   original string and its length
  •   the resultant string and its length
  •   the above strings should be "bracketed" with   <<<   and   >>>   (to delineate blanks)
  •   «««Guillemets may be used instead for "bracketing" for the more artistic programmers,   shown used here»»»


Use (at least) the following five strings,   all strings are length seventy-two (characters, including blanks),   except the 1st string:

                                                                                  immediately
 string                                                                            repeated
 number                                                                            character
                                                                                     ( ↓   a blank,  a minus,  a seven,  a period)
        ╔╗
   1    ║╚═══════════════════════════════════════════════════════════════════════╗    ' '    ◄■■■■■■  a null string  (length zero)
   2    ║"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln ║    '-'
   3    ║..1111111111111111111111111111111111111111111111111111111111111117777888║    '7'
   4    ║I never give 'em hell, I just tell the truth, and they think it's hell. ║    '.'
   5    ║                                                    --- Harry S Truman  ║  (below)  ◄■■■■■■  has many repeated blanks
        ╚════════════════════════════════════════════════════════════════════════╝     ↑
                                                                                       │
                                                                                       │
        For the 5th string  (Truman's signature line),  use each of these  specified immediately  repeated characters:
                                  •  a blank
                                  •  a minus
                                  •  a lowercase  r


Note:   there should be seven results shown,   one each for the 1st four strings,   and three results for the 5th string.


Related tasks




Go

<lang go>package main

import "fmt"

// Returns squeezed string, original and new lengths in // unicode code points (not normalized). func squeeze(s string, c rune) (string, int, int) {

   r := []rune(s)
   le, del := len(r), 0
   for i := le - 2; i >= 0; i-- {
       if r[i] == c && r[i] == r[i+1] {
           copy(r[i:], r[i+1:])
           del++
       }
   }
   if del == 0 {
       return s, le, le
   }
   r = r[:le-del]
   return string(r), le, len(r)

}

func main() {

   strings := []string{
       "",
       `"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln `,
       "..1111111111111111111111111111111111111111111111111111111111111117777888",
       "I never give 'em hell, I just tell the truth, and they think it's hell. ",
       "                                                   ---  Harry S Truman  ",
       "The better the 4-wheel drive, the further you'll be from help when ya get stuck!",
       "headmistressship",
       "aardvark",
       "😍😀🙌💃😍😍😍🙌",
   }
   chars := [][]rune{{' '}, {'-'}, {'7'}, {'.'}, {' ', '-', 'r'}, {'e'}, {'s'}, {'a'}, {'😍'}}
   for i, s := range strings {
       for _, c := range chars[i] {
           ss, olen, slen := squeeze(s, c)
           fmt.Printf("specified character = %q\n", c)
           fmt.Printf("original : length = %2d, string = «««%s»»»\n", olen, s)
           fmt.Printf("squeezed : length = %2d, string = «««%s»»»\n\n", slen, ss)
       }
   }

}</lang>

Output:
specified character = ' '
original : length =  0, string = «««»»»
squeezed : length =  0, string = «««»»»

specified character = '-'
original : length = 72, string = «««"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln »»»
squeezed : length = 70, string = «««"If I were two-faced, would I be wearing this one?" - Abraham Lincoln »»»

specified character = '7'
original : length = 72, string = «««..1111111111111111111111111111111111111111111111111111111111111117777888»»»
squeezed : length = 69, string = «««..1111111111111111111111111111111111111111111111111111111111111117888»»»

specified character = '.'
original : length = 72, string = «««I never give 'em hell, I just tell the truth, and they think it's hell. »»»
squeezed : length = 72, string = «««I never give 'em hell, I just tell the truth, and they think it's hell. »»»

specified character = ' '
original : length = 72, string = «««                                                   ---  Harry S Truman  »»»
squeezed : length = 20, string = ««« --- Harry S Truman »»»

specified character = '-'
original : length = 72, string = «««                                                   ---  Harry S Truman  »»»
squeezed : length = 70, string = «««                                                   -  Harry S Truman  »»»

specified character = 'r'
original : length = 72, string = «««                                                   ---  Harry S Truman  »»»
squeezed : length = 71, string = «««                                                   ---  Hary S Truman  »»»

specified character = 'e'
original : length = 80, string = «««The better the 4-wheel drive, the further you'll be from help when ya get stuck!»»»
squeezed : length = 79, string = «««The better the 4-whel drive, the further you'll be from help when ya get stuck!»»»

specified character = 's'
original : length = 16, string = «««headmistressship»»»
squeezed : length = 14, string = «««headmistreship»»»

specified character = 'a'
original : length =  8, string = «««aardvark»»»
squeezed : length =  7, string = «««ardvark»»»

specified character = '😍'
original : length =  8, string = «««😍😀🙌💃😍😍😍🙌»»»
squeezed : length =  6, string = «««😍😀🙌💃😍🙌»»»

Julia

<lang julia>const teststringpairs = [

   ("", ' '),
   (""""If I were two-faced, would I be wearing this one?" --- Abraham Lincoln """, '-'),
   ("..1111111111111111111111111111111111111111111111111111111111111117777888", '7'),
   ("""I never give 'em hell, I just tell the truth, and they think it's hell. """, '.'),
   ("                                                    --- Harry S Truman  ", ' '),
   ("                                                    --- Harry S Truman  ", '-'),
   ("                                                    --- Harry S Truman  ", 'r')]

function squeezed(s, c)

   t = isempty(s) ? "" : s[1:1]
   for x in s[2:end]
       if x != t[end] || x != c
           t *= x
       end
   end
   t

end

for (s, c) in teststringpairs

   n, t = length(s), squeezed(s, c)
   println("«««$s»»» (length $n)\n",
       s == t ? "is not squeezed, so remains" : "squeezes to", 
       ":\n«««$t»»» (length $(length(t))).\n")

end

</lang>

Output:
«««»»» (length 0)
is not squeezed, so remains:
«««»»» (length 0).

«««"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln »»» (length 72)
squeezes to:
«««"If I were two-faced, would I be wearing this one?" - Abraham Lincoln »»» (length 70).

«««..1111111111111111111111111111111111111111111111111111111111111117777888»»» (length 72)
squeezes to:
«««..1111111111111111111111111111111111111111111111111111111111111117888»»» (length 69).

«««I never give 'em hell, I just tell the truth, and they think it's hell. »»» (length 72)
is not squeezed, so remains:
«««I never give 'em hell, I just tell the truth, and they think it's hell. »»» (length 72).

«««                                                    --- Harry S Truman  »»» (length 72)
squeezes to:
««« --- Harry S Truman »»» (length 20).

«««                                                    --- Harry S Truman  »»» (length 72)
squeezes to:
«««                                                    - Harry S Truman  »»» (length 70).

«««                                                    --- Harry S Truman  »»» (length 72)
squeezes to:
«««                                                    --- Hary S Truman  »»» (length 71).

Perl 6

Works with: Rakudo version 2019.07.1

<lang perl6>map {

   my $squeeze = $^phrase;
   sink $^reg;
   $squeeze ~~ s:g/($reg)$0+/$0/;
   printf "\nOriginal length: %d <<<%s>>>\nSqueezable on \"%s\": %s\nSqueezed length: %d <<<%s>>>\n",
     $phrase.chars, $phrase, $reg.uniname, $phrase ne $squeeze, $squeeze.chars, $squeeze

},

 , ' ', 
 '"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln ', '-',
 '..1111111111111111111111111111111111111111111111111111111111111117777888', '7',
 "I never give 'em hell, I just tell the truth, and they think it's hell. ", '.',
 '                                                    --- Harry S Truman  ', ' ',
 '                                                    --- Harry S Truman  ', '-',
 '                                                    --- Harry S Truman  ', 'r'</lang>
Output:
Original length: 0 <<<>>>
Squeezable on "SPACE": False
Squeezed length: 0 <<<>>>

Original length: 72 <<<"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln >>>
Squeezable on "HYPHEN-MINUS": True
Squeezed length: 70 <<<"If I were two-faced, would I be wearing this one?" - Abraham Lincoln >>>

Original length: 72 <<<..1111111111111111111111111111111111111111111111111111111111111117777888>>>
Squeezable on "DIGIT SEVEN": True
Squeezed length: 69 <<<..1111111111111111111111111111111111111111111111111111111111111117888>>>

Original length: 72 <<<I never give 'em hell, I just tell the truth, and they think it's hell. >>>
Squeezable on "FULL STOP": False
Squeezed length: 72 <<<I never give 'em hell, I just tell the truth, and they think it's hell. >>>

Original length: 72 <<<                                                    --- Harry S Truman  >>>
Squeezable on "SPACE": True
Squeezed length: 20 <<< --- Harry S Truman >>>

Original length: 72 <<<                                                    --- Harry S Truman  >>>
Squeezable on "HYPHEN-MINUS": True
Squeezed length: 70 <<<                                                    - Harry S Truman  >>>

Original length: 72 <<<                                                    --- Harry S Truman  >>>
Squeezable on "LATIN SMALL LETTER R": True
Squeezed length: 71 <<<                                                    --- Hary S Truman  >>>

REXX

<lang rexx>/*REXX program "squeezes" all immediately repeated characters in a string (or strings). */ @.= /*define a default for the @. array. */

  1. .1= ' '; @.1=
  2. .2= '-'; @.2= '"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln '
  3. .3= '7'; @.3= ..1111111111111111111111111111111111111111111111111111111111111111177788
  4. .4= . ; @.4= "I never give 'em hell, I just tell the truth, and they think it's hell. "
  5. .5= ' '; @.5= ' --- Harry S Truman '
  6. .6= '-'; @.6= @.5
  7. .7= 'r'; @.7= @.5
    do j=1;    L= length(@.j)                   /*obtain the length of an array element*/
    say copies('═', 105)                        /*show a separator line between outputs*/
    if j>1  &  L==0     then leave              /*if arg is null and  J>1, then leave. */
    say '    specified immediate repeatable chararacter='    #.j     "   ('"c2x(#.j)"'x)"
    say '    length='right(L, 3)     "   input=«««" || @.j || '»»»'
    new= squeeze(@.j, #.j)
      w= length(new)
    say '    length='right(w, 3)     "  output=«««" || new || '»»»'
    end   /*j*/

exit /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ squeeze: procedure; parse arg y 1 $ 2,z /*get string; get immed. repeated char.*/

        if pos(z || z, y)==0  then return y     /*No repeated immediate char?  Return Y*/
                                                /* [↑]  Not really needed;  a speed─up.*/
                    do k=2  to length(y)        /*traipse through almost all the chars.*/
                    _= substr(y, k, 1)                      /*pick a character from  Y */
                    if _==right($, 1) & _==z then iterate   /*Same character?  Skip it.*/
                    $= $ || _                               /*append char., it's diff. */
                    end     /*j*/
        return $</lang>
output   when using the internal default inputs:
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter=      ('20'x)
    length=  0    input=«««»»»
    length=  0   output=«««»»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter= -    ('2D'x)
    length= 72    input=«««"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln »»»
    length= 70   output=«««"If I were two-faced, would I be wearing this one?" - Abraham Lincoln »»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter= 7    ('37'x)
    length= 72    input=«««..1111111111111111111111111111111111111111111111111111111111111111177788»»»
    length= 70   output=«««..11111111111111111111111111111111111111111111111111111111111111111788»»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter= .    ('2E'x)
    length= 72    input=«««I never give 'em hell, I just tell the truth, and they think it's hell. »»»
    length= 72   output=«««I never give 'em hell, I just tell the truth, and they think it's hell. »»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter=      ('20'x)
    length= 72    input=«««                                                   ---  Harry S Truman  »»»
    length= 20   output=««« --- Harry S Truman »»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter= -    ('2D'x)
    length= 72    input=«««                                                   ---  Harry S Truman  »»»
    length= 70   output=«««                                                   -  Harry S Truman  »»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════
    specified immediate repeatable chararacter= r    ('72'x)
    length= 72    input=«««                                                   ---  Harry S Truman  »»»
    length= 71   output=«««                                                   ---  Hary S Truman  »»»
═════════════════════════════════════════════════════════════════════════════════════════════════════════

zkl

<lang zkl>fcn squeeze(c,str){ // Works with UTF-8

  s,cc,sz,n := Data(Void,str), String(c,c), c.len(), 0; // byte buffer in case of LOTs of deletes
  while(Void != (n=s.find(cc,n))){ str=s.del(n,sz) }  // and searching is faster for big strings
  s.text

}</lang> <lang zkl>strings:=T( T("",""), T("-","\"If I were two-faced, would I be wearing this one?\" --- Abraham Lincoln "), T("7","..1111111111111111111111111111111111111111111111111111111111111117777888"), T(" ","I never give 'em hell, I just tell the truth, and they think it's hell. "), T(" "," --- Harry S Truman "), T("-"," --- Harry S Truman "), T("r"," --- Harry S Truman "), T("e","The better the 4-wheel drive, the further you'll be from help when ya get stuck!"), T("s","headmistressship"), T("\Ubd;","\Ubc;\Ubd;\Ubd;\Ube;"), );

foreach c,s in (strings){

  println("Squeeze: \"",c,"\"");
  println("Before: %2d <<<%s>>>".fmt(s.len(-8),s));
  sstr:=squeeze(c,s);
  println("After:  %2d <<<%s>>>\n".fmt(sstr.len(-8),sstr));

}</lang>

Output:
Squeeze: ""
Before:  0 <<<>>>
After:   0 <<<>>>

Squeeze: "-"
Before: 72 <<<"If I were two-faced, would I be wearing this one?" --- Abraham Lincoln >>>
After:  70 <<<"If I were two-faced, would I be wearing this one?" - Abraham Lincoln >>>

Squeeze: "7"
Before: 72 <<<..1111111111111111111111111111111111111111111111111111111111111117777888>>>
After:  69 <<<..1111111111111111111111111111111111111111111111111111111111111117888>>>

Squeeze: " "
Before: 72 <<<I never give 'em hell, I just tell the truth, and they think it's hell. >>>
After:  72 <<<I never give 'em hell, I just tell the truth, and they think it's hell. >>>

Squeeze: " "
Before: 72 <<<                                                    --- Harry S Truman  >>>
After:  20 <<< --- Harry S Truman >>>

Squeeze: "-"
Before: 72 <<<                                                    --- Harry S Truman  >>>
After:  70 <<<                                                    - Harry S Truman  >>>

Squeeze: "r"
Before: 72 <<<                                                   ---  Harry S Truman  >>>
After:  71 <<<                                                   ---  Hary S Truman  >>>

Squeeze: "e"
Before: 80 <<<The better the 4-wheel drive, the further you'll be from help when ya get stuck!>>>
After:  79 <<<The better the 4-whel drive, the further you'll be from help when ya get stuck!>>>

Squeeze: "s"
Before: 16 <<<headmistressship>>>
After:  14 <<<headmistreship>>>

Squeeze: "½"
Before:  4 <<<¼½½¾>>>
After:   3 <<<¼½¾>>>