Word break problem

From Rosetta Code
Revision as of 17:24, 20 June 2017 by rosettacode>Craigd (→‎{{header|Rust}}: added zkl header)
Word break problem is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
Task
Given an input string and a dictionary of words, segment the input string into a space-separated sequence of dictionary words if possible.

Perl 6

Works with: Rakudo version 2017.04

This implementation does not necessarily find every combination, it returns the one with the longest matching tokens.

<lang perl6>my @words = <a bc abc cd b>; my $regex = @words.join('|');

put "$_: ", word-break($_) for <abcd abbc abcbcd acdbc abcdd>;

sub word-break (Str $word) { ($word ~~ / ^ (<$regex>)+ $ /)[0] // "Not possible" }</lang>

Output:
abcd: a b cd
abbc: a b bc
abcbcd: abc b cd
acdbc: a cd bc
abcdd: Not possible

REXX

This REXX version allows the words to be tested (and the dictionary words) to be specified on the command line. <lang rexx>/*REXX program breaks up a word (or string) into a list of words from a dictionary. */ parse arg a '/' x; $=space($); x=space(x) /*get optional args; elide extra blanks*/ if a== | a=="," then a='abcd abbc abcbcd acdbc abcdd' /*maybe use the defaults ? */ if x== | x=="," then x='a bc abc cd b' /* " " " " */ na=words(a) /*the number of words to be tested. */ nx=words(x) /* " " " " " the dictionary*/ say nx 'dictionary words:' x /*display the words in the dictionary. */ say /*display a blank line to the terminal.*/ aw=0; do i=1 for na; _=word(a,i) /*obtain a word that will be tested. */

      aw=max(aw,length(_))                      /*find widest width word being tested. */
      end   /*i*/                               /* [↑]  AW  is used to align the output*/

@.=0 /*initialize the dictionary to "null". */ xw=0; do i=1 for nx; _=word(x,i) /*obtain a word from the dictionary. */

      xw=max(xw,length(_));    @._=1            /*find widest width dictionary word.   */
      end   /*i*/                               /* [↑]  define a dictionary word.      */

p=0 /* [↓] process a word in the A list.*/

      do j=1  for na;          yy=word(a,j)     /*YY:   test a word  from the  A  list.*/
        do t=(nx+1)**(xw+1) by -1 to 1  until y==;  y=yy    /*try a word possibility.*/
        $=                                      /*nullify the (possible) result list.  */
            do try=t  while y\=               /*keep testing until  Y  is exhausted. */
            p=(try+p)//xw+1                     /*use a possible width for this attempt*/
            p=fw(y,p);  if p==0  then iterate t /*is this part of the word not found ? */
            $=$ ?                               /*It was found. Add partial to the list*/
            y=substr(y,p+1)                     /*now, use and test the rest of word.  */
            end   /*try*/
        end       /*t*/
      if t==0  then $= '(not possible)'         /*indicate that the word isn't possible*/
      say right(yy,aw)  '───►'  strip($)        /*display the result to the terminal.  */
      end   /*j*/

exit /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ fw: parse arg z,L; do k=L by -1 for L; ?=left(z,k); if @.? then leave; end; return k</lang>

output   when using the default inputs:
5 dictionary words: a bc abc cd b

  abcd ───► a b cd
  abbc ───► a b bc
abcbcd ───► abc b cd
 acdbc ───► a cd bc
 abcdd ───► (not possible)

Rust

Dynamic programming

<lang rust>use std::collections::HashSet; fn create_string(s: &str, v: Vec<Option<usize>>) -> String {

   let mut idx = s.len();
   let mut slice_vec = vec![];
   while let Some(prev) = v[idx] {
       slice_vec.push(&s[prev..idx]); 
       idx = prev;
   }
   slice_vec.reverse();
   slice_vec.join(" ")


}

fn word_break(s: &str, dict: HashSet<&str>) -> Option<String> {

   let size = s.len() + 1;
   let mut possible = vec![None; size];
   let check_word = |i,j| dict.get(&s[i..j]).map(|_| i);
   for i in 1..size {
       possible[i] = possible[i].or_else(|| check_word(0,i));
       if possible[i].is_some() {
           for j in i+1..size {
               possible[j] = possible[j].or_else(|| check_word(i,j));
           }
           if possible[s.len()].is_some() {
               return Some(create_string(s, possible));
           }
       };
   }
   None

}

fn main() {

   let mut set = HashSet::new();
   set.insert("a");
   set.insert("bc");
   set.insert("abc");
   set.insert("cd");
   set.insert("b");
   println!("{:?}", word_break("abcd", set).unwrap());

}</lang>

Output:
"a b cd"

zkl

<lang zkl></lang> <lang zkl></lang>

Output: