Teacup rim text: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎Functional Python: Updated word source, formatted output.)
Line 436: Line 436:




# Perhaps add one variant as groupByComparing ? (across JS, AS too ?)
# groupBy :: (a -> b) -> [a] -> [[a]]
# groupBy :: (a -> b) -> [a] -> [[a]]
def groupBy(f):
def groupBy(f):

Revision as of 11:32, 5 August 2019

Teacup rim text is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

On a set of coasters we have, there's a picture of a teacup. On the rim of the teacup the word "TEA" appears a number of times separated by bullet characters. It occurred to me that if the bullet were removed and the words run together, you could start at any letter and still end up with a meaningful three-letter word. So start at the "T" and read "TEA". Start at the "E" and read "EAT", or start at the "A" and read "ATE".

That got me thinking that maybe there are other words that could be used rather that "TEA". And that's just English. What about Italian or Greek or ... um ... Telugu. For English, use the MIT 10000 word list located at https://www.mit.edu/~ecprice/wordlist.10000

So here's the task: from a web accessible or locally accessible word source, iterate through each word of length 3 or more. With each word, peel off the first letter and put it at the end. Check if the word exists. If it does, keep going with the next letter, repeating the process for as many letters as there are minus one. If all of the words exist store the original word. List each word that survives the filtration process along with all its variants. Having listed a set, for example [ate tea eat], resist displaying permutations of that set, e.g. [eat tea ate] etc.

Factor

<lang factor>USING: fry hash-sets http.client kernel math prettyprint sequences sequences.extras sets sorting splitting ;

"https://www.mit.edu/~ecprice/wordlist.10000" http-get nip "\n" split [ length 2 > ] filter [ [ all-rotations ] map ] [ >hash-set ] bi '[ [ _ in? ] all? ] filter [ natural-sort ] map members .</lang>

Output:
{
    { "aaa" "aaa" "aaa" }
    { "aim" "ima" "mai" }
    { "arc" "car" "rca" }
    { "asp" "pas" "spa" }
    { "ate" "eat" "tea" }
    { "iii" "iii" "iii" }
    { "ips" "psi" "sip" }
    { "ooo" "ooo" "ooo" }
    { "www" "www" "www" }
    { "xxx" "xxx" "xxx" }
}

Go

<lang go>package main

import (

   "bufio"
   "fmt"
   "log"
   "os"
   "sort"
   "strings"

)

func check(err error) {

   if err != nil {
       log.Fatal(err)
   }

}

func readWords(fileName string) []string {

   file, err := os.Open(fileName)
   check(err)
   defer file.Close()
   var words []string
   scanner := bufio.NewScanner(file)
   for scanner.Scan() {
       word := strings.ToLower(strings.TrimSpace(scanner.Text()))
       if len(word) >= 3 {
           words = append(words, word)
       }
   }
   check(scanner.Err())
   return words

}

func rotate(runes []rune) {

   first := runes[0]
   copy(runes, runes[1:])
   runes[len(runes)-1] = first

}

func main() {

   words := readWords("mit_10000.txt") // using local copy
   n := len(words)
   used := make(map[string]bool)

outer:

   for _, word := range words {
       runes := []rune(word)
       variants := []string{word}
       for i := 0; i < len(runes)-1; i++ {
           rotate(runes)
           word2 := string(runes)
           if used[word2] {
               continue outer
           }
           ix := sort.SearchStrings(words, word2)
           if ix == n || words[ix] != word2 {
               continue outer
           }
           variants = append(variants, word2)
       }
       for _, variant := range variants {
           used[variant] = true
       }
       fmt.Println(variants)
   }

}</lang>

Output:
[aaa aaa aaa]
[aim ima mai]
[arc rca car]
[asp spa pas]
[ate tea eat]
[iii iii iii]
[ips psi sip]
[ooo ooo ooo]
[www www www]
[xxx xxx xxx]

Haskell

Circular words of more than 2 characters in a local copy of unixdict.txt <lang haskell>import Control.Applicative (liftA2) import qualified Data.Set as S

main :: IO () main = readFile "unixdict.txt" >>= (print . circularWords . lines)

circularWords :: [String] -> [String] circularWords ws =

 let lexicon = S.fromList ws
 in filter (isCircular lexicon) ws

isCircular :: S.Set String -> String -> Bool isCircular lex w =

  2 < length w && all (`S.member` lex) (rotations w)
 

rotations :: [a] -> a rotations = liftA2 fmap rotated (enumFromTo 0 . pred . length)

rotated :: [a] -> Int -> [a] rotated [] _ = [] rotated xs n = zipWith const (drop n (cycle xs)) xs</lang>

Output:
["aaa","apt","arc","ate","car","eat","iii","pta","rca","tap","tea"]

JavaScript

Reading a local dictionary with a macOS JS for Automation library:

Works with: JXA

<lang javascript>(() => {

   'use strict';
   // main :: IO ()
   const main = () =>
       showLog(
           circularWords(
               lines(readFile('~/unixdict.txt'))
           )
       );
   // circularWords :: [String] -> [String]
   const circularWords = ws =>
       ws.filter(isCircular(new Set(ws)), ws);
   // isCircular :: Set String -> String -> Bool
   const isCircular = lexicon => w => {
       const iLast = w.length - 1;
       return 1 < iLast && until(
           ([i, bln, s]) => iLast < i || !bln,
           ([i, bln, s]) => [1 + i, lexicon.has(s), rotated(s)],
           [0, true, w]
       )[1];
   }


   // MAC OS JS FOR AUTOMATION ---------------------------
   // readFile :: FilePath -> IO String
   const readFile = fp => {
       const
           e = $(),
           uw = ObjC.unwrap,
           s = uw(
               $.NSString.stringWithContentsOfFileEncodingError(
                   $(fp)
                   .stringByStandardizingPath,
                   $.NSUTF8StringEncoding,
                   e
               )
           );
       return undefined !== s ? (
           s
       ) : uw(e.localizedDescription);
   };
   // GENERIC FUNCTIONS ----------------------------------
   // lines :: String -> [String]
   const lines = s => s.split(/[\r\n]/);
   // rotated :: String -> String
   const rotated = xs =>
       xs.slice(1) + xs[0];
   // showLog :: a -> IO ()
   const showLog = (...args) =>
       console.log(
           args
           .map(JSON.stringify)
           .join(' -> ')
       );
   // until :: (a -> Bool) -> (a -> a) -> a -> a
   const until = (p, f, x) => {
       let v = x;
       while (!p(v)) v = f(v);
       return v;
   };
   // MAIN ---
   return main();

})();</lang>

Output:
["aaa","apt","arc","ate","car","eat","iii","pta","rca","tap","tea"]

Julia

Using the MIT 10000 word list, and excluding words of less than three letters, to reduce output length. <lang julia>using HTTP

function getwords()

   req = HTTP.request("GET", "https://www.mit.edu/~ecprice/wordlist.10000")
   Dict{String, Int}((string(x), 1) for x in split(String(req.body), r"\s+"))

end

rotate(s, n) = String(circshift(Vector{UInt8}(s), n))

isliketea(w, d) = (n = length(w); n > 2 && all(i -> haskey(d, rotate(w, i)), 1:n-1))

function getteawords()

   wordlistdict = getwords()
   for word in collect(keys(wordlistdict))
       if isliketea(word, wordlistdict)
           println(word, ": ", [rotate(word, i) for i in 1:length(word)-1])
       end
   end

end

getteawords()

</lang>

Output:
pas: ["spa", "asp"]
xxx: ["xxx", "xxx"]
iii: ["iii", "iii"]
asp: ["pas", "spa"]
tea: ["ate", "eat"]
spa: ["asp", "pas"]
ate: ["eat", "tea"]
aim: ["mai", "ima"]
aaa: ["aaa", "aaa"]
car: ["rca", "arc"]
ooo: ["ooo", "ooo"]
sip: ["psi", "ips"]
arc: ["car", "rca"]
ips: ["sip", "psi"]
www: ["www", "www"]
mai: ["ima", "aim"]
rca: ["arc", "car"]
eat: ["tea", "ate"]
psi: ["ips", "sip"]
ima: ["aim", "mai"]

Lychen

Lychen is V8 JavaScript wrapped in C#, exposing C# into JavaScript.

Using https://www.mit.edu/~ecprice/wordlist.10000 as per the Julia example.

<lang javascript> const wc = new CS.System.Net.WebClient(); const lines = wc.DownloadString("https://www.mit.edu/~ecprice/wordlist.10000"); const words = lines.split(/\n/g); const collection = {}; words.filter(word => word.length > 2).forEach(word => {

 let allok = true;
 let newword = word;
 for (let i = 0; i < word.length - 1; i++) {
   newword = newword.substr(1) + newword.substr(0, 1);
   if (!words.includes(newword)) {
     allok = false;
     break;
   }
 }
 if (allok) {
   const key = word.split("").sort().join("");
   if (!collection[key]) {
     collection[key] = [word];
   } else {
     if (!collection[key].includes(word)) {
       collection[key].push(word);
     }
   }
 }

}); Object.keys(collection) .filter(key => collection[key].length > 1) .forEach(key => console.log("%s", collection[key].join(", "))); </lang>

aim, ima, mai
arc, car, rca
asp, pas, spa
ate, eat, tea
ips, psi, sip

Perl 6

Works with: Rakudo version 2019.07.1

Much of the previous exposition here is superfluous as the reference word list has changed.

There doesn't seem to be any restriction that the word needs to consist only of lowercase letters, so words of any case are included. Since the example code specifically shows the example words (TEA, EAT, ATE) in uppercase, I elected to uppercase the found words.

<lang perl6>my %words; './wordlist.10000'.IO.slurp.words.map: { .chars < 3 ?? (next) !! %words{.uc.comb.sort.join}.push: .uc };

my @teacups; my %seen;

for %words.values -> @these {

   MAYBE: for @these {
       my $maybe = $_;
       next if %seen{$_};
       my @print;
       for ^$maybe.chars {
           if $maybe ∈ @these {
               @print.push: $maybe;
               $maybe = $maybe.comb.list.rotate.join;
           } else {
               @print = ();
               last MAYBE
           }
       }
       if @print.elems {
           @teacups.push: @print;
           %seen{$_}++ for @print;
       }
   }

}

say .join(", ") for sort @teacups;</lang>

Output:
AAA, AAA, AAA
AIM, IMA, MAI
ARC, RCA, CAR
ASP, SPA, PAS
ATE, TEA, EAT
III, III, III
IPS, PSI, SIP
OOO, OOO, OOO
WWW, WWW, WWW
XXX, XXX, XXX

Python

Functional

Composing generic functions, and taking only circular words of more than two characters. <lang python>Teacup rim text

from itertools import groupby from os.path import expanduser


  1. main :: IO ()

def main():

   Circular words of more than two characters
      in a local copy of unixdict.txt
   
   print(
       showGroups(
           circularWords(
               # Local copy of
               # https://www.mit.edu/~ecprice/wordlist.10000
               lines(readFile('~/mitWords.txt'))
           )
       )
   )


  1. circularWords :: [String] -> [String]

def circularWords(ws):

   The subset of those words in the given list
      which are circular.
   
   lexicon = set(ws)
   return list(filter(isCircular(lexicon), ws))


  1. isCircular :: Set String -> String -> Bool

def isCircular(lexicon):

   True if the given word contains more than
      two characters, and all of its rotations
      are found in the lexicon.
   
   def go(w):
       def f(tpl):
           (i, _, x) = tpl
           return (1 + i, x in lexicon, rotated(x))
       iLast = len(w) - 1
       return 1 < iLast and until(
           lambda tpl: iLast < tpl[0] or (not tpl[1])
       )(f)(
           (0, True, w)
       )[1]
   return lambda s: go(s)


  1. DISPLAY -------------------------------------------------
  1. showGroups :: [String] -> String

def showGroups(xs):

   List of groups of circular words.
   return unlines([
       gp[0][0] + (
           ' -> ' + ', '.join(
               [snd(x) for x in gp[1:]]
           ) if 1 < len(gp) else 
       )
       for gp in groupBy(fst)(
           sorted(
               [(.join(sorted(x)), x) for x in xs],
               key=fst
           )
       )
   ])


  1. GENERIC -------------------------------------------------
  1. fst :: (a, b) -> a

def fst(tpl):

   First member of a pair.
   return tpl[0]


  1. groupBy :: (a -> b) -> [a] -> a

def groupBy(f):

   The elements of xs grouped,
      preserving order, by equality
      in terms of the key function f.
   
   return lambda xs: [
       list(x[1]) for x in groupby(xs, key=f)
   ]


  1. lines :: String -> [String]

def lines(s):

   A list of strings,
      (containing no newline characters)
      derived from a single new-line delimited string.
   
   return s.splitlines()


  1. readFile :: FilePath -> IO String

def readFile(fp):

   The contents of any file at the path
      derived by expanding any ~ in fp.
   
   with open(expanduser(fp), 'r', encoding='utf-8') as f:
       return f.read()


  1. rotated :: String -> String

def rotated(s):

   A list rotated 1 character to the right.
   return s[1:] + s[0]


  1. snd :: (a, b) -> b

def snd(tpl):

   Second member of a pair.
   return tpl[1]


  1. unlines :: [String] -> String

def unlines(xs):

   A single string formed by the intercalation
      of a list of strings with the newline character.
   
   return '\n'.join(xs)


  1. until :: (a -> Bool) -> (a -> a) -> a -> a

def until(p):

   The result of repeatedly applying f until p holds.
      The initial seed value is x.
   
   def go(f, x):
       v = x
       while not p(v):
           v = f(v)
       return v
   return lambda f: lambda x: go(f, x)


  1. MAIN ---

if __name__ == '__main__':

   main()</lang>
Output:
aaa
acr -> car, rca
aet -> eat, tea
aim -> ima, mai
aps -> pas, spa
iii
ips -> psi, sip
ooo
www
xxx

REXX

All words that contained non─letter (Latin) characters   (periods, decimal digits, minus signs, underbars, or embedded blanks)   weren't considered as candidates for circular words.

The dictionary wasn't assumed to be sorted in any way. <lang rexx>/*REXX pgm finds circular words (length>2), using a dictionary, suppress permutations.*/ parse arg iFID L . /*obtain optional arguments from the CL*/ if iFID==|iFID=="," then iFID= 'wordlist.10k' /*Not specified? Then use the default.*/ if L==| L=="," then L= 3 /* " " " " " " */

  1. = 0 /*number of words in dictionary, Len>L.*/

@.= /*stemmed array of non─duplicated words*/

     do r=0  while lines(iFID)\==0              /*read all lines (words) in dictionary.*/
     z= space( linein(iFID) )                   /*obtain get a word from the dictionary*/
     if length(z)<L | @.z\==  then iterate    /*length must be  L  or more,  no dups.*/
     if \datatype(z, 'M')       then iterate    /*Word contains non-letters?  Then skip*/
     @.z = z                                    /*assign a word from the dictionary.   */
     #= # + 1;    $.#=  z                       /*bump word count; append word to list.*/
     end   /*r*/                                /* [↑]  dictionary need not be sorted. */

say "There're " r ' entries in the dictionary (of all types): ' iFID say "There're " # ' words in the dictionary of at least length ' L say cw= 0 /*the number of circular words (so far)*/

     do j=1  for #;      x= $.j;      y= x      /*obtain the  Jth  word in the list.   */
     if x==  then iterate                     /*if a null, don't show variants.      */
     yy= y                                      /*the start of a list of the variants. */
                   do k=1  for length(x)-1      /*"circulate" the litters in the word. */
                   y= substr(y, 2)left(y, 1)    /*add the left letter to the right end.*/
                   if @.y==  then iterate j   /*if not a word,  then skip this word. */
                   yy= yy',' y                  /*append to the list of the variants.  */
                   if y\==x  then @.y=          /*nullify word to suppress permutations*/
                   end   /*k*/                  /* [↓] ··· except for monolithic words.*/
     cw= cw + 1                                 /*bump counter of circular words found.*/
     say 'circular word: '     yy               /*display a circular word to the term. */
     end   /*j*/

say say cw ' circular words were found.' /*stick a fork in it, we're all done. */</lang>

output   when using the default inputs:
There're  10000  entries in the dictionary (of all types):   wordlist.10k
There're  9578  words in the dictionary of at least length  3

circular word:  aaa, aaa, aaa
circular word:  aim, ima, mai
circular word:  arc, rca, car
circular word:  asp, spa, pas
circular word:  ate, tea, eat
circular word:  iii, iii, iii
circular word:  ips, psi, sip
circular word:  ooo, ooo, ooo
circular word:  www, www, www
circular word:  xxx, xxx, xxx

10  circular words were found.

zkl

<lang zkl>// this is limited to the max items a Dictionary can hold words:=File("mit_wordlist_10000.txt").pump(Dictionary().add.fp1(True),"strip"); seen :=Dictionary(); foreach word in (words.keys){

  rots,w,sz := List(), word, word.len();
  if(sz>2 and not seen.holds(word)){
     do(sz-1){ 

w=String(w[-1],w[0,-1]); // rotate one character if(not words.holds(w)) continue(2); // not a word, do next word rots.append(w); // I'd like to see all the rotations

     }
     println(rots.append(word).sort().concat(" ")); 
     rots.pump(seen.add.fp1(True));	// we've seen these rotations
  }

}</lang>

Output:
www www www
asp pas spa
ips psi sip
iii iii iii
ate eat tea
aaa aaa aaa
aim ima mai
arc car rca
xxx xxx xxx
ooo ooo ooo