ABC words

From Rosetta Code
Revision as of 11:51, 7 December 2020 by Petelomax (talk | contribs) (→‎{{header|Phix}}: use GT_LF_STRIPPED)
ABC words is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
Definition

A word is an   ABC word   if the letters  "a",  "b"  and  "c"  appear in the word in alphabetical order.

If any or all of these letters occur more than once in a word, then only the first occurrence of each letter should be used to determine whether a word is an  ABC word  or not.


Task

Show here   (on this page)   every  ABC word  in unixdict.txt.


Other tasks related to string operations:
Metrics
Counting
Remove/replace
Anagrams/Derangements/shuffling
Find/Search/Determine
Formatting
Song lyrics/poems/Mad Libs/phrases
Tokenize
Sequences



ALGOL 68

<lang algol68># find words that have "a", "b" and "C" in order in them # IF FILE input file;

   STRING file name = "unixdict.txt";
   open( input file, file name, stand in channel ) /= 0

THEN

   # failed to open the file #
   print( ( "Unable to open """ + file name + """", newline ) )

ELSE

   # file opened OK #
   BOOL at eof := FALSE;
   # set the EOF handler for the file #
   on logical file end( input file, ( REF FILE f )BOOL:
                                    BEGIN
                                        # note that we reached EOF on the #
                                        # latest read #
                                        at eof := TRUE;
                                        # return TRUE so processing can continue #
                                        TRUE
                                    END
                      );
   INT abc count := 0;
   WHILE STRING word;
         get( input file, ( word, newline ) );
         NOT at eof
   DO
       IF  INT w pos := LWB word;
           INT w max  = UPB word;
           INT a pos := w max + 1;
           INT b pos := w max + 1;
           INT c pos := w max + 1;
           char in string( "a", a pos, word );
           char in string( "b", b pos, word );
           char in string( "c", c pos, word );
           a pos <  b pos
       AND b pos <  c pos
       AND c pos <= w max
       THEN
           abc count +:= 1;
           print( ( whole( abc count, -5 ), ": ", word, newline ) )
       FI
   OD;
   close( input file )

FI</lang>

Output:
    1: aback
    2: abacus
    3: abc
    4: abdicate
    5: abduct
    6: abeyance
    7: abject
    8: abreact
    9: abscess
   10: abscissa
   11: abscissae
   12: absence
   13: abstract
   14: abstracter
   15: abstractor
   16: adiabatic
   17: aerobacter
   18: aerobic
   19: albacore
   20: alberich
   21: albrecht
   22: algebraic
   23: alphabetic
   24: ambiance
   25: ambuscade
   26: aminobenzoic
   27: anaerobic
   28: arabic
   29: athabascan
   30: auerbach
   31: diabetic
   32: diabolic
   33: drawback
   34: fabric
   35: fabricate
   36: flashback
   37: halfback
   38: iambic
   39: lampblack
   40: leatherback
   41: metabolic
   42: nabisco
   43: paperback
   44: parabolic
   45: playback
   46: prefabricate
   47: quarterback
   48: razorback
   49: roadblock
   50: sabbatical
   51: snapback
   52: strabismic
   53: syllabic
   54: tabernacle
   55: tablecloth

AWK

The following one-liner entered into a Posix shell returns the same 55 words as other entries.

<lang awk>awk '/^[^bc]*a[^c]*b.*c/' unixdict.txt</lang>

C#

Takes an optional command line for other character combinations. User can specify any reasonable number of unique characters. Caveat: see discussion page for issue about specifying repeated characters. <lang csharp>class Program {

   static void Main(string[] args) { int bi, i = 0; string chars = args.Length < 1 ? "abc" : args[0];
       foreach (var item in System.IO.File.ReadAllLines("unixdict.txt")) {
           int ai = -1; foreach (var ch in chars)
               if ((bi = item.IndexOf(ch)) > ai) ai = bi; else goto skip;
           System.Console.Write("{0,3} {1,-18} {2}", ++i, item, i % 5 == 0 ? "\n" : "");
       skip: ; } }

} </lang>

Output:

Without command line arguments:

  1 aback                2 abacus               3 abc                  4 abdicate             5 abduct
  6 abeyance             7 abject               8 abreact              9 abscess             10 abscissa
 11 abscissae           12 absence             13 abstract            14 abstracter          15 abstractor
 16 adiabatic           17 aerobacter          18 aerobic             19 albacore            20 alberich
 21 albrecht            22 algebraic           23 alphabetic          24 ambiance            25 ambuscade
 26 aminobenzoic        27 anaerobic           28 arabic              29 athabascan          30 auerbach
 31 diabetic            32 diabolic            33 drawback            34 fabric              35 fabricate
 36 flashback           37 halfback            38 iambic              39 lampblack           40 leatherback
 41 metabolic           42 nabisco             43 paperback           44 parabolic           45 playback
 46 prefabricate        47 quarterback         48 razorback           49 roadblock           50 sabbatical
 51 snapback            52 strabismic          53 syllabic            54 tabernacle          55 tablecloth

With command line argument "alw":

  1 afterglow            2 airflow              3 alewife              4 allentown            5 alleyway
  6 allow                7 allowance            8 alway                9 always              10 baldwin
 11 barlow              12 bartholomew         13 bungalow            14 caldwell            15 candlewick
 16 cauliflower         17 fallow              18 foamflower          19 galloway            20 gallows
 21 galway              22 halfway             23 hallow              24 halloween           25 hallway
 26 malawi              27 mallow              28 marlowe             29 marshmallow         30 mayflower
 31 metalwork           32 railway             33 sallow              34 saltwater           35 sandalwood
 36 shadflower          37 shallow             38 stalwart            39 tailwind            40 tallow

FreeBASIC

<lang freebasic>

  1. define NOTINSTRING 9999

function first_occ( s as string, letter as string ) as uinteger

   for i as ubyte = 1 to len(s)
       if mid(s,i,1) = letter then return i
   next i
   return NOTINSTRING - asc(letter)

end function

function is_abc( s as string ) as boolean

   if first_occ( s, "a" ) > first_occ( s, "b" ) then return false
   if first_occ( s, "b" ) > first_occ( s, "c" ) then return false
   if first_occ( s, "c" ) > len(s) then return false
   return true

end function

dim as string word dim as uinteger c = 0

open "unixdict.txt" for input as #1 while true

   line input #1, word
   if word="" then exit while
   if is_abc( word ) then
       c+=1
       print c;".   ";word
   end if

wend close #1</lang>

Output:
1.   aback
2.   abacus
3.   abc
4.   abdicate
5.   abduct
6.   abeyance
7.   abject
8.   abreact
9.   abscess
10.   abscissa
11.   abscissae
12.   absence
13.   abstract
14.   abstracter
15.   abstractor
16.   adiabatic
17.   aerobacter
18.   aerobic
19.   albacore
20.   alberich
21.   albrecht
22.   algebraic
23.   alphabetic
24.   ambiance
25.   ambuscade
26.   aminobenzoic
27.   anaerobic
28.   arabic
29.   athabascan
30.   auerbach
31.   diabetic
32.   diabolic
33.   drawback
34.   fabric
35.   fabricate
36.   flashback
37.   halfback
38.   iambic
39.   lampblack
40.   leatherback
41.   metabolic
42.   nabisco
43.   paperback
44.   parabolic
45.   playback
46.   prefabricate
47.   quarterback
48.   razorback
49.   roadblock
50.   sabbatical
51.   snapback
52.   strabismic
53.   syllabic
54.   tabernacle
55.   tablecloth

Go

<lang go>package main

import (

   "bytes"
   "fmt"
   "io/ioutil"
   "log"

)

func main() {

   wordList := "unixdict.txt"
   b, err := ioutil.ReadFile(wordList)
   if err != nil {
       log.Fatal("Error reading file")
   }
   bwords := bytes.Fields(b)
   count := 0
   fmt.Println("Based on first occurrences only, the ABC words in", wordList, "are:")
   for _, bword := range bwords {
       a := bytes.IndexRune(bword, 'a')
       b := bytes.IndexRune(bword, 'b')
       c := bytes.IndexRune(bword, 'c')
       if a >= 0 && b > a && c > b {
           count++
           fmt.Printf("%2d: %s\n", count, string(bword))
       }
   }

}</lang>

Output:
Based on first occurrences only, the ABC words in unixdict.txt are:
 1: aback
 2: abacus
 3: abc
 4: abdicate
 5: abduct
 6: abeyance
 7: abject
 8: abreact
 9: abscess
10: abscissa
11: abscissae
12: absence
13: abstract
14: abstracter
15: abstractor
16: adiabatic
17: aerobacter
18: aerobic
19: albacore
20: alberich
21: albrecht
22: algebraic
23: alphabetic
24: ambiance
25: ambuscade
26: aminobenzoic
27: anaerobic
28: arabic
29: athabascan
30: auerbach
31: diabetic
32: diabolic
33: drawback
34: fabric
35: fabricate
36: flashback
37: halfback
38: iambic
39: lampblack
40: leatherback
41: metabolic
42: nabisco
43: paperback
44: parabolic
45: playback
46: prefabricate
47: quarterback
48: razorback
49: roadblock
50: sabbatical
51: snapback
52: strabismic
53: syllabic
54: tabernacle
55: tablecloth

Julia

<lang julia>function lettersinorder(dictfile, letters)

   chars = sort(collect(letters))
   for word in split(read(dictfile, String))
       positions = [findfirst(c -> c == ch, word) for ch in chars]
       all(!isnothing, positions) && issorted(positions) && println(word)
   end

end

lettersinorder("unixdict.txt", "abc")

</lang>

Output:
aback
abacus
abc
abdicate
abduct
abeyance
abject
abreact
abscess
abscissa
abscissae
absence
abstract
abstracter
abstractor
adiabatic
aerobacter
aerobic
albacore
alberich
albrecht
algebraic
alphabetic
ambiance
ambuscade
aminobenzoic
anaerobic
arabic
athabascan
auerbach
diabetic
diabolic
drawback
fabric
fabricate
flashback
halfback
iambic
lampblack
leatherback
metabolic
nabisco
paperback
parabolic
playback
prefabricate
quarterback
razorback
roadblock
sabbatical
snapback
strabismic
syllabic
tabernacle
tablecloth

Perl

Outputs same 55 words everyone else finds. <lang perl>#!/usr/bin/perl

@ARGV = 'unixdict.txt'; print grep /^[^bc]*a[^c]*b.*c/, <>;</lang>

Phix

<lang Phix>function abc(string word)

   sequence idii = apply(true,find,{"abc",{word}})
   return find(0,idii)==0 and idii==sort(idii)

end function sequence words = filter(get_text("demo/unixdict.txt",GT_LF_STRIPPED),abc) printf(1,"%d abc words found: %s\n",{length(words),join(shorten(words,"",3),", ")})</lang>

Output:
55 abc words found: aback, abacus, abc, ..., syllabic, tabernacle, tablecloth

Python

Outputs the same 55 words as other examples when entered in a Posix terminal shell

<lang python>python -c ' import sys for ln in sys.stdin:

   if "a" in ln and ln.find("a") < ln.find("b") < ln.find("c"):
       print(ln.rstrip())

' < unixdict.txt </lang>

Raku

<lang perl6>put 'unixdict.txt'.IO.words».fc.grep({ (.index('a')//next) < (.index('b')//next) < (.index('c')//next) })\

   .&{"{+$_} words:\n  " ~ .batch(11)».fmt('%-12s').join: "\n  "};</lang>
Output:
55 words:
  aback        abacus       abc          abdicate     abduct       abeyance     abject       abreact      abscess      abscissa     abscissae   
  absence      abstract     abstracter   abstractor   adiabatic    aerobacter   aerobic      albacore     alberich     albrecht     algebraic   
  alphabetic   ambiance     ambuscade    aminobenzoic anaerobic    arabic       athabascan   auerbach     diabetic     diabolic     drawback    
  fabric       fabricate    flashback    halfback     iambic       lampblack    leatherback  metabolic    nabisco      paperback    parabolic   
  playback     prefabricate quarterback  razorback    roadblock    sabbatical   snapback     strabismic   syllabic     tabernacle   tablecloth 

REXX

This REXX version doesn't care what order the words in the dictionary are in,   nor does it care what
case  (lower/upper/mixed)  the words are in,   the search for the   ABC   words is   caseless.

It also allows the   (ABC)   characters to be specified on the command line (CL) as well as the dictionary file identifier. <lang rexx>/*REXX pgm finds "ABC" words (within an identified dict.) where ABC are found in order.*/ parse arg chrs iFID . /*obtain optional arguments from the CL*/ if chrs== | chrs=="," then chrs= 'abc' /*Not specified? Then use the default.*/ if iFID== | iFID=="," then iFID='unixdict.txt' /* " " " " " " */ @.= /*default value of any dictionary word.*/

       do #=1  while lines(iFID)\==0            /*read each word in the file  (word=X).*/
       x= strip( linein( iFID) )                /*pick off a word from the input line. */
       $.#= x;         upper x;     @.#= x      /*save: original case.                 */
       end   /*#*/                              /* [↑]   semaphore name is uppercased. */

say copies('─', 30) # "words in the dictionary file: " iFID L = length(chrs) /*obtain the length of the ABC chars.*/ chrsU= chrs; upper chrsU /*obtain an uppercase version of chrs.*/ ABCs= 0 /*count of the "ABC" words found. */

       do j=1  for #-1                          /*process all the words that were found*/
       if verify(chrsU, @.j)>0  then iterate    /*All characters found?  No, then skip.*/
       p= 0                                     /*initialize the position location.    */
              do k=1  for L                     /*examine each letter of the ABC charts*/
              _= pos( substr(chrsU, k, 1), @.j) /*find the position of the  Kth letter.*/
              if _<p  then iterate j            /*Less than the previous?  Then skip it*/
              p= _                              /*save the position of the last letter.*/
              end   /*k*/
       ABCs= ABCs + 1                           /*bump the count of "ABC" words found. */
       say right(left($.j, 30), 40)             /*indent original word for readability.*/
       end        /*j*/

say copies('─', 30) ABCs ' "ABC" words found using the characters: ' chrs</lang>

output   when using the default input:
────────────────────────────── 25105 words in the dictionary file:  unixdict.txt
          aback
          abacus
          abc
          abdicate
          abduct
          abeyance
          abject
          abreact
          abscess
          abscissa
          abscissae
          absence
          abstract
          abstracter
          abstractor
          adiabatic
          aerobacter
          aerobic
          albacore
          alberich
          albrecht
          algebraic
          alphabetic
          ambiance
          ambuscade
          aminobenzoic
          anaerobic
          arabic
          athabascan
          auerbach
          diabetic
          diabolic
          drawback
          fabric
          fabricate
          flashback
          halfback
          iambic
          lampblack
          leatherback
          metabolic
          nabisco
          paperback
          parabolic
          playback
          prefabricate
          quarterback
          razorback
          roadblock
          sabbatical
          snapback
          strabismic
          syllabic
          tabernacle
          tablecloth
────────────────────────────── 55  "ABC" words found using the characters:  abc
output   when using the  (vowels in order)  input:     aeiou
────────────────────────────── 25105 words in the dictionary file:  unixdict.txt
          adventitious
          facetious
────────────────────────────── 2  "ABC" words found using the characters:  aeiou

Ring

<lang ring> cStr = read("unixdict.txt") wordList = str2list(cStr) num = 0

see "ABC words are:" + nl

for n = 1 to len(wordList)

   bool1 = substr(wordList[n],"a")
   bool2 = substr(wordList[n],"b")
   bool3 = substr(wordList[n],"c")
   bool4 = bool1 > 0 and bool2 > 0 and bool3 > 0
   bool5 = bool2 > bool1 and bool3 > bool2
   if bool4 = 1 and bool5 = 1
      num = num + 1
      see "" + num + ". " + wordList[n] + nl
   ok

next </lang> Output:

ABC words are:
1. aback
2. abacus
3. abc
4. abdicate
5. abduct
6. abeyance
7. abject
8. abreact
9. abscess
10. abscissa
11. abscissae
12. absence
13. abstract
14. abstracter
15. abstractor
16. adiabatic
17. aerobacter
18. aerobic
19. albacore
20. alberich
21. albrecht
22. algebraic
23. alphabetic
24. ambiance
25. ambuscade
26. aminobenzoic
27. anaerobic
28. arabic
29. athabascan
30. auerbach
31. diabetic
32. diabolic
33. drawback
34. fabric
35. fabricate
36. flashback
37. halfback
38. iambic
39. lampblack
40. leatherback
41. metabolic
42. nabisco
43. paperback
44. parabolic
45. playback
46. prefabricate
47. quarterback
48. razorback
49. roadblock
50. sabbatical
51. snapback
52. strabismic
53. syllabic
54. tabernacle
55. tablecloth

Wren

Library: Wren-fmt

<lang ecmascript>import "io" for File import "/fmt" for Fmt

var wordList = "unixdict.txt" // local copy var words = File.read(wordList).trimEnd().split("\n") var count = 0 System.print("Based on first occurrences only, the ABC words in %(wordList) are:") for (word in words) {

   var a = word.indexOf("a")
   var b = word.indexOf("b")
   var c = word.indexOf("c")
   if (a >= 0 && b > a && c > b) {
       count = count + 1
       Fmt.print("$2d: $s", count, word)
   }

}</lang>

Output:
Based on first occurrences only, the ABC words in unixdict.txt are:
 1: aback
 2: abacus
 3: abc
 4: abdicate
 5: abduct
 6: abeyance
 7: abject
 8: abreact
 9: abscess
10: abscissa
11: abscissae
12: absence
13: abstract
14: abstracter
15: abstractor
16: adiabatic
17: aerobacter
18: aerobic
19: albacore
20: alberich
21: albrecht
22: algebraic
23: alphabetic
24: ambiance
25: ambuscade
26: aminobenzoic
27: anaerobic
28: arabic
29: athabascan
30: auerbach
31: diabetic
32: diabolic
33: drawback
34: fabric
35: fabricate
36: flashback
37: halfback
38: iambic
39: lampblack
40: leatherback
41: metabolic
42: nabisco
43: paperback
44: parabolic
45: playback
46: prefabricate
47: quarterback
48: razorback
49: roadblock
50: sabbatical
51: snapback
52: strabismic
53: syllabic
54: tabernacle
55: tablecloth