Find words whose first and last three letters are equal: Difference between revisions
Added Python implementation |
|||
Line 512: | Line 512: | ||
tartar |
tartar |
||
testes</pre> |
testes</pre> |
||
=={{header|Python}}== |
|||
Tested on Python 3+, the file download will work only if the link is still active. It is possible that you may be able to fetch the file in your browser but download via code may still fail. Check whether you are connected to a VPN, it works on open networks |
|||
<lang Python> |
|||
import urllib.request |
|||
urllib.request.urlretrieve("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt", "unixdict.txt") |
|||
dictionary = open("unixdict.txt","r") |
|||
wordList = dictionary.read().split('\n') |
|||
for word in wordList: |
|||
if len(word)>5 and word[:3].lower()==word[-3:].lower(): |
|||
print(word) |
|||
</lang> |
|||
{{out}} |
|||
<pre> |
|||
antiperspirant |
|||
calendrical |
|||
einstein |
|||
hotshot |
|||
murmur |
|||
oshkosh |
|||
tartar |
|||
testes |
|||
</pre> |
|||
=={{header|Quackery}}== |
=={{header|Quackery}}== |
Revision as of 21:19, 3 October 2021
- Task
Using the dictionary unixdict.txt
find the words whose first and last three letters are equal.
The length of any word shown should have a length > 5.
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contains most consonants
- Find words which contains more than 3 vowels
- Find words which first and last three letters are equals
- Find words which odd letters are consonants and even letters are vowels or vice_versa
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 Bottles of Beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
Ada
<lang Ada>with Ada.Text_Io; with Ada.Strings.Fixed;
procedure Find_Three_Equals is
use Ada.Text_Io; use Ada.Strings.Fixed;
Filename : constant String := "unixdict.txt"; File : File_Type;
begin
Open (File, In_File, Filename); while not End_Of_File (File) loop declare Word : constant String := Get_Line (File); First : String renames Head (Word, 3); Last : String renames Tail (Word, 3); begin if First = Last and Word'Length > 5 then Put_Line (Word); end if; end; end loop; Close (File);
end Find_Three_Equals;</lang>
ALGOL 68
<lang algol68># find 6 (or more) character words with the same first and last 3 letters # IF FILE input file;
STRING file name = "unixdict.txt"; open( input file, file name, stand in channel ) /= 0
THEN
# failed to open the file # print( ( "Unable to open """ + file name + """", newline ) )
ELSE
# file opened OK # BOOL at eof := FALSE; # set the EOF handler for the file # on logical file end( input file, ( REF FILE f )BOOL: BEGIN # note that we reached EOF on the # # latest read # at eof := TRUE; # return TRUE so processing can continue # TRUE END ); INT count := 0; WHILE STRING word; get( input file, ( word, newline ) ); NOT at eof DO IF INT w len = ( UPB word + 1 ) - LWB word; w len > 5 THEN IF word[ 1 : 3 ] = word[ w len - 2 : ] THEN count +:= 1; print( ( word, " " ) ); IF count MOD 5 = 0 THEN print( ( newline ) ) ELSE FROM w len + 1 TO 14 DO print( ( " " ) ) OD FI FI FI OD; print( ( newline, "found ", whole( count, 0 ), " words with the same first and last 3 characters", newline ) ); close( input file )
FI</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes found 8 words with the same first and last 3 characters
Arturo
<lang rebol>words: read.lines relative "unixdict.txt" equalHeadTail?: function [w][
equal? first.n: 3 w last.n: 3 w
]
loop words 'word [
if 5 < size word [ if equalHeadTail? word -> print word ]
]</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
AWK
<lang AWK>
- syntax: GAWK -f FIND_WORDS_WHICH_FIRST_AND_LAST_THREE_LETTERS_ARE_EQUALS.AWK unixdict.txt
(length($0) >= 6 && substr($0,1,3) == substr($0,length($0)-2,3)) END {
exit(0)
} </lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
C++
<lang cpp>#include <cstdlib>
- include <fstream>
- include <iostream>
int main(int argc, char** argv) {
const char* filename(argc < 2 ? "unixdict.txt" : argv[1]); std::ifstream in(filename); if (!in) { std::cerr << "Cannot open file '" << filename << "'.\n"; return EXIT_FAILURE; } std::string word; int n = 0; while (getline(in, word)) { const size_t len = word.size(); if (len > 5 && word.compare(0, 3, word, len - 3) == 0) std::cout << ++n << ": " << word << '\n'; } return EXIT_SUCCESS;
}</lang>
- Output:
1. antiperspirant 2. calendrical 3. einstein 4. hotshot 5. murmur 6. oshkosh 7. tartar 8. testes
F#
<lang fsharp> // First and last three letters are equal. Nigel Galloway: February 18th., 2021 let fN g=if String.length g<6 then false else g.[..2]=g.[g.Length-3..] seq{use n=System.IO.File.OpenText("unixdict.txt") in while not n.EndOfStream do yield n.ReadLine()}|>Seq.filter fN|>Seq.iter(printfn "%s") </lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Factor
Read entire file
This version reads the entire dictionary file into memory and filters it. This is the fastest version by far. Factor is optimized for making multiple passes over data; it actually takes longer if we combine the two filters into one, either with short-circuiting or non-short-circuiting and
.
<lang factor>USING: io io.encodings.ascii io.files kernel math sequences ;
"unixdict.txt" ascii file-lines [ length 5 > ] filter [ [ 3 head-slice ] [ 3 tail-slice* ] bi = ] filter [ print ] each</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Read file line by line
This version reads the dictionary file line by line and prints out words that fit the criteria. This ends up being a bit more imperative and deeply nested, but unlike the version above, we only load one word at a time, saving quite a bit of memory. <lang factor>USING: combinators.short-circuit io io.encodings.ascii io.files kernel math sequences ;
"unixdict.txt" ascii [
[ readln dup [ dup { [ length 5 > ] [ [ 3 head-slice ] [ 3 tail-slice* ] bi = ] } 1&& [ print ] [ drop ] if ] when* ] loop
] with-file-reader</lang>
- Output:
As above.
Lazy file I/O
This version lazily reads the input file by treating a stream like a lazy list with the llines
word. This allows us the nice style of the first example with the memory benefits of the second example. Unlike in the first example, combining the filters would buy us some time here, as lazy lists aren't as efficient as sequences.
<lang factor>USING: io io.encodings.ascii io.files kernel lists lists.lazy
math sequences ;
"unixdict.txt" ascii <file-reader> llines [ length 5 > ] lfilter [ [ 3 head-slice ] [ 3 tail-slice* ] bi = ] lfilter [ print ] leach</lang>
- Output:
As above.
Forth
<lang forth>: first-last-three-equal { addr len -- ? }
len 5 <= if false exit then addr 3 addr len 3 - + 3 compare 0= ;
256 constant max-line
- main
0 0 { count fd-in } s" unixdict.txt" r/o open-file throw to fd-in begin here max-line fd-in read-line throw while here swap 2dup first-last-three-equal if count 1+ to count count 1 .r ." . " type cr else 2drop then repeat drop fd-in close-file throw ;
main bye</lang>
- Output:
1. antiperspirant 2. calendrical 3. einstein 4. hotshot 5. murmur 6. oshkosh 7. tartar 8. testes
FreeBASIC
<lang freebasic>#define NULL 0
type node
word as string*32 'enough space to store any word in the dictionary nxt as node ptr
end type
function addword( tail as node ptr, word as string ) as node ptr
'allocates memory for a new node, links the previous tail to it, 'and returns the address of the new node dim as node ptr newnode = allocate(sizeof(node)) tail->nxt = newnode newnode->nxt = NULL newnode->word = word return newnode
end function
function length( word as string ) as uinteger
'necessary replacement for the built-in len function, which in this 'case would always return 32 for i as uinteger = 1 to 32 if asc(mid(word,i,1)) = 0 then return i-1 next i return 999
end function
dim as string word dim as node ptr tail = allocate( sizeof(node) ) dim as node ptr head = tail, curr = head, currj dim as uinteger ln tail->nxt = NULL tail->word = "XXXXHEADER"
open "unixdict.txt" for input as #1 while true
line input #1, word if word = "" then exit while if length(word)>5 then tail = addword( tail, word )
wend close #1
while curr->nxt <> NULL
word = curr->word ln = length(word) for i as uinteger = 1 to 3 if mid(word,i,1) <> mid(word,ln-3+i,1) then goto nextword next i print word nextword: curr = curr->nxt
wend</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Go
<lang go>package main
import (
"bytes" "fmt" "io/ioutil" "log" "unicode/utf8"
)
func main() {
wordList := "unixdict.txt" b, err := ioutil.ReadFile(wordList) if err != nil { log.Fatal("Error reading file") } bwords := bytes.Fields(b) count := 0 for _, bword := range bwords { s := string(bword) if utf8.RuneCountInString(s) > 5 && (s[0:3] == s[len(s)-3:]) { count++ fmt.Printf("%d: %s\n", count, s) } }
}</lang>
- Output:
1: antiperspirant 2: calendrical 3: einstein 4: hotshot 5: murmur 6: oshkosh 7: tartar 8: testes
jq
Works with gojq, the Go implementation of jq <lang jq>select(length > 5 and .[:3] == .[-3:])</lang>
- Output:
Invocation example: jq -rRM -f program.jq unixdict.txt
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Julia
See Alternade_words#Julia for the foreachword function. <lang julia>matchfirstlast3(word, _) = length(word) > 5 && word[1:3] == word[end-2:end] ? word : ""
foreachword("unixdict.txt", matchfirstlast3, numcols=4)</lang>
- Output:
Word source: unixdict.txt antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Ksh
<lang ksh>#!/bin/ksh
- Find list of words (> 5 chars) where 1st 3 and last 3 letters are the same
- # Variables:
dict='../unixdict.txt' integer MIN_LEN=5 integer MATCH_NO=3
######
- main #
######
while read word; do
(( ${#word} <= MIN_LEN )) && continue
first=${word:0:${MATCH_NO}} last=${word:$((${#word}-MATCH_NO)):${#word}}
[[ ${first} == ${last} ]] && print ${word}
done < ${dict}</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Mathematica /Wolfram Language
<lang Mathematica>dict = Once[Import["https://web.archive.org/web/20180611003215/http://www.puzzlers.org/pub/wordlists/unixdict.txt"]]; dict //= StringSplit[#, "\n"] &; dict //= Select[StringLength /* GreaterThan[5]]; Select[dict, StringTake[#, 3] === StringTake[#, -3] &]</lang>
- Output:
{"antiperspirant", "calendrical", "einstein", "hotshot", "murmur", "oshkosh", "tartar", "testes"}
Nim
<lang Nim>for word in "unixdict.txt".lines:
if word.len > 5: if word[0..2] == word[^3..^1]: echo word</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Perl
as one-liner .. <lang perl>// 20210212 Perl programming solution
perl -ne '/(?=^(.{3}).*\1$)^.{6,}$/&&print' unixdict.txt
- minor variation
perl -ne 's/(?=^(.{3}).*\1$)^.{6,}$/print/e' unixdict.txt</lang>
Phix
with javascript_semantics function flaste(string word) return length(word)>5 and word[1..3]=word[-3..-1] end function sequence flastes = filter(unix_dict(),flaste) printf(1,"%d words: %s\n",{length(flastes),join(shorten(flastes,"",3))})
- Output:
8 words: antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
PL/I
<lang pli>firstAndLast3Equal: procedure options(main);
declare dict file; open file(dict) title('unixdict.txt'); on endfile(dict) stop; declare word char(32) varying, (first3, last3) char(3); do while('1'b); get file(dict) list(word); first3 = substr(word, 1, 3); last3 = substr(word, length(word)-2, 3); if length(word) > 5 & first3 = last3 then put skip list(word); end;
end firstAndLast3Equal;</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Python
Tested on Python 3+, the file download will work only if the link is still active. It is possible that you may be able to fetch the file in your browser but download via code may still fail. Check whether you are connected to a VPN, it works on open networks <lang Python> import urllib.request urllib.request.urlretrieve("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt", "unixdict.txt")
dictionary = open("unixdict.txt","r")
wordList = dictionary.read().split('\n')
for word in wordList:
if len(word)>5 and word[:3].lower()==word[-3:].lower(): print(word)
</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Quackery
<lang Quackery> [ [] swap ]'[ swap
witheach [ dup nested unrot over do iff [ dip join ] else nip ] drop ] is filter ( [ --> [ )
$ "unixdict.txt" sharefile drop nest$ filter [ size 5 > ] filter [ 3 split -3 split nip = ] witheach [ echo$ cr ]</lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
Racket
<lang racket>#lang racket
(define ((prefix-and-suffix-match? len) str)
(let ((l (string-length str))) (and (>= l (* 2 len)) (string=? (substring str 0 len) (substring str (- l len))))))
(module+ main
(filter (prefix-and-suffix-match? 3) (file->lines "../../data/unixdict.txt")))</lang>
- Output:
'("antiperspirant" "calendrical" "einstein" "hotshot" "murmur" "oshkosh" "tartar" "testes")
Raku
<lang perl6># 20210210 Raku programming solution
my ( \L, \N, \IN ) = 5, 3, 'unixdict.txt';
for IN.IO.lines { .say if .chars > L and .substr(0,N) eq .substr(*-N,*) } </lang>
- Output:
antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes
REXX
This REXX version doesn't care what order the words in the dictionary are in, nor does it care what
case (lower/upper/mixed) the words are in, the search for the words and vowels is caseless.
The program verifies that the first and last three characters are, indeed, letters.
It also allows the length (3) of the first and last number of letters to be specified, and also the minimum length of the
words to be searched on the command line (CL) as well as specifying the dictionary file identifier.
<lang rexx>/*REXX pgm finds words in an specified dict. which have the same 1st and last 3 letters.*/
parse arg minL many iFID . /*obtain optional arguments from the CL*/
if minL== | minL=="," then minL= 6 /* " " " " " " */
if many== | many=="," then many= 3 /* " " " " " " */
if iFID== | iFID=="," then iFID='unixdict.txt' /* " " " " " " */
do #=1 while lines(iFID)\==0 /*read each word in the file (word=X).*/ x= strip( linein( iFID) ) /*pick off a word from the input line. */ @.#= x /*save: the original case of the word.*/ end /*#*/
- = # - 1 /*adjust word count because of DO loop.*/
say copies('─', 30) # "words in the dictionary file: " iFID finds= 0 /*word count which have matching end. */
/*process all the words that were found*/ do j=1 for #; $= @.j; upper $ /*obtain dictionary word; uppercase it.*/ if length($)<minL then iterate /*Word not long enough? Then skip it.*/ lhs= left($, many); rhs= right($, many) /*obtain the left & right side of word.*/ if \datatype(lhs || rhs, 'U') then iterate /*are the left and right side letters? */ if lhs \== rhs then iterate /*Left side match right side? No, skip*/ finds= finds + 1 /*bump count of only "e" vowels found. */ say right( left(@.j, 30), 40) /*indent original word for readability.*/ end /*j*/ /*stick a fork in it, we're all done. */
say copies('─', 30) finds " words found that the left " many ' letters match the' ,
"right letters which a word has a minimal length of " minL</lang>
- output when using the default inputs:
────────────────────────────── 25104 words in the dictionary file: unixdict.txt antiperspirant calendrical einstein hotshot murmur oshkosh tartar testes ────────────────────────────── 8 words found that the left 3 letters match the right letters which a word has a minimal length of 6
Ring
<lang ring> load "stdlib.ring"
cStr = read("unixdict.txt") wordList = str2list(cStr) num = 0
see "working..." + nl see "Words are:" + nl
ln = len(wordList) for n = ln to 1 step -1
if len(wordList[n]) < 6 del(wordList,n) ok
next
for n = 1 to len(wordList)
if left(wordList[n],3) = right(wordList[n],3) num = num + 1 see "" + num + ". " + wordList[n] + nl ok
next
see "done..." + nl </lang> Output:
working... Words are: 1. antiperspirant 2. calendrical 3. einstein 4. hotshot 5. murmur 6. oshkosh 7. tartar 8. testes done...
Swift
<lang swift>import Foundation
do {
try String(contentsOfFile: "unixdict.txt", encoding: String.Encoding.ascii) .components(separatedBy: "\n") .filter{$0.count > 5 && $0.prefix(3) == $0.suffix(3)} .enumerated() .forEach{print("\($0.0 + 1). \($0.1)")}
} catch {
print(error.localizedDescription)
}</lang>
- Output:
1. antiperspirant 2. calendrical 3. einstein 4. hotshot 5. murmur 6. oshkosh 7. tartar 8. testes
Wren
<lang ecmascript>import "io" for File import "/fmt" for Fmt
var wordList = "unixdict.txt" // local copy var count = 0 File.read(wordList).trimEnd().split("\n").
where { |w| return w.count > 5 && (w[0..2] == w[-3..-1]) }. each { |w| count = count + 1 Fmt.print("$d: $s", count, w) }</lang>
- Output:
1: antiperspirant 2: calendrical 3: einstein 4: hotshot 5: murmur 6: oshkosh 7: tartar 8: testes