Words containing "the" substring
- Task
Using the dictionary unixdict.txt, search words containing "the" substring,
then display the found words (on this page).
The length of any word shown should have a length > 11.
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
11l
L(word) File(‘unixdict.txt’).read().split("\n")
I ‘the’ C word & word.len > 11
print(word)
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Action!
In the following solution the input file unixdict.txt is loaded from H6 drive. Altirra emulator automatically converts CR/LF character from ASCII into 155 character in ATASCII charset used by Atari 8-bit computer when one from H6-H10 hard drive under DOS 2.5 is used.
BYTE FUNC FindS(CHAR ARRAY text,sub)
BYTE i,j,found
i=1
WHILE i<=text(0)-sub(0)+1
DO
found=0
FOR j=1 TO sub(0)
DO
IF text(i+j-1)#sub(j) THEN
found=0 EXIT
ELSE
found=1
FI
OD
IF found THEN
RETURN (i)
FI
i==+1
OD
RETURN (0)
BYTE FUNC IsValidWord(CHAR ARRAY word)
IF word(0)<=11 THEN RETURN (0) FI
IF FindS(word,"the")=0 THEN RETURN(0) FI
RETURN (1)
PROC FindWords(CHAR ARRAY fname)
CHAR ARRAY line(256)
CHAR ARRAY tmp(256)
BYTE pos,dev=[1]
pos=2
Close(dev)
Open(dev,fname,4)
WHILE Eof(dev)=0
DO
InputSD(dev,line)
IF IsValidWord(line) THEN
IF pos+line(0)>=39 THEN
PutE() pos=2
FI
Print(line) Put(32)
pos==+line(0)+1
FI
OD
Close(dev)
RETURN
PROC Main()
CHAR ARRAY fname="H6:UNIXDICT.TXT"
FindWords(fname)
RETURN
- Output:
Screenshot from Atari 8-bit computer
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Ada
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.Characters.Latin_1; use Ada.Characters.Latin_1;
procedure Main is
type col_count is mod 6;
package AF renames Ada.Strings.Fixed;
file_name : String := "unixdict.txt";
The_File : File_Type;
Inpt_Str : String (1 .. 40);
Length : Natural;
pattern : String := "the";
Columns : col_count := 0;
Tally : Natural := 0;
sep : constant Character := HT;
begin
Open (File => The_File, Mode => In_File, Name => file_name);
while not End_Of_File (The_File) loop
Get_Line (File => The_File, Item => Inpt_Str, Last => Length);
if Length > 11
and then
AF.Count (Source => Inpt_Str (1 .. Length), Pattern => pattern) > 0
then
Tally := Tally + 1;
Columns := Columns + 1;
Put (Inpt_Str (1 .. Length) & sep);
if Columns = 0 then
New_Line;
end if;
end if;
end loop;
New_Line;
Put_Line ("Found" & Tally'Image & " ""the"" words");
Close (The_File);
end Main;
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping Found 32 "the" words
ALGOL 68
Note, the source of files.incl.a68 is on a separate page on Rosetta Code, see the above link.
If you are using a compiler/interpreter other than ALGOL 68G, a string in string
procedure is available on Rosetta Code here.
BEGIN # find 12 character (or more) words that contain "the" #
PR read "files.incl.a68" PR # include file utilities #
# prints word if it contains "the", returns TRUE if it dows, FALSE if not #
PROC show the word = ( STRING word, INT count so far )BOOL:
IF INT w len = ( UPB word + 1 ) - LWB word;
w len < 12
THEN FALSE
ELIF NOT string in string( "the", NIL, word )
THEN FALSE
ELSE print( ( word, " " ) );
IF ( count so far + 1 ) MOD 6 = 0
THEN print( ( newline ) )
ELSE FROM w len + 1 TO 18 DO print( ( " " ) ) OD
FI;
TRUE
FI # show the word # ;
IF INT count = "unixdict.txt" EACHLINE show the word;
count < 0
THEN print( ( "Unable to open unixdict.txt", newline ) )
ELSE print( ( newline, "found ", whole( count, 0 ) ) );
print( ( " ""the"" words", newline ) )
FI
END
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping found 32 "the" words
AppleScript
AppleScripters can tackle this task in a variety of ways. The example handlers below are listed in order of increasing speed but all complete the task in under 0.2 seconds on my current machine. They all take a file specifier, search string, and minimum length as parameters and return identical results for the same input.
Using just the core language — 'words':
on wordsContaining(textfile, searchText, minLength)
script o
property wordList : missing value
property output : {}
end script
-- Extract the text's 'words' and return any that meet both the search text and minimum length requirements.
set o's wordList to words of (read (textfile as alias) as «class utf8»)
repeat with thisWord in o's wordList
if ((thisWord contains searchText) and (thisWord's length ≥ minLength)) then
set end of o's output to thisWord's contents
end if
end repeat
return o's output
end wordsContaining
Using just the core language — 'text items':
on wordsContaining(textFile, searchText, minLength)
script o
property textItems : missing value
property output : {}
end script
-- Extract the text's search-text-delimited sections.
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to searchText
set o's textItems to text items of (read (textFile as alias) as «class utf8»)
set AppleScript's text item delimiters to astid
-- Reconstitute any words containing the search text from the stubs at the section ends and
-- the search text itself, returning any results which meet the minimum length requirement.
set thisSection to beginning of o's textItems
set sectionHasWords to ((count thisSection's words) > 0)
considering white space
repeat with i from 2 to (count o's textItems)
set foundWord to searchText
if (sectionHasWords) then
set thisStub to thisSection's last word
if (thisSection ends with thisStub) then set foundWord to thisStub & foundWord
end if
set thisSection to item i of o's textItems
set sectionHasWords to ((count thisSection's words) > 0)
if (sectionHasWords) then
set thisStub to thisSection's first word
if (thisSection begins with thisStub) then set foundWord to foundWord & thisStub
end if
if (foundWord's length ≥ minLength) then set end of o's output to foundWord
end repeat
end considering
return o's output
end wordsContaining
Using a shell script:
on wordsContaining(textFile, searchText, minLength)
-- Set up and execute a shell script which uses grep to find words containing the search text
-- (matching AppleScript's current case-sensitivity setting) and awk to pass those which
-- satisfy the minimum length requirement.
if ("A" = "a") then
set part1 to "grep -io "
else
set part1 to "grep -o "
end if
set shellCode to part1 & quoted form of ("\\b\\w*" & searchText & "\\w*\\b") & ¬
(" <" & quoted form of textFile's POSIX path) & ¬
(" | awk " & quoted form of ("// && length($0) >= " & minLength))
return paragraphs of (do shell script shellCode)
end wordsContaining
Using Foundation methods (AppleScriptObjC):
use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions
on wordsContaining(textFile, searchText, minLength)
set theText to current application's class "NSMutableString"'s ¬
stringWithContentsOfFile:(textFile's POSIX path) usedEncoding:(missing value) |error|:(missing value)
-- Replace every run of non AppleScript 'word' characters with a linefeed.
tell theText to replaceOccurrencesOfString:("(?:[\\W--[.'’]]|(?<!\\w)[.'’]|[.'’](?!\\w))++") withString:(linefeed) ¬
options:(current application's NSRegularExpressionSearch) range:({0, its |length|()})
-- Split the text at the linefeeds.
set theWords to theText's componentsSeparatedByString:(linefeed)
-- Filter the resulting array for strings which meet the search text and minimum length requirements,
-- matching AppleScript's current case-sensitivity setting. NSString lengths are measured in 16-bit
-- code units so use regex to check the lengths in characters.
if ("A" = "a") then
set filterTemplate to "((self CONTAINS[c] %@) && (self MATCHES %@))"
else
set filterTemplate to "((self CONTAINS %@) && (self MATCHES %@))"
end if
set filter to current application's class "NSPredicate"'s ¬
predicateWithFormat_(filterTemplate, searchText, ".{" & minLength & ",}+")
return (theWords's filteredArrayUsingPredicate:(filter)) as list
end wordsContaining
Test code for the task with any of the above:
local textFile, output
set textFile to ((path to desktop as text) & "unixdict.txt") as «class furl»
-- considering case -- Uncomment this and the corresponding 'end' line for case-sensitive searches.
set output to wordsContaining(textFile, "the", 12)
-- end considering
return {count output, output}
- Output:
{32, {"authenticate", "chemotherapy", "chrysanthemum", "clothesbrush", "clotheshorse", "eratosthenes", "featherbedding", "featherbrain", "featherweight", "gaithersburg", "hydrothermal", "lighthearted", "mathematician", "neurasthenic", "nevertheless", "northeastern", "northernmost", "otherworldly", "parasympathetic", "physiotherapist", "physiotherapy", "psychotherapeutic", "psychotherapist", "psychotherapy", "radiotherapy", "southeastern", "southernmost", "theoretician", "weatherbeaten", "weatherproof", "weatherstrip", "weatherstripping"}}
Arturo
print.lines
select read.lines relative "unixdict.txt" 'l ->
and? [11 < size l]
[contains? l "the"]
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
AutoHotkey
FileRead, wList, % A_Desktop "\unixdict.txt"
SubString := "the"
list := ContainSubStr(wList, SubString)
for i, v in list
result .= i "- " v "`n"
MsgBox, 262144, , % result
return
ContainSubStr(wList, SubString){
oRes := []
for i, w in StrSplit(wList, "`n", "`r")
{
if (StrLen(w) < 12 || !InStr(w, SubString))
continue
oRes.Push(w)
}
return oRes
}
- Output:
1- authenticate 2- chemotherapy 3- chrysanthemum 4- clothesbrush 5- clotheshorse 6- eratosthenes 7- featherbedding 8- featherbrain 9- featherweight 10- gaithersburg 11- hydrothermal 12- lighthearted 13- mathematician 14- neurasthenic 15- nevertheless 16- northeastern 17- northernmost 18- otherworldly 19- parasympathetic 20- physiotherapist 21- physiotherapy 22- psychotherapeutic 23- psychotherapist 24- psychotherapy 25- radiotherapy 26- southeastern 27- southernmost 28- theoretician 29- weatherbeaten 30- weatherproof 31- weatherstrip 32- weatherstripping
AutoIt
; Includes not needed if you don't want to use the constants
#include <FileConstants.au3>
#include <StringConstants.au3>
#include <MsgBoxConstants.au3>
;Initialise some variables and constants
Local Const $sFileName = "unixdict.txt"
Local Const $sStrToFind = "the"
Local $iFoundResults = 0
; Open the file for reading and store the handle to a variable.
Local $hFileOpen = FileOpen($sFileName, $FO_READ)
If $hFileOpen = -1 Then
MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.")
Return False
EndIf
; Read the contents of the file using the handle returned by FileOpen.
Local $sFileRead = FileRead($hFileOpen)
; Close the handle returned by FileOpen.
FileClose($hFileOpen)
; Get each "word" that's on a new line
Local $aArray = StringSplit($sFileRead, @CRLF)
; Loop through the array returned by StringSplit to check the length and if it containes the "the" substring.
For $i = 1 To $aArray[0]
If StringLen($aArray[$i]) > 11 Then
If StringInStr($aArray[$i], $sStrToFind) <> 0 Then
; Increment the found results counter
$iFoundResults += 1
; Log the output
ConsoleWrite($aArray[$i])
ConsoleWrite(@CRLF)
EndIf
EndIf
Next
ConsoleWrite("Found " & $iFoundResults & " words containing '" & $sStrToFind & "'")
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping Found 32 words containing 'the'>Exit code: 0 Time: 0.07385
AWK
The following is an awk one-liner entered at a Posix shell.
/Code$ awk '/the/ && length($1) > 11' unixdict.txt
authenticate
chemotherapy
chrysanthemum
clothesbrush
clotheshorse
eratosthenes
featherbedding
featherbrain
featherweight
gaithersburg
hydrothermal
lighthearted
mathematician
neurasthenic
nevertheless
northeastern
northernmost
otherworldly
parasympathetic
physiotherapist
physiotherapy
psychotherapeutic
psychotherapist
psychotherapy
radiotherapy
southeastern
southernmost
theoretician
weatherbeaten
weatherproof
weatherstrip
weatherstripping
/Code$
BASIC
10 OPEN "I",1,"unixdict.txt"
20 IF EOF(1) THEN CLOSE #1: END
30 LINE INPUT #1,W$
40 IF LEN(W$)>11 AND INSTR(W$,"the") THEN PRINT W$
50 GOTO 20
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
BASIC256
f = freefile
open f, "i:\unixdict.txt"
while not eof(f)
a$ = read (f)
if length(a$) > 11 and instr(a$, "the") then print a$
end while
close f
- Output:
Same as BASIC entry.
GW-BASIC
10 OPEN "unixdict.txt" FOR INPUT AS #1
20 WHILE NOT EOF(1)
30 LINE INPUT #1, A$
40 IF LEN(A$) > 11 AND INSTR(A$,"the") THEN PRINT A$
50 WEND
60 CLOSE #1
70 END
- Output:
Same as BASIC entry.
QBasic
OPEN "unixdict.txt" FOR INPUT AS #1
WHILE NOT EOF(1)
LINE INPUT #1, W$
IF LEN(W$) > 11 AND INSTR(W$, "the") THEN PRINT W$
WEND
CLOSE #1
END
- Output:
Same as BASIC entry.
BCPL
get "libhdr"
let read(word) = valof
$( let ch = ?
word%0 := 0
$( ch := rdch()
if ch = endstreamch then resultis false
word%0 := word%0 + 1
word%(word%0) := ch
$) repeatuntil ch = '*N'
resultis true
$)
let contains(s1,s2) = valof
$( for i=1 to s1%0-s2%0+1
$( for j=1 to s2%0
unless s1%(i+j-1)=s2%j goto next
resultis true
next: loop
$)
resultis false
$)
// We need to test for a length of 12 rather than 11,
// because the newline character is included.
let match(word) = word%0 > 12 & contains(word,"the")
let start() be
$( let word = vec 63
let file = findinput("unixdict.txt")
test file=0 do
writes("Cannot open unixdict.txt*N")
or
$( selectinput(file)
while read(word) if match(word) do writes(word)
endread()
$)
$)
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
BQN
•Show ∘‿4⥊ (1∊"the"⍷>)˘⊸/ (11<≠¨)⊸/ •Flines "unixdict.txt"
- Output:
┌─ ╵ "authenticate" "chemotherapy" "chrysanthemum" "clothesbrush" "clotheshorse" "eratosthenes" "featherbedding" "featherbrain" "featherweight" "gaithersburg" "hydrothermal" "lighthearted" "mathematician" "neurasthenic" "nevertheless" "northeastern" "northernmost" "otherworldly" "parasympathetic" "physiotherapist" "physiotherapy" "psychotherapeutic" "psychotherapist" "psychotherapy" "radiotherapy" "southeastern" "southernmost" "theoretician" "weatherbeaten" "weatherproof" "weatherstrip" "weatherstripping" ┘
C
#include <stdio.h>
#include <string.h>
int main() {
char word[128];
FILE *f = fopen("unixdict.txt","r");
if (!f) {
fprintf(stderr, "Cannot open unixdict.txt\n");
return -1;
}
while (!feof(f)) {
fgets(word, sizeof(word), f);
// fgets() includes the \n character, so we need to test
// for a length of 12 (11 letters plus the newline)
if (strlen(word) > 12 && strstr(word,"the"))
printf("%s",word);
}
fclose(f);
return 0;
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
C++
#include <iostream>
#include <fstream>
int main() {
std::string word;
std::ifstream file("unixdict.txt");
if (!file) {
std::cerr << "Cannot open unixdict.txt" << std::endl;
return -1;
}
while (file >> word) {
if (word.length() > 11 && word.find("the") != std::string::npos)
std::cout << word << std::endl;
}
return 0;
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Common Lisp
(defun print-words-containing-substring (str len path)
(with-open-file (s path :direction :input)
(do ((line (read-line s nil :eof) (read-line s nil :eof)))
((eql line :eof)) (when (and (> (length line) len)
(search str line))
(format t "~a~%" line)))))
(print-words-containing-substring "the" 11 "unixdict.txt")
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping NIL
Delphi
program Words_containing_the_substring;
{$APPTYPE CONSOLE}
uses
System.SysUtils,
System.IOUtils;
var
Words, WordsFound: TArray<string>;
begin
Words := TFile.ReadAllLines('unixdict.txt');
for var w in Words do
begin
if (w.Length > 11) and (w.IndexOf('the') > -1) then
begin
SetLength(WordsFound, Length(WordsFound) + 1);
WordsFound[High(WordsFound)] := w;
end;
end;
writeln('Words containing "the" having a length > 11 in unixdict.txt:');
for var i := 0 to High(WordsFound) do
writeln(i + 1: 2, ': ', WordsFound[i]);
readln;
end.
Draco
\util.g
proc theword(*char line) bool:
CharsLen(line) > 11
and CharsIndex(line, "the") ~= -1
corp
proc nonrec main() void:
file(1024) dictfile;
[32] char buf;
*char line;
channel input text dict;
open(dict, dictfile, "unixdict.txt");
line := &buf[0];
while readln(dict; line) do
if theword(line) then writeln(line) fi
od;
close(dict)
corp
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
DuckDB
In this entry, the fact that an on-the-fly table can be referenced within a function is used to avoid having to create a named table while still allowing a certain measure of functional abstraction. The alternative would involve using query_table(), which was only introduced in DuckDB 1.1.
# Use an in-scope dictionary, dictionary(word), as the dictionary
create or replace function the_words() as table (
FROM dictionary
WHERE contains(word, 'the')
ORDER BY word
);
with dictionary as (
SELECT word
FROM read_csv('unixdict.txt', header=false, sep='',
columns={'word': VARCHAR}, auto_detect=false)
where length(word)>11
)
from the_words();
- Output:
(elided)
┌───────────────────┐ │ word │ │ varchar │ ├───────────────────┤ │ authenticate │ │ chemotherapy │ .... │ weatherstrip │ │ weatherstripping │ ├───────────────────┤ │ 32 rows │ └───────────────────┘
EasyLang
repeat
s$ = input
until s$ = ""
if len s$ > 11
if strpos s$ "the" <> 0
print s$
.
.
.
# the content of unixdict.txt
input_data
10th
.
mathematician
.
ed
First remove the short lines, then remove the ones without "the".
# by Artyom Bologov
H
v/.\{12,\}/d
v/the/d
,p
Q
- Output:
$ ed -s unixdict.txt < the-substring.ed authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Factor
USING: io io.encodings.ascii io.files kernel math sequences ;
"unixdict.txt" ascii file-lines
[ length 11 > ] filter
[ "the" swap subseq? ] filter
[ print ] each
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Forth
Developed with Gforth 0.7.9
11 constant WordLen
128 constant max-line
create SearchSub 80 allot
Create SrcFile 256 allot
Variable fhin
variable Cnt
: SrcOpen Srcfile count r/o open-file throw Fhin ! ;
: SrcClose fhin @ close-file throw ;
: third >r over r> swap ;
: cnt++ cnt 1 swap +! ;
: SubStrFound SearchSub count Search ;
: read-lines fhin @
begin pad max-line third read-line throw
while pad swap dup WordLen >
if 2dup SubStrFound -rot 2drop
if cnt++ cr type else 2drop then
else 2DROP
then
repeat 2drop ;
: Test 0 cnt !
s" ./unixdict.txt" SrcFile place
s" the" SearchSub place
SrcOpen
read-lines
cr ." =============="
cr ." Found " cnt @ . ." Words" cr
SrcClose ;
Test
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping ============== Found 32 Words
Fortran
program main
implicit none
integer :: lun
character(len=256) :: line
integer :: ios
open(file='unixdict.txt',newunit=lun)
do
read(lun,'(a)',iostat=ios)line
if(ios /= 0)exit
if( index(line,'the') /= 0 .and. len_trim(line) > 11 ) then
write(*,'(a)')trim(line)
endif
enddo
end program main
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
FreeBASIC
Reuses some code from Odd words#FreeBASIC
#define NULL 0
type node
word as string*32 'enough space to store any word in the dictionary
nxt as node ptr
end type
function addword( tail as node ptr, word as string ) as node ptr
'allocates memory for a new node, links the previous tail to it,
'and returns the address of the new node
dim as node ptr newnode = allocate(sizeof(node))
tail->nxt = newnode
newnode->nxt = NULL
newnode->word = word
return newnode
end function
function length( word as string ) as uinteger
'necessary replacement for the built-in len function, which in this
'case would always return 32
for i as uinteger = 1 to 32
if asc(mid(word,i,1)) = 0 then return i-1
next i
return 999
end function
dim as string word
dim as node ptr tail = allocate( sizeof(node) )
dim as node ptr head = tail, curr = head, currj
tail->nxt = NULL
tail->word = "XXXXHEADER"
open "unixdict.txt" for input as #1
while true
line input #1, word
if word = "" then exit while
if length(word)>11 then tail = addword( tail, word )
wend
close #1
dim as string tempword
while curr->nxt <> NULL
for i as uinteger = 1 to length(curr->word)-3
if mid(curr->word,i,3) = "the" then print curr->word
next i
curr = curr->nxt
wend
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
FutureBasic
include "NSLog.incl"
#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}
void local fn DoIt
CFURLRef url
CFStringRef string, wd
ErrorRef err = NULL
CFArrayRef array
CFMutableArrayRef mutArray
url = fn URLWithString( @"https://web.archive.org/web/20180611003215/http://www.puzzlers.org/pub/wordlists/unixdict.txt" )
string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, @err )
if ( string )
array = fn StringComponentsSeparatedByCharactersInSet( string, fn CharacterSetNewlineSet )
mutArray = fn MutableArrayWithCapacity(0)
for wd in array
if ( len(wd) > 11 and fn StringContainsString( wd, @"the" ) )
MutableArrayAddObject( mutArray, wd )
end if
next
string = fn ArrayComponentsJoinedByString( mutArray, @"\n" )
NSLog(@"%@",string)
else
NSLog(@"%@",err)
end if
end fn
fn DoIt
HandleEvents
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Go
package main
import (
"bytes"
"fmt"
"io/ioutil"
"log"
"strings"
"unicode/utf8"
)
func main() {
wordList := "unixdict.txt"
b, err := ioutil.ReadFile(wordList)
if err != nil {
log.Fatal("Error reading file")
}
bwords := bytes.Fields(b)
var words []string
for _, bword := range bwords {
s := string(bword)
if utf8.RuneCountInString(s) > 11 {
words = append(words, s)
}
}
count := 0
fmt.Println("Words containing 'the' having a length > 11 in", wordList, "\b:")
for _, word := range words {
if strings.Contains(word, "the") {
count++
fmt.Printf("%2d: %s\n", count, word)
}
}
}
- Output:
Words containing 'the' having a length > 11 in unixdict.txt: 1: authenticate 2: chemotherapy 3: chrysanthemum 4: clothesbrush 5: clotheshorse 6: eratosthenes 7: featherbedding 8: featherbrain 9: featherweight 10: gaithersburg 11: hydrothermal 12: lighthearted 13: mathematician 14: neurasthenic 15: nevertheless 16: northeastern 17: northernmost 18: otherworldly 19: parasympathetic 20: physiotherapist 21: physiotherapy 22: psychotherapeutic 23: psychotherapist 24: psychotherapy 25: radiotherapy 26: southeastern 27: southernmost 28: theoretician 29: weatherbeaten 30: weatherproof 31: weatherstrip 32: weatherstripping
Haskell
import System.IO (readFile)
import Data.List (isInfixOf)
main = do
txt <- readFile "unixdict.txt"
let res = [ w | w <- lines txt, isInfixOf "the" w, length w > 11 ]
putStrLn $ show (length res) ++ " words were found:"
mapM_ putStrLn res
λ> main 32 words were found: authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
J
>(#~ (+./@E.~&'the'*11<#)@>) cutLF fread'unixdict.txt'
authenticate
chemotherapy
chrysanthemum
clothesbrush
clotheshorse
eratosthenes
featherbedding
featherbrain
featherweight
gaithersburg
hydrothermal
lighthearted
mathematician
neurasthenic
nevertheless
northeastern
northernmost
otherworldly
parasympathetic
physiotherapist
physiotherapy
psychotherapeutic
psychotherapist
psychotherapy
radiotherapy
southeastern
southernmost
theoretician
weatherbeaten
weatherproof
weatherstrip
weatherstripping
Java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
public final class WordsContainingTheSubstring {
public static void main(String[] args) throws IOException {
Files.lines(Path.of("unixdict.txt"))
.filter( word -> word.length() > 11 && word.contains("the") )
.forEach(System.out::println);
}
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
JavaScript
document.write(`
<p>Select a file: <input type="file" id="file"></p>
<p>Get words containing: <input value="THE" type="text" id="cont"></p>
<p>Min. word length: <input type="number" value="12" id="len"></p>
<div id="info"></div><div id="out"></div>
`);
function search(inp) {
let cont = document.getElementById('cont').value.toUpperCase(),
len = parseInt(document.getElementById('len').value),
out = document.getElementById('out'),
info = document.getElementById('info'),
result = [], i;
inp = inp.replace(/\n|\r/g, '_');
inp = inp.replace(/__/g, ' ').split(' ');
for (i = 0; i < inp.length; i++)
if (inp[i].length >= len && inp[i].toUpperCase().indexOf(cont) != -1)
result.push(inp[i]);
info.innerHTML = `<h2>${result.length} matches found for ${cont}, min. length ${len}:</h2>`;
out.innerText = result.join(', ');
}
document.getElementById('file').onchange = function() {
let fr = new FileReader(),
f = document.getElementById('file').files[0];
fr.onload = function() { search(fr.result); }
fr.readAsText(f);
}
- Output:
32 matches found for THE, min. length 12: authenticate, chemotherapy, chrysanthemum, clothesbrush, clotheshorse, eratosthenes, featherbedding, featherbrain, featherweight, gaithersburg, hydrothermal, lighthearted, mathematician, neurasthenic, nevertheless, northeastern, northernmost, otherworldly, parasympathetic, physiotherapist, physiotherapy, psychotherapeutic, psychotherapist, psychotherapy, radiotherapy, southeastern, southernmost, theoretician, weatherbeaten, weatherproof, weatherstrip, weatherstripping
jq
jq -nrR 'inputs|select(length>11 and index("the"))' unixdict.txt
One could also use `test("the")` here instead, the difference being that the argument of `test` is a JSON string interpreted as a regular expression.
- Output:
As for 11l et al.
Julia
See Alternade_words for the foreachword function.
containsthe(w, d) = occursin("the", w) ? w : ""
foreachword("unixdict.txt", containsthe, minlen = 12)
- Output:
Word source: unixdict.txt authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympatheticphysiotherapistphysiotherapy psychotherapeuticpsychotherapistpsychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Lua
for word in io.open("unixdict.txt", "r"):lines() do
if #word > 11 and word:find("the") then
print(word)
end
end
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Mathematica /Wolfram Language
dict = Once[Import["https://web.archive.org/web/20180611003215/http://www.puzzlers.org/pub/wordlists/unixdict.txt"]];
dict //= StringSplit[#, "\n"] &;
dict //= Select[StringLength /* GreaterThan[11]];
Select[dict, StringContainsQ["the"]]
- Output:
{authenticate, chemotherapy, chrysanthemum, clothesbrush, clotheshorse, eratosthenes, featherbedding, featherbrain, featherweight, gaithersburg, hydrothermal, lighthearted, mathematician, neurasthenic, nevertheless, northeastern, northernmost, otherworldly, parasympathetic, physiotherapist, physiotherapy, psychotherapeutic, psychotherapist, psychotherapy, radiotherapy, southeastern, southernmost, theoretician, weatherbeaten, weatherproof, weatherstrip, weatherstripping}
min
"unixdict.txt" fread "\n" split
(length 11 >) filter
("the" indexof -1 !=) filter
(puts!) foreach
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Nanoquery
words = split(new(Nanoquery.IO.File).open("unixdict.txt").readAll(),"\n")
for word in words
if (word .contains. "the") and (len(word) > 11)
println word
end if
end for
Nim
import strutils
var count = 0
for word in "unixdict.txt".lines:
if word.len > 11 and word.contains("the"):
inc count
echo ($count).align(2), ' ', word
- Output:
1 authenticate 2 chemotherapy 3 chrysanthemum 4 clothesbrush 5 clotheshorse 6 eratosthenes 7 featherbedding 8 featherbrain 9 featherweight 10 gaithersburg 11 hydrothermal 12 lighthearted 13 mathematician 14 neurasthenic 15 nevertheless 16 northeastern 17 northernmost 18 otherworldly 19 parasympathetic 20 physiotherapist 21 physiotherapy 22 psychotherapeutic 23 psychotherapist 24 psychotherapy 25 radiotherapy 26 southeastern 27 southernmost 28 theoretician 29 weatherbeaten 30 weatherproof 31 weatherstrip 32 weatherstripping
Nu
open 'unixdict.txt' | split words -l 12 | where ('the' in $it)
- Output:
╭────┬───────────────────╮ │ 0 │ authenticate │ │ 1 │ chemotherapy │ │ 2 │ chrysanthemum │ │ 3 │ clothesbrush │ │ 4 │ clotheshorse │ │ 5 │ eratosthenes │ │ 6 │ featherbedding │ │ 7 │ featherbrain │ │ 8 │ featherweight │ │ 9 │ gaithersburg │ │ 10 │ hydrothermal │ │ 11 │ lighthearted │ │ 12 │ mathematician │ │ 13 │ neurasthenic │ │ 14 │ nevertheless │ │ 15 │ northeastern │ │ 16 │ northernmost │ │ 17 │ otherworldly │ │ 18 │ parasympathetic │ │ 19 │ physiotherapist │ │ 20 │ physiotherapy │ │ 21 │ psychotherapeutic │ │ 22 │ psychotherapist │ │ 23 │ psychotherapy │ │ 24 │ radiotherapy │ │ 25 │ southeastern │ │ 26 │ southernmost │ │ 27 │ theoretician │ │ 28 │ weatherbeaten │ │ 29 │ weatherproof │ │ 30 │ weatherstrip │ │ 31 │ weatherstripping │ ╰────┴───────────────────╯
Objeck
class Thes {
function : Main(args : String[]) ~ Nil {
if(args->Size() = 1) {
reader := System.IO.File.FileReader->New(args[0]);
words := Collection.Generic.Vector->New()<String>;
line := reader->ReadLine();
while(line <> Nil) {
if(line->Size() > 11 & line->Has("the")) {
words->AddBack(line);
};
line := reader->ReadLine();
};
reader->Close();
found := words->Size();
"Found {$found} word(s):"->PrintLine();
each(i : words) {
word := words->Get(i);
"{$word} "->Print();
if(i > 0 & i % 5 = 0) {
'\n'->Print();
};
};
};
}
}
- Output:
Found 32 word(s): authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Pascal
program wordsContainingTheSubstring(input, output);
var
word: string(22);
begin
while not EOF do
begin
readLn(word);
if (length(word) > 11) and_then (index(word, 'the') > 0) then
begin
writeLn(word)
end
end
end.
If unixdict.txt is fed to stdin, the standard input file, you will get the following output:
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Perl
Perl one-liner entered from a Posix shell:
/Code$ perl -n -e '/(\w*the\w*)/ && length($1)>11 && print' unixdict.txt
authenticate
chemotherapy
chrysanthemum
clothesbrush
clotheshorse
eratosthenes
featherbedding
featherbrain
featherweight
gaithersburg
hydrothermal
lighthearted
mathematician
neurasthenic
nevertheless
northeastern
northernmost
otherworldly
parasympathetic
physiotherapist
physiotherapy
psychotherapeutic
psychotherapist
psychotherapy
radiotherapy
southeastern
southernmost
theoretician
weatherbeaten
weatherproof
weatherstrip
weatherstripping
/Code$
Phix
with javascript_semantics function the(string word) return length(word)>11 and match("the",word) end function sequence words = filter(unix_dict(),the) printf(1,"found %d 'the' words:\n%s\n",{length(words),join(shorten(words,"",3),", ")})
- Output:
found 32 'the' words: authenticate, chemotherapy, chrysanthemum, ..., weatherproof, weatherstrip, weatherstripping
PHP
<?php foreach(file("unixdict.txt") as $w) echo (strstr($w, "the") && strlen(trim($w)) > 11) ? $w : "";
Plain English
To run:
Start up.
Put "c:\unixdict.txt" into a path.
Read the path into a buffer.
Slap a rider on the buffer.
Loop.
Move the rider (text file rules).
Subtract 1 from the rider's token's last. \newline
Put the rider's token into a word string.
If the word is blank, break.
If the word's length is less than 12, repeat.
If "the" is in the word, write the word on the console.
Repeat.
Wait for the escape key.
Shut down.
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
PL/I
the: procedure options(main);
declare dict file;
open file(dict) title('unixdict.txt');
on endfile(dict) stop;
declare word char(32) varying;
do while('1'b);
get file(dict) list(word);
if length(word) > 11 & index(word,'the') ^= 0 then
put skip list(word);
end;
close file(dict);
end the;
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Processing
String[] words = loadStrings("unixdict.txt");
for (String word : words) {
if (word.contains("the") && word.length() > 11) {
println(word);
}
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Python
import urllib.request as request
with request.urlopen("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt") as f:
a = f.read().decode("ASCII").split()
for s in a:
if len(s) > 11 and "the" in s:
print(s)
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Quackery
Uses a finite state machine to search efficiently for a substring. (The fsm to search for "the" is built only once, during compilation.) Presented as a dialogue in the Quackery shell (REPL).
/O> [ $ 'sundry/fsm.qky' loadfile ] now!
... [ dup
... [ $ 'the' buildfsm ] constant
... usefsm over found ] is contains-"the"
... []
... $ 'unixdict.txt' sharefile drop
... nest$ witheach
... [ dup size 12 < iff drop done
... contains-"the" iff [ nested join ]
... else drop ]
... 60 wrap$ cr
...
authenticate chemotherapy chrysanthemum clothesbrush
clotheshorse eratosthenes featherbedding featherbrain
featherweight gaithersburg hydrothermal lighthearted
mathematician neurasthenic nevertheless northeastern
northernmost otherworldly parasympathetic physiotherapist
physiotherapy psychotherapeutic psychotherapist
psychotherapy radiotherapy southeastern southernmost
theoretician weatherbeaten weatherproof weatherstrip
weatherstripping
Stack empty.
R
words <- readLines("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
grep("the", words[nchar(words) > 11], value = T)
- Output:
[1] "authenticate" "chemotherapy" "chrysanthemum" "clothesbrush" [5] "clotheshorse" "eratosthenes" "featherbedding" "featherbrain" [9] "featherweight" "gaithersburg" "hydrothermal" "lighthearted" [13] "mathematician" "neurasthenic" "nevertheless" "northeastern" [17] "northernmost" "otherworldly" "parasympathetic" "physiotherapist" [21] "physiotherapy" "psychotherapeutic" "psychotherapist" "psychotherapy" [25] "radiotherapy" "southeastern" "southernmost" "theoretician" [29] "weatherbeaten" "weatherproof" "weatherstrip" "weatherstripping"
Raku
A trivial modification of the ABC words task.
put 'unixdict.txt'.IO.words».fc.grep({ (.chars > 11) && (.contains: 'the') })\
.&{"{+$_} words:\n " ~ .batch(8)».fmt('%-17s').join: "\n "};
- Output:
32 words: authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Red
Red[]
foreach word read/lines %unixdict.txt [
if all [11 < length? word find word "the"] [print word]
]
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Refal
$ENTRY Go {
, <ReadFile 1 'unixdict.txt'>: e.Dict
= <Each Show <Filter TheWord e.Dict>>;
};
TheWord {
(e.Word), e.Word: e.X 'the' e.Y,
<Lenw e.Word>: s.Len e.Word,
<Compare s.Len 11>: '+' = T;
(e.Word) = F;
};
ReadFile {
s.Chan e.Filename =
<Open 'r' s.Chan e.Filename>
<ReadFile (s.Chan)>;
(s.Chan), <Get s.Chan>: {
0 = <Close s.Chan>;
e.Line = (e.Line) <ReadFile (s.Chan)>;
};
};
Each {
s.F = ;
s.F t.I e.X = <Mu s.F t.I> <Each s.F e.X>;
};
Filter {
s.F = ;
s.F t.I e.X, <Mu s.F t.I>: {
T = t.I <Filter s.F e.X>;
F = <Filter s.F e.X>;
};
};
Show {
(e.X) = <Prout e.X>;
};
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
REXX
This REXX version doesn't care what order the words in the dictionary are in, nor does it care what
case (lower/upper/mixed) the words are in, the search for the substring the is caseless.
It also allows the substring to be specified on the command line (CL) as well as the dictionary file identifier.
Programming note: If the minimum length is negative, it indicates to find the words (but not display them), and
only the display the count of found words.
/*REXX program finds words that contain the substring "the" (within an identified dict.)*/
parse arg $ minL iFID . /*obtain optional arguments from the CL*/
if $=='' | $=="," then $= 'the' /*Not specified? Then use the default.*/
if minL=='' | minL=="," then minL= 12 /* " " " " " " */
if iFID=='' | iFID=="," then iFID='unixdict.txt' /* " " " " " " */
tell= minL>0; minL= abs(minL) /*use absolute value of minimum length.*/
@.= /*default value of any dictionary word.*/
do #=1 while lines(iFID)\==0 /*read each word in the file (word=X).*/
@.#= strip( linein( iFID) ) /*pick off a word from the input line. */
end /*#*/
#= # - 1 /*adjust word count because of DO loop.*/
$u= $; upper $u /*obtain an uppercase version of $. */
say copies('─', 25) # "words in the dictionary file: " iFID
say
finds= 0 /*count of the substring found in dict.*/
do j=1 for #; z= @.j; upper z /*process all the words that were found*/
if length(z)<minL then iterate /*Is word too short? Yes, then skip.*/
if pos($u, z)==0 then iterate /*Found the substring? No, " " */
finds= finds + 1 /*bump count of substring words found. */
if tell then say right(left(@.j, 20), 25) /*Show it? Indent original word.*/
end /*j*/
/*stick a fork in it, we're all done. */
say copies('─', 25) finds " words (with a min. length of" ,
minL') that contains the substring: ' $
- output when using the default inputs:
───────────────────────── 25104 words in the dictionary file: unixdict.txt authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping ───────────────────────── 32 words (with a min. length of 12) that contain the substring: the
- output when using the input of: , -3
───────────────────────── 25105 words in the dictionary file: unixdict.txt ───────────────────────── 287 words (with a min. length of 3) that contains the substring: the
Ring
cStr = read("unixdict.txt")
wordList = str2list(cStr)
num = 0
the = "the"
see "working..." + nl
ln = len(wordList)
for n = ln to 1 step -1
if len(wordList[n]) < 12
del(wordList,n)
ok
next
see "Words containing "the" substring:" + nl
for n = 1 to len(wordList)
ind = substr(wordList[n],the)
if ind > 0
num = num +1
see "" + num + ". " + wordList[n] + nl
ok
next
see "done..." + nl
Output:
working... Founded "the" words are: 1. authenticate 2. chemotherapy 3. chrysanthemum 4. clothesbrush 5. clotheshorse 6. eratosthenes 7. featherbedding 8. featherbrain 9. featherweight 10. gaithersburg 11. hydrothermal 12. lighthearted 13. mathematician 14. neurasthenic 15. nevertheless 16. northeastern 17. northernmost 18. otherworldly 19. parasympathetic 20. physiotherapist 21. physiotherapy 22. psychotherapeutic 23. psychotherapist 24. psychotherapy 25. radiotherapy 26. southeastern 27. southernmost 28. theoretician 29. weatherbeaten 30. weatherproof 31. weatherstrip 32. weatherstripping done...
RPL
As RPL does not manage files, the content of unixdict.txt has been converted into a list of 25,104 strings stored in the DICT
global variable.
≪ { } 1 DICT SIZE FOR j 'DICT' j GET IF DUP SIZE 11 > OVER "the" POS AND THEN + ELSE DROP END NEXT ≫ 'TASK' STO
- Output:
1: { "authenticate" "chemotherapy" "chrysanthemum" "clothesbrush" "clotheshorse" "eratosthenes" "featherbedding" "featherbrain" "featherweight" "gaithersburg" "hydrothermal" "lighthearted" "mathematician" "neurasthenic" "nevertheless" "northeastern" "northernmost" "otherworldly" "parasympathetic" "physiotherapist" "physiotherapy" "psychotherapeutic" "psychotherapist" "psychotherapy" "radiotherapy" "southeastern" "southernmost" "theoretician" "weatherbeaten" "weatherproof" "weatherstrip" "weatherstripping" }
Ruby
File.foreach("unixdict.txt"){|w| puts w if w.size > 11 && w.match?("the") }
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Rust
use std::fs;
const WORD_LENGTH: usize = 12;
fn main() {
let wordsfile = fs::read_to_string("unixdict.txt").unwrap().to_lowercase();
let words = wordsfile
.split_whitespace()
.filter(|w| w.len() >= WORD_LENGTH && w.contains("the"));
for (i, w) in words.enumerate() {
print!("{:<18}{}", w, if (i + 1) % 5 == 0 { "\n" } else { "" });
}
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Smalltalk
'unixdict.txt' asFilename contents
select:[:word | (word size > 11) and:[word includesString:'the']]
thenDo:#transcribeCR
if counting per word is required (which is overkill here, as there are no duplicates in the file), keep them in a bag:
bagOfWords := Bag new.
'unixdict.txt' asFilename contents
select:[:word | (word size > 11) and:[word includesString:'the']]
thenDo:[:word | bagOfWords add:word. word transcribeCR].
bagOfWords transcribeCR.
bagOfWords size transcribeCR
Note: #transcribeCR is a method in Object which says: "Transcript showCR:self".
Variant (as script file). Save to file: "filter.st":
#! /usr/bin/env stx --script
[Stdin atEnd] whileFalse:[
|word|
((word := Stdin nextLine) size > 11
and:[word includesString:'the']
) ifTrue:[
Stdout nextPutLine: word
]
]
Execute with:
chmod +x filter.st
./filter.st < unixdict.txt
The output from the above counting snippet:
- Output:
authenticate chemotherapy chrysanthemum ... weatherproof weatherstrip weatherstripping Bag(chrysanthemum(*1) hydrothermal(*1) nevertheless(*1) chemotherapy(*1) eratosthenes(*1) mathematician(*1) ... theoretician(*1) weatherbeaten(*1) weatherstripping(*1)) 32
sed
#!/bin/sed -f
/^.\{12\}/!d
/the/!d
SETL
program the_words;
dict := open("unixdict.txt", "r");
loop doing geta(dict, word); until eof(dict) do
word ?:= "";
if #word > 11 and "the" in word then
print(word);
end if;
end loop;
close(dict);
end program;
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Standard ML
val hasThe = String.isSubstring "the"
fun isThe12 s = size s > 11 andalso hasThe s
val () = print
((String.concatWith " "
o List.filter isThe12
o String.tokens Char.isSpace
o TextIO.inputAll) TextIO.stdIn ^ "\n")
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Swift
import Foundation
let minLength = 12
let substring = "the"
do {
try String(contentsOfFile: "unixdict.txt", encoding: String.Encoding.ascii)
.components(separatedBy: "\n")
.filter{$0.count >= minLength && $0.contains(substring)}
.enumerated()
.forEach{print(String(format: "%2d. %@", $0.0 + 1, $0.1))}
} catch {
print(error.localizedDescription)
}
- Output:
1. authenticate 2. chemotherapy 3. chrysanthemum 4. clothesbrush 5. clotheshorse 6. eratosthenes 7. featherbedding 8. featherbrain 9. featherweight 10. gaithersburg 11. hydrothermal 12. lighthearted 13. mathematician 14. neurasthenic 15. nevertheless 16. northeastern 17. northernmost 18. otherworldly 19. parasympathetic 20. physiotherapist 21. physiotherapy 22. psychotherapeutic 23. psychotherapist 24. psychotherapy 25. radiotherapy 26. southeastern 27. southernmost 28. theoretician 29. weatherbeaten 30. weatherproof 31. weatherstrip 32. weatherstripping
Tcl
foreach w [read [open unixdict.txt]] {
if {[string first the $w] != -1 && [string length $w] > 11} {
puts $w
}
}
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
VBA
Sub Main_Contain()
Dim ListeWords() As String, Book As String, i As Long, out() As String, count As Integer
Book = Read_File("C:\Users\" & Environ("Username") & "\Desktop\unixdict.txt")
ListeWords = Split(Book, vbNewLine)
For i = LBound(ListeWords) To UBound(ListeWords)
If Len(ListeWords(i)) > 11 Then
If InStr(ListeWords(i), "the") > 0 Then
ReDim Preserve out(count)
out(count) = ListeWords(i)
count = count + 1
End If
End If
Next
Debug.Print "Found : " & count & " words : " & Join(out, ", ")
End Sub
Private Function Read_File(Fic As String) As String
Dim Nb As Integer
Nb = FreeFile
Open Fic For Input As #Nb
Read_File = Input(LOF(Nb), #Nb)
Close #Nb
End Function
- Output:
Found : 32 words : authenticate, chemotherapy, chrysanthemum, clothesbrush, clotheshorse, eratosthenes, featherbedding, featherbrain, featherweight, gaithersburg, hydrothermal, lighthearted, mathematician, neurasthenic, nevertheless, northeastern, northernmost, otherworldly, parasympathetic, physiotherapist, physiotherapy, psychotherapeutic, psychotherapist, psychotherapy, radiotherapy, southeastern, southernmost, theoretician, weatherbeaten, weatherproof, weatherstrip, weatherstripping
VBScript
Run it with Cscript
with createobject("ADODB.Stream")
.charset ="UTF-8"
.open
.loadfromfile("unixdict.txt")
s=.readtext
end with
a=split (s,vblf)
with new regexp
.pattern=".*?the.*"
for each i in a
if len(trim(i))>=11 then
if .test(i) then wscript.echo i
end if
next
end with
- Output:
authenticate brotherhood calisthenic chemotherapy chrysanthemum clothesbrush clotheshorse clothesline earthenware endothelial endothermic eratosthenes featherbedding featherbrain featherweight furtherance furthermore furthermost gaithersburg grandfather grandmother hydrothermal kinesthesis leatherback leatherneck leatherwork lighthearted mathematician netherlands netherworld neurasthenic nevertheless nonetheless northeastern northernmost otherworldly parasympathetic parentheses parenthesis parenthetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy smithereens southeastern southernmost sympathetic thenceforth theoretician therapeutic thereabouts theretofore weatherbeaten weatherproof weatherstrip weatherstripping
V (Vlang)
import os
fn main() {
mut count := 1
mut text :=''
unixdict := os.read_file('./unixdict.txt') or {panic('file not found')}
for word in unixdict.split_into_lines() {
if word.contains('the') && word.len > 11 {text += count++.str() + ': $word \n'}
}
println(text)
}
- Output:
1: authenticate 2: chemotherapy 3: chrysanthemum 4: clothesbrush 5: clotheshorse 6: eratosthenes 7: featherbedding 8: featherbrain 9: featherweight 10: gaithersburg 11: hydrothermal 12: lighthearted 13: mathematician 14: neurasthenic 15: nevertheless 16: northeastern 17: northernmost 18: otherworldly 19: parasympathetic 20: physiotherapist 21: physiotherapy 22: psychotherapeutic 23: psychotherapist 24: psychotherapy 25: radiotherapy 26: southeastern 27: southernmost 28: theoretician 29: weatherbeaten 30: weatherproof 31: weatherstrip 32: weatherstripping
Wren
import "io" for File
import "./fmt" for Fmt
var wordList = "unixdict.txt" // local copy
var words = File.read(wordList).trimEnd().split("\n").where { |w| w.count > 11 }.toList
var count = 0
System.print("Words containing 'the' having a length > 11 in %(wordList):")
for (word in words) {
if (word.contains("the")) {
count = count + 1
Fmt.print("$2d: $s", count, word)
}
}
- Output:
Words containing 'the' having a length > 11 in unixdict.txt: 1: authenticate 2: chemotherapy 3: chrysanthemum 4: clothesbrush 5: clotheshorse 6: eratosthenes 7: featherbedding 8: featherbrain 9: featherweight 10: gaithersburg 11: hydrothermal 12: lighthearted 13: mathematician 14: neurasthenic 15: nevertheless 16: northeastern 17: northernmost 18: otherworldly 19: parasympathetic 20: physiotherapist 21: physiotherapy 22: psychotherapeutic 23: psychotherapist 24: psychotherapy 25: radiotherapy 26: southeastern 27: southernmost 28: theoretician 29: weatherbeaten 30: weatherproof 31: weatherstrip 32: weatherstripping
XPL0
string 0; \use zero-terminated strings
int I, Ch, Len;
char Word(100); \(longest word in unixdict.txt is 22 chars)
def LF=$0A, CR=$0D, EOF=$1A;
[FSet(FOpen("unixdict.txt", 0), ^I); \open dictionary and set it to device 3
OpenI(3);
repeat I:= 0;
loop [repeat Ch:= ChIn(3) until Ch # CR; \remove possible CR
if Ch=LF or Ch=EOF then quit;
Word(I):= Ch;
I:= I+1;
];
Word(I):= 0; \terminate string
Len:= I;
if Len >= 12 then
for I:= 0 to Len-3 do \scan for "the" (assume lowercase)
if Word(I)=^t & Word(I+1)=^h & Word(I+2)=^e then
[Text(0, Word); CrLf(0)];
until Ch = EOF;
]
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping
Yabasic
// Rosetta Code problem: http://rosettacode.org/wiki/Words_containing_"the"_substring
// by Galileo, 02/2022
a = open("unixdict.txt")
while not eof(a)
line input #a a$
if len(a$) > 11 and instr(a$, "the") print a$
wend
close a
- Output:
authenticate chemotherapy chrysanthemum clothesbrush clotheshorse eratosthenes featherbedding featherbrain featherweight gaithersburg hydrothermal lighthearted mathematician neurasthenic nevertheless northeastern northernmost otherworldly parasympathetic physiotherapist physiotherapy psychotherapeutic psychotherapist psychotherapy radiotherapy southeastern southernmost theoretician weatherbeaten weatherproof weatherstrip weatherstripping ---Program done, press RETURN---
- Draft Programming Tasks
- 11l
- Action!
- Ada
- ALGOL 68
- ALGOL 68-files
- AppleScript
- Arturo
- AutoHotkey
- AutoIt
- AWK
- BASIC
- BASIC256
- GW-BASIC
- QBasic
- BCPL
- BQN
- C
- C++
- Common Lisp
- Delphi
- System.SysUtils
- System.IOUtils
- Draco
- DuckDB
- EasyLang
- Ed
- Factor
- Forth
- Fortran
- FreeBASIC
- FutureBasic
- Go
- Haskell
- J
- Java
- JavaScript
- Jq
- Julia
- Lua
- Mathematica
- Wolfram Language
- Min
- Nanoquery
- Nim
- Nu
- Objeck
- Pascal
- Perl
- Phix
- PHP
- Plain English
- PL/I
- Processing
- Python
- Quackery
- R
- Raku
- Red
- Refal
- REXX
- Ring
- RPL
- Ruby
- Rust
- Smalltalk
- Sed
- SETL
- Standard ML
- Swift
- Tcl
- VBA
- VBScript
- V (Vlang)
- Wren
- Wren-fmt
- XPL0
- Yabasic
- 6502 Assembly/Omit
- 8080 Assembly/Omit
- Computer/zero Assembly/Omit
- Z80 Assembly/Omit