Isograms and heterograms
You are encouraged to solve this task according to the task description, using any language you may know.
- Definitions
For the purposes of this task, an isogram means a string where each character present is used the same number of times and an n-isogram means an isogram where each character present is used exactly n times.
A heterogram means a string in which no character occurs more than once. It follows that a heterogram is the same thing as a 1-isogram.
- Examples
caucasus is a 2-isogram because the letters c, a, u and s all occur twice.
atmospheric is a heterogram because all its letters are used once only.
- Task
Using unixdict.txt and ignoring capitalization:
1) Find and display here all words which are n-isograms where n > 1.
Present the results as a single list but sorted as follows:
a. By decreasing order of n;
b. Then by decreasing order of word length;
c. Then by ascending lexicographic order.
2) Secondly, find and display here all words which are heterograms and have more than 10 characters.
Again present the results as a single list but sorted as per b. and c. above.
- Reference
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
ALGOL 68
Note, files.incl.a68 and sort.incl.a68 are available on separate pages on Rosetta Code, see the above links.
BEGIN # find some isograms ( words where each letter occurs the same number #
# of times as the others ) and heterograms ( words where each letter #
# occurs once ). Note a heterogram is an isogram of order 1 #
PR read "files.incl.a68" PR # include file utilities #
PR read "sort.incl.a68" PR # include sort utilities #
# returns the length of s #
OP LENGTH = ( STRING s )INT: 1 + ( UPB s - LWB s );
# returns n if s is an isogram of order n, 0 if s is not an isogram #
OP ORDER = ( STRING s )INT:
BEGIN
# count the number of times each character occurs #
[ 0 : max abs char ]INT count;
FOR i FROM LWB count TO UPB count DO count[ i ] := 0 OD;
FOR i FROM LWB s TO UPB s DO
CHAR c = s[ i ];
IF c >= "A" AND c <= "Z" THEN # uppercase - treat as lower #
count[ ( ABS c - ABS "A" ) + ABS "a" ] +:= 1
ELSE # lowercase or non-letter #
count[ ABS c ] +:= 1
FI
OD;
INT order := -1;
# check the characters all occur the same number of times #
FOR i FROM LWB count TO UPB count WHILE order /= 0 DO
IF count[ i ] /= 0 THEN # have a characetr that appeared in s #
IF order = -1 THEN # first character #
order := count[ i ]
ELIF order /= count[ i ] THEN # character occured a #
order := 0 # different number of times #
# to the previous one #
FI
FI
OD;
IF order < 0 THEN 0 ELSE order FI
END # ORDER # ;
[ 1 : 2 000 ]STRING words; # table if required isograms and heterograms #
# stores word in words if it is an isogram or heterogram of more then 10 #
# characters #
# returns TRUE if word was stored, FALSE otherwise #
# count so far will contain the number of poreceding matching words #
PROC store grams = ( STRING word, INT count so far )BOOL:
IF INT order = ORDER word;
order < 1
THEN FALSE # not an isogram or heterogram #
ELIF INT w length = LENGTH word;
order = 1 AND w length <= 10
THEN FALSE # short heterogram #
ELSE # a long heterogram or an isogram #
# store the word prefixed by the max abs char complement of the #
# the order and the length so when sorted, the words are #
# ordered as requierd by the task #
STRING s word = REPR ( max abs char - order )
+ REPR ( max abs char - w length )
+ word;
words[ count so far + 1 ] := s word;
TRUE
FI # store grams # ;
IF INT w count = "unixdict.txt" EACHLINE store grams;
w count < 0
THEN
print( ( "Unable to open unixdict.txt", newline ) )
ELSE
words QUICKSORT ELEMENTS( 1, w count ); # sort the words #
# display the words #
INT prev order := 0;
INT prev length := 999 999;
INT p count := 0;
FOR w TO w count DO
STRING gram = words[ w ];
INT order = max abs char - ABS gram[ 1 ];
INT length = max abs char - ABS gram[ 2 ];
STRING word = gram[ 3 : ];
IF order /= prev order THEN
print( ( newline
, IF order = 1
THEN "heterograms longer than 10 characters"
ELSE "isograms of order " + whole( order, 0 )
FI
)
);
prev order := order;
prev length := 999 999;
p count := 0
FI;
IF prev length > length OR p count > 5 THEN
print( ( newline ) );
prev length := length;
p count := 0
FI;
print( ( " " * IF length > 11 THEN 1 ELSE 13 - length FI, word ) );
p count +:= 1
OD
FI
END
- Output:
isograms of order 3 aaa iii isograms of order 2 beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii heterograms longer than 10 characters ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
AppleScript
use AppleScript version "2.3.1" -- Mac OS X 10.9 (Mavericks) or later.
use sorter : script ¬
"Custom Iterative Ternary Merge Sort" -- <https://www.macscripter.net/t/timsort-and-nigsort/71383/3>
use scripting additions
-- Return the n number of an n-isogram or 0 for a non-isogram.
on isogramicity(wrd)
set chrCount to (count wrd)
if (chrCount < 2) then return chrCount
set chrs to wrd's characters
tell sorter to sort(chrs, 1, chrCount, {})
set i to 1
set currentChr to chrs's beginning
repeat with j from 2 to chrCount
set testChr to chrs's item j
if (testChr ≠ currentChr) then
if (i = 1) then
set n to j - i -- First character's instance count.
else if (j - i ≠ n) then
return 0 -- Instance count mismatch.
end if
set i to j
set currentChr to testChr
end if
end repeat
if (i = 1) then return chrCount -- All characters the same.
if (chrCount - i + 1 ≠ n) then return 0 -- Mismatch with last character.
return n
end isogramicity
on task()
script o
property wrds : paragraphs of ¬
(read file ((path to desktop as text) & "unixdict.txt") as «class utf8»)
property isograms : {{}, {}, {}, {}, {}} -- Allow for up to 5-isograms.
-- Sort customisation handler to order the words as required.
on isGreater(a, b)
set ca to (count a)
set cb to (count b)
if (ca = cb) then return (a > b)
return (ca < cb)
end isGreater
end script
ignoring case -- A mere formality. It's the default and unixdict.txt is single-cased anyway!
repeat with i from 1 to (count o's wrds)
set thisWord to o's wrds's item i
set n to isogramicity(thisWord)
if (n > 0) then set end of o's isograms's item n to thisWord
end repeat
repeat with thisList in o's isograms
tell sorter to sort(thisList, 1, -1, {comparer:o})
end repeat
end ignoring
set output to {"N-isograms where n > 1:"}
set n_isograms to {}
repeat with i from (count o's isograms) to 2 by -1
set n_isograms to n_isograms & o's isograms's item i
end repeat
set wpl to 6 -- Words per line.
repeat with i from 1 to (count n_isograms)
set n_isograms's item i to text 1 thru 10 of ((n_isograms's item i) & " ")
set wtg to i mod wpl -- Words to go to in this line.
if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
end repeat
if (wtg > 0) then set end of output to join(n_isograms's items -wtg thru i, "")
set end of output to linefeed & "Heterograms with more than 10 characters:"
set n_isograms to o's isograms's beginning
set wpl to 4
repeat with i from 1 to (count n_isograms)
set thisWord to n_isograms's item i
if ((count thisWord) < 11) then exit repeat
set n_isograms's item i to text 1 thru 15 of (thisWord & " ")
set wtg to i mod wpl
if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
end repeat
if (wtg > 0) then set end of output to join(n_isograms's items (i - wtg) thru (i - 1), "")
return join(output, linefeed)
end task
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
task()
- Output:
"N-isograms where n > 1:
aaa iii beriberi bilabial caucasus couscous
teammate appall emmett hannah murmur tartar
testes anna coco dada deed dodo
gogo isis juju lulu mimi noon
otto papa peep poop teet tete
toot tutu ii
Heterograms with more than 10 characters:
ambidextrous bluestocking exclusionary incomputable
lexicography loudspeaking malnourished atmospheric
blameworthy centrifugal christendom consumptive
countervail countryside countrywide disturbance
documentary earthmoving exculpatory geophysical
inscrutable misanthrope problematic selfadjoint
stenography sulfonamide switchblade switchboard
switzerland thunderclap valedictory voluntarism "
AutoHotkey
LenOrder(lista) {
loop,parse,lista,%A_Space%
if (StrLen(A_LoopField) > MaxLen)
MaxLen := StrLen(A_LoopField)
loop % MaxLen-1
{
loop,parse,lista,%A_Space%
if (StrLen(A_LoopField) = MaxLen)
devolve .= A_LoopField . " "
MaxLen -= 1
}
return devolve
}
loop,read,unixdict.txt
{
encounters := 0, started := false
loop % StrLen(A_LoopReadLine)
{
target := strreplace(A_LoopReadLine,SubStr(A_LoopReadLine,a_index,1),,xt)
if !started
{
started := true
encounters := xt
}
if (xt<>encounters)
{
encounters := 0
continue
}
target := A_LoopReadLine
}
if (encounters = 1) and (StrLen(target) > 10)
heterograms .= target " "
else if (encounters > 1)
isograms%encounters% .= target " "
}
Loop
{
if (A_Index = 1)
continue
if !isograms%A_Index%
break
isograms := LenOrder(isograms%A_Index%) . isograms
}
msgbox % isograms
msgbox % LenOrder(heterograms)
ExitApp
return
~Esc::
ExitApp
- Output:
--------------------------- Isograms and Heterograms.ahk --------------------------- aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii --------------------------- ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism ---------------------------
C++
#include <algorithm>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <set>
#include <string>
#include <unordered_map>
struct Isogram_pair {
std::string word;
int32_t value;
};
std::string to_lower_case(const std::string& text) {
std::string result = text;
std::transform(result.begin(), result.end(), result.begin(),
[](char ch){ return std::tolower(ch); });
return result;
}
int32_t isogram_value(const std::string& word) {
std::unordered_map<char, int32_t> char_counts;
for ( const char& ch : word ) {
if ( char_counts.find(ch) == char_counts.end() ) {
char_counts.emplace(ch, 1);
} else {
char_counts[ch]++;
}
}
const int32_t count = char_counts[word[0]];
const bool identical = std::all_of(char_counts.begin(), char_counts.end(),
[count](const std::pair<char, int32_t> pair){ return pair.second == count; });
return identical ? count : 0;
}
int main() {
auto compare = [](Isogram_pair a, Isogram_pair b) {
return ( a.value == b.value ) ?
( ( a.word.length() == b.word.length() ) ? a.word < b.word : a.word.length() > b.word.length() )
: a.value > b.value;
};
std::set<Isogram_pair, decltype(compare)> isograms;
std::fstream file_stream;
file_stream.open("../unixdict.txt");
std::string word;
while ( file_stream >> word ) {
const int32_t value = isogram_value(to_lower_case(word));
if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
isograms.insert(Isogram_pair(word, value));
}
}
std::cout << "n-isograms with n > 1:" << std::endl;
for ( const Isogram_pair& isogram_pair : isograms ) {
if ( isogram_pair.value > 1 ) {
std::cout << isogram_pair.word << std::endl;
}
}
std::cout << "\n" << "Heterograms with more than 10 letters:" << std::endl;
for ( const Isogram_pair& isogram_pair : isograms ) {
if ( isogram_pair.value == 1 ) {
std::cout << isogram_pair.word << std::endl;
}
}
}
- Output:
n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
EasyLang
repeat
s$ = input
until s$ = ""
if len s$ > 1
w$[] &= s$
.
.
func[] letters w$ .
len r[] 127
for c$ in strchars w$
h = strcode c$
r[h] += 1
.
return r[]
.
func cmp a b a$ b$ .
if a > b
return 1
elif a = b
if len a$ > len b$
return 1
elif len a$ = len b$ and strcmp a$ b$ < 0
return 1
.
.
return 0
.
proc sort . d$[] d[] .
n = len d$[]
for i = 1 to n - 1
for j = i + 1 to n
if cmp d[j] d[i] d$[j] d$[i] = 1
swap d$[j] d$[i]
swap d[j] d[i]
.
.
.
.
proc isograms . .
for w$ in w$[]
cnt[] = letters w$
n = 0
for i to 127
if cnt[i] = 1
break 1
.
if cnt[i] > 0
if n = 0
n = cnt[i]
elif cnt[i] <> n
break 1
.
.
.
if i > 127
r$[] &= w$
n[] &= n
.
.
sort r$[] n[]
for w$ in r$[]
print w$
.
.
proc heterogram lng . .
for w$ in w$[]
if len w$ > lng
cnt[] = letters w$
for i to 127
if cnt[i] > 0 and cnt[i] <> 1
break 1
.
.
if i > 127
r$[] &= w$
n[] &= 0
.
.
.
sort r$[] n[]
for w$ in r$[]
print w$
.
.
isograms
print ""
heterogram 10
#
# the content of unixdict.txt
input_data
aaa
anna
beriberi
coco
ii
iii
ambidextrous
atmospheric
bluestocking
Factor
USING: assocs combinators.short-circuit.smart grouping io
io.encodings.ascii io.files kernel literals math math.order
math.statistics sequences sets sorting ;
CONSTANT: words $[ "unixdict.txt" ascii file-lines ]
: isogram<=> ( a b -- <=> )
{ [ histogram values first ] [ length ] } compare-with ;
: isogram-sort ( seq -- seq' )
[ isogram<=> invert-comparison ] sort ;
: isogram? ( seq -- ? )
histogram values { [ first 1 > ] [ all-eq? ] } && ;
: .words-by ( quot -- )
words swap filter isogram-sort [ print ] each ; inline
"List of n-isograms where n > 1:" print
[ isogram? ] .words-by nl
"List of heterograms of length > 10:" print
[ { [ length 10 > ] [ all-unique? ] } && ] .words-by
- Output:
List of n-isograms where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
FreeBASIC
Function Isogram(word As String) As String
Dim As Integer i, k
Dim As String ch, chars = ""
Dim As Integer counts(26) '= {0}
For i = 1 To Len(word)
ch = Mid(word, i, 1)
k = Instr(chars, ch)
If k = 0 Then
chars &= ch
counts(Len(chars)) = 1
Else
counts(k) += 1
End If
Next
Dim As Integer c1 = counts(1), lc = Len(chars), lw = Len(word)
Dim As Integer isEqual = 1
For i = 1 To lc
If counts(i) <> c1 Then
isEqual = 0
Exit For
End If
Next
Return Iif((c1 > 1 Or lw > 10) And isEqual, word & " " & Str(c1) & " " & Str(lw), "")
End Function
Dim As String res = ""
Dim As String word
Dim As Integer i, j
Dim As String results(1000) ' Assuming a maximum of 1000 words
Dim As Integer count = 0
Open "i:\unixdict.txt" For Input As #1
Do Until Eof(1)
Line Input #1, word
Dim As String result = Isogram(word)
If result <> "" Then
results(count) = result
count += 1
End If
Loop
Close #1
Print "word n length"
For i = 0 To count - 1
Dim As String result = results(i)
Dim As Integer space1 = Instr(result, " ")
Dim As Integer space2 = Instr(Mid(result, space1 + 1), " ") + space1
word = Left(result, space1 - 1)
Dim As Integer c1 = Val(Mid(result, space1 + 1, space2 - space1 - 1))
Dim As Integer lw = Val(Mid(result, space2 + 1))
Print Using "\ \ ## ######"; word; c1; lw
Next i
Sleep
- Output:
word n length aaa 3 3 ambidextrous 1 12 anna 2 4 appall 2 6 atmospheric 1 11 beriberi 2 8 bilabial 2 8 blameworthy 1 11 bluestocking 1 12 caucasus 2 8 centrifugal 1 11 christendom 1 11 coco 2 4 consumptive 1 11 countervail 1 11 countryside 1 11 countrywide 1 11 couscous 2 8 dada 2 4 deed 2 4 disturbance 1 11 documentary 1 11 dodo 2 4 earthmoving 1 11 emmett 2 6 exclusionary 1 12 exculpatory 1 11 geophysical 1 11 gogo 2 4 hannah 2 6 ii 2 2 iii 3 3 incomputable 1 12 inscrutable 1 11 isis 2 4 juju 2 4 lexicography 1 12 loudspeaking 1 12 lulu 2 4 malnourished 1 12 mimi 2 4 misanthrope 1 11 murmur 2 6 noon 2 4 otto 2 4 papa 2 4 peep 2 4 poop 2 4 problematic 1 11 selfadjoint 1 11 stenography 1 11 sulfonamide 1 11 switchblade 1 11 switchboard 1 11 switzerland 1 11 tartar 2 6 teammate 2 8 teet 2 4 testes 2 6 tete 2 4 thunderclap 1 11 toot 2 4 tutu 2 4 valedictory 1 11 voluntarism 1 11
J
For this task, we want to know the value of n for n-isograms. This value would be zero for words which are not n-isograms. We can implement this by counting how many times each character occurs and determining whether that value is unique. (If it's the unique value, n is the number of times the first character occurs):
isogram=: {{ {. (#~ 1= #@~.) #/.~ y }} S:0
Also, it's worth noting that unixdict.txt is already in sorted order, even after coercing its contents to lower case:
(-: /:~) cutLF tolower fread 'unixdict.txt'
1
With this tool and this knowledge, we are ready to tackle this task (the /: expression sorts, and the #~ expression selects):
> (/: -@isogram,.-@#@>) (#~ 1<isogram) cutLF tolower fread 'unixdict.txt'
aaa
iii
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii
> (/: -@#@>) (#~ (10 < #@>) * 1=isogram) cutLF tolower fread 'unixdict.txt'
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism
Java
import java.io.IOException;
import java.nio.file.Path;
import java.util.AbstractSet;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeSet;
public final class IsogramsAndHeterograms {
public static void main(String[] aArgs) throws IOException {
AbstractSet<IsogramPair> isograms = new TreeSet<IsogramPair>(comparatorIsogram);
Scanner scanner = new Scanner(Path.of("unixdict.txt"));
while ( scanner.hasNext() ) {
String word = scanner.next().toLowerCase();
final int value = isogramValue(word);
if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
isograms.add( new IsogramPair(word, value) );
}
}
scanner.close();
System.out.println("n-isograms with n > 1:");
isograms.stream().filter( pair -> pair.aValue > 1 ).map( pair -> pair.aWord ).forEach(System.out::println);
System.out.println(System.lineSeparator() + "Heterograms with more than 10 letters:");
isograms.stream().filter( pair -> pair.aValue == 1 ).map( pair -> pair.aWord ).forEach(System.out::println);
}
private static int isogramValue(String aWord) {
Map<Character, Integer> charCounts = new HashMap<Character, Integer>();
for ( char ch : aWord.toCharArray() ) {
charCounts.merge(ch, 1, Integer::sum);
}
final int count = charCounts.get(aWord.charAt(0));
final boolean identical = charCounts.values().stream().allMatch( i -> i == count );
return identical ? count : 0;
}
private static Comparator<IsogramPair> comparatorIsogram =
Comparator.comparing(IsogramPair::aValue, Comparator.reverseOrder())
.thenComparing(IsogramPair::getWordLength, Comparator.reverseOrder())
.thenComparing(IsogramPair::aWord, Comparator.naturalOrder());
private record IsogramPair(String aWord, int aValue) {
private int getWordLength() {
return aWord.length();
}
};
}
- Output:
n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
jq
This entry assumes that the external file of words does not contain duplicates.
# bag of words
def bow(stream):
reduce stream as $word ({}; .[($word|tostring)] += 1);
# If the input string is an n-isogram then return n, otherwise 0:
def isogram:
bow(ascii_downcase|explode[]|[.]|implode)
| .[keys_unsorted[0]] as $n
| if all(.[]; . == $n) then $n else 0 end ;
# Read the word list (inputs) and record the n-isogram value.
# Output: an array of [word, n] values
def words:
[inputs
| select(test("^[A-Za-z]+$"))
| sub("^ +";"") | sub(" +$";"")
| [., isogram] ];
# Input: an array of [word, n] values
# Sort by decreasing order of n;
# Then by decreasing order of word length;
# Then by ascending lexicographic order
def isograms:
map( select( .[1] > 1) )
| sort_by( .[0])
| sort_by( - (.[0]|length))
| sort_by( - .[1]);
# Input: an array of [word, n] values
# Sort as for isograms
def heterograms($minlength):
map(select (.[1] == 1 and (.[0]|length) >= $minlength))
| sort_by( .[0])
| sort_by( - (.[0]|length));
words
| (isograms
| "List of the \(length) n-isograms for which n > 1:",
foreach .[] as [$word, $n] ({};
.header = if $n != .group then "\nisograms of order \($n)" else null end
| .group = $n;
(.header | select(.)), $word ) ) ,
(heterograms(11)
| "\nList of the \(length) heterograms with length > 10:", .[][0])
Invocation
< unixdict.txt jq -Rrn -f isograms-and-heterograms.jq
- Output:
List of the 33 n-isograms for which n > 1: isograms of order 3 aaa iii isograms of order 2 beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of the 32 heterograms with length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Julia
function isogram(word)
wchars, uchars = collect(word), unique(collect(word))
ulen, wlen = length(uchars), length(wchars)
(wlen == 1 || ulen == wlen) && return 1
n = count(==(first(uchars)), wchars)
return all(i -> count(==(uchars[i]), wchars) == n, 2:ulen) ? n : 0
end
words = split(lowercase(read("documents/julia/unixdict.txt", String)), r"\s+")
orderlengthtuples = [(isogram(w), length(w), w) for w in words]
tcomp(x, y) = (x[1] != y[1] ? y[1] < x[1] : x[2] != y[2] ? y[2] < x[2] : x[3] < y[3])
nisograms = sort!(filter(t -> t[1] > 1, orderlengthtuples), lt = tcomp)
heterograms = sort!(filter(t -> t[1] == 1 && length(t[3]) > 10, orderlengthtuples), lt = tcomp)
println("N-Isogram N Length\n", "-"^24)
foreach(t -> println(rpad(t[3], 8), lpad(t[1], 5), lpad(t[2], 5)), nisograms)
println("\nHeterogram Length\n", "-"^20)
foreach(t -> println(rpad(t[3], 12), lpad(t[2], 5)), heterograms)
- Output:
N-Isogram N Length ------------------------ aaa 3 3 iii 3 3 beriberi 2 8 bilabial 2 8 caucasus 2 8 couscous 2 8 teammate 2 8 appall 2 6 emmett 2 6 hannah 2 6 murmur 2 6 tartar 2 6 testes 2 6 anna 2 4 coco 2 4 dada 2 4 deed 2 4 dodo 2 4 gogo 2 4 isis 2 4 juju 2 4 lulu 2 4 mimi 2 4 noon 2 4 otto 2 4 papa 2 4 peep 2 4 poop 2 4 teet 2 4 tete 2 4 toot 2 4 tutu 2 4 ii 2 2 Heterogram Length -------------------- ambidextrous 12 bluestocking 12 exclusionary 12 incomputable 12 lexicography 12 loudspeaking 12 malnourished 12 atmospheric 11 blameworthy 11 centrifugal 11 christendom 11 consumptive 11 countervail 11 countryside 11 countrywide 11 disturbance 11 documentary 11 earthmoving 11 exculpatory 11 geophysical 11 inscrutable 11 misanthrope 11 problematic 11 selfadjoint 11 stenography 11 sulfonamide 11 switchblade 11 switchboard 11 switzerland 11 thunderclap 11 valedictory 11 voluntarism 11
Nim
import std/[algorithm, strutils, tables]
type Item = tuple[word: string; n: int]
func isogramCount(word: string): Natural =
## Check if the word is an isogram and return the number
## of times each character is present. Return 1 for
## heterograms. Return 0 if the word is neither an isogram
## or an heterogram.
let counts = word.toCountTable
result = 0
for count in counts.values:
if result == 0:
result = count
elif count != result:
return 0
proc cmp1(item1, item2: Item): int =
## Comparison function for part 1.
result = cmp(item2.n, item1.n)
if result == 0:
result = cmp(item2.word.len, item1.word.len)
if result == 0:
result = cmp(item1.word, item2.word)
proc cmp2(item1, item2: Item): int =
## Comparison function for part 2.
result = cmp(item1.n, item2.n)
if result == 0:
result = cmp(item2.word.len, item1.word.len)
if result == 0:
result = cmp(item1.word, item2.word)
var isograms: seq[Item]
for line in lines("unixdict.txt"):
let word = line.toLower
let count = word.isogramCount
if count != 0:
isograms.add (word, count)
echo "N-isograms where N > 1:"
isograms.sort(cmp1)
var idx = 0
for item in isograms:
if item.n == 1: break
inc idx
stdout.write item.word.alignLeft(12)
if idx mod 6 == 0: stdout.write '\n'
echo()
echo "\nHeterograms with more than 10 characters:"
isograms.sort(cmp2)
idx = 0
for item in isograms:
if item.n != 1: break
if item.word.len > 10:
inc idx
stdout.write item.word.alignLeft(16)
if idx mod 4 == 0: stdout.write '\n'
echo()
- Output:
N-isograms where N > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
PascalABC.NET
uses System.Net;
function isogram(word: string): integer;
begin
result := 0;
var letters := new Dictionary<char, integer>;
foreach var c in word do
letters[c] := letters.Get(c) + 1;
var counts: set of integer;
foreach var letter in letters do
counts += [letter.value];
if counts.Count = 1 then
result := letters.Get(word[1]);
end;
begin
var client := new WebClient();
var text := client.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt');
var words: sequence of string := text.ToWords(|#10, #13|);
words.Where(w -> isogram(w) > 1)
.OrderByDescending(w -> isogram(w))
.ThenByDescending(w -> w.Length)
.ThenBy(w -> w).println;
println;
words.where(w -> (isogram(w) = 1) and (w.Length > 10))
.OrderByDescending(w -> w.Length)
.ThenBy(w -> w).println;
end.
- Output:
aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Perl
use strict;
use warnings;
use feature 'say';
use Path::Tiny;
use List::Util 'uniq';
my @words = map { lc } path('unixdict.txt')->slurp =~ /^[A-z]{2,}$/gm;
my(@heterogram, %isogram);
for my $w (@words) {
my %l;
$l{$_}++ for split '', $w;
next unless 1 == scalar (my @x = uniq values %l);
if ($x[0] == 1) { push @heterogram, $w if length $w > 10 }
else { push @{$isogram{$x[0]}}, $w }
}
for my $n (reverse sort keys %isogram) {
my @i = sort { length $b <=> length $a } @{$isogram{$n}};
say scalar @i . " $n-isograms:\n" . join("\n", @i) . "\n";
}
say scalar(@heterogram) . " heterograms with more than 10 characters:\n" . join "\n", sort { length $b <=> length $a } @heterogram;
- Output:
2 3-isograms: aaa iii 31 2-isograms: beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii 32 heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Phix
with javascript_semantics function isogram(string word) sequence chars = {}, counts = {} for ch in word do integer k = find(ch,chars) if k=0 then chars &= ch counts &= 1 else counts[k] += 1 end if end for integer c1 = counts[1], lc = length(counts), lw = length(word) return iff((c1>1 or lw>10) and counts=repeat(c1,lc)?{word,c1,lw}:0) end function sequence res = sort_columns(filter(apply(unix_dict(),isogram),"!=",0),{-2,-3,1}) printf(1,"word n length\n%s\n",{join(res,'\n',fmt:="%-14s %d %6d")})
- Output:
word n length aaa 3 3 iii 3 3 beriberi 2 8 bilabial 2 8 caucasus 2 8 couscous 2 8 teammate 2 8 appall 2 6 emmett 2 6 hannah 2 6 murmur 2 6 tartar 2 6 testes 2 6 anna 2 4 coco 2 4 dada 2 4 deed 2 4 dodo 2 4 gogo 2 4 isis 2 4 juju 2 4 lulu 2 4 mimi 2 4 noon 2 4 otto 2 4 papa 2 4 peep 2 4 poop 2 4 teet 2 4 tete 2 4 toot 2 4 tutu 2 4 ii 2 2 ambidextrous 1 12 bluestocking 1 12 exclusionary 1 12 incomputable 1 12 lexicography 1 12 loudspeaking 1 12 malnourished 1 12 atmospheric 1 11 blameworthy 1 11 centrifugal 1 11 christendom 1 11 consumptive 1 11 countervail 1 11 countryside 1 11 countrywide 1 11 disturbance 1 11 documentary 1 11 earthmoving 1 11 exculpatory 1 11 geophysical 1 11 inscrutable 1 11 misanthrope 1 11 problematic 1 11 selfadjoint 1 11 stenography 1 11 sulfonamide 1 11 switchblade 1 11 switchboard 1 11 switzerland 1 11 thunderclap 1 11 valedictory 1 11 voluntarism 1 11
Python
from collections import Counter
def find_n_isograms(wordlist):
n_isograms = []
for word in wordlist:
word_lower = word.lower()
freq = Counter(word_lower)
frequencies = freq.values()
if len(set(frequencies)) == 1 and next(iter(frequencies)) > 1:
n = next(iter(frequencies))
n_isograms.append((-n, -len(word), word))
n_isograms.sort()
return [word for _, _, word in n_isograms]
def find_heterograms(wordlist):
heterograms = []
for word in wordlist:
if len(word) > 10:
word_lower = word.lower()
if len(set(word_lower)) == len(word_lower):
heterograms.append((-len(word), word))
heterograms.sort()
return [word for _, word in heterograms]
with open('unidict.txt', 'r') as file:
wordlist = [line.strip() for line in file]
n_isograms_result = find_n_isograms(wordlist)
heterograms_result = find_heterograms(wordlist)
print("n-isograms (n > 1):", n_isograms_result)
print("Heterograms with more than 10 characters:", heterograms_result)
- Output:
n-isograms (n > 1): ['aaa', 'iii', 'beriberi', 'bilabial', 'caucasus', 'couscous', 'teammate', 'appall', 'emmett', 'hannah', 'murmur', 'tartar', 'testes', 'anna', 'coco', 'dada', 'deed', 'dodo', 'gogo', 'isis', 'juju', 'lulu', 'mimi', 'noon', 'otto', 'papa', 'peep', 'poop', 'teet', 'tete', 'toot', 'tutu', 'ii'] Heterograms with more than 10 characters: ['ambidextrous', 'bluestocking', 'exclusionary', 'incomputable', 'lexicography', 'loudspeaking', 'malnourished', 'atmospheric', 'blameworthy', 'centrifugal', 'christendom', 'consumptive', 'countervail', 'countryside', 'countrywide', 'disturbance', 'documentary', 'earthmoving', 'exculpatory', 'geophysical', 'inscrutable', 'misanthrope', 'problematic', 'selfadjoint', 'stenography', 'sulfonamide', 'switchblade', 'switchboard', 'switzerland', 'thunderclap', 'valedictory', 'voluntarism']
Quackery
[ [] ]'[
rot witheach
[ dup nested
unrot over do
iff [ dip join ]
else nip ]
drop ] is filter ( [ --> [ )
[ 0 127 of
swap witheach
[ upper 2dup peek
1+ unrot poke ]
[] swap witheach
[ dup iff join else drop ]
dup [] = iff [ drop 0 ] done
behead swap witheach
[ over != if
[ drop 0 conclude ] ] ] is isogram ( [ --> n )
$ "rosetta/unixdict.txt" sharefile
drop nest$ dup
filter [ isogram 1 > ]
sort$
sortwith [ size dip size < ]
sortwith [ isogram dip isogram < ]
60 wrap$
cr
filter [ size 10 > ]
filter [ isogram 1 = ]
sort$
sortwith [ size dip size < ]
60 wrap$
cr
- Output:
aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Raku
my $file = 'unixdict.txt';
my @words = $file.IO.slurp.words.race.map: { $_ => .comb.Bag };
.say for (6...2).map: -> $n {
next unless my @iso = @words.race.grep({.value.values.all == $n})».key;
"\n({+@iso}) {$n}-isograms:\n" ~ @iso.sort({[-.chars, ~$_]}).join: "\n";
}
my $minchars = 10;
say "\n({+$_}) heterograms with more than $minchars characters:\n" ~
.sort({[-.chars, ~$_]}).join: "\n" given
@words.race.grep({.key.chars >$minchars && .value.values.max == 1})».key;
- Output:
(2) 3-isograms: aaa iii (31) 2-isograms: beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii (32) heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Ruby
Blameworthy exclusionary lexicography causes unixdict.txt to make it incomputable if the word isogram is itself an isogram.
words = File.readlines("unixdict.txt", chomp: true)
isograms = words.group_by do |word|
char_counts = word.downcase.chars.tally.values
char_counts.first if char_counts.uniq.size == 1
end
isograms.delete(nil)
isograms.transform_values!{|ar| ar.sort_by{|word| [-word.size, word]} }
keys = isograms.keys.sort.reverse
keys.each{|k| puts "(#{isograms[k].size}) #{k}-isograms: #{isograms[k]} " if k > 1 }
min_chars = 10
large_heterograms = isograms[1].select{|word| word.size > min_chars }
puts "" , "(#{large_heterograms.size}) heterograms with more than #{min_chars} chars:"
puts large_heterograms
- Output:
(2) 3-isograms: ["aaa", "iii"] (31) 2-isograms: ["beriberi", "bilabial", "caucasus", "couscous", "teammate", "appall", "emmett", "hannah", "murmur", "tartar", "testes", "anna", "coco", "dada", "deed", "dodo", "gogo", "isis", "juju", "lulu", "mimi", "noon", "otto", "papa", "peep", "poop", "teet", "tete", "toot", "tutu", "ii"] (32) heterograms with more than 10 chars: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Wren
import "io" for File
import "./str" for Str
var isogram = Fn.new { |word|
if (word.count == 1) return 1
var map = {}
word = Str.lower(word)
for (c in word) {
if (map.containsKey(c)) {
map[c] = map[c] + 1
} else {
map[c] = 1
}
}
var chars = map.keys.toList
var n = map[chars[0]]
var iso = chars[1..-1].all { |c| map[c] == n }
return iso ? n : 0
}
var isoComparer = Fn.new { |i, j|
if (i[1] != j[1]) return i[1] > j[1]
if (i[0].count != j[0].count) return i[0].count > j[0].count
return Str.le(i[0], j[0])
}
var heteroComparer = Fn.new { |i, j|
if (i[0].count != j[0].count) return i[0].count > j[0].count
return Str.le(i[0], j[0])
}
var wordList = "unixdict.txt" // local copy
var words = File.read(wordList)
.trimEnd()
.split("\n")
.map { |word| [word, isogram.call(word)] }
var isograms = words.where { |t| t[1] > 1 }
.toList
.sort(isoComparer)
.map { |t| " " + t[0] }
.toList
System.print("List of n-isograms(%(isograms.count)) where n > 1:")
System.print(isograms.join("\n"))
var heterograms = words.where { |t| t[1] == 1 && t[0].count > 10 }
.toList
.sort(heteroComparer)
.map { |t| " " + t[0] }
.toList
System.print("\nList of heterograms(%(heterograms.count)) of length > 10:")
System.print(heterograms.join("\n"))
- Output:
List of n-isograms(33) where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms(32) of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism