Isograms and heterograms

You are encouraged to solve this task according to the task description, using any language you may know.
- Definitions
For the purposes of this task, an isogram means a string where each character present is used the same number of times and an n-isogram means an isogram where each character present is used exactly n times.
A heterogram means a string in which no character occurs more than once. It follows that a heterogram is the same thing as a 1-isogram.
- Examples
caucasus is a 2-isogram because the letters c, a, u and s all occur twice.
atmospheric is a heterogram because all its letters are used once only.
- Task
Using unixdict.txt and ignoring capitalization:
1) Find and display here all words which are n-isograms where n > 1.
Present the results as a single list but sorted as follows:
a. By decreasing order of n;
b. Then by decreasing order of word length;
c. Then by ascending lexicographic order.
2) Secondly, find and display here all words which are heterograms and have more than 10 characters.
Again present the results as a single list but sorted as per b. and c. above.
- Reference
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
ALGOL 68
# find some isograms ( words where each letter occurs the same number of #
# times as the others ) and heterograms ( words where each letter occurs #
# once ). Note a heterogram is an isogram of order 1 #
IF FILE input file;
STRING file name = "unixdict.txt";
open( input file, file name, stand in channel ) /= 0
THEN
# failed to open the file #
print( ( "Unable to open """ + file name + """", newline ) )
ELSE
# file opened OK #
BOOL at eof := FALSE;
# set the EOF handler for the file - notes eof has been reached and #
# returns TRUE so processing can continue #
on logical file end( input file, ( REF FILE f )BOOL: at eof := TRUE );
# in-place quick sort an array of strings #
PROC s quicksort = ( REF[]STRING a, INT lb, ub )VOID:
IF ub > lb
THEN
# more than one element, so must sort #
INT left := lb;
INT right := ub;
# choosing the middle element of the array as the pivot #
STRING pivot := a[ left + ( ( right + 1 ) - left ) OVER 2 ];
WHILE
WHILE IF left <= ub THEN a[ left ] < pivot ELSE FALSE FI
DO
left +:= 1
OD;
WHILE IF right >= lb THEN a[ right ] > pivot ELSE FALSE FI
DO
right -:= 1
OD;
left <= right
DO
STRING t := a[ left ];
a[ left ] := a[ right ];
a[ right ] := t;
left +:= 1;
right -:= 1
OD;
s quicksort( a, lb, right );
s quicksort( a, left, ub )
FI # s quicksort # ;
# returns the length of s #
OP LENGTH = ( STRING s )INT: 1 + ( UPB s - LWB s );
# returns n if s is an isogram of order n, 0 if s is not an isogram #
OP ORDER = ( STRING s )INT:
BEGIN
# count the number of times each character occurs #
[ 0 : max abs char ]INT count;
FOR i FROM LWB count TO UPB count DO count[ i ] := 0 OD;
FOR i FROM LWB s TO UPB s DO
CHAR c = s[ i ];
IF c >= "A" AND c <= "Z" THEN
# uppercase - treat as lower #
count[ ( ABS c - ABS "A" ) + ABS "a" ] +:= 1
ELSE
# lowercase or non-letter #
count[ ABS c ] +:= 1
FI
OD;
INT order := -1;
# check the characters all occur the same number of times #
FOR i FROM LWB count TO UPB count WHILE order /= 0 DO
IF count[ i ] /= 0 THEN
# have a characetr that appeared in s #
IF order = -1 THEN
# first character #
order := count[ i ]
ELIF order /= count[ i ] THEN
# character occured a different number of times to #
# the previous one #
order := 0
FI
FI
OD;
IF order < 0 THEN 0 ELSE order FI
END # ORDER # ;
[ 1 : 2 000 ]STRING words;
INT w count := 0;
WHILE
STRING word;
get( input file, ( word, newline ) );
NOT at eof
DO
# have another word #
INT order = ORDER word;
IF order > 0 THEN
INT w length = LENGTH word;
IF ( order = 1 AND w length > 10 ) OR order > 1 THEN
# a long heterogram or an isogram #
# store the word prefixed by the max abs char complement of #
# the order and the length so when sorted, the words are #
# ordered as requierd by the task #
STRING s word = REPR ( max abs char - order )
+ REPR ( max abs char - w length )
+ word;
words[ w count +:= 1 ] := s word
FI
FI
OD;
close( input file );
# sort the words #
s quicksort( words, 1, w count );
# display the words #
INT prev order := 0;
INT prev length := 999 999;
INT p count := 0;
FOR w TO w count DO
STRING gram = words[ w ];
INT order = max abs char - ABS gram[ 1 ];
INT length = max abs char - ABS gram[ 2 ];
STRING word = gram[ 3 : ];
IF order /= prev order THEN
IF order = 1 THEN
print( ( newline, "heterograms longer than 10 characters" ) )
ELSE
print( ( newline, "isograms of order ", whole( order, 0 ) ) )
FI;
prev order := order;
prev length := 999 999;
p count := 0
FI;
IF prev length > length OR p count > 5 THEN
print( ( newline ) );
prev length := length;
p count := 0
FI;
print( ( " " * IF length > 11 THEN 1 ELSE 13 - length FI, word ) );
p count +:= 1
OD
FI
- Output:
isograms of order 3 aaa iii isograms of order 2 beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii heterograms longer than 10 characters ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
AppleScript
use AppleScript version "2.3.1" -- Mac OS X 10.9 (Mavericks) or later.
use sorter : script ¬
"Custom Iterative Ternary Merge Sort" -- <https://www.macscripter.net/t/timsort-and-nigsort/71383/3>
use scripting additions
-- Return the n number of an n-isogram or 0 for a non-isogram.
on isogramicity(wrd)
set chrCount to (count wrd)
if (chrCount < 2) then return chrCount
set chrs to wrd's characters
tell sorter to sort(chrs, 1, chrCount, {})
set i to 1
set currentChr to chrs's beginning
repeat with j from 2 to chrCount
set testChr to chrs's item j
if (testChr ≠ currentChr) then
if (i = 1) then
set n to j - i -- First character's instance count.
else if (j - i ≠ n) then
return 0 -- Instance count mismatch.
end if
set i to j
set currentChr to testChr
end if
end repeat
if (i = 1) then return chrCount -- All characters the same.
if (chrCount - i + 1 ≠ n) then return 0 -- Mismatch with last character.
return n
end isogramicity
on task()
script o
property wrds : paragraphs of ¬
(read file ((path to desktop as text) & "unixdict.txt") as «class utf8»)
property isograms : {{}, {}, {}, {}, {}} -- Allow for up to 5-isograms.
-- Sort customisation handler to order the words as required.
on isGreater(a, b)
set ca to (count a)
set cb to (count b)
if (ca = cb) then return (a > b)
return (ca < cb)
end isGreater
end script
ignoring case -- A mere formality. It's the default and unixdict.txt is single-cased anyway!
repeat with i from 1 to (count o's wrds)
set thisWord to o's wrds's item i
set n to isogramicity(thisWord)
if (n > 0) then set end of o's isograms's item n to thisWord
end repeat
repeat with thisList in o's isograms
tell sorter to sort(thisList, 1, -1, {comparer:o})
end repeat
end ignoring
set output to {"N-isograms where n > 1:"}
set n_isograms to {}
repeat with i from (count o's isograms) to 2 by -1
set n_isograms to n_isograms & o's isograms's item i
end repeat
set wpl to 6 -- Words per line.
repeat with i from 1 to (count n_isograms)
set n_isograms's item i to text 1 thru 10 of ((n_isograms's item i) & " ")
set wtg to i mod wpl -- Words to go to in this line.
if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
end repeat
if (wtg > 0) then set end of output to join(n_isograms's items -wtg thru i, "")
set end of output to linefeed & "Heterograms with more than 10 characters:"
set n_isograms to o's isograms's beginning
set wpl to 4
repeat with i from 1 to (count n_isograms)
set thisWord to n_isograms's item i
if ((count thisWord) < 11) then exit repeat
set n_isograms's item i to text 1 thru 15 of (thisWord & " ")
set wtg to i mod wpl
if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
end repeat
if (wtg > 0) then set end of output to join(n_isograms's items (i - wtg) thru (i - 1), "")
return join(output, linefeed)
end task
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
task()
- Output:
"N-isograms where n > 1:
aaa iii beriberi bilabial caucasus couscous
teammate appall emmett hannah murmur tartar
testes anna coco dada deed dodo
gogo isis juju lulu mimi noon
otto papa peep poop teet tete
toot tutu ii
Heterograms with more than 10 characters:
ambidextrous bluestocking exclusionary incomputable
lexicography loudspeaking malnourished atmospheric
blameworthy centrifugal christendom consumptive
countervail countryside countrywide disturbance
documentary earthmoving exculpatory geophysical
inscrutable misanthrope problematic selfadjoint
stenography sulfonamide switchblade switchboard
switzerland thunderclap valedictory voluntarism "
AutoHotkey
LenOrder(lista) {
loop,parse,lista,%A_Space%
if (StrLen(A_LoopField) > MaxLen)
MaxLen := StrLen(A_LoopField)
loop % MaxLen-1
{
loop,parse,lista,%A_Space%
if (StrLen(A_LoopField) = MaxLen)
devolve .= A_LoopField . " "
MaxLen -= 1
}
return devolve
}
loop,read,unixdict.txt
{
encounters := 0, started := false
loop % StrLen(A_LoopReadLine)
{
target := strreplace(A_LoopReadLine,SubStr(A_LoopReadLine,a_index,1),,xt)
if !started
{
started := true
encounters := xt
}
if (xt<>encounters)
{
encounters := 0
continue
}
target := A_LoopReadLine
}
if (encounters = 1) and (StrLen(target) > 10)
heterograms .= target " "
else if (encounters > 1)
isograms%encounters% .= target " "
}
Loop
{
if (A_Index = 1)
continue
if !isograms%A_Index%
break
isograms := LenOrder(isograms%A_Index%) . isograms
}
msgbox % isograms
msgbox % LenOrder(heterograms)
ExitApp
return
~Esc::
ExitApp
- Output:
--------------------------- Isograms and Heterograms.ahk --------------------------- aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii --------------------------- ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism ---------------------------
C++
#include <algorithm>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <set>
#include <string>
#include <unordered_map>
struct Isogram_pair {
std::string word;
int32_t value;
};
std::string to_lower_case(const std::string& text) {
std::string result = text;
std::transform(result.begin(), result.end(), result.begin(),
[](char ch){ return std::tolower(ch); });
return result;
}
int32_t isogram_value(const std::string& word) {
std::unordered_map<char, int32_t> char_counts;
for ( const char& ch : word ) {
if ( char_counts.find(ch) == char_counts.end() ) {
char_counts.emplace(ch, 1);
} else {
char_counts[ch]++;
}
}
const int32_t count = char_counts[word[0]];
const bool identical = std::all_of(char_counts.begin(), char_counts.end(),
[count](const std::pair<char, int32_t> pair){ return pair.second == count; });
return identical ? count : 0;
}
int main() {
auto compare = [](Isogram_pair a, Isogram_pair b) {
return ( a.value == b.value ) ?
( ( a.word.length() == b.word.length() ) ? a.word < b.word : a.word.length() > b.word.length() )
: a.value > b.value;
};
std::set<Isogram_pair, decltype(compare)> isograms;
std::fstream file_stream;
file_stream.open("../unixdict.txt");
std::string word;
while ( file_stream >> word ) {
const int32_t value = isogram_value(to_lower_case(word));
if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
isograms.insert(Isogram_pair(word, value));
}
}
std::cout << "n-isograms with n > 1:" << std::endl;
for ( const Isogram_pair& isogram_pair : isograms ) {
if ( isogram_pair.value > 1 ) {
std::cout << isogram_pair.word << std::endl;
}
}
std::cout << "\n" << "Heterograms with more than 10 letters:" << std::endl;
for ( const Isogram_pair& isogram_pair : isograms ) {
if ( isogram_pair.value == 1 ) {
std::cout << isogram_pair.word << std::endl;
}
}
}
- Output:
n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Factor
USING: assocs combinators.short-circuit.smart grouping io
io.encodings.ascii io.files kernel literals math math.order
math.statistics sequences sets sorting ;
CONSTANT: words $[ "unixdict.txt" ascii file-lines ]
: isogram<=> ( a b -- <=> )
{ [ histogram values first ] [ length ] } compare-with ;
: isogram-sort ( seq -- seq' )
[ isogram<=> invert-comparison ] sort ;
: isogram? ( seq -- ? )
histogram values { [ first 1 > ] [ all-eq? ] } && ;
: .words-by ( quot -- )
words swap filter isogram-sort [ print ] each ; inline
"List of n-isograms where n > 1:" print
[ isogram? ] .words-by nl
"List of heterograms of length > 10:" print
[ { [ length 10 > ] [ all-unique? ] } && ] .words-by
- Output:
List of n-isograms where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
J
For this task, we want to know the value of n for n-isograms. This value would be zero for words which are not n-isograms. We can implement this by counting how many times each character occurs and determining whether that value is unique. (If it's the unique value, n is the number of times the first character occurs):
isogram=: {{ {. (#~ 1= #@~.) #/.~ y }} S:0
Also, it's worth noting that unixdict.txt is already in sorted order, even after coercing its contents to lower case:
(-: /:~) cutLF tolower fread 'unixdict.txt'
1
With this tool and this knowledge, we are ready to tackle this task (the /: expression sorts, and the #~ expression selects):
> (/: -@isogram,.-@#@>) (#~ 1<isogram) cutLF tolower fread 'unixdict.txt'
aaa
iii
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii
> (/: -@#@>) (#~ (10 < #@>) * 1=isogram) cutLF tolower fread 'unixdict.txt'
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism
Java
import java.io.IOException;
import java.nio.file.Path;
import java.util.AbstractSet;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeSet;
public final class IsogramsAndHeterograms {
public static void main(String[] aArgs) throws IOException {
AbstractSet<IsogramPair> isograms = new TreeSet<IsogramPair>(comparatorIsogram);
Scanner scanner = new Scanner(Path.of("unixdict.txt"));
while ( scanner.hasNext() ) {
String word = scanner.next().toLowerCase();
final int value = isogramValue(word);
if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
isograms.add( new IsogramPair(word, value) );
}
}
scanner.close();
System.out.println("n-isograms with n > 1:");
isograms.stream().filter( pair -> pair.aValue > 1 ).map( pair -> pair.aWord ).forEach(System.out::println);
System.out.println(System.lineSeparator() + "Heterograms with more than 10 letters:");
isograms.stream().filter( pair -> pair.aValue == 1 ).map( pair -> pair.aWord ).forEach(System.out::println);
}
private static int isogramValue(String aWord) {
Map<Character, Integer> charCounts = new HashMap<Character, Integer>();
for ( char ch : aWord.toCharArray() ) {
charCounts.merge(ch, 1, Integer::sum);
}
final int count = charCounts.get(aWord.charAt(0));
final boolean identical = charCounts.values().stream().allMatch( i -> i == count );
return identical ? count : 0;
}
private static Comparator<IsogramPair> comparatorIsogram =
Comparator.comparing(IsogramPair::aValue, Comparator.reverseOrder())
.thenComparing(IsogramPair::getWordLength, Comparator.reverseOrder())
.thenComparing(IsogramPair::aWord, Comparator.naturalOrder());
private record IsogramPair(String aWord, int aValue) {
private int getWordLength() {
return aWord.length();
}
};
}
- Output:
n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
jq
This entry assumes that the external file of words does not contain duplicates.
# bag of words
def bow(stream):
reduce stream as $word ({}; .[($word|tostring)] += 1);
# If the input string is an n-isogram then return n, otherwise 0:
def isogram:
bow(ascii_downcase|explode[]|[.]|implode)
| .[keys_unsorted[0]] as $n
| if all(.[]; . == $n) then $n else 0 end ;
# Read the word list (inputs) and record the n-isogram value.
# Output: an array of [word, n] values
def words:
[inputs
| select(test("^[A-Za-z]+$"))
| sub("^ +";"") | sub(" +$";"")
| [., isogram] ];
# Input: an array of [word, n] values
# Sort by decreasing order of n;
# Then by decreasing order of word length;
# Then by ascending lexicographic order
def isograms:
map( select( .[1] > 1) )
| sort_by( .[0])
| sort_by( - (.[0]|length))
| sort_by( - .[1]);
# Input: an array of [word, n] values
# Sort as for isograms
def heterograms($minlength):
map(select (.[1] == 1 and (.[0]|length) >= $minlength))
| sort_by( .[0])
| sort_by( - (.[0]|length));
words
| (isograms
| "List of the \(length) n-isograms for which n > 1:",
foreach .[] as [$word, $n] ({};
.header = if $n != .group then "\nisograms of order \($n)" else null end
| .group = $n;
(.header | select(.)), $word ) ) ,
(heterograms(11)
| "\nList of the \(length) heterograms with length > 10:", .[][0])
Invocation
< unixdict.txt jq -Rrn -f isograms-and-heterograms.jq
- Output:
List of the 33 n-isograms for which n > 1: isograms of order 3 aaa iii isograms of order 2 beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of the 32 heterograms with length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Julia
function isogram(word)
wchars, uchars = collect(word), unique(collect(word))
ulen, wlen = length(uchars), length(wchars)
(wlen == 1 || ulen == wlen) && return 1
n = count(==(first(uchars)), wchars)
return all(i -> count(==(uchars[i]), wchars) == n, 2:ulen) ? n : 0
end
words = split(lowercase(read("documents/julia/unixdict.txt", String)), r"\s+")
orderlengthtuples = [(isogram(w), length(w), w) for w in words]
tcomp(x, y) = (x[1] != y[1] ? y[1] < x[1] : x[2] != y[2] ? y[2] < x[2] : x[3] < y[3])
nisograms = sort!(filter(t -> t[1] > 1, orderlengthtuples), lt = tcomp)
heterograms = sort!(filter(t -> t[1] == 1 && length(t[3]) > 10, orderlengthtuples), lt = tcomp)
println("N-Isogram N Length\n", "-"^24)
foreach(t -> println(rpad(t[3], 8), lpad(t[1], 5), lpad(t[2], 5)), nisograms)
println("\nHeterogram Length\n", "-"^20)
foreach(t -> println(rpad(t[3], 12), lpad(t[2], 5)), heterograms)
- Output:
N-Isogram N Length ------------------------ aaa 3 3 iii 3 3 beriberi 2 8 bilabial 2 8 caucasus 2 8 couscous 2 8 teammate 2 8 appall 2 6 emmett 2 6 hannah 2 6 murmur 2 6 tartar 2 6 testes 2 6 anna 2 4 coco 2 4 dada 2 4 deed 2 4 dodo 2 4 gogo 2 4 isis 2 4 juju 2 4 lulu 2 4 mimi 2 4 noon 2 4 otto 2 4 papa 2 4 peep 2 4 poop 2 4 teet 2 4 tete 2 4 toot 2 4 tutu 2 4 ii 2 2 Heterogram Length -------------------- ambidextrous 12 bluestocking 12 exclusionary 12 incomputable 12 lexicography 12 loudspeaking 12 malnourished 12 atmospheric 11 blameworthy 11 centrifugal 11 christendom 11 consumptive 11 countervail 11 countryside 11 countrywide 11 disturbance 11 documentary 11 earthmoving 11 exculpatory 11 geophysical 11 inscrutable 11 misanthrope 11 problematic 11 selfadjoint 11 stenography 11 sulfonamide 11 switchblade 11 switchboard 11 switzerland 11 thunderclap 11 valedictory 11 voluntarism 11
Nim
import std/[algorithm, strutils, tables]
type Item = tuple[word: string; n: int]
func isogramCount(word: string): Natural =
## Check if the word is an isogram and return the number
## of times each character is present. Return 1 for
## heterograms. Return 0 if the word is neither an isogram
## or an heterogram.
let counts = word.toCountTable
result = 0
for count in counts.values:
if result == 0:
result = count
elif count != result:
return 0
proc cmp1(item1, item2: Item): int =
## Comparison function for part 1.
result = cmp(item2.n, item1.n)
if result == 0:
result = cmp(item2.word.len, item1.word.len)
if result == 0:
result = cmp(item1.word, item2.word)
proc cmp2(item1, item2: Item): int =
## Comparison function for part 2.
result = cmp(item1.n, item2.n)
if result == 0:
result = cmp(item2.word.len, item1.word.len)
if result == 0:
result = cmp(item1.word, item2.word)
var isograms: seq[Item]
for line in lines("unixdict.txt"):
let word = line.toLower
let count = word.isogramCount
if count != 0:
isograms.add (word, count)
echo "N-isograms where N > 1:"
isograms.sort(cmp1)
var idx = 0
for item in isograms:
if item.n == 1: break
inc idx
stdout.write item.word.alignLeft(12)
if idx mod 6 == 0: stdout.write '\n'
echo()
echo "\nHeterograms with more than 10 characters:"
isograms.sort(cmp2)
idx = 0
for item in isograms:
if item.n != 1: break
if item.word.len > 10:
inc idx
stdout.write item.word.alignLeft(16)
if idx mod 4 == 0: stdout.write '\n'
echo()
- Output:
N-isograms where N > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Perl
use strict;
use warnings;
use feature 'say';
use Path::Tiny;
use List::Util 'uniq';
my @words = map { lc } path('unixdict.txt')->slurp =~ /^[A-z]{2,}$/gm;
my(@heterogram, %isogram);
for my $w (@words) {
my %l;
$l{$_}++ for split '', $w;
next unless 1 == scalar (my @x = uniq values %l);
if ($x[0] == 1) { push @heterogram, $w if length $w > 10 }
else { push @{$isogram{$x[0]}}, $w }
}
for my $n (reverse sort keys %isogram) {
my @i = sort { length $b <=> length $a } @{$isogram{$n}};
say scalar @i . " $n-isograms:\n" . join("\n", @i) . "\n";
}
say scalar(@heterogram) . " heterograms with more than 10 characters:\n" . join "\n", sort { length $b <=> length $a } @heterogram;
- Output:
2 3-isograms: aaa iii 31 2-isograms: beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii 32 heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Phix
with javascript_semantics function isogram(string word) sequence chars = {}, counts = {} for ch in word do integer k = find(ch,chars) if k=0 then chars &= ch counts &= 1 else counts[k] += 1 end if end for integer c1 = counts[1], lc = length(counts), lw = length(word) return iff((c1>1 or lw>10) and counts=repeat(c1,lc)?{word,c1,lw}:0) end function sequence res = sort_columns(filter(apply(unix_dict(),isogram),"!=",0),{-2,-3,1}) printf(1,"word n length\n%s\n",{join(res,'\n',fmt:="%-14s %d %6d")})
- Output:
word n length aaa 3 3 iii 3 3 beriberi 2 8 bilabial 2 8 caucasus 2 8 couscous 2 8 teammate 2 8 appall 2 6 emmett 2 6 hannah 2 6 murmur 2 6 tartar 2 6 testes 2 6 anna 2 4 coco 2 4 dada 2 4 deed 2 4 dodo 2 4 gogo 2 4 isis 2 4 juju 2 4 lulu 2 4 mimi 2 4 noon 2 4 otto 2 4 papa 2 4 peep 2 4 poop 2 4 teet 2 4 tete 2 4 toot 2 4 tutu 2 4 ii 2 2 ambidextrous 1 12 bluestocking 1 12 exclusionary 1 12 incomputable 1 12 lexicography 1 12 loudspeaking 1 12 malnourished 1 12 atmospheric 1 11 blameworthy 1 11 centrifugal 1 11 christendom 1 11 consumptive 1 11 countervail 1 11 countryside 1 11 countrywide 1 11 disturbance 1 11 documentary 1 11 earthmoving 1 11 exculpatory 1 11 geophysical 1 11 inscrutable 1 11 misanthrope 1 11 problematic 1 11 selfadjoint 1 11 stenography 1 11 sulfonamide 1 11 switchblade 1 11 switchboard 1 11 switzerland 1 11 thunderclap 1 11 valedictory 1 11 voluntarism 1 11
Quackery
[ [] ]'[
rot witheach
[ dup nested
unrot over do
iff [ dip join ]
else nip ]
drop ] is filter ( [ --> [ )
[ 0 127 of
swap witheach
[ upper 2dup peek
1+ unrot poke ]
[] swap witheach
[ dup iff join else drop ]
dup [] = iff [ drop 0 ] done
behead swap witheach
[ over != if
[ drop 0 conclude ] ] ] is isogram ( [ --> n )
$ "rosetta/unixdict.txt" sharefile
drop nest$ dup
filter [ isogram 1 > ]
sort$
sortwith [ size dip size < ]
sortwith [ isogram dip isogram < ]
60 wrap$
cr
filter [ size 10 > ]
filter [ isogram 1 = ]
sort$
sortwith [ size dip size < ]
60 wrap$
cr
- Output:
aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Raku
my $file = 'unixdict.txt';
my @words = $file.IO.slurp.words.race.map: { $_ => .comb.Bag };
.say for (6...2).map: -> $n {
next unless my @iso = @words.race.grep({.value.values.all == $n})».key;
"\n({+@iso}) {$n}-isograms:\n" ~ @iso.sort({[-.chars, ~$_]}).join: "\n";
}
my $minchars = 10;
say "\n({+$_}) heterograms with more than $minchars characters:\n" ~
.sort({[-.chars, ~$_]}).join: "\n" given
@words.race.grep({.key.chars >$minchars && .value.values.max == 1})».key;
- Output:
(2) 3-isograms: aaa iii (31) 2-isograms: beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii (32) heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Ruby
Blameworthy exclusionary lexicography causes unixdict.txt to make it incomputable if the word isogram is itself an isogram.
words = File.readlines("unixdict.txt", chomp: true)
isograms = words.group_by do |word|
char_counts = word.downcase.chars.tally.values
char_counts.first if char_counts.uniq.size == 1
end
isograms.delete(nil)
isograms.transform_values!{|ar| ar.sort_by{|word| [-word.size, word]} }
keys = isograms.keys.sort.reverse
keys.each{|k| puts "(#{isograms[k].size}) #{k}-isograms: #{isograms[k]} " if k > 1 }
min_chars = 10
large_heterograms = isograms[1].select{|word| word.size > min_chars }
puts "" , "(#{large_heterograms.size}) heterograms with more than #{min_chars} chars:"
puts large_heterograms
- Output:
(2) 3-isograms: ["aaa", "iii"] (31) 2-isograms: ["beriberi", "bilabial", "caucasus", "couscous", "teammate", "appall", "emmett", "hannah", "murmur", "tartar", "testes", "anna", "coco", "dada", "deed", "dodo", "gogo", "isis", "juju", "lulu", "mimi", "noon", "otto", "papa", "peep", "poop", "teet", "tete", "toot", "tutu", "ii"] (32) heterograms with more than 10 chars: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Wren
import "io" for File
import "./str" for Str
var isogram = Fn.new { |word|
if (word.count == 1) return 1
var map = {}
word = Str.lower(word)
for (c in word) {
if (map.containsKey(c)) {
map[c] = map[c] + 1
} else {
map[c] = 1
}
}
var chars = map.keys.toList
var n = map[chars[0]]
var iso = chars[1..-1].all { |c| map[c] == n }
return iso ? n : 0
}
var isoComparer = Fn.new { |i, j|
if (i[1] != j[1]) return i[1] > j[1]
if (i[0].count != j[0].count) return i[0].count > j[0].count
return Str.le(i[0], j[0])
}
var heteroComparer = Fn.new { |i, j|
if (i[0].count != j[0].count) return i[0].count > j[0].count
return Str.le(i[0], j[0])
}
var wordList = "unixdict.txt" // local copy
var words = File.read(wordList)
.trimEnd()
.split("\n")
.map { |word| [word, isogram.call(word)] }
var isograms = words.where { |t| t[1] > 1 }
.toList
.sort(isoComparer)
.map { |t| " " + t[0] }
.toList
System.print("List of n-isograms(%(isograms.count)) where n > 1:")
System.print(isograms.join("\n"))
var heterograms = words.where { |t| t[1] == 1 && t[0].count > 10 }
.toList
.sort(heteroComparer)
.map { |t| " " + t[0] }
.toList
System.print("\nList of heterograms(%(heterograms.count)) of length > 10:")
System.print(heterograms.join("\n"))
- Output:
List of n-isograms(33) where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms(32) of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism