Isograms and heterograms
- Definitions
For the purposes of this task, an isogram means a string where each character present is used the same number of times and an n-isogram means an isogram where each character present is used exactly n times.
A heterogram means a string in which no character occurs more than once. It follows that a heterogram is the same thing as a 1-isogram.
- Examples
caucasus is a 2-isogram because the letters c, a, u and s all occur twice.
atmospheric is a heterogram because all its letters are used once only.
- Task
Using unixdict.txt and ignoring capitalization:
1) Find and display here all words which are n-isograms where n > 1.
Present the results as a single list but sorted as follows:
a. By decreasing order of n;
b. Then by decreasing order of word length;
c. Then by ascending lexicographic order.
2) Secondly, find and display here all words which are heterograms and have more than 10 characters.
Again present the results as a single list but sorted as per b. and c. above.
- Reference
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contains most consonants
- Find words which contains more than 3 vowels
- Find words which first and last three letters are equals
- Find words which odd letters are consonants and even letters are vowels or vice_versa
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 Bottles of Beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
Factor
<lang factor>USING: assocs combinators.short-circuit.smart grouping io io.encodings.ascii io.files kernel literals math math.order math.statistics sequences sets sorting ;
CONSTANT: words $[ "unixdict.txt" ascii file-lines ]
- isogram<=> ( a b -- <=> )
{ [ histogram values first ] [ length ] } compare-with ;
- isogram-sort ( seq -- seq' )
[ isogram<=> invert-comparison ] sort ;
- isogram? ( seq -- ? )
histogram values { [ first 1 > ] [ all-eq? ] } && ;
- .words-by ( quot -- )
words swap filter isogram-sort [ print ] each ; inline
"List of n-isograms where n > 1:" print [ isogram? ] .words-by nl
"List of heterograms of length > 10:" print [ { [ length 10 > ] [ all-unique? ] } && ] .words-by</lang>
- Output:
List of n-isograms where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Raku
<lang perl6>my $file = 'unixdict.txt';
my @words = $file.IO.slurp.words.map: { $_ => .comb.Bag };
(6...2).map: -> $n {
next unless my @iso = @words.grep({.value.values.all == $n}).map: *.key; say "\n({+@iso}) {$n}-isograms:\n" ~ @iso.sort({[-.chars, ~$_]}).join: "\n";
}
my $minchars = 10;
say "\n({+$_}) heterograms with $minchars or more characters:\n" ~
.sort({[-.chars, ~$_]}).join: "\n" given @words.grep({.key.chars >$minchars && .value.values.max == 1})».key;</lang>
- Output:
(2) 3-isograms: aaa iii (31) 2-isograms: beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii (32) heterograms with 10 or more characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism
Wren
<lang ecmascript>import "io" for File import "./str" for Str
var isogram = Fn.new { |word|
if (word.count == 1) return 1 var map = {} word = Str.lower(word) for (c in word) { if (map.containsKey(c)) { map[c] = map[c] + 1 } else { map[c] = 1 } } var chars = map.keys.toList var n = map[chars[0]] var iso = chars[1..-1].all { |c| map[c] == n } return iso ? n : 0
}
var isoComparer = Fn.new { |i, j|
if (i[1] != j[1]) return i[1] > j[1] if (i[0].count != j[0].count) return i[0].count > j[0].count return Str.le(i[0], j[0])
}
var heteroComparer = Fn.new { |i, j|
if (i[0].count != j[0].count) return i[0].count > j[0].count return Str.le(i[0], j[0])
}
var wordList = "unixdict.txt" // local copy var words = File.read(wordList)
.trimEnd() .split("\n") .map { |word| [word, isogram.call(word)] }
var isograms = words.where { |t| t[1] > 1 }
.toList .sort(isoComparer) .map { |t| " " + t[0] } .toList
System.print("List of n-isograms(%(isograms.count)) where n > 1:") System.print(isograms.join("\n"))
var heterograms = words.where { |t| t[1] == 1 && t[0].count > 10 }
.toList .sort(heteroComparer) .map { |t| " " + t[0] } .toList
System.print("\nList of heterograms(%(heterograms.count)) of length > 10:") System.print(heterograms.join("\n"))</lang>
- Output:
List of n-isograms(33) where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of heterograms(32) of length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism