Jump to content

Isograms and heterograms

From Rosetta Code
Task
Isograms and heterograms
You are encouraged to solve this task according to the task description, using any language you may know.
Definitions

For the purposes of this task, an isogram means a string where each character present is used the same number of times and an n-isogram means an isogram where each character present is used exactly n times.

A heterogram means a string in which no character occurs more than once. It follows that a heterogram is the same thing as a 1-isogram.


Examples

caucasus is a 2-isogram because the letters c, a, u and s all occur twice.

atmospheric is a heterogram because all its letters are used once only.


Task

Using unixdict.txt and ignoring capitalization:


1) Find and display here all words which are n-isograms where n > 1.

Present the results as a single list but sorted as follows:

a. By decreasing order of n;

b. Then by decreasing order of word length;

c. Then by ascending lexicographic order.

2) Secondly, find and display here all words which are heterograms and have more than 10 characters.

Again present the results as a single list but sorted as per b. and c. above.


Reference


Other tasks related to string operations:
Metrics
Counting
Remove/replace
Anagrams/Derangements/shuffling
Find/Search/Determine
Formatting
Song lyrics/poems/Mad Libs/phrases
Tokenize
Sequences


ALGOL 68

Library: ALGOL 68-sort

Note, files.incl.a68 and sort.incl.a68 are available on separate pages on Rosetta Code, see the above links.

BEGIN # find some isograms ( words where each letter occurs the same number   #
      # of times as the others ) and heterograms ( words where each letter    #
      # occurs once ). Note a heterogram is an isogram of order 1             #

    PR read "files.incl.a68" PR                      # include file utilities #
    PR read  "sort.incl.a68" PR                      # include sort utilities #

    # returns the length of s                                                 #
    OP LENGTH = ( STRING s )INT: 1 + ( UPB s - LWB s );
    # returns n if s is an isogram of order n, 0 if s is not an isogram       #
    OP   ORDER  = ( STRING s )INT:
         BEGIN
            # count the number of times each character occurs                 #
            [ 0 : max abs char ]INT count;
            FOR i FROM LWB count TO UPB count DO count[ i ] := 0 OD;
            FOR i FROM LWB s TO UPB s DO
                CHAR c = s[ i ];
                IF c >= "A" AND c <= "Z" THEN    # uppercase - treat as lower #
                    count[ ( ABS c - ABS "A" ) + ABS "a" ] +:= 1
                ELSE                                # lowercase or non-letter #
                    count[ ABS c ] +:= 1
                FI
            OD;
            INT order := -1;
            # check the characters all occur the same number of times         #
            FOR i FROM LWB count TO UPB count WHILE order /= 0 DO
                IF count[ i ] /= 0 THEN # have a characetr that appeared in s #
                    IF   order = -1 THEN                    # first character #
                        order := count[ i ]
                    ELIF order /= count[ i ] THEN       # character occured a #
                        order := 0                # different number of times #
                                                        # to the previous one #
                    FI
                FI
            OD;
            IF order < 0 THEN 0 ELSE order FI
         END # ORDER # ;

    [ 1 : 2 000 ]STRING words;   # table if required isograms and heterograms #

    # stores word in words if it is an isogram or heterogram of more then 10  #
    # characters                                                              #
    # returns TRUE if word was stored, FALSE otherwise                        #
    # count so far will contain the number of poreceding matching words       #
    PROC store grams = ( STRING word, INT count so far )BOOL:
         IF   INT order = ORDER word;
              order < 1
         THEN FALSE                            # not an isogram or heterogram #
         ELIF INT w length = LENGTH word;
              order = 1 AND w length <= 10
         THEN FALSE                                        # short heterogram #
         ELSE # a long heterogram or an isogram                               #
              # store the word prefixed by the max abs char complement of the #
              # the order and the length so when sorted, the words are        #
              # ordered as requierd by the task                               #
              STRING s word = REPR ( max abs char - order    )
                            + REPR ( max abs char - w length )
                            + word;
              words[ count so far + 1 ] := s word;
              TRUE
         FI # store grams # ;

    IF INT   w count = "unixdict.txt" EACHLINE store grams;
       w count < 0
    THEN
        print( ( "Unable to open unixdict.txt", newline ) )
    ELSE
        words QUICKSORT ELEMENTS( 1, w count );              # sort the words #
        # display the words                                                   #
        INT prev order  :=       0;
        INT prev length := 999 999;
        INT p count     :=       0;
        FOR w TO w count DO
            STRING gram   = words[ w ];
            INT    order  = max abs char - ABS gram[ 1 ];
            INT    length = max abs char - ABS gram[ 2 ];
            STRING word   = gram[ 3 : ];
            IF order /= prev order THEN
                print( ( newline
                       , IF order = 1
                         THEN "heterograms longer than 10 characters"
                         ELSE "isograms of order " + whole( order, 0 )
                         FI
                       )
                     );
                prev order  := order;
                prev length := 999 999;
                p count     := 0
            FI;
            IF prev length > length OR p count > 5 THEN
                print( ( newline ) );
                prev length := length;
                p count     := 0
            FI;
            print( ( " " * IF length > 11 THEN 1 ELSE 13 - length FI, word ) );
            p count +:= 1
        OD
    FI

END
Output:

isograms of order 3
          aaa          iii
isograms of order 2
     beriberi     bilabial     caucasus     couscous     teammate
       appall       emmett       hannah       murmur       tartar       testes
         anna         coco         dada         deed         dodo         gogo
         isis         juju         lulu         mimi         noon         otto
         papa         peep         poop         teet         tete         toot
         tutu
           ii
heterograms longer than 10 characters
 ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking
 malnourished
  atmospheric  blameworthy  centrifugal  christendom  consumptive  countervail
  countryside  countrywide  disturbance  documentary  earthmoving  exculpatory
  geophysical  inscrutable  misanthrope  problematic  selfadjoint  stenography
  sulfonamide  switchblade  switchboard  switzerland  thunderclap  valedictory
  voluntarism

AppleScript

use AppleScript version "2.3.1" -- Mac OS X 10.9 (Mavericks) or later.
use sorter : script ¬
	"Custom Iterative Ternary Merge Sort" -- <https://www.macscripter.net/t/timsort-and-nigsort/71383/3>
use scripting additions

-- Return the n number of an n-isogram or 0 for a non-isogram.
on isogramicity(wrd)
    set chrCount to (count wrd)
    if (chrCount < 2) then return chrCount
    set chrs to wrd's characters
    tell sorter to sort(chrs, 1, chrCount, {})
    
    set i to 1
    set currentChr to chrs's beginning
    repeat with j from 2 to chrCount
        set testChr to chrs's item j
        if (testChr  currentChr) then
            if (i = 1) then
                set n to j - i -- First character's instance count.
            else if (j - i  n) then
                return 0 -- Instance count mismatch.
            end if
            set i to j
            set currentChr to testChr
        end if
    end repeat
    if (i = 1) then return chrCount -- All characters the same.
    if (chrCount - i + 1  n) then return 0 -- Mismatch with last character.
    return n
end isogramicity

on task()
    script o
        property wrds : paragraphs of ¬
            (read file ((path to desktop as text) & "unixdict.txt") as «class utf8»)
        property isograms : {{}, {}, {}, {}, {}} -- Allow for up to 5-isograms.
        
        -- Sort customisation handler to order the words as required.
        on isGreater(a, b)
            set ca to (count a)
            set cb to (count b)
            if (ca = cb) then return (a > b)
            return (ca < cb)
        end isGreater
    end script
    
    ignoring case -- A mere formality. It's the default and unixdict.txt is single-cased anyway!
        repeat with i from 1 to (count o's wrds)
            set thisWord to o's wrds's item i
            set n to isogramicity(thisWord)
            if (n > 0) then set end of o's isograms's item n to thisWord
        end repeat
        repeat with thisList in o's isograms
            tell sorter to sort(thisList, 1, -1, {comparer:o})
        end repeat
    end ignoring
    
    set output to {"N-isograms where n > 1:"}
    set n_isograms to {}
    repeat with i from (count o's isograms) to 2 by -1
        set n_isograms to n_isograms & o's isograms's item i
    end repeat
    set wpl to 6 -- Words per line.
    repeat with i from 1 to (count n_isograms)
        set n_isograms's item i to text 1 thru 10 of ((n_isograms's item i) & "          ")
        set wtg to i mod wpl -- Words to go to in this line.
        if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
    end repeat
    if (wtg > 0) then set end of output to join(n_isograms's items -wtg thru i, "")
    
    set end of output to linefeed & "Heterograms with more than 10 characters:"
    set n_isograms to o's isograms's beginning
    set wpl to 4
    repeat with i from 1 to (count n_isograms)
        set thisWord to n_isograms's item i
        if ((count thisWord) < 11) then exit repeat
        set n_isograms's item i to text 1 thru 15 of (thisWord & "    ")
        set wtg to i mod wpl
        if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "")
    end repeat
    if (wtg > 0) then set end of output to join(n_isograms's items (i - wtg) thru (i - 1), "")
    
    return join(output, linefeed)
end task

on join(lst, delim)
    set astid to AppleScript's text item delimiters
    set AppleScript's text item delimiters to delim
    set txt to lst as text
    set AppleScript's text item delimiters to astid
    return txt
end join


task()
Output:
"N-isograms where n > 1:
aaa       iii       beriberi  bilabial  caucasus  couscous  
teammate  appall    emmett    hannah    murmur    tartar    
testes    anna      coco      dada      deed      dodo      
gogo      isis      juju      lulu      mimi      noon      
otto      papa      peep      poop      teet      tete      
toot      tutu      ii        

Heterograms with more than 10 characters:
ambidextrous   bluestocking   exclusionary   incomputable   
lexicography   loudspeaking   malnourished   atmospheric    
blameworthy    centrifugal    christendom    consumptive    
countervail    countryside    countrywide    disturbance    
documentary    earthmoving    exculpatory    geophysical    
inscrutable    misanthrope    problematic    selfadjoint    
stenography    sulfonamide    switchblade    switchboard    
switzerland    thunderclap    valedictory    voluntarism    "

AutoHotkey

LenOrder(lista) {
	loop,parse,lista,%A_Space%
		if (StrLen(A_LoopField) > MaxLen)
			MaxLen := StrLen(A_LoopField)
	loop % MaxLen-1
		{
			loop,parse,lista,%A_Space%
				if (StrLen(A_LoopField) = MaxLen)
					devolve .= A_LoopField . " "
			MaxLen -= 1
		}
	return devolve		
	}

loop,read,unixdict.txt
{
	encounters := 0, started := false
	loop % StrLen(A_LoopReadLine)
	{
		target := strreplace(A_LoopReadLine,SubStr(A_LoopReadLine,a_index,1),,xt)
		if !started
		{
			started := true
			encounters := xt
		}
		if (xt<>encounters)
			{
				encounters := 0
				continue
			}
		target := A_LoopReadLine
	}
	if (encounters = 1) and (StrLen(target) > 10)
		heterograms .= target " "
	else if (encounters > 1)
		isograms%encounters% .= target " "
}
Loop
{
	if (A_Index = 1)
		continue
	if !isograms%A_Index%
		break
	isograms := LenOrder(isograms%A_Index%) . isograms
}	
msgbox % isograms
msgbox % LenOrder(heterograms)
ExitApp
return

~Esc::
ExitApp
Output:
---------------------------
Isograms and Heterograms.ahk
---------------------------
aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii 
---------------------------
ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism 
---------------------------

C++

#include <algorithm>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <set>
#include <string>
#include <unordered_map>

struct Isogram_pair {
	std::string word;
	int32_t value;
};

std::string to_lower_case(const std::string& text) {
	std::string result = text;
	std::transform(result.begin(), result.end(), result.begin(),
		[](char ch){ return std::tolower(ch); });
	return result;
}

int32_t isogram_value(const std::string& word) {
	std::unordered_map<char, int32_t> char_counts;
	for ( const char& ch : word ) {
		if ( char_counts.find(ch) == char_counts.end() ) {
			char_counts.emplace(ch, 1);
		} else {
			char_counts[ch]++;
		}
	}

	const int32_t count = char_counts[word[0]];
	const bool identical = std::all_of(char_counts.begin(), char_counts.end(),
	                       	   [count](const std::pair<char, int32_t> pair){ return pair.second == count; });

	return identical ? count : 0;
}

int main() {
	auto compare = [](Isogram_pair a, Isogram_pair b) {
		return ( a.value == b.value ) ?
			( ( a.word.length() == b.word.length() ) ? a.word < b.word : a.word.length() > b.word.length() )
			: a.value > b.value;
	};
	std::set<Isogram_pair, decltype(compare)> isograms;

	std::fstream file_stream;
	file_stream.open("../unixdict.txt");
	std::string word;
	while ( file_stream >> word ) {
		const int32_t value = isogram_value(to_lower_case(word));
		if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
			isograms.insert(Isogram_pair(word, value));
		}
	}

	std::cout << "n-isograms with n > 1:" << std::endl;
	for ( const Isogram_pair& isogram_pair : isograms ) {
		if ( isogram_pair.value > 1 ) {
			std::cout << isogram_pair.word << std::endl;
		}
	}

	std::cout << "\n" << "Heterograms with more than 10 letters:" << std::endl;
	for ( const Isogram_pair& isogram_pair : isograms ) {
		if ( isogram_pair.value == 1 ) {
			std::cout << isogram_pair.word << std::endl;
		}
	}
}
Output:
n-isograms with n > 1:
aaa
iii
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

Heterograms with more than 10 letters:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

EasyLang

repeat
   s$ = input
   until s$ = ""
   if len s$ > 1
      w$[] &= s$
   .
.
func[] letters w$ .
   len r[] 127
   for c$ in strchars w$
      h = strcode c$
      r[h] += 1
   .
   return r[]
.
func cmp a b a$ b$ .
   if a > b
      return 1
   elif a = b
      if len a$ > len b$
         return 1
      elif len a$ = len b$ and strcmp a$ b$ < 0
         return 1
      .
   .
   return 0
.
proc sort . d$[] d[] .
   n = len d$[]
   for i = 1 to n - 1
      for j = i + 1 to n
         if cmp d[j] d[i] d$[j] d$[i] = 1
            swap d$[j] d$[i]
            swap d[j] d[i]
         .
      .
   .
.
proc isograms . .
   for w$ in w$[]
      cnt[] = letters w$
      n = 0
      for i to 127
         if cnt[i] = 1
            break 1
         .
         if cnt[i] > 0
            if n = 0
               n = cnt[i]
            elif cnt[i] <> n
               break 1
            .
         .
      .
      if i > 127
         r$[] &= w$
         n[] &= n
      .
   .
   sort r$[] n[]
   for w$ in r$[]
      print w$
   .
.
proc heterogram lng . .
   for w$ in w$[]
      if len w$ > lng
         cnt[] = letters w$
         for i to 127
            if cnt[i] > 0 and cnt[i] <> 1
               break 1
            .
         .
         if i > 127
            r$[] &= w$
            n[] &= 0
         .
      .
   .
   sort r$[] n[]
   for w$ in r$[]
      print w$
   .
.
isograms
print ""
heterogram 10
# 
# the content of unixdict.txt 
input_data
aaa
anna
beriberi
coco
ii
iii
ambidextrous
atmospheric
bluestocking

Factor

Works with: Factor version 0.99 2022-04-03
USING: assocs combinators.short-circuit.smart grouping io
io.encodings.ascii io.files kernel literals math math.order
math.statistics sequences sets sorting ;

CONSTANT: words $[ "unixdict.txt" ascii file-lines ]

: isogram<=> ( a b -- <=> )
    { [ histogram values first ] [ length ] } compare-with ;

: isogram-sort ( seq -- seq' )
    [ isogram<=> invert-comparison ] sort ;

: isogram? ( seq -- ? )
    histogram values { [ first 1 > ] [ all-eq? ] } && ;

: .words-by ( quot -- )
    words swap filter isogram-sort [ print ] each ; inline

"List of n-isograms where n > 1:" print
[ isogram? ] .words-by nl

"List of heterograms of length > 10:" print
[ { [ length 10 > ] [ all-unique? ] } && ] .words-by
Output:
List of n-isograms where n > 1:
aaa
iii
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

List of heterograms of length > 10:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

FreeBASIC

Function Isogram(word As String) As String
    Dim As Integer i, k
    Dim As String ch, chars = ""
    Dim As Integer counts(26) '= {0}
    
    For i = 1 To Len(word)
        ch = Mid(word, i, 1)
        k = Instr(chars, ch)
        If k = 0 Then
            chars &= ch
            counts(Len(chars)) = 1
        Else
            counts(k) += 1
        End If
    Next
    
    Dim As Integer c1 = counts(1), lc = Len(chars), lw = Len(word)
    Dim As Integer isEqual = 1
    For i = 1 To lc
        If counts(i) <> c1 Then
            isEqual = 0
            Exit For
        End If
    Next
    
    Return Iif((c1 > 1 Or lw > 10) And isEqual, word & " " & Str(c1) & " " & Str(lw), "")
End Function

Dim As String res = ""
Dim As String word
Dim As Integer i, j
Dim As String results(1000) ' Assuming a maximum of 1000 words
Dim As Integer count = 0

Open "i:\unixdict.txt" For Input As #1
Do Until Eof(1)
    Line Input #1, word
    Dim As String result = Isogram(word)
    If result <> "" Then
        results(count) = result
        count += 1
    End If
Loop
Close #1

Print "word            n length"
For i = 0 To count - 1
    Dim As String result = results(i)
    Dim As Integer space1 = Instr(result, " ")
    Dim As Integer space2 = Instr(Mid(result, space1 + 1), " ") + space1
    
    word = Left(result, space1 - 1)
    Dim As Integer c1 = Val(Mid(result, space1 + 1, space2 - space1 - 1))
    Dim As Integer lw = Val(Mid(result, space2 + 1))
    Print Using "\            \ ## ######"; word; c1; lw
Next i

Sleep
Output:
word            n length
aaa             3      3
ambidextrous    1     12
anna            2      4
appall          2      6
atmospheric     1     11
beriberi        2      8
bilabial        2      8
blameworthy     1     11
bluestocking    1     12
caucasus        2      8
centrifugal     1     11
christendom     1     11
coco            2      4
consumptive     1     11
countervail     1     11
countryside     1     11
countrywide     1     11
couscous        2      8
dada            2      4
deed            2      4
disturbance     1     11
documentary     1     11
dodo            2      4
earthmoving     1     11
emmett          2      6
exclusionary    1     12
exculpatory     1     11
geophysical     1     11
gogo            2      4
hannah          2      6
ii              2      2
iii             3      3
incomputable    1     12
inscrutable     1     11
isis            2      4
juju            2      4
lexicography    1     12
loudspeaking    1     12
lulu            2      4
malnourished    1     12
mimi            2      4
misanthrope     1     11
murmur          2      6
noon            2      4
otto            2      4
papa            2      4
peep            2      4
poop            2      4
problematic     1     11
selfadjoint     1     11
stenography     1     11
sulfonamide     1     11
switchblade     1     11
switchboard     1     11
switzerland     1     11
tartar          2      6
teammate        2      8
teet            2      4
testes          2      6
tete            2      4
thunderclap     1     11
toot            2      4
tutu            2      4
valedictory     1     11
voluntarism     1     11

J

For this task, we want to know the value of n for n-isograms. This value would be zero for words which are not n-isograms. We can implement this by counting how many times each character occurs and determining whether that value is unique. (If it's the unique value, n is the number of times the first character occurs):

isogram=: {{ {. (#~ 1= #@~.) #/.~ y }} S:0

Also, it's worth noting that unixdict.txt is already in sorted order, even after coercing its contents to lower case:

   (-: /:~) cutLF tolower fread 'unixdict.txt'
1

With this tool and this knowledge, we are ready to tackle this task (the /: expression sorts, and the #~ expression selects):

   > (/: -@isogram,.-@#@>) (#~ 1<isogram) cutLF tolower fread 'unixdict.txt'
aaa     
iii     
beriberi
bilabial
caucasus
couscous
teammate
appall  
emmett  
hannah  
murmur  
tartar  
testes  
anna    
coco    
dada    
deed    
dodo    
gogo    
isis    
juju    
lulu    
mimi    
noon    
otto    
papa    
peep    
poop    
teet    
tete    
toot    
tutu    
ii      
   > (/: -@#@>) (#~ (10 < #@>) * 1=isogram) cutLF tolower fread 'unixdict.txt'
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric 
blameworthy 
centrifugal 
christendom 
consumptive 
countervail 
countryside 
countrywide 
disturbance 
documentary 
earthmoving 
exculpatory 
geophysical 
inscrutable 
misanthrope 
problematic 
selfadjoint 
stenography 
sulfonamide 
switchblade 
switchboard 
switzerland 
thunderclap 
valedictory 
voluntarism

Java

import java.io.IOException;
import java.nio.file.Path;
import java.util.AbstractSet;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeSet;

public final class IsogramsAndHeterograms {

	public static void main(String[] aArgs) throws IOException {
		AbstractSet<IsogramPair> isograms = new TreeSet<IsogramPair>(comparatorIsogram);	
	
		Scanner scanner = new Scanner(Path.of("unixdict.txt"));
	    while ( scanner.hasNext() ) {
	    	String word = scanner.next().toLowerCase();
	    	final int value = isogramValue(word);
	    	if ( value > 1 || ( word.length() > 10 && value == 1 ) ) {
	    		isograms.add( new IsogramPair(word, value) );
	    	}  	
	    }
	    scanner.close();
	    
	    System.out.println("n-isograms with n > 1:");
	    isograms.stream().filter( pair -> pair.aValue > 1 ).map( pair -> pair.aWord ).forEach(System.out::println);
	    System.out.println(System.lineSeparator() + "Heterograms with more than 10 letters:");
	    isograms.stream().filter( pair -> pair.aValue == 1 ).map( pair -> pair.aWord ).forEach(System.out::println);	   
	} 
	
	private static int isogramValue(String aWord) {
	    Map<Character, Integer> charCounts = new HashMap<Character, Integer>(); 
	    for ( char ch : aWord.toCharArray() ) {
	    	charCounts.merge(ch, 1, Integer::sum);
	    }
	    
	    final int count = charCounts.get(aWord.charAt(0));
	    final boolean identical = charCounts.values().stream().allMatch( i -> i == count );
	    return identical ? count : 0;
	}
	
	private static Comparator<IsogramPair> comparatorIsogram =
		Comparator.comparing(IsogramPair::aValue, Comparator.reverseOrder())
		.thenComparing(IsogramPair::getWordLength, Comparator.reverseOrder())
		.thenComparing(IsogramPair::aWord, Comparator.naturalOrder());
	
	private record IsogramPair(String aWord, int aValue) {
		
		private int getWordLength() {
			return aWord.length();
		}
		
	};

}
Output:
n-isograms with n > 1:
aaa
iii
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

Heterograms with more than 10 letters:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

jq

This entry assumes that the external file of words does not contain duplicates.

# bag of words
def bow(stream): 
  reduce stream as $word ({}; .[($word|tostring)] += 1);

# If the input string is an n-isogram then return n, otherwise 0:
def isogram:
  bow(ascii_downcase|explode[]|[.]|implode)
  | .[keys_unsorted[0]] as $n
  | if all(.[]; . == $n) then $n else 0 end ;

# Read the word list (inputs) and record the n-isogram value.
# Output: an array of [word, n] values
def words:
  [inputs
   | select(test("^[A-Za-z]+$"))
   | sub("^ +";"") | sub(" +$";"")
   | [., isogram] ];

# Input: an array of [word, n] values
# Sort by decreasing order of n;
# Then by decreasing order of word length;
# Then by ascending lexicographic order
def isograms:
  map( select( .[1] > 1) )
  | sort_by( .[0])
  | sort_by( - (.[0]|length))
  | sort_by( - .[1]);

# Input: an array of [word, n] values
# Sort as for isograms
def heterograms($minlength):
  map(select (.[1] == 1 and (.[0]|length) >= $minlength))
  | sort_by( .[0])
  | sort_by( - (.[0]|length));

words
| (isograms
   | "List of the \(length) n-isograms for which n > 1:",
     foreach .[] as [$word, $n] ({};
        .header = if $n != .group then "\nisograms of order \($n)" else null end
	| .group = $n;
	(.header | select(.)), $word ) ) ,

  (heterograms(11)
   | "\nList of the \(length) heterograms with length > 10:", .[][0])

Invocation

< unixdict.txt jq -Rrn -f isograms-and-heterograms.jq
Output:
List of the 33 n-isograms for which n > 1:

isograms of order 3
aaa
iii

isograms of order 2
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

List of the 32 heterograms with length > 10:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism


Julia

function isogram(word)
    wchars, uchars = collect(word), unique(collect(word))
    ulen, wlen = length(uchars), length(wchars)
    (wlen == 1 || ulen == wlen) && return 1
    n = count(==(first(uchars)), wchars)
    return all(i -> count(==(uchars[i]), wchars) == n, 2:ulen) ? n : 0
end

words = split(lowercase(read("documents/julia/unixdict.txt", String)), r"\s+")
orderlengthtuples = [(isogram(w), length(w), w) for w in words]

tcomp(x, y) = (x[1] != y[1] ? y[1] < x[1] : x[2] != y[2] ? y[2] < x[2] : x[3] < y[3])

nisograms = sort!(filter(t -> t[1] > 1, orderlengthtuples), lt = tcomp)
heterograms = sort!(filter(t -> t[1] == 1 && length(t[3]) > 10, orderlengthtuples), lt = tcomp)

println("N-Isogram   N  Length\n", "-"^24)
foreach(t -> println(rpad(t[3], 8), lpad(t[1], 5), lpad(t[2], 5)), nisograms)
println("\nHeterogram   Length\n", "-"^20)
foreach(t -> println(rpad(t[3], 12), lpad(t[2], 5)), heterograms)
Output:
N-Isogram   N  Length
------------------------
aaa         3    3
iii         3    3
beriberi    2    8
bilabial    2    8
caucasus    2    8
couscous    2    8
teammate    2    8
appall      2    6
emmett      2    6
hannah      2    6
murmur      2    6
tartar      2    6
testes      2    6
anna        2    4
coco        2    4
dada        2    4
deed        2    4
dodo        2    4
gogo        2    4
isis        2    4
juju        2    4
lulu        2    4
mimi        2    4
noon        2    4
otto        2    4
papa        2    4
peep        2    4
poop        2    4
teet        2    4
tete        2    4
toot        2    4
tutu        2    4
ii          2    2

Heterogram   Length
--------------------
ambidextrous   12
bluestocking   12
exclusionary   12
incomputable   12
lexicography   12
loudspeaking   12
malnourished   12
atmospheric    11
blameworthy    11
centrifugal    11
christendom    11
consumptive    11
countervail    11
countryside    11
countrywide    11
disturbance    11
documentary    11
earthmoving    11
exculpatory    11
geophysical    11
inscrutable    11
misanthrope    11
problematic    11
selfadjoint    11
stenography    11
sulfonamide    11
switchblade    11
switchboard    11
switzerland    11
thunderclap    11
valedictory    11
voluntarism    11

Nim

import std/[algorithm, strutils, tables]

type Item = tuple[word: string; n: int]

func isogramCount(word: string): Natural =
  ## Check if the word is an isogram and return the number
  ## of times each character is present. Return 1 for
  ## heterograms. Return 0 if the word is neither an isogram
  ## or an heterogram.
  let counts = word.toCountTable
  result = 0
  for count in counts.values:
    if result == 0:
      result = count
    elif count != result:
      return 0

proc cmp1(item1, item2: Item): int =
  ## Comparison function for part 1.
  result = cmp(item2.n, item1.n)
  if result == 0:
    result = cmp(item2.word.len, item1.word.len)
    if result == 0:
      result = cmp(item1.word, item2.word)

proc cmp2(item1, item2: Item): int =
  ## Comparison function for part 2.
  result = cmp(item1.n, item2.n)
  if result == 0:
    result = cmp(item2.word.len, item1.word.len)
    if result == 0:
      result = cmp(item1.word, item2.word)


var isograms: seq[Item]

for line in lines("unixdict.txt"):
  let word = line.toLower
  let count = word.isogramCount
  if count != 0:
    isograms.add (word, count)

echo "N-isograms where N > 1:"
isograms.sort(cmp1)
var idx = 0
for item in isograms:
  if item.n == 1: break
  inc idx
  stdout.write item.word.alignLeft(12)
  if idx mod 6 == 0: stdout.write '\n'
echo()

echo "\nHeterograms with more than 10 characters:"
isograms.sort(cmp2)
idx = 0
for item in isograms:
  if item.n != 1: break
  if item.word.len > 10:
    inc idx
    stdout.write item.word.alignLeft(16)
    if idx mod 4 == 0: stdout.write '\n'
echo()
Output:
N-isograms where N > 1:
aaa         iii         beriberi    bilabial    caucasus    couscous    
teammate    appall      emmett      hannah      murmur      tartar      
testes      anna        coco        dada        deed        dodo        
gogo        isis        juju        lulu        mimi        noon        
otto        papa        peep        poop        teet        tete        
toot        tutu        ii          

Heterograms with more than 10 characters:
ambidextrous    bluestocking    exclusionary    incomputable    
lexicography    loudspeaking    malnourished    atmospheric     
blameworthy     centrifugal     christendom     consumptive     
countervail     countryside     countrywide     disturbance     
documentary     earthmoving     exculpatory     geophysical     
inscrutable     misanthrope     problematic     selfadjoint     
stenography     sulfonamide     switchblade     switchboard     
switzerland     thunderclap     valedictory     voluntarism   

PascalABC.NET

uses System.Net;

function isogram(word: string): integer;
begin
  result := 0;
  var letters := new Dictionary<char, integer>;
  foreach var c in word do 
    letters[c] := letters.Get(c) + 1;
  var counts: set of integer;
  foreach var letter in letters do 
    counts += [letter.value];
  if counts.Count = 1 then 
    result := letters.Get(word[1]);
end;

begin
  var client := new WebClient();
  var text := client.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt');
  var words: sequence of string := text.ToWords(|#10, #13|);
  words.Where(w -> isogram(w) > 1)
       .OrderByDescending(w -> isogram(w))
       .ThenByDescending(w -> w.Length)
       .ThenBy(w -> w).println;
  println;
  words.where(w -> (isogram(w) = 1) and (w.Length > 10))
       .OrderByDescending(w -> w.Length)
       .ThenBy(w -> w).println;
end.
Output:
aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii

ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism

Perl

use strict;
use warnings;
use feature 'say';
use Path::Tiny;
use List::Util 'uniq';

my @words = map { lc } path('unixdict.txt')->slurp =~ /^[A-z]{2,}$/gm;

my(@heterogram, %isogram);
for my $w (@words) {
    my %l;
    $l{$_}++ for split '', $w;
    next unless 1 == scalar (my @x = uniq values %l);
    if ($x[0] == 1) { push @heterogram,        $w if length $w > 10 }
    else            { push @{$isogram{$x[0]}}, $w                   }
}

for my $n (reverse sort keys %isogram) {
    my @i = sort { length $b <=> length $a } @{$isogram{$n}};
    say scalar @i . " $n-isograms:\n" . join("\n", @i) . "\n";
}

say scalar(@heterogram) . " heterograms with more than 10 characters:\n" . join "\n", sort { length $b <=> length $a } @heterogram;
Output:
2 3-isograms:
aaa
iii

31 2-isograms:
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

32 heterograms with more than 10 characters:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

Phix

with javascript_semantics
function isogram(string word)
    sequence chars = {}, counts = {}
    for ch in word do
        integer k = find(ch,chars)
        if k=0 then
            chars &= ch
            counts &= 1
        else
            counts[k] += 1
        end if
    end for
    integer c1 = counts[1], lc = length(counts), lw = length(word)
    return iff((c1>1 or lw>10) and counts=repeat(c1,lc)?{word,c1,lw}:0)
end function
 
sequence res = sort_columns(filter(apply(unix_dict(),isogram),"!=",0),{-2,-3,1})
printf(1,"word           n length\n%s\n",{join(res,'\n',fmt:="%-14s %d %6d")})
Output:
word           n length
aaa            3      3
iii            3      3
beriberi       2      8
bilabial       2      8
caucasus       2      8
couscous       2      8
teammate       2      8
appall         2      6
emmett         2      6
hannah         2      6
murmur         2      6
tartar         2      6
testes         2      6
anna           2      4
coco           2      4
dada           2      4
deed           2      4
dodo           2      4
gogo           2      4
isis           2      4
juju           2      4
lulu           2      4
mimi           2      4
noon           2      4
otto           2      4
papa           2      4
peep           2      4
poop           2      4
teet           2      4
tete           2      4
toot           2      4
tutu           2      4
ii             2      2
ambidextrous   1     12
bluestocking   1     12
exclusionary   1     12
incomputable   1     12
lexicography   1     12
loudspeaking   1     12
malnourished   1     12
atmospheric    1     11
blameworthy    1     11
centrifugal    1     11
christendom    1     11
consumptive    1     11
countervail    1     11
countryside    1     11
countrywide    1     11
disturbance    1     11
documentary    1     11
earthmoving    1     11
exculpatory    1     11
geophysical    1     11
inscrutable    1     11
misanthrope    1     11
problematic    1     11
selfadjoint    1     11
stenography    1     11
sulfonamide    1     11
switchblade    1     11
switchboard    1     11
switzerland    1     11
thunderclap    1     11
valedictory    1     11
voluntarism    1     11

Python

from collections import Counter

def find_n_isograms(wordlist):
    n_isograms = []
    for word in wordlist:
        word_lower = word.lower()
        freq = Counter(word_lower)
        frequencies = freq.values()
        if len(set(frequencies)) == 1 and next(iter(frequencies)) > 1:
            n = next(iter(frequencies))
            n_isograms.append((-n, -len(word), word))
    n_isograms.sort()
    return [word for _, _, word in n_isograms]

def find_heterograms(wordlist):
    heterograms = []
    for word in wordlist:
        if len(word) > 10:
            word_lower = word.lower()
            if len(set(word_lower)) == len(word_lower):
                heterograms.append((-len(word), word))
    heterograms.sort()
    return [word for _, word in heterograms]

with open('unidict.txt', 'r') as file:
    wordlist = [line.strip() for line in file]

n_isograms_result = find_n_isograms(wordlist)
heterograms_result = find_heterograms(wordlist)

print("n-isograms (n > 1):", n_isograms_result)
print("Heterograms with more than 10 characters:", heterograms_result)
Output:
n-isograms (n > 1): ['aaa', 'iii', 'beriberi', 'bilabial', 'caucasus', 'couscous', 'teammate', 'appall', 'emmett', 'hannah', 'murmur', 'tartar', 'testes', 'anna', 'coco', 'dada', 'deed', 'dodo', 'gogo', 'isis', 'juju', 'lulu', 'mimi', 'noon', 'otto', 'papa', 'peep', 'poop', 'teet', 'tete', 'toot', 'tutu', 'ii']
Heterograms with more than 10 characters: ['ambidextrous', 'bluestocking', 'exclusionary', 'incomputable', 'lexicography', 'loudspeaking', 'malnourished', 'atmospheric', 'blameworthy', 'centrifugal', 'christendom', 'consumptive', 'countervail', 'countryside', 'countrywide', 'disturbance', 'documentary', 'earthmoving', 'exculpatory', 'geophysical', 'inscrutable', 'misanthrope', 'problematic', 'selfadjoint', 'stenography', 'sulfonamide', 'switchblade', 'switchboard', 'switzerland', 'thunderclap', 'valedictory', 'voluntarism']

Quackery

  [ [] ]'[
    rot witheach
      [ dup nested
        unrot over do
        iff [ dip join ]
        else nip ]
    drop ]                        is filter  ( [ --> [ )

  [ 0 127 of
    swap witheach
      [ upper 2dup peek
        1+ unrot poke ]
    [] swap witheach
      [ dup iff join else drop ]
    dup [] = iff [ drop 0 ] done
    behead swap witheach
      [ over != if
          [ drop 0 conclude ] ] ] is isogram ( [ --> n )

  $ "rosetta/unixdict.txt" sharefile
  drop nest$ dup
  filter [ isogram 1 > ]
  sort$
  sortwith [ size dip size < ]
  sortwith [ isogram dip isogram < ]
  60 wrap$
  cr
  filter [ size 10  > ]
  filter [ isogram 1 = ]
  sort$
  sortwith [ size dip size < ]
  60 wrap$
  cr
Output:
aaa iii beriberi bilabial caucasus couscous teammate appall
emmett hannah murmur tartar testes anna coco dada deed dodo
gogo isis juju lulu mimi noon otto papa peep poop teet tete
toot tutu ii

ambidextrous bluestocking exclusionary incomputable
lexicography loudspeaking malnourished atmospheric
blameworthy centrifugal christendom consumptive countervail
countryside countrywide disturbance documentary earthmoving
exculpatory geophysical inscrutable misanthrope problematic
selfadjoint stenography sulfonamide switchblade switchboard
switzerland thunderclap valedictory voluntarism

Raku

my $file = 'unixdict.txt';

my @words = $file.IO.slurp.words.race.map: { $_ => .comb.Bag };

.say for (6...2).map: -> $n {
    next unless my @iso = @words.race.grep({.value.values.all == $n})».key;
    "\n({+@iso}) {$n}-isograms:\n" ~ @iso.sort({[-.chars, ~$_]}).join: "\n";
}

my $minchars = 10;

say "\n({+$_}) heterograms with more than $minchars characters:\n" ~
  .sort({[-.chars, ~$_]}).join: "\n" given
  @words.race.grep({.key.chars >$minchars && .value.values.max == 1})».key;
Output:
(2) 3-isograms:
aaa
iii

(31) 2-isograms:
beriberi
bilabial
caucasus
couscous
teammate
appall
emmett
hannah
murmur
tartar
testes
anna
coco
dada
deed
dodo
gogo
isis
juju
lulu
mimi
noon
otto
papa
peep
poop
teet
tete
toot
tutu
ii

(32) heterograms with more than 10 characters:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

Ruby

Blameworthy exclusionary lexicography causes unixdict.txt to make it incomputable if the word isogram is itself an isogram.

words = File.readlines("unixdict.txt", chomp: true)

isograms = words.group_by do |word|
  char_counts = word.downcase.chars.tally.values
  char_counts.first if char_counts.uniq.size == 1
end
isograms.delete(nil)
isograms.transform_values!{|ar| ar.sort_by{|word| [-word.size, word]} }

keys = isograms.keys.sort.reverse
keys.each{|k| puts "(#{isograms[k].size}) #{k}-isograms: #{isograms[k]} " if k > 1 }

min_chars = 10
large_heterograms = isograms[1].select{|word| word.size > min_chars }
puts "" , "(#{large_heterograms.size}) heterograms with more than #{min_chars} chars:"
puts large_heterograms
Output:
(2) 3-isograms: ["aaa", "iii"] 
(31) 2-isograms: ["beriberi", "bilabial", "caucasus", "couscous", "teammate", "appall", "emmett", "hannah", "murmur", "tartar", "testes", "anna", "coco", "dada", "deed", "dodo", "gogo", "isis", "juju", "lulu", "mimi", "noon", "otto", "papa", "peep", "poop", "teet", "tete", "toot", "tutu", "ii"] 

(32) heterograms with more than 10 chars:
ambidextrous
bluestocking
exclusionary
incomputable
lexicography
loudspeaking
malnourished
atmospheric
blameworthy
centrifugal
christendom
consumptive
countervail
countryside
countrywide
disturbance
documentary
earthmoving
exculpatory
geophysical
inscrutable
misanthrope
problematic
selfadjoint
stenography
sulfonamide
switchblade
switchboard
switzerland
thunderclap
valedictory
voluntarism

Wren

Library: Wren-str
import "io" for File
import "./str" for Str

var isogram = Fn.new { |word|
    if (word.count == 1) return 1
    var map = {}
    word = Str.lower(word)
    for (c in word) {
        if (map.containsKey(c)) {
            map[c] = map[c] + 1
        } else {
            map[c] = 1
        }
    }
    var chars = map.keys.toList
    var n = map[chars[0]]
    var iso = chars[1..-1].all { |c| map[c] == n }
    return iso ? n : 0
}

var isoComparer = Fn.new { |i, j|
    if (i[1] != j[1]) return i[1] > j[1]
    if (i[0].count != j[0].count) return i[0].count > j[0].count
    return Str.le(i[0], j[0])
}

var heteroComparer = Fn.new { |i, j|
    if (i[0].count != j[0].count) return i[0].count > j[0].count
    return Str.le(i[0], j[0])
}

var wordList = "unixdict.txt" // local copy
var words = File.read(wordList)
                .trimEnd()
                .split("\n")
                .map { |word| [word, isogram.call(word)] }

var isograms = words.where { |t| t[1] > 1 }
                    .toList
                    .sort(isoComparer)
                    .map { |t| "  " + t[0] }
                    .toList
System.print("List of n-isograms(%(isograms.count)) where n > 1:")
System.print(isograms.join("\n"))

var heterograms = words.where { |t| t[1] == 1 && t[0].count > 10 }
                       .toList
                       .sort(heteroComparer)
                       .map { |t| "  " + t[0] }
                       .toList
System.print("\nList of heterograms(%(heterograms.count)) of length > 10:")
System.print(heterograms.join("\n"))
Output:
List of n-isograms(33) where n > 1:
  aaa
  iii
  beriberi
  bilabial
  caucasus
  couscous
  teammate
  appall
  emmett
  hannah
  murmur
  tartar
  testes
  anna
  coco
  dada
  deed
  dodo
  gogo
  isis
  juju
  lulu
  mimi
  noon
  otto
  papa
  peep
  poop
  teet
  tete
  toot
  tutu
  ii

List of heterograms(32) of length > 10:
  ambidextrous
  bluestocking
  exclusionary
  incomputable
  lexicography
  loudspeaking
  malnourished
  atmospheric
  blameworthy
  centrifugal
  christendom
  consumptive
  countervail
  countryside
  countrywide
  disturbance
  documentary
  earthmoving
  exculpatory
  geophysical
  inscrutable
  misanthrope
  problematic
  selfadjoint
  stenography
  sulfonamide
  switchblade
  switchboard
  switzerland
  thunderclap
  valedictory
  voluntarism
Cookies help us deliver our services. By using our services, you agree to our use of cookies.