Determine if a string has all unique characters

From Rosetta Code
Task
Determine if a string has all unique characters
You are encouraged to solve this task according to the task description, using any language you may know.
Task

Given a character string   (which may be empty, or have a length of zero characters):

  •   create a function/procedure/routine to:
  •   determine if all the characters in the string are unique
  •   indicate if or which character is duplicated and where
  •   display each string and it's length   (as the strings are being examined)
  •   a zero─length (empty) string shall be considered as unique
  •   process the strings from left─to─right
  •   if       unique,   display a message saying such
  •   if not unique,   then:
  •   display a message saying such
  •   display what character is duplicated
  •   only the 1st non─unique character need be displayed
  •   display where "both" duplicated characters are in the string
  •   the above messages can be part of a single message
  •   display the hexadecimal value of the duplicated character


Use (at least) these five test values   (strings):

  •   a string of length     0   (an empty string)
  •   a string of length     1   which is a single period   (.)
  •   a string of length     6   which contains:   abcABC
  •   a string of length     7   which contains a blank in the middle:   XYZ  ZYX
  •   a string of length   36   which   doesn't   contain the letter "oh":
1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ


Show all output here on this page.


Related tasks



Factor[edit]

USING: accessors formatting generalizations io kernel
math.parser regexp sequences sets strings ;
 
: >dup-char< ( str n -- char hex first-index second-index )
1string tuck [ dup first >hex ] 2dip <regexp>
all-matching-slices first2 [ from>> ] [email protected] ;
 
: duplicate-info. ( str -- )
dup duplicates
[ >dup-char< "'%s' (0x%s) at indices %d and %d.\n" printf ]
with each nl ;
 
: uniqueness-report. ( str -- )
dup dup length "%u — length %d — contains " printf dup
all-unique? [ drop "all unique characters." print nl ]
[ "duplicate characters:" print duplicate-info. ] if ;
 
""
"."
"abcABC"
"XYZ ZYX"
"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ"
[ uniqueness-report. ] 5 napply
Output:
"" — length 0 — contains all unique characters.

"." — length 1 — contains all unique characters.

"abcABC" — length 6 — contains all unique characters.

"XYZ ZYX" — length 7 — contains duplicate characters:
'Z' (0x5a) at indices 2 and 4.
'Y' (0x59) at indices 1 and 5.
'X' (0x58) at indices 0 and 6.

"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ" — length 36 — contains duplicate characters:
'0' (0x30) at indices 9 and 24.

Go[edit]

package main
 
import "fmt"
 
func analyze(s string) {
chars := []rune(s)
le := len(chars)
fmt.Printf("Analyzing %q which has a length of %d:\n", s, le)
if le > 1 {
for i := 0; i < le-1; i++ {
for j := i + 1; j < le; j++ {
if chars[j] == chars[i] {
fmt.Println(" Not all characters in the string are unique.")
fmt.Printf("  %q (%#[1]x) is duplicated at positions %d and %d.\n\n", chars[i], i+1, j+1)
return
}
}
}
}
fmt.Println(" All characters in the string are unique.\n")
}
 
func main() {
strings := []string{
"",
".",
"abcABC",
"XYZ ZYX",
"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ",
"01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X",
"hétérogénéité",
"🎆🎃🎇🎈",
"😍😀🙌💃😍🙌",
"🐠🐟🐡🦈🐬🐳🐋🐡",
}
for _, s := range strings {
analyze(s)
}
}
Output:
Analyzing "" which has a length of 0:
  All characters in the string are unique.

Analyzing "." which has a length of 1:
  All characters in the string are unique.

Analyzing "abcABC" which has a length of 6:
  All characters in the string are unique.

Analyzing "XYZ ZYX" which has a length of 7:
  Not all characters in the string are unique.
  'X' (0x58) is duplicated at positions 1 and 7.

Analyzing "1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ" which has a length of 36:
  Not all characters in the string are unique.
  '0' (0x30) is duplicated at positions 10 and 25.

Analyzing "01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X" which has a length of 39:
  Not all characters in the string are unique.
  '0' (0x30) is duplicated at positions 1 and 11.

Analyzing "hétérogénéité" which has a length of 13:
  Not all characters in the string are unique.
  'é' (0xe9) is duplicated at positions 2 and 4.

Analyzing "🎆🎃🎇🎈" which has a length of 4:
  All characters in the string are unique.

Analyzing "😍😀🙌💃😍🙌" which has a length of 6:
  Not all characters in the string are unique.
  '😍' (0x1f60d) is duplicated at positions 1 and 5.

Analyzing "🐠🐟🐡🦈🐬🐳🐋🐡" which has a length of 8:
  Not all characters in the string are unique.
  '🐡' (0x1f421) is duplicated at positions 3 and 8.

Julia[edit]

arr(s) = [c for c in s]
alldup(a) = filter(x -> length(x) > 1, [findall(x -> x == a[i], a) for i in 1:length(a)])
firstduplicate(s) = (a = arr(s); d = alldup(a); isempty(d) ? nothing : first(d))
 
function testfunction(strings)
println("String | Length | All Unique | First Duplicate | Positions\n" *
"-------------------------------------------------------------------------------------")
for s in strings
n = firstduplicate(s)
a = arr(s)
println(rpad(s, 38), rpad(length(s), 11), n == nothing ? "yes" :
rpad("no $(a[n[1]])", 26) * rpad(n[1], 4) * "$(n[2])")
end
end
 
testfunction([
"",
".",
"abcABC",
"XYZ ZYX",
"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ",
"hétérogénéité",
"🎆🎃🎇🎈",
"😍😀🙌💃😍🙌",
"🐠🐟🐡🦈🐬🐳🐋🐡",
])
 
Output:
String                            | Length | All Unique | First Duplicate (Hex) | Positions
-------------------------------------------------------------------------------------------
                                      0          yes
.                                     1          yes
abcABC                                6          yes
XYZ ZYX                               7          no             X  (58)            1   7
1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ  36         no             0  (30)            10  25
hétérogénéité                         13         no             é  (e9)            2   4
🎆🎃🎇🎈                             4          yes
😍😀🙌💃😍🙌                        6          no           😍  (1f60d)         1   5
🐠🐟🐡🦈🐬🐳🐋🐡                   8          no           🐡  (1f421)         3   8

Perl[edit]

use strict;
use warnings;
use feature 'say';
use utf8;
binmode(STDOUT, ':utf8');
use List::AllUtils qw(uniq);
use Unicode::UCD 'charinfo';
 
for my $str (
'',
'.',
'abcABC',
'XYZ ZYX',
'1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ',
'01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X',
'Δ👍👨👍Δ',
'ΔδΔ̂ΔΛ',
) {
my @S;
push @S, $1 while $str =~ /(\X)/g;
printf qq{\n"$str" (length: %d) has }, scalar @S;
if (@S != uniq @S ) {
say "duplicated characters:";
my %P;
push @{ $P{$S[$_]} }, 1+$_ for 0..$#S;
for my $k (sort keys %P) {
next unless @{$P{$k}} > 1;
printf "'%s' %s (0x%x) in positions: %s\n", $k, charinfo(ord $k)->{'name'}, ord($k), join ', ', @{$P{$k}};
}
} else {
say "no duplicated characters."
}
}
Output:
"" (length: 0) has no duplicated characters.

"." (length: 1) has no duplicated characters.

"abcABC" (length: 6) has no duplicated characters.

"XYZ ZYX" (length: 7) has duplicated characters:
'X' LATIN CAPITAL LETTER X (0x58) in positions: 1, 7
'Y' LATIN CAPITAL LETTER Y (0x59) in positions: 2, 6
'Z' LATIN CAPITAL LETTER Z (0x5a) in positions: 3, 5

"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ" (length: 36) has duplicated characters:
'0' DIGIT ZERO (0x30) in positions: 10, 25

"01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X" (length: 39) has duplicated characters:
'0' DIGIT ZERO (0x30) in positions: 1, 11, 26, 38
'X' LATIN CAPITAL LETTER X (0x58) in positions: 35, 39

"Δ👍👨👍Δ" (length: 5) has duplicated characters:
'Δ' GREEK CAPITAL LETTER DELTA (0x394) in positions: 1, 5
'👍' THUMBS UP SIGN (0x1f44d) in positions: 2, 4

"ΔδΔ̂ΔΛ" (length: 5) has duplicated characters:
'Δ' GREEK CAPITAL LETTER DELTA (0x394) in positions: 1, 4

Perl 6[edit]

Works with: Rakudo version 2019.07.1

Perl 6 works with unicode natively and handles combining characters and multi-byte emoji correctly. In the last string, notice the the length is correctly shown as 11 characters and that the delta with a combining circumflex in position 6 is not the same as the deltas without in positions 5 & 9.

  -> $str {
my $i = 0;
print "\n{$str.perl} (length: {$str.chars}), has ";
my %m;
%m{$_}.push: ++$i for $str.comb;
if any(%m.values) > 1 {
say "duplicated characters:";
say "'{.key}' ({.key.uninames}; hex ordinal: {(.key.ords).fmt: "0x%X"})" ~
" in positions: {.value.join: ', '}" for %m.grep( *.value > 1 ).sort( *.value[0] );
} else {
say "no duplicated characters."
}
} for
'',
'.',
'abcABC',
'XYZ ZYX',
'1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ',
'01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X',
'🦋🙂👨‍👩‍👧‍👦🙄ΔΔ̂ 🦋Δ👍👨‍👩‍👧‍👦'
Output:
"" (length: 0), has no duplicated characters.

"." (length: 1), has no duplicated characters.

"abcABC" (length: 6), has no duplicated characters.

"XYZ ZYX" (length: 7), has duplicated characters:
'X' (LATIN CAPITAL LETTER X; hex ordinal: 0x58) in positions: 1, 7
'Y' (LATIN CAPITAL LETTER Y; hex ordinal: 0x59) in positions: 2, 6
'Z' (LATIN CAPITAL LETTER Z; hex ordinal: 0x5A) in positions: 3, 5

"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ" (length: 36), has duplicated characters:
'0' (DIGIT ZERO; hex ordinal: 0x30) in positions: 10, 25

"01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X" (length: 39), has duplicated characters:
'0' (DIGIT ZERO; hex ordinal: 0x30) in positions: 1, 11, 26, 38
'X' (LATIN CAPITAL LETTER X; hex ordinal: 0x58) in positions: 35, 39

"🦋🙂👨‍👩‍👧‍👦🙄ΔΔ̂ 🦋Δ👍👨‍👩‍👧‍👦" (length: 11), has duplicated characters:
'🦋' (BUTTERFLY; hex ordinal: 0x1F98B) in positions: 1, 8
'👨‍👩‍👧‍👦' (MAN ZERO WIDTH JOINER WOMAN ZERO WIDTH JOINER GIRL ZERO WIDTH JOINER BOY; hex ordinal: 0x1F468 0x200D 0x1F469 0x200D 0x1F467 0x200D 0x1F466) in positions: 3, 11
'Δ' (GREEK CAPITAL LETTER DELTA; hex ordinal: 0x394) in positions: 5, 9

REXX[edit]

/*REXX pgm determines if a string is comprised of all unique characters (no duplicates).*/
@.= /*assign a default for the @. array. */
parse arg @.1 /*obtain optional argument from the CL.*/
if @.1='' then do; @.1= /*Not specified? Then assume defaults.*/
@.2= .
@.3= 'abcABC'
@.4= 'XYZ ZYX'
@.5= '1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ'
end
 
do j=1; if j\==1 & @.j=='' then leave /*String is null & not j=1? We're done*/
say copies('─', 79) /*display a separator line (a fence). */
say 'Testing for the string (length' length(@.j)"): " @.j
say
dup= isUnique(@.j)
say 'The characters in the string' word("are aren't", 1 + (dup>0) ) 'all unique.'
if dup==0 then iterate
 ?= substr(@.j, dup, 1)
say 'The character '  ? " ('"c2x(?)"'x) at position " dup ,
' is repeated at position ' pos(?, @.j, dup+1)
end /*j*/
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
isUnique: procedure; parse arg x /*obtain the character string.*/
do k=1 to length(x) - 1 /*examine all but the last. */
p= pos( substr(x, k, 1), x, k + 1) /*see if the Kth char is a dup*/
if p\==0 then return k /*Find a dup? Return location.*/
end /*k*/
return 0 /*indicate all chars unique. */
output   when using the internal defaults
───────────────────────────────────────────────────────────────────────────────
Testing for the string (length 0):

The characters in the string are all unique.
───────────────────────────────────────────────────────────────────────────────
Testing for the string (length 1):  .

The characters in the string are all unique.
───────────────────────────────────────────────────────────────────────────────
Testing for the string (length 6):  abcABC

The characters in the string are all unique.
───────────────────────────────────────────────────────────────────────────────
Testing for the string (length 7):  XYZ ZYX

The characters in the string aren't all unique.
The character  X  ('58'x)  at position  1  is repeated at position  7
───────────────────────────────────────────────────────────────────────────────
Testing for the string (length 36):  1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ

The characters in the string aren't all unique.
The character  0  ('30'x)  at position  10  is repeated at position  25

zkl[edit]

fcn stringUniqueness(str){  // Does not handle Unicode
sz,unique,uz,counts := str.len(), str.unique(), unique.len(), str.counts();
println("Length %d: \"%s\"".fmt(sz,str));
if(sz==uz or uz==1) println("\tAll characters are unique");
else // counts is (char,count, char,count, ...)
println("\tDuplicate: ",
counts.pump(List,Void.Read,fcn(str,c,n){
if(n>1){
is,z:=List(),-1; do(n){ is.append(z=str.find(c,z+1)) }
"'%s' (0x%x)[%s]".fmt(c,c.toAsc(),is.concat(","))
}
else Void.Skip
}.fp(str)).concat(", "));
}
testStrings:=T("", ".", "abcABC", "XYZ ZYX", 
"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ",
"01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X");
foreach s in (testStrings){ stringUniqueness(s) }
Output:
Length 0: ""
	All characters are unique
Length 1: "."
	All characters are unique
Length 6: "abcABC"
	All characters are unique
Length 7: "XYZ ZYX"
	Duplicate: 'X' (0x58)[0,6], 'Y' (0x59)[1,5], 'Z' (0x5a)[2,4]
Length 36: "1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ"
	Duplicate: '0' (0x30)[9,24]
Length 39: "01234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ0X"
	Duplicate: '0' (0x30)[0,10,25,37], 'X' (0x58)[34,38]