SEDOL

From Rosetta Code

Jump to: navigation, search

Programming Task
This is a programming task. It lays out a problem which Rosetta Code users are encouraged to solve, using languages they know.

Code examples should be formatted along the lines of one of the existing prototypes.
For each number list of 6-digit SEDOLs, calculate and append the checksum digit. That is, given this input:
710889
B0YBKJ
406566
B0YBLH
228276
B0YBKL
557910
B0YBKR
585284
B0YBKT
Produce this output:
7108899
B0YBKJ7
4065663
B0YBLH2
2282765
B0YBKL9
5579107
B0YBKR5
5852842
B0YBKT7

Contents

[edit] Ada

 
with Ada.Text_IO;  use Ada.Text_IO;
 
procedure Test_SEDOL is
 
   subtype SEDOL_String is String (1..6);
   type SEDOL_Sum is range 0..9;
 
   function Check (SEDOL : SEDOL_String) return SEDOL_Sum is
      Weight : constant array (SEDOL_String'Range) of Integer := (1,3,1,7,3,9);
      Sum    : Integer := 0;
      Item   : Integer;
   begin
      for Index in SEDOL'Range loop
         Item := Character'Pos (SEDOL (Index));
         case Item is
            when Character'Pos ('0')..Character'Pos ('9') =>
               Item := Item - Character'Pos ('0');
            when Character'Pos ('B')..Character'Pos ('D') |
                 Character'Pos ('F')..Character'Pos ('H') |
                 Character'Pos ('J')..Character'Pos ('N') |
                 Character'Pos ('P')..Character'Pos ('T') |
                 Character'Pos ('V')..Character'Pos ('Z') =>
               Item := Item - Character'Pos ('A') + 10;
            when others =>
               raise Constraint_Error;
         end case;
         Sum := Sum + Item * Weight (Index);
      end loop;
      return SEDOL_Sum ((-Sum) mod 10);
   end Check;
 
   Test : constant array (1..10) of SEDOL_String :=
             (  "710889", "B0YBKJ", "406566", "B0YBLH", "228276",
                "B0YBKL", "557910", "B0YBKR", "585284", "B0YBKT"
             );
begin
   for Index in Test'Range loop
      Put_Line (Test (Index) & Character'Val (Character'Pos ('0') + Check (Test (Index))));
   end loop;
end Test_SEDOL;
 

The function Check raises Constraint_Error upon an invalid input. The calculated sum is trimmed using (-sum) mod 10, which is mathematically equivalent to (10 - (sum mod 10)) mod 10.

Sample output:

7108899
B0YBKJ7
4065663
B0YBLH2
2282765
B0YBKL9
5579107
B0YBKR5
5852842
B0YBKT7

[edit] BASIC

Works with: QuickBasic version 4.5

DECLARE FUNCTION getSedolCheckDigit! (str AS STRING)
DO
        INPUT a$
        PRINT a$ + STR$(getSedolCheckDigit(a$))
LOOP WHILE a$ <> ""
 
FUNCTION getSedolCheckDigit (str AS STRING)
    IF LEN(str) <> 6 THEN
        PRINT "Six chars only please"
        EXIT FUNCTION
    END IF
    str = UCASE$(str)
    DIM mult(6) AS INTEGER
    mult(1) = 1: mult(2) = 3: mult(3) = 1
    mult(4) = 7: mult(5) = 3: mult(6) = 9
    total = 0
    FOR i = 1 TO 6
        s$ = MID$(str, i, 1)
        IF s$ = "A" OR s$ = "E" OR s$ = "I" OR s$ = "O" OR s$ = "U" THEN
                PRINT "No vowels"
                EXIT FUNCTION
        END IF
        IF ASC(s$) >= 48 AND ASC(s$) <= 57 THEN
                total = total + VAL(s$) * mult(i)
        ELSE
                total = total + (ASC(s$) - 55) * mult(i)
        END IF
 
    NEXT i
    getSedolCheckDigit = (10 - (total MOD 10)) MOD 10
END FUNCTION

[edit] Forth

create weight 1 , 3 , 1 , 7 , 3 , 9 ,

: char>num ( '0-9A-Z' -- 0..35 )
  dup [char] 9 > 7 and - [char] 0 - ;

: check+ ( sedol -- sedol' )
  6 <> abort" wrong SEDOL length"
  0 ( sum )
  6 0 do
    over I + c@ char>num
    weight I cells + @ *
    +
  loop
  10 mod   10 swap -  10 mod  [char] 0 +
  over 6 + c! 7 ;
: sedol"   [char] " parse check+ type ;

sedol" 710889" 7108899 ok
sedol" B0YBKJ" B0YBKJ7 ok
sedol" 406566" 4065663 ok
sedol" B0YBLH" B0YBLH2 ok
sedol" 228276" 2282765 ok
sedol" B0YBKL" B0YBKL9 ok
sedol" 557910" 5579107 ok
sedol" B0YBKR" B0YBKR5 ok
sedol" 585284" 5852842 ok
sedol" B0YBKT" B0YBKT7 ok

[edit] Fortran

Works with: Fortran version 90 and later

MODULE SEDOL_CHECK
  IMPLICIT NONE
  CONTAINS
 
  FUNCTION Checkdigit(c)
    CHARACTER :: Checkdigit
    CHARACTER(6), INTENT(IN) :: c
    CHARACTER(36) :: alpha = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    INTEGER, DIMENSION(6) :: weights = (/ 1, 3, 1, 7, 3, 9 /), temp
    INTEGER :: i, n

    DO i = 1, 6
      temp(i) = INDEX(alpha, c(i:i)) - 1
    END DO
    temp = temp * weights
    n = MOD(10 - (MOD(SUM(temp), 10)), 10)  
    Checkdigit = ACHAR(n + 48)
  END FUNCTION Checkdigit
 
END MODULE SEDOL_CHECK

PROGRAM SEDOLTEST
  USE SEDOL_CHECK
  IMPLICIT NONE
 
  CHARACTER(31) :: valid = "0123456789BCDFGHJKLMNPQRSTVWXYZ"
  CHARACTER(6) :: codes(10) = (/ "710889", "B0YBKJ", "406566", "B0YBLH", "228276" ,  &
                                 "B0YBKL", "557910", "B0YBKR", "585284", "B0YBKT" /)
  CHARACTER(7) :: sedol
  INTEGER :: i, invalid

  DO i = 1, 10
    invalid = VERIFY(codes(i), valid)
    IF (invalid == 0) THEN
      sedol = codes(i)
      sedol(7:7) = Checkdigit(codes(i))
    ELSE
      sedol = "INVALID"
    END IF
    WRITE(*, "(2A9)") codes(i), sedol
  END DO
   
END PROGRAM SEDOLTEST

Output

  710889  7108899
  B0YBKJ  B0YBKJ7
  406566  4065663
  B0YBLH  B0YBLH2
  228276  2282765
  B0YBKL  B0YBKL9
  557910  5579107
  B0YBKR  B0YBKR5
  585284  5852842
  B0YBKT  B0YBKT7

[edit] J

There are several ways to perform this in J. This most closely follows the algorithmic description at Wikipedia:

  sn   =.  '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'  
  ac0  =:  (, 10 | 1 3 1 7 3 9 +/@:* -)&.(sn i. |:)

However, so J is so concise, having written the above, it becomes clear that the negation (-) is unneccsary.

The fundamental operation is the linear combination (+/@:*) and neither argument is "special". In particular, the coefficients are just another array participating in the calculation, and there's no reason we can't modify them as easily as the input array. Having this insight, it is obvious that manipulating the coefficients, rather than the input array, will be more efficient (because the coefficients are fixed at small size, while the input array can be arbitrarily large).

Which leads us to this more efficient formulation:

  ac1  =:  (, 10 | (10 - 1 3 1 7 3 9) +/@:* ])&.(sn i. |:)

which reduces to:

  ac1  =:  (, 10 | 9 7 9 3 7 1 +/@:* ])&.(sn i. |:)

Which is just as concise as ac0, but faster.

Following this train of thought, our array thinking leads us to realize that even the modulous isn't neccesary. The number of SEDOL numbers is finite, as is the number of coefficients; therefore the number of possible linear combinations of these is finite. In fact, there are only 841 possible outcomes. This is a small number, and can be efficiently stored as a lookup table (even better, since the outcomes will be mod 10, they are restricted to the digits 0-9, and they repeat).

Which leads us to:

  ac2  =.  (,"1 0 (841 $ '0987654321') {~ 1 3 1 7 3 9 +/ .*~ sn i. ])

Which is more than twice as fast as even the optimized formulation (ac1), though it is slightly longer.

[edit] Java

import java.util.Scanner;
 
public class SEDOL{
	public static void main(String[] args){
		Scanner sc = new Scanner(System.in);
		while(sc.hasNext()){
			String sedol = sc.next();
			System.out.println(sedol + getSedolCheckDigit(sedol));
		}
	}
	public static int getSedolCheckDigit(String str){
	    if(str.length() != 6){
	    	System.err.println("Six chars only please");
	    	return -1;
	    }
	    str = str.toUpperCase();
	    int[] mult = {1, 3, 1, 7, 3, 9};
	    int total = 0;
	    for(int i = 0;i < 6; i++){
	        char s = str.charAt(i);
	        if(s == 'A' || s == 'E' || s == 'I' || s == 'O' ||s == 'U'){
	        	System.err.println("No vowels");
	        	return -1;
	        }
	        total += 
	        (Character.isDigit(s) ? Character.digit(s, 10) : s - 55) * mult[i];
	    }
	    return (10 - (total % 10)) % 10;
	}
}

[edit] Perl

This program reads from standard input.

sub sum
 {my $n = 0;
  $n += $_ foreach @_;
  return $n;}
 
sub zip
 {my $f = shift;
  my @a = @{shift()};
  my @b = @{shift()};
  my @result = ();
  push(@result, $f->(shift @a, shift @b)) while @a and @b;
  return @result;}
 
sub char_to_v
 {my $c = shift;
  $c =~ /[A-Z]/ and $c = ord($c) - ord('A') + 10;
  return $c;}
 
my @weights = (1, 3, 1, 7, 3, 9);
sub sedol
 {my $s = shift;
  my @vs = map {char_to_v $_} split //, $s;
  my $checksum = sum (zip sub {$_[0] * $_[1]}, \@vs, \@weights);
  my $check_digit = (10 - $checksum % 10) % 10;
  return $s . $check_digit;}
 
while (<>)
   {chomp;
    print sedol($_), "\n";}

[edit] Python

import string
 
# constants
sedolchars = string.digits + string.ascii_uppercase
sedol2value = dict((ch, n) for n,ch in enumerate(sedolchars))
for ch in 'AEIOU':
    del sedol2value[ch]
sedolchars = sorted(sedol2value.keys())
sedolweight = [1,3,1,7,3,9,1]
 
def checksum(sedol):
    tmp = sum(sedol2value[ch] * sedolweight[n]
               for n,ch in enumerate(sedol[:6])
               )
    return sedolchars[ (10 - (tmp % 10)) % 10]
 
for sedol in '''
    710889
    B0YBKJ
    406566
    B0YBLH
    228276
    B0YBKL
    557910
    B0YBKR
    585284
    B0YBKT
    '''.split():
    print sedol + checksum(sedol)
 
Personal tools