Top rank per group: Difference between revisions

Content added Content deleted

Inline

Revision as of 05:11, 11 February 2009

Find the top N salaries in each group, where N is provided as a parameter.

Use this data as a formatted internal data structure(adapt it to your language-native idioms, rather than parse at runtime), or identify your external data source: <lang csv>Employee Name,Employee ID,Salary,Department Tyler Bennett,E10297,32000,D101 John Rappl,E21437,47000,D050 George Woltman,E00127,53500,D101 Adam Smith,E63535,18000,D202 Claire Buckman,E39876,27800,D202 David McClellan,E04242,41500,D101 Rich Holcomb,E01234,49500,D202 Nathan Adams,E41298,21900,D050 Richard Potter,E43128,15900,D101 David Motsinger,E27002,19250,D202 Tim Sampair,E03033,27000,D101 Kim Arlich,E10001,57000,D190 Timothy Grove,E16398,29900,D190</lang>

Haskell

<lang haskell> import Data.List import Control.Monad import Control.Arrow import Text.Printf

groupingOn f a b = f a == f b comparing f a b = compare (f a) (f b) comparingDwn f a b = compare (f b) (f a)

type ID = Int type DEP = [Char] type NAME = [Char] type SALARY = Double type Employee = (ID, DEP, NAME, SALARY)

employees :: [Employee] employees = [(1001,"AB","Janssen A.H.",41000), (101,"KA","'t Woud B.",45000),

            (1013,"AB","de Bont C.A.",65000), (1101,"CC","Modaal A.M.J.",30000),
            (1203,"AB","Anders H.",50000),    (100,"KA","Ezelbips P.J.",52000),
            (1102,"CC","Zagt A.",33000),     (1103,"CC","Ternood T.R.",21000),
            (1104,"CC","Lageln M.",23000),   (1105,"CC","Amperwat A.",19000),
            (1106,"CC","Boon T.J.",25000), (1107,"CC","Beloop L.O.",31000),
            (1009,"CD","Janszoon A.",38665), (1026,"CD","Janszen H.P.",41000),
            (1011,"CC","de Goeij J.",39000), (106,"KA","Pragtweik J.M.V.",42300),
            (111,"KA","Bakeuro S.",31000),  (105,"KA","Clubdrager C.",39800),
            (104,"KA","Karendijk F.",23000), (107,"KA","Centjes R.M.",34000),
            (119,"KA","Tegenstroom H.L.",39000), (1111,"CD","Telmans R.M.",55500),
            (1093,"AB","de Slegte S.",46987), (1199,"CC","Uitlaat G.A.S.",44500)
           ]

nr :: Employee -> ID nr (i,_,_,_) = i

dep :: Employee -> DEP dep (_,d,_,_) = d

name :: Employee -> NAME name (_,_,n,_) = n

sal :: Employee -> SALARY sal (_,_,_,s) = s

dorank :: Int ->

         (Employee -> DEP) ->
         (Employee -> SALARY) ->
         [Employee]-> Employee

dorank n o1 o2 = map (take n. sortBy (comparingDwn o2))

                . groupBy (groupingOn o1) . sortBy (comparing o1)

toprank :: IO () toprank = do

  printf "%-16s %3s %10s\n" "NAME" "DEP" "TIP" 
  printf "%s\n" $ replicate 31 '='
  mapM_ (mapM_ (ap (ap (printf "%-16s %3s %10.2g\n" . name) dep) sal)) $ dorank 3 dep sal employees

</lang> Output: top 3 per department

*Main> toprank
NAME             DEP        TIP
===============================
de Bont C.A.      AB   65000.00
Anders H.         AB   50000.00
de Slegte S.      AB   46987.00
Uitlaat G.A.S.    CC   44500.00
de Goeij J.       CC   39000.00
Zagt A.           CC   33000.00
Telmans R.M.      CD   55500.00
Janszen H.P.      CD   41000.00
Janszoon A.       CD   38665.00
Ezelbips P.J.     KA   52000.00
't Woud B.        KA   45000.00
Pragtweik J.M.V.  KA   42300.00

J

J has a rich set of primitive functions, which combine the power of an imperative language with the expressiveness of a declarative, SQL-like language:

   NB.  Dynamically generate convenience functions
   ('`',,;:^:_1: N=:{.Employees) =:, (_&{"1)`'' ([^:(_ -: ])L:0)"0 _~ i.# E =: {: Employees

   NB.  Show top ranked employees in each dept
   N , (<@:>"1@:|:@:((6 <. #) {. ] \: SALARY)/.~ DEPT) |: <"1&> E

</lang>

+-----+-----+-----------------+------+
|ID   |DEPT |NAME             |SALARY|
+-----+-----+-----------------+------+
|1013 |AB   |de Bont C.A.     |65000 |
|1203 |AB   |Anders H.        |50000 |
|1093 |AB   |de Slegte S.     |46987 |
|1001 |AB   |Janssen A.H.     |41000 |
+-----+-----+-----------------+------+
|100  |KA   |Ezelbips P.J.    |52000 |
|101  |KA   |'t Woud B.       |45000 |
|106  |KA   |Pragtweik J.M.V. |42300 |
|105  |KA   |Clubdrager C.    |39800 |
|119  |KA   |Tegenstroom H.L. |39000 |
|107  |KA   |Centjes R.M.     |34000 |
+-----+-----+-----------------+------+
|1199 |CC   |Uitlaat G.A.S.   |44500 |
|1011 |CC   |de Goeij J.      |39000 |
|1102 |CC   |Zagt A.          |33000 |
|1107 |CC   |Beloop L.O.      |31000 |
|1101 |CC   |Modaal A.M.J.    |30000 |
|1106 |CC   |Boon T.J.        |25000 |
+-----+-----+-----------------+------+
|1111 |CD   |Telmans R.M.     |55500 |
|1026 |CD   |Janszen H.P.     |41000 |
|1009 |CD   |Janszoon A.      |38665 |
+-----+-----+-----------------+------+

using the data set:

   Employees=: (<;.1~(1 1{.~#);+./@:(;:E.S:0])@:{.)];._2 noun define
   ID   DEPT NAME             SALARY
   1001 AB   Janssen A.H.     41000 
   101  KA   't Woud B.       45000 
   1013 AB   de Bont C.A.     65000 
   1101 CC   Modaal A.M.J.    30000 
   1203 AB   Anders H.        50000 
   100  KA   Ezelbips P.J.    52000 
   1102 CC   Zagt A.          33000 
   1103 CC   Ternood T.R.     21000 
   1104 CC   Lageln M.        23000 
   1105 CC   Amperwat A.      19000 
   1106 CC   Boon T.J.        25000 
   1107 CC   Beloop L.O.      31000 
   1009 CD   Janszoon A.      38665 
   1026 CD   Janszen H.P.     41000 
   1011 CC   de Goeij J.      39000 
   106  KA   Pragtweik J.M.V. 42300 
   111  KA   Bakeuro S.       31000 
   105  KA   Clubdrager C.    39800 
   104  KA   Karendijk F.     23000 
   107  KA   Centjes R.M.     34000 
   119  KA   Tegenstroom H.L. 39000 
   1111 CD   Telmans R.M.     55500 
   1093 AB   de Slegte S.     46987 
   1199 CC   Uitlaat G.A.S.   44500 
   )

SMEQL

The following SMEQL example returns the top 6 earners in each department based on this table schema:

 table: Employees
 ----------------
 empID
 dept
 empName
 salary

Source Code:

 srt = orderBy(Employees, (dept, salary), order)
 top = group(srt, [(dept) dept2, max(order) order])
 join(srt, top, a.dept=b.dept2 and b.order - a.order < 6)