Bin given limits
You are encouraged to solve this task according to the task description, using any language you may know.
You are given a list of n ascending, unique numbers which are to form limits for n+1 bins which count how many of a large set of input numbers fall in the range of each bin.
(Assuming zero-based indexing)
bin[0] counts how many inputs are < limit[0] bin[1] counts how many inputs are >= limit[0] and < limit[1] .. bin[n-1] counts how many inputs are >= limit[n-2] and < limit[n-1] bin[n] counts how many inputs are >= limit[n-1]
- Task
The task is to create a function that given the ascending limits and a stream/ list of numbers, will return the bins; together with another function that given the same list of limits and the binning will print the limit of each bin together with the count of items that fell in the range.
Assume the numbers to bin are too large to practically sort.
- Task examples
Part 1: Bin using the following limits the given input data
limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55]
Part 2: Bin using the following limits the given input data
limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749]
Show output here, on this page.
Ada
This example works with Ada 2012. The definition of the subtype Limits_Array employs a dynamic predicate to ensure that the limits array is sorted. The solution defines the binning types and operations within an Ada package, providing modularity and simplifying the code in the main procedure.
package specification: <lang Ada>package binning is
type Nums_Array is array (Natural range <>) of Integer; function Is_Sorted (Item : Nums_Array) return Boolean; subtype Limits_Array is Nums_Array with Dynamic_Predicate => Is_Sorted (Limits_Array); function Bins (Limits : Limits_Array; Data : Nums_Array) return Nums_Array; procedure Print (Limits : Limits_Array; Bin_Result : Nums_Array);
end binning; </lang> package body: <lang Ada>pragma Ada_2012; with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
package body binning is
--------------- -- Is_Sorted -- ---------------
function Is_Sorted (Item : Nums_Array) return Boolean is begin return (for all i in Item'First .. Item'Last - 1 => Item (i) < Item (i + 1)); end Is_Sorted;
---------- -- Bins -- ----------
function Bins (Limits : Limits_Array; Data : Nums_Array) return Nums_Array is Result : Nums_Array (Limits'First .. Limits'Last + 1) := (others => 0); Bin_Index : Natural; begin for value of Data loop Bin_Index := Result'First; for I in reverse Limits'Range loop if value >= Limits (I) then Bin_Index := I + 1; exit; end if; end loop; Result (Bin_Index) := Result (Bin_Index) + 1; end loop; return Result; end Bins;
----------- -- Print -- -----------
procedure Print (Limits : Limits_Array; Bin_Result : Nums_Array) is begin if Limits'Length = 0 then return; end if; Put (" < "); Put (Item => Limits (Limits'First), Width => 3); Put (": "); Put (Item => Bin_Result (Bin_Result'First), Width => 2); New_Line; for i in Limits'First + 1 .. Limits'Last loop Put (">= "); Put (Item => Limits (i - 1), Width => 3); Put (" and < "); Put (Item => Limits (i), Width => 3); Put (": "); Put (Item => Bin_Result (i), Width => 2); New_Line; end loop; Put (">= "); Put (Item => Limits (Limits'Last), Width => 3); Put (" : "); Put (Item => Bin_Result (Bin_Result'Last), Width => 2); New_Line; end Print;
end binning; </lang> main procedure: <lang Ada>with Ada.Text_IO; use Ada.Text_IO; with binning; use binning;
procedure Main is
Limits_1 : Limits_Array := (23, 37, 43, 53, 67, 83); Data_1 : Nums_Array := (95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55); Limits_2 : Limits_Array := (14, 18, 249, 312, 389, 392, 513, 591, 634, 720); Data_2 : Nums_Array := (445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749); Bin_1 : Nums_Array := Bins (Limits => Limits_1, Data => Data_1); Bin_2 : Nums_Array := Bins (Limits => Limits_2, Data => Data_2);
begin
Put_Line ("Example 1:"); Print (Limits => Limits_1, Bin_Result => Bin_1); New_Line; Put_Line ("Example 2:"); Print (Limits => Limits_2, Bin_Result => Bin_2);
end Main; </lang> {output}
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
C
<lang c>#include <stdio.h>
- include <stdlib.h>
size_t upper_bound(const int* array, size_t n, int value) {
size_t start = 0; while (n > 0) { size_t step = n / 2; size_t index = start + step; if (value >= array[index]) { start = index + 1; n -= step + 1; } else { n = step; } } return start;
}
int* bins(const int* limits, size_t nlimits, const int* data, size_t ndata) {
int* result = calloc(nlimits + 1, sizeof(int)); if (result == NULL) return NULL; for (size_t i = 0; i < ndata; ++i) ++result[upper_bound(limits, nlimits, data[i])]; return result;
}
void print_bins(const int* limits, size_t n, const int* bins) {
if (n == 0) return; printf(" < %3d: %2d\n", limits[0], bins[0]); for (size_t i = 1; i < n; ++i) printf(">= %3d and < %3d: %2d\n", limits[i - 1], limits[i], bins[i]); printf(">= %3d : %2d\n", limits[n - 1], bins[n]);
}
int main() {
const int limits1[] = {23, 37, 43, 53, 67, 83}; const int data1[] = {95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55};
printf("Example 1:\n"); size_t n = sizeof(limits1) / sizeof(int); int* b = bins(limits1, n, data1, sizeof(data1) / sizeof(int)); if (b == NULL) { fprintf(stderr, "Out of memory\n"); return EXIT_FAILURE; } print_bins(limits1, n, b); free(b);
const int limits2[] = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720}; const int data2[] = { 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749};
printf("\nExample 2:\n"); n = sizeof(limits2) / sizeof(int); b = bins(limits2, n, data2, sizeof(data2) / sizeof(int)); if (b == NULL) { fprintf(stderr, "Out of memory\n"); return EXIT_FAILURE; } print_bins(limits2, n, b); free(b);
return EXIT_SUCCESS;
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
C++
<lang cpp>#include <algorithm>
- include <cassert>
- include <iomanip>
- include <iostream>
- include <vector>
std::vector<int> bins(const std::vector<int>& limits,
const std::vector<int>& data) { std::vector<int> result(limits.size() + 1, 0); for (int n : data) { auto i = std::upper_bound(limits.begin(), limits.end(), n); ++result[i - limits.begin()]; } return result;
}
void print_bins(const std::vector<int>& limits, const std::vector<int>& bins) {
size_t n = limits.size(); if (n == 0) return; assert(n + 1 == bins.size()); std::cout << " < " << std::setw(3) << limits[0] << ": " << std::setw(2) << bins[0] << '\n'; for (size_t i = 1; i < n; ++i) std::cout << ">= " << std::setw(3) << limits[i - 1] << " and < " << std::setw(3) << limits[i] << ": " << std::setw(2) << bins[i] << '\n'; std::cout << ">= " << std::setw(3) << limits[n - 1] << " : " << std::setw(2) << bins[n] << '\n';
}
int main() {
const std::vector<int> limits1{23, 37, 43, 53, 67, 83}; const std::vector<int> data1{ 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55};
std::cout << "Example 1:\n"; print_bins(limits1, bins(limits1, data1));
const std::vector<int> limits2{14, 18, 249, 312, 389, 392, 513, 591, 634, 720}; const std::vector<int> data2{ 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749};
std::cout << "\nExample 2:\n"; print_bins(limits2, bins(limits2, data2));
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
Factor
Factor provides the bisect-right
word in the sorting.extras
vocabulary. See the implementation here.
<lang factor>USING: assocs formatting grouping io kernel math math.parser
math.statistics sequences sequences.extras sorting.extras ;
- bin ( data limits -- seq )
dup length 1 + [ 0 ] replicate -rot [ bisect-right over [ 1 + ] change-nth ] curry each ;
- .bin ( {lo,hi} n i -- )
swap "%3d members in " printf zero? "(" "[" ? write "%s, %s)\n" vprintf ;
- .bins ( data limits -- )
dup [ number>string ] map "-∞" prefix "∞" suffix 2 clump -rot bin [ .bin ] 2each-index ;
"First example:" print
{
95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47 16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55
} { 23 37 43 53 67 83 } .bins nl
"Second example:" print {
445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933 416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749
} { 14 18 249 312 389 392 513 591 634 720 } .bins</lang>
- Output:
First example: 11 members in (-∞, 23) 4 members in [23, 37) 2 members in [37, 43) 6 members in [43, 53) 9 members in [53, 67) 5 members in [67, 83) 13 members in [83, ∞) Second example: 3 members in (-∞, 14) 0 members in [14, 18) 44 members in [18, 249) 10 members in [249, 312) 16 members in [312, 389) 2 members in [389, 392) 28 members in [392, 513) 16 members in [513, 591) 6 members in [591, 634) 16 members in [634, 720) 59 members in [720, ∞)
Go
<lang go>package main
import (
"fmt" "sort"
)
func getBins(limits, data []int) []int {
n := len(limits) bins := make([]int, n+1) for _, d := range data { index := sort.SearchInts(limits, d) // uses binary search if index < len(limits) && d == limits[index] { index++ } bins[index]++ } return bins
}
func printBins(limits, bins []int) {
n := len(limits) fmt.Printf(" < %3d = %2d\n", limits[0], bins[0]) for i := 1; i < n; i++ { fmt.Printf(">= %3d and < %3d = %2d\n", limits[i-1], limits[i], bins[i]) } fmt.Printf(">= %3d = %2d\n", limits[n-1], bins[n]) fmt.Println()
}
func main() {
limitsList := [][]int{ {23, 37, 43, 53, 67, 83}, {14, 18, 249, 312, 389, 392, 513, 591, 634, 720}, }
dataList := [][]int{ { 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55, }, { 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749, }, }
for i := 0; i < len(limitsList); i++ { fmt.Println("Example", i+1, "\b\n") bins := getBins(limitsList[i], dataList[i]) printBins(limitsList[i], bins) }
}</lang>
- Output:
Example 1 < 23 = 11 >= 23 and < 37 = 4 >= 37 and < 43 = 2 >= 43 and < 53 = 6 >= 53 and < 67 = 9 >= 67 and < 83 = 5 >= 83 = 13 Example 2 < 14 = 3 >= 14 and < 18 = 0 >= 18 and < 249 = 44 >= 249 and < 312 = 10 >= 312 and < 389 = 16 >= 389 and < 392 = 2 >= 392 and < 513 = 28 >= 513 and < 591 = 16 >= 591 and < 634 = 6 >= 634 and < 720 = 16 >= 720 = 59
Julia
<lang julia>"""Add the function Python has in its bisect library""" function bisect_right(array, x, low = 1, high = length(array) + 1)
while low < high middle = (low + high) ÷ 2 x < array[middle] ? (high = middle) : (low = middle + 1) end return low
end
""" Bin data according to (ascending) limits """ function bin_it(limits, data)
bins = zeros(Int, length(limits) + 1) # adds under/over range bins too for d in data bins[bisect_right(limits, d)] += 1 end return bins
end
""" Pretty print the resulting bins and counts """ function bin_print(limits, bins)
println(" < $(lpad(limits[1], 3)) := $(lpad(bins[1], 3))") for (lo, hi, count) in zip(limits, limits[2:end], bins[2:end]) println(">= $(lpad(lo, 3)) .. < $(lpad(hi, 3)) := $(lpad(count, 3))") end println(">= $(lpad(limits[end], 3)) := $(lpad(bins[end], 3))")
end
""" Test on data provided """ function testbins()
println("RC FIRST EXAMPLE:") limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55] bins = bin_it(limits, data) bin_print(limits, bins)
println("\nRC SECOND EXAMPLE:") limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749] bins = bin_it(limits, data) bin_print(limits, bins)
end
testbins()
</lang>
- Output:
RC FIRST EXAMPLE: < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE: < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Phix
<lang Phix>function bin_it(sequence limits, data)
-- Bin data according to (ascending) limits. sequence bins = repeat(0,length(limits)+1) -- adds under/over range bins too for i=1 to length(data) do integer bdx = binary_search(data[i],limits) bins[abs(bdx)+(bdx>0)] += 1 end for return bins
end function
procedure bin_print(sequence limits, bins)
printf(1," < %3d := %3d\n",{limits[1],bins[1]}) for i=2 to length(limits) do printf(1,">= %3d and < %3d := %3d\n",{limits[i-1],limits[i],bins[i]}) end for printf(1,">= %3d := %3d\n\n",{limits[$],bins[$]})
end procedure
sequence limits, data printf(1,"Example 1:\n") limits = {23, 37, 43, 53, 67, 83} data = {95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,
16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55}
bin_print(limits, bin_it(limits, data))
printf(1,"Example 2:\n") limits = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720} data = {445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,
416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749}
bin_print(limits, bin_it(limits, data))</lang>
- Output:
Example 1: < 23 := 11 >= 23 and < 37 := 4 >= 37 and < 43 := 2 >= 43 and < 53 := 6 >= 53 and < 67 := 9 >= 67 and < 83 := 5 >= 83 := 13 Example 2: < 14 := 3 >= 14 and < 18 := 0 >= 18 and < 249 := 44 >= 249 and < 312 := 10 >= 312 and < 389 := 16 >= 389 and < 392 := 2 >= 392 and < 513 := 28 >= 513 and < 591 := 16 >= 591 and < 634 := 6 >= 634 and < 720 := 16 >= 720 := 59
Python
This example uses binary search through the limits to assign each number to its bin, via standard module bisect.bisect_right.
The Counter module is not used as the number of bins is known allowing faster array access for incrementing bin counts versus dict lookup.
<lang python>from bisect import bisect_right
def bin_it(limits: list, data: list) -> list:
"Bin data according to (ascending) limits." bins = [0] * (len(limits) + 1) # adds under/over range bins too for d in data: bins[bisect_right(limits, d)] += 1 return bins
def bin_print(limits: list, bins: list) -> list:
print(f" < {limits[0]:3} := {bins[0]:3}") for lo, hi, count in zip(limits, limits[1:], bins[1:]): print(f">= {lo:3} .. < {hi:3} := {count:3}") print(f">= {limits[-1]:3} := {bins[-1]:3}")
if __name__ == "__main__":
print("RC FIRST EXAMPLE\n") limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55] bins = bin_it(limits, data) bin_print(limits, bins)
print("\nRC SECOND EXAMPLE\n") limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749] bins = bin_it(limits, data) bin_print(limits, bins)</lang>
- Output:
RC FIRST EXAMPLE < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Raku
<lang perl6> sub bin_it ( @limits, @data ) {
my @ranges = ( -Inf, |@limits, Inf ).rotor( 2 => -1 ).map: { .[0] ..^ .[1] }; my @binned = @data.classify(-> $d { @ranges.grep(-> $r { $d ~~ $r }) }); my %counts = @binned.map: { .key => .value.elems }; return @ranges.map: { $_ => ( %counts{$_} // 0 ) };
} sub bin_format ( @bins ) {
return @bins.map: { .key.gist.fmt('%9s => ') ~ .value.fmt('%2d') };
}
my @tests =
{ limits => (23, 37, 43, 53, 67, 83), data => (95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,16,8,9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55), }, { limits => (14, 18, 249, 312, 389, 392, 513, 591, 634, 720), data => ( 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247,346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97,854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692,698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791,466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ), },
for @tests -> ( :@limits, :@data ) {
my @bins = bin_it( @limits, @data ); .say for bin_format(@bins); say ;
} </lang>
- Output:
-Inf..^23 => 11 23..^37 => 4 37..^43 => 2 43..^53 => 6 53..^67 => 9 67..^83 => 5 83..^Inf => 13 -Inf..^14 => 3 14..^18 => 0 18..^249 => 44 249..^312 => 10 312..^389 => 16 389..^392 => 2 392..^513 => 28 513..^591 => 16 591..^634 => 6 634..^720 => 16 720..^Inf => 59
REXX
<lang rexx>/*REXX program counts how many numbers of a set that fall in the range of each bin. */ lims= 23 37 43 53 67 83 /* ◄■■■■■■1st set of bin limits & data.*/ data= 95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47 ,
16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55
call lims lims; call bins data call show 'the 1st set of bin counts for the specified data:'
say; say; say
lims= 14 18 249 312 389 392 513 591 634 720 /* ◄■■■■■■2nd set of bin limits & data.*/ data= 445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933 ,
416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 , 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 , 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 , 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 , 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 , 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 , 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 , 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 , 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749
call lims lims; call bins data call show 'the 2nd set of bin counts for the specified data:' exit 0 /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ bins: parse arg nums; !.= 0; datum= words(nums); wc= length(datum) /*max width count.*/
do j=1 for datum; x= word(nums, j) do k=0 for # /*find the bin that this number is in. */ if x < @.k then do; !.k= !.k + 1; iterate j; end /*bump a bin count*/ end /*k*/ !.k= !.k + 1 /*number is > the highest bin specified*/ end /*j*/; return
/*──────────────────────────────────────────────────────────────────────────────────────*/ lims: parse arg limList; #= words(limList); wb= 0 /*max width binLim*/
do j=1 for #; _= j - 1; @._= word(limList, j); wb= max(wb, length(@._) ) end /*j*/; return
/*──────────────────────────────────────────────────────────────────────────────────────*/ show: parse arg t; say center(t, 51 ); $= left(, 9) /*$: for indentation*/
say center(, 51, "═") /*show title separator.*/ jp= # - 1; ge= '≥'; le='<'; eq= ' count=' do j=0 for #; jm= j - 1; bin= right(@.j, wb) if j==0 then say $ left(, length(ge) +3+wb+length(..) )le bin eq right(!.j, wc) else say $ ge right(@.jm, wb) .. le bin eq right(!.j, wc) if j==jp then say $ ge right(@.jp,wb) left(, 3+length(..)+wb) eq right(!.#, wc) end /*j*/; return</lang>
- output when using the internal default input:
the 1st set of bin counts for the specified data: ═══════════════════════════════════════════════════ < 23 count= 11 ≥ 23 .. < 37 count= 4 ≥ 37 .. < 43 count= 2 ≥ 43 .. < 53 count= 6 ≥ 53 .. < 67 count= 9 ≥ 67 .. < 83 count= 5 ≥ 83 count= 13 the 2nd set of bin counts for the specified data: ═══════════════════════════════════════════════════ < 14 count= 3 ≥ 14 .. < 18 count= 0 ≥ 18 .. < 249 count= 44 ≥ 249 .. < 312 count= 10 ≥ 312 .. < 389 count= 16 ≥ 389 .. < 392 count= 2 ≥ 392 .. < 513 count= 28 ≥ 513 .. < 591 count= 16 ≥ 591 .. < 634 count= 6 ≥ 634 .. < 720 count= 16 ≥ 720 count= 59
Rust
A very simple and naive algorithm that uses nested dynamic arrays.
<lang rust>fn make_bins(limits: &Vec<usize>, data: &Vec<usize>) -> Vec<Vec<usize>> {
let mut bins: Vec<Vec<usize>> = Vec::with_capacity(limits.len() + 1); for _ in 0..=limits.len() {bins.push(Vec::new());}
limits.iter().enumerate().for_each(|(idx, limit)| { data.iter().for_each(|elem| { if idx == 0 && elem < limit { bins[0].push(*elem); } // smaller than the smallest limit else if idx == limits.len()-1 && elem >= limit { bins[limits.len()].push(*elem); } // larger than the largest limit else if elem < limit && elem >= &limits[idx-1] { bins[idx].push(*elem); } // otherwise }); });
bins
}
fn print_bins(limits: &Vec<usize>, bins: &Vec<Vec<usize>>) {
for (idx, bin) in bins.iter().enumerate() { if idx == 0 { println!(" < {:3} := {:3}", limits[idx], bin.len()); } else if idx == limits.len() { println!(">= {:3} := {:3}", limits[idx-1], bin.len()); }else { println!(">= {:3} .. < {:3} := {:3}", limits[idx-1], limits[idx], bin.len()); } };
}
fn main() {
let limits1 = vec![23, 37, 43, 53, 67, 83]; let data1 = vec![95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55];
let limits2 = vec![14, 18, 249, 312, 389, 392, 513, 591, 634, 720]; let data2 = vec![ 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ];
// Why are we calling it RC anyways??? println!("RC FIRST EXAMPLE"); let bins1 = make_bins(&limits1, &data1); print_bins(&limits1, &bins1);
println!("\nRC SECOND EXAMPLE"); let bins2 = make_bins(&limits2, &data2); print_bins(&limits2, &bins2);
} </lang>
- Output:
RC FIRST EXAMPLE < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Wren
<lang ecmascript>import "/sort" for Find import "/fmt" for Fmt
var getBins = Fn.new { |limits, data|
var n = limits.count var bins = List.filled(n+1, 0) for (d in data) { var res = Find.all(limits, d) // uses binary search var found = res[0] var index = res[2].from if (found) index = index + 1 bins[index] = bins[index] + 1 } return bins
}
var printBins = Fn.new { |limits, bins|
for (i in 0..limits.count) { if (i == 0) { Fmt.print(" < $3d = $2d", limits[0], bins[0]) } else if (i == limits.count) { Fmt.print(">= $3d = $2d", limits[i-1], bins[i]) } else { Fmt.print(">= $3d and < $3d = $2d", limits[i-1], limits[i], bins[i]) } } System.print()
}
var limitsList = [
[23, 37, 43, 53, 67, 83], [14, 18, 249, 312, 389, 392, 513, 591, 634, 720]
]
var dataList = [
[ 95,21,94,12,99,4,70,75,83,93,52,80,57, 5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96, 6,98,40,79,97,45,64,60,29,49,36,43,55 ], [ 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ]
]
for (i in 0...limitsList.count) {
System.print("Example %(i+1):\n") var bins = getBins.call(limitsList[i], dataList[i]) printBins.call(limitsList[i], bins)
}</lang>
- Output:
Example 1: < 23 = 11 >= 23 and < 37 = 4 >= 37 and < 43 = 2 >= 43 and < 53 = 6 >= 53 and < 67 = 9 >= 67 and < 83 = 5 >= 83 = 13 Example 2: < 14 = 3 >= 14 and < 18 = 0 >= 18 and < 249 = 44 >= 249 and < 312 = 10 >= 312 and < 389 = 16 >= 389 and < 392 = 2 >= 392 and < 513 = 28 >= 513 and < 591 = 16 >= 591 and < 634 = 6 >= 634 and < 720 = 16 >= 720 = 59