Probabilistic choice: Difference between revisions
Content added Content deleted
Thundergnat (talk | contribs) m (syntax highlighting fixup automation) |
|||
Line 1,986: | Line 1,986: | ||
Zayin 0.0909091 0.0909630 |
Zayin 0.0909091 0.0909630 |
||
Chet 0.0634560 0.0632870 </pre> |
Chet 0.0634560 0.0632870 </pre> |
||
=={{header|jq}}== |
|||
In the task description, the probabilities are given as rationals, |
|||
so in this entry, we shall assume the probabilities are given as rationals |
|||
represented by JSON objects of the form {n: $numerator, d: denominator}. |
|||
For simplicity, it is further assumed that the largest denominator is not too large. |
|||
Neither the C nor the Go implmentations of jq provides a PRNG, |
|||
so in the following, /dev/urandom is used as a source of entropy; |
|||
an appropriate invocation of jq (or gojq) would thus be as follows: |
|||
<syntaxhighlight lang=bash> |
|||
#!/bin/bash |
|||
< /dev/urandom tr -cd '0-9' | fold -w 1 | jq -nr -f probabilistic-choice.jq |
|||
</syntaxhighlight> |
|||
<syntaxhighlight lang=jq> |
|||
# Output: a prn in range(0;$n) where $n is `.` |
|||
def prn: |
|||
if . == 1 then 0 |
|||
else . as $n |
|||
| ([1, (($n-1)|tostring|length)]|max) as $w |
|||
| [limit($w; inputs)] | join("") | tonumber |
|||
| if . < $n then . else ($n | prn) end |
|||
end; |
|||
# General Utility Functions |
|||
# bag of words |
|||
def bow(stream): |
|||
reduce stream as $word ({}; .[($word|tostring)] += 1); |
|||
# left pad with blank |
|||
def lpad($len): tostring | ($len - length) as $l | (" " * $l)[:$l] + .; |
|||
# right-pad with 0 |
|||
def rpad($len): tostring | ($len - length) as $l | .+ ("0" * $l)[:$l]; |
|||
# Input: a string of digits with up to one embedded "." |
|||
# Output: the corresponding string representation with exactly $n decimal digits but aligned at the period |
|||
def align_decimal($n): |
|||
tostring |
|||
| index(".") as $ix |
|||
| if $ix then capture("(?<i>[0-9]*[.])(?<j>[0-9]{0," + ($n|tostring) + "})") as {$i, $j} |
|||
| $i + ($j|rpad($n)) |
|||
else . + ".0" | align_decimal($n) |
|||
end ; |
|||
# Input: a string of digits with up to one embedded "." |
|||
# Output: the corresponding string representation with up to $n decimal digits but aligned at the period |
|||
def align_decimal($n): |
|||
tostring |
|||
| index(".") as $ix |
|||
| if $ix then capture("(?<i>[0-9]*[.])(?<j>[0-9]{0," + ($n|tostring) + "})") as {$i, $j} |
|||
| $i + ($j|rpad($n)) |
|||
else . + ".0" | align_decimal($n) |
|||
end ; |
|||
# least common multiple |
|||
# Define the helper function to take advantage of jq tail-recursion optimization |
|||
def lcm($m; $n): |
|||
def _lcm: |
|||
# state is [m, n, i] |
|||
if (.[2] % .[1]) == 0 then .[2] |
|||
else .[0:2] + [.[2] + $m] | _lcm |
|||
end; |
|||
[m, n, m] | _lcm; |
|||
def lcm(s): reduce s as $_ (1; lcm(.; $_)); |
|||
# rationals |
|||
def r($n; $d): {$n, $d}; |
|||
</syntaxhighlight> |
|||
'''The Task''' |
|||
<syntaxhighlight lang=jq> |
|||
# Given that $integers is an array of integers to be interpreted as |
|||
# relative probabilities, return a corresponding element chosen |
|||
# randomly from the input array. |
|||
def randomly($integers): |
|||
def accumulate: reduce .[1:][] as $i ([.[0]]; . + [$i + .[-1]]); |
|||
if ($integers | length) != length |
|||
then "randomly/1: the array lengths are unequal" | error |
|||
else . as $in |
|||
| $integers |
|||
| add as $sum |
|||
| accumulate as $p |
|||
| ($sum|prn + 1) as $random |
|||
| $in[first(range(0; $p|length) | select( $random <= $p[.] ))] |
|||
end ; |
|||
# Input should be a JSON object giving probabilities of each key as a rational: {n, d} |
|||
def choose($n): |
|||
lcm(.[].d) as $lcm |
|||
| ([.[] | $lcm * .n / .d]) as $p |
|||
| keys_unsorted as $items |
|||
| range(0; $n) |
|||
| $items | randomly($p); |
|||
# Print a table comparing expected, observed and the ratio |
|||
# (expected - observed)^2 / expected |
|||
def compare( $expected; $observed ): |
|||
def p($n): align_decimal($n) | lpad(8); |
|||
" : expected observed (e-o)^2 / e", |
|||
( $expected |
|||
| keys_unsorted[] as $k |
|||
| .[$k] as $e |
|||
| ($observed[$k] // 0) as $o |
|||
| "\($k|lpad(6)) : \($e|p(1)) \($o|floor|lpad(8)) \( (($e - $o) | (.*.) / $e) | p(2))" ); |
|||
# The specific task |
|||
def probabilities: |
|||
{ "aleph": r(1; 5), |
|||
"beth": r(1; 6), |
|||
"gimel": r(1; 7), |
|||
"daleth": r(1; 8), |
|||
"he": r(1; 9), |
|||
"waw": r(1; 10), |
|||
"zayin": r(1; 11), |
|||
"heth": r(1759; 27720) |
|||
}; |
|||
def task($n): |
|||
probabilities |
|||
| bow(choose($n)) as $observed |
|||
| compare( map_values($n * .n / .d); $observed ) ; |
|||
task(1E6) |
|||
' |
|||
</syntaxhighlight> |
|||
'''Output''' |
|||
<pre> |
|||
: expected observed (e-o)^2 / e |
|||
aleph : 200000.0 200247 0.30 |
|||
beth : 166666.6 166668 1.06 |
|||
gimel : 142857.1 142787 0.03 |
|||
daleth : 125000.0 124789 0.35 |
|||
he : 111111.1 111280 0.25 |
|||
waw : 100000.0 100165 0.27 |
|||
zayin : 90909.0 90716 0.41 |
|||
heth : 63455.9 63348 0.18 |
|||
</pre> |
|||
=={{header|Julia}}== |
=={{header|Julia}}== |