Statistics/Basic: Difference between revisions
m
→{{header|Go}}: a little more on the extra credit
m (→{{header|Go}}: a little more on the extra credit) |
|||
Line 206:
*********************
</pre>
The usual approach to the extra problem is [http://en.wikipedia.org/wiki/Sampling_%28statistics%29 sampling.] That is, to not do it.
For the extra, here is an outline of a map reduce strategy. The main task indicated that numbers should be generated before doing any computations on them. Consistent with that, The function getSegment returns data based on a starting and ending index, as if it were accessing some large data store.▼
▲
The following runs comfortably on a simulated data size of 10 million. To scale to a trillion, and to use real data, you would want to use a technique like [[Distributed_programming#Go]] to distribute work across multiple computers, and on each computer, use a technique like [[Parallel_calculations#Go]] to distribute work across multiple cores within each computer. You would tune parameters like the constant <tt>threshold</tt> in the code below to optimize cache performance.
<lang go>package main
|