Statistics/Basic

From Rosetta Code
Revision as of 11:29, 2 July 2011 by rosettacode>Ledrug (init task)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Statistics/Basic is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Statistics is all about large groups of numbers. When talking about a set of sampled data, most frequently used is their mean value and standard deviation (stddev). If you have set of data where , the mean is , while the stddev is .

When examining a large quantity of data, one often uses a histgram, which shows the counts of data samples falling into a prechosen set of intervals (or bins). When plotted, often as bar graphs, it visually indicates how often each data value occurs.

Task Using your language's random number routine, generate real numbers in the range of [0, 1]. It doesn't matter if you chose to use open or closed range. Create 100 of such numbers (i.e. sample size 100) and calculate their mean and stddev. Do so for sample size of 1,000 and 10,000, maybe even higher if you feel like. Show a histogram of any of these sets. Do you notice some patterns about the standard deviation?

Extra Sometimes so much data need to be processed that it's impossible to keep all of them at once. Can your calculate the mean, stddev and histogram of a trillion numbers?