Talk:Statistics/Basic: Difference between revisions

Content added Content deleted

Inline

Revision as of 17:11, 2 July 2011

Wrong emphasis in 'Extra'?

Once one example shows how to calculate them from keeping running sums of x and x-squared then they should all be able to copy. Better to just add reference to the other formulas so we can compare the language implementations.
I guess this is because RC is about showing off language capabilities and, (trying to), rely less on the knowledge of individual contributors. --Paddy3118 12:53, 2 July 2011 (UTC)

There isn't much of a formula to talk about, and the requirement is a real world need. This task isn't meant to be difficult, but more related to what actually happens in real data analysis. And why narrow down on "language capabilities"? What kind of programmer would be hurt by a little thinking about simple algorithms? --Ledrug 13:38, 2 July 2011 (UTC)

The idea is to aid language comparison rather than be yet another generic programmer challenge site. --Paddy3118 16:09, 2 July 2011 (UTC)

I agree. I don't think this is really the place for challenges. We don't have much of a framework for it anyway since the solutions are all on the task page. I think it's best to just name or describe an algorithm right from the start. A "better" algorithm could be used as extra credit (e.g. one that greatly reduces error or handles corner cases well). --Mwn3d 16:27, 2 July 2011 (UTC)

How about stating the possible patterns in the standard deviation you might find and adjusting the task to give languages a chance to show them? --Paddy3118 16:10, 2 July 2011 (UTC)

Which part of it looks like a challenge? You add up some numbers, then maybe divide by another number, it's not like there's tricky coding to be done. Large dataset is a real senario and not hard to deal with, as long as you don't artificially complicate it. And as sample size increases, numbers such as mean and stddev becomes stable, which is almost the whole point of statistics: it's an easily noticeable trend, I'm not asking you to find face of Jesus in the output numbers. As a programmer, none of these should be hard to understand, and I never said anything about greatly reducing errors: you can only avoid greatly increasing it, but that's natural requirement for anyone doing numerical work. --Ledrug 17:02, 2 July 2011 (UTC)

Making it numerically stable, that's challenging. It's easy enough if you have a small number of values of all about the same scale, but that's not always the case. –Donal Fellows 17:11, 2 July 2011 (UTC)

@@ Line 8: / Line 8: @@
 adjusting the task to give languages a chance to show them? --[[User:Paddy3118|Paddy3118]] 16:10, 2 July 2011 (UTC)
 :Which part of it looks like a challenge?  You add up some numbers, then maybe divide by another number, it's not like there's tricky coding to be done.  Large dataset is a real senario and not hard to deal with, as long as you don't artificially complicate it.  And as sample size increases, numbers such as mean and stddev becomes stable, which is almost the whole point of statistics: it's an easily noticeable trend, I'm not asking you to find face of Jesus in the output numbers.  As a programmer, none of these should be hard to understand, and I never said anything about greatly reducing errors: you can only avoid greatly <i>increasing</i> it, but that's natural requirement for anyone doing numerical work. --[[User:Ledrug|Ledrug]] 17:02, 2 July 2011 (UTC)
+:: Making it numerically stable, that's challenging. It's easy enough if you have a small number of values of all about the same scale, but that's not always the case. –[[User:Dkf|Donal Fellows]] 17:11, 2 July 2011 (UTC)