WP Confusing

After trying to implement something, I thought the WP article to be less than helpful. How about basing the task on this? --Paddy3118 07:32, 14 December 2009 (UTC)

Agreed that the WP article is confusing (perhaps one of us ought to improve it!) but I think I've extracted the key part to make the Tcl solution. It even handles signed fractional data, should that be presented. I think it is easier to convert the data to stems and leaves and store in a map (from stems to a sequence of leaves) before sorting any of it. Or at least that was what made sense to me. –Donal Fellows 09:37, 14 December 2009 (UTC)
Short question aside: How should missing stems be handled? One example on your page lists stems without leaves in between the stems that have leaves. For An overview of the data distribution this is certainly of value but it can make the plot needlessly long. Maybe it should be clarified whether stems without leaves should be represented in the output or not. —Johannes Rössel 14:12, 14 December 2009 (UTC)
As WP says: "It is important that each stem is listed only once and that no numbers are skipped, even if it means that some stems have no leaves." This ensures that the vertical direction is always a linear scale. This is not just a table: it is an information graphic. Distances matter. —Kevin Reid 14:42, 14 December 2009 (UTC)

Less Useful for negative numbers?

For data: 15 14 3 2 1 0 -1 -2 -3 -14 -15 Would you generate either graph1:

 -2 | 5 6
 -1 | 7 8 9
  0 | 0 1 2 3
 10 | 4 5

Which satisfies X | Y where 10*X + Y is a datum, but the digit for Y when the datum<0 is not necessarily the right-most digit of the datum.

Or graph 2:

 -1 | -4 -5
  0 | -3 -2 -1  0  1  2  3
 10 |  4  5

Which also satisfies X | Y where 10*X + Y is a datum, the last digit is preserved, but I don't like the negative leaf numbers.
What thinketh though? --Paddy3118 13:38, 16 December 2009 (UTC)

Per Wikipedia, you generate a -0 stem, not a -1 stem, before the 0 stem, and use the actual digits -- that is, for your data

-1 | 4 5
-0 | 1 2 3
 0 | 0 1 2 3
 1 | 4 5

However, in general the choice of what goes in the stems and the leaves is a choice for whatever best illustrates the particular data set. In this task, there are no negative numbers.

Also, your second example is problematic because the 0 stem contains a wider span (19 values, -9..9) than the 10 of every other stem, so it distorts the data. On the other hand, the -0 strategy means that the -0 stem has a range of -9..-1 with only 9 elements whereas everything else has 10. I like your first example for uniformity but it seems confusing to read which sort of defeats the point. —Kevin Reid 13:50, 16 December 2009 (UTC)

Return to "Stem-and-leaf plot" page.