Talk:Sparkline in unicode: Difference between revisions

Line 19:

* <code>0 999 4000 4999 7000 7999</code> detects the half-width bug and some smaller errors (see Tcl). Output should have three heights; the half-width bug looks like: ▁▂▅▅▇█

: '''Addendum:''' ''the second test case assumes that each of the 8 heights should represent 1/8<sup>th</sup> of the range, as closely as possible. Not everyone agrees. See [[#counterpoint]] and [[#Deeper_root_of_the_.27bug.27_.3F|Deeper root of the bug?]] below for discussion.''

⚫

:: A very helpful intervention and discussion, and I agree absolutely about the first test example.

⚫

::

⚫

:: Perhaps our interpretation of the '''second''' test example depends on some unclarified assumptions about the optimal width (and alignment) of the bins ?

⚫

:: The Haskell '''Statistics.Sample.Histogram''' library, for example, returns the following allocation of the sample <code>0 999 4000 4999 7000 7999</code> to 8 evenly sized bins:

⚫

:: <code>[1,1,0,0,2,0,1,1]</code>

⚫

:: which would, I think, correspond to 5 different sparkline heights, unless I am confusing myself.

⚫

:: The set of lower bounds suggested by '''Statistics.Sample.Histogram''' for a division of this sample between 8 bins is:

⚫

:: <code>[-571.3571428571429,571.3571428571429,1714.0714285714287,2856.7857142857147,3999.5,5142.214285714286,6284.928571428572,7427.642857142857]</code>

⚫

:: The assumption they are making is that any given sample is likely to be drawn from a slightly larger range of possible sample values, and that some margin can usefully be allowed.

⚫

:: The margin which that library adopts is <code>margin = (hi - lo) / (fromIntegral (intBins - 1) * 2))</code>

⚫

:: (yielding fractionally larger bins and a total range that starts a little below the minimum observed value, and ends a little above the maximum observed value)

⚫

:: Arguably reasonable for us to do something comparable ? [[User:Hout|Hout]] ([[User talk:Hout|talk]]) 12:26, 26 February 2019 (UTC)

⚫

::: PS the dependence of edge cases on mutable assumptions (e.g. the relationship between the range of the sample and the range of possible/graphed values) may be underscored by the result given by the '''Mathematica 11 Histogram function''', which (if we specify only a target number of bins) allocates the same sample as follows (different pattern again, but still, I think, 5 sparkline levels):

⚫

:::: <code>Histogram[{0, 999, 4000, 4999, 7000, 7999}, {"Raw", 8}] --> </code>

⚫

:::: [2, 0, 0, 1, 1, 0, 1, 1]

⚫

::::

⚫

:::: And similarly the '''R language hist() function''' expression <code>hist(c(0, 999, 4000, 4999, 7000, 7999), breaks=8)</code>

⚫

:::: Returns a distribution of 5 [2, 0, 0, 1, 1, 0, 1, 1], again using 5 (rather than 3) of 8 available bins.

⚫

:::: The breaks which it derives from that data set can be listed:

⚫

:::: <code> > histinfo<-hist(c(0, 999, 4000, 4999, 7000, 7999), breaks=8)</code>

⚫

:::: <code> > histinfo</code>

⚫

:::: <code>$breaks</code>

⚫

:::: <code>[1] 0 1000 2000 3000 4000 5000 6000 7000 8000</code>

⚫

::::[[User:Hout|Hout]] ([[User talk:Hout|talk]]) 13:33, 26 February 2019 (UTC)

⚫

::::: "fractionally larger bins" is the Tcl approach I discussed in the section above. It's fine but requires careful selection of the denominator. Too big, and the bins are wider than they need to be (Tcl's mistake); too small, and it can be erased by fp errors.

⚫

::::: edit: the relationship between the value of <code>breaks</code> and the number of bins in R is completely opaque and does not match the documentation. For example, <code>hist(0:9, breaks=x)</code> gives 2 bins for x=3; 5 bins for x=4,5,6; 9 bins for x=7.

⚫

::::: edit2: I should clarify that Haskell's solution exhibits the half-width bug. I don't believe this is defensible. Much better choices of denominator are available. --Oopsiedaisy, 26 February 2019

;sparktest.pl

Line 100:

Line 69:

::Thanks Oopsiedaisy. I started the task off with an initial buggy Python solution. Now fixed and with examples extended to show your problem cases. Thanks again. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 19:35, 24 February 2019 (UTC)

====Counterpoint====

⚫