Talk:Sparkline in unicode: Difference between revisions
shorter, faster sparktest.pl |
m →sparktest.pl: shorter still |
||
Line 19: | Line 19: | ||
This is some Perl code that will report the widths of same-height sections of output, when provided with a sparkline on standard input. Non-sparkline-lines are ignored. The line produced from a continuous integer sequence should produce eight equal widths (or almost equal if the sequence length is not a multiple of eight). |
This is some Perl code that will report the widths of same-height sections of output, when provided with a sparkline on standard input. Non-sparkline-lines are ignored. The line produced from a continuous integer sequence should produce eight equal widths (or almost equal if the sequence length is not a multiple of eight). |
||
<code> perl -CS -Mutf8 -nle ' |
<code> perl -CS -Mutf8 -nle 'y/▁-█//cd; @x=grep $i^=1, map length, /((.)\2*)/g and print"@x"' </code> |
||
Sample usage (in bash, and assuming program accepts space-separated data on standard input): |
Sample usage (in bash, and assuming program accepts space-separated data on standard input): |
Revision as of 12:30, 24 February 2019
Most of these are buggy
The wrong way to compute the character index
Anything that uses the number 7 (bins-1
etc.) in the binning assignment has too-wide bin sizes. The two most common manifestations of the bug are:
- when the quotient is truncated (
floor
/ceil
/int
), the first or last bin will be one value wide. - when the quotient is rounded, the widths of the first and last bin are too small by half.
The right way to compute the character index
The Go code uses int( 8 * (v-min) / (max-min) )
which works in all cases except when v==max
; it deals with that case by clamping values larger than 7 to 7 (for a zero-based array).
The Tcl code gets honorable mention for using int( 8 * (v-min) / (max-min)*1.01 )
, which mostly does the same thing as the Go code. It avoids the need for clamping but gives bins that are 1% too wide, which becomes visible when the range is large. This approach works if the multiplier is larger than 1, smaller than 1 + 1/(max-min)
, and large enough to not get overwhelmed by floating-point imprecision.
Test cases that detect bugs
0 1 19 20
detects the one-wide bug. Output should be the same as0 0 1 1
with exactly two heights. The bug looks like ▁▂██ or ▁▁▇█0 999 4000 4999 7000 7999
detects the half-width bug and some smaller errors (see Tcl). Output should have three heights; the half-width bug looks like: ▁▂▅▅▇█
sparktest.pl
This is some Perl code that will report the widths of same-height sections of output, when provided with a sparkline on standard input. Non-sparkline-lines are ignored. The line produced from a continuous integer sequence should produce eight equal widths (or almost equal if the sequence length is not a multiple of eight).
perl -CS -Mutf8 -nle 'y/▁-█//cd; @x=grep $i^=1, map length, /((.)\2*)/g and print"@x"'
Sample usage (in bash, and assuming program accepts space-separated data on standard input):
echo {1..8000} | sparkline | sparktest.pl
Expected output from the sample is 1000 1000 1000 1000 1000 1000 1000 1000
.
Not Buggy
- Go. Tested up to
echo {1..12345} | go run sl.go | sparktest.pl
Buggy
- C: ▁▂██
- C++: ▁▁▇█
- Clojure: ▁▂▅▅▇█
- Common Lisp: ▁▂▅▅▇█
- D: obvious one-wide bug; didn't run the code
- Elixir: ▁▂▅▅▇█
- Groovy: one-wide; didn't run
- Haskell: looks like half-width bug; didn't run
- Java: one-wide; didn't run
- Javascript: ▁▂▅▅▇█
- jq: one-wide and neglects to check bounds: ▁▃▷►
- Nim: Python translation
- Perl: ▁▁▇█
- Perl 6: ▁▁▇█
- PicoLisp: ▁▂▅▅▇█
- Python: ▁▁▇█
- Ruby: ▁▁▇█
- Rust: thread 'main' panicked at 'attempt to subtract with overflow', sl.rust:8:40
- Tcl: ▁▁▄▅▇█; not a half-width bug (the second character is correct); manifests only on large ranges; see comments above.
... that's 15 tested, 14 failures, plus 5 didn't-runs that almost certainly have the bug.
Bar choices
Hi Tim. There is a problem with your choices of bars in that they have a ragged bottom line:
- ▁▂▃▄▅▆▇█
There is a problem with my choice of bars in that the highest bar is not full width:
- ▁▂▃▅▆▇▉▇▆▅▃▂▁
I find the ragged baseline to be much more irritating. How to resolve? --Paddy3118 (talk) 03:18, 18 June 2013 (UTC)
Oh, my font is Courier. --Paddy3118 (talk) 03:42, 18 June 2013 (UTC)
- I find that there's quite wide differences in the quality of fonts when it comes to blocks and box elements; a lot of fonts simply don't have the things that should extend to the limits of the glyph box they declare actually doing so at all. In my limited experimenting, Courier New is considerably better than the others I've tried (Andale Mono, Consolas, Courier, Monaco) for this sort of thing. Not much we can do about that really (except “blame the font makers”, which isn't very helpful). –Donal Fellows (talk) 11:24, 20 June 2013 (UTC)
I now find that there is raggedness in the baseline of my bar choice if I swap to Consolas font. I think I'll revert to using Tims seven bars and search for a font as the Unicode page has nothing to say on this, just:
@@ 2580 Block Elements 259F @ Block elements 2580 UPPER HALF BLOCK 2581 LOWER ONE EIGHTH BLOCK 2582 LOWER ONE QUARTER BLOCK 2583 LOWER THREE EIGHTHS BLOCK 2584 LOWER HALF BLOCK 2585 LOWER FIVE EIGHTHS BLOCK 2586 LOWER THREE QUARTERS BLOCK 2587 LOWER SEVEN EIGHTHS BLOCK
--Paddy3118 (talk) 03:56, 18 June 2013 (UTC)
- The baseline is fine in my terminal font, and the baseline problem only manifests in the browser. In any case, if the font is problematic, that's the font's problem, not our problem. Notionally the blocks should have the same baseline, and I'd much rather have a solution that will be correct after they fix the fonts. (Or fix the font aliasing algorithm, which may be what's really going on here.) --TimToady (talk) 07:57, 18 June 2013 (UTC)
- Yes, it's the font dealiasing that is doing it. Changing the page's font size up and down moves the fuzz from the bottom to the top, and to different characters. So trying to pick the "right" characters is an exercise in futility, because what's right for you will be wrong for someone else. So just use the eight characters that are supposed to be right, and ignore the baseline issue. --TimToady (talk) 08:02, 18 June 2013 (UTC)
- The baseline is fine in my terminal font, and the baseline problem only manifests in the browser. In any case, if the font is problematic, that's the font's problem, not our problem. Notionally the blocks should have the same baseline, and I'd much rather have a solution that will be correct after they fix the fonts. (Or fix the font aliasing algorithm, which may be what's really going on here.) --TimToady (talk) 07:57, 18 June 2013 (UTC)
Python query
In the (original) Python entry, obviously some kind of to be or not to be unicode thing, can someone explain the try/except on bar, ta? --Pete Lomax (talk)
- It allows the code to work in both Python 2 and Python 3. --Paddy3118 (talk) 10:45, 11 January 2019 (UTC)
- Sorry, I didn't mean the try/except on raw_input, but the one on bar (try: bar = u'▁▂▃▄▅▆▇█' except: bar = '▁▂▃▄▅▆▇█'). Following that link, I am certainly closer to understanding, but still slightly adrift. Is it something to do with u'xx' being invalid syntax' in 3.0 .. 3.2 but accepted/ignored in 3.3+? --Pete Lomax (talk) 17:33, 11 January 2019 (UTC)
- Petelomax: It's true that originally Python 3.x didn't accept the
u'...'
syntax because normal'...'
strings are already Unicode. More recent versions accept the syntax, but theu
has no effect. So that might explain the try: there, except that a try/except doesn't do any good for syntax errors. So I'm as puzzled as you are.--Markjreed (talk) 19:05, 11 January 2019 (UTC)
- Petelomax: It's true that originally Python 3.x didn't accept the