Revision as of 21:20, 9 October 2020 (view source) rosettacode>Gerard Schildberger m (→‎C example problem: added a bemused comment.) ← Older edit		Revision as of 21:39, 10 October 2020 (view source) Rdm (talk \| contribs) m (→‎Details of LZW - second draft) Newer edit →
Line 3: The task currently does not actually specify the LZW algorithm to be implemented. Instead, it defers to whatever is currently present in the Wikipedia presentation of the algorithm. This has left some compatibility details unspecified. (As near as I can tell, none of the implementations here include support for these issues.) But the task page is also back-linked from wikipedia, and someone should take responsibility for mentioning these things: (1) One consequence is that the implementations do not include the LZW bitpacking algorithm. Instead, they represent the numbers which would be fed to that algorithm. The bitpacking algorithm shifts indices into position to take advantage of all guaranteed zeros in the representation of the previous index. (~~This~~GIF ~~probably~~uses ~~suggests~~little-endian arepresentation ~~little~~-- low order bits first, TIFF and PDF use big-endian representation of-- high order ~~the~~bits ~~indices~~first.) (2) Another consequence is that the implementations tend to not include the LZW clear code. (in the context of the compress file format, this is referred to as LZC -- basically all versions of compress skipped past code 0x100 and the documentation on that file format used this starting point for the dictionary as the distinction between LZW and LZC -- image file format specifications do the same but still call the format LZW). LZC may limit the size of the dictionary (12 bit indices in ~~the~~image ~~GIF~~file ~~standard~~formats, ~~apparently~~typically 3216 bit indices in the ~~LZW~~compress file format, ~~though~~(which Iused ~~haven't~~a ~~investigated~~file ~~that~~name ~~closely~~extension ~~enough~~of ~~to be sure~~".Z"). When the dictionary size is reached, the encoder emits the clear code (0x100 for the usual ANSI LZW dictionary) and reverts the dictionary to its original state. (And, of course, the decoder must do the same at that point.) Though, if the dictionary size is known it's also possible to simply clear the dictionary when it gets full (and some versions of compress apparently did work this way). Note also that this was a non-issue when file sizes were small -- this was retrofitted into the algorithm when large file sizes became an issue. (3) Another issue is that the LZW file format begins with a three byte sequence (0x1F 0x9d ~~bitwidth~~XX where ~~only~~(c ~~the~~notation) ~~low~~XX&0x1F ~~five~~was ~~bits~~the ofdictionary ~~bitwidth~~size ~~are~~limit ~~significant).~~and ~~Again,~~XX&0x80 ~~not~~is ~~represented~~true inif ~~this~~clear codes are ~~task~~emitted. (At the time I am writing this, this is an ~~first~~second draft describing these issues. IThis ~~(or~~version ~~someone)~~is, reasonably accurate should ~~probably~~be ~~track~~reviewed ~~down~~by asomeone ~~historically~~with ~~valid~~more ~~LZW~~of ~~implementation~~an ~~and~~emphasis ~~verify~~on ~~the~~grammatical ~~correctness~~issues. ofMaybe ~~the~~me ~~description~~after ofenough ~~this~~time ~~description~~has passed to be baffled by how I wrote it. And then maybe reviewed again for technical accuracy. Note also that further down on this page is some example code that does bit packing,) ==Task and example==

Talk:LZW compression: Difference between revisions

Talk:LZW compression (view source)

Revision as of 21:39, 10 October 2020