I post this here because I’d enjoy a discussion and have the errors of my naive ways pointed out to me; I’ve already promised myself I’ll not be sucked into implementing a proof of concept! Who knows, maybe in here there’s a nugget that will inspire someone else, and that will be nice.
Imagine a hashtable-only-based history for LZ. Imagine it like this:
You pick some order for the key. In this example, I’ll pick order 3.
Think of the stream as being in multiples of this key-size, even though you’re sliding through it one char at a time.
So, for example, our compressor is mid-stream:
input: ...I had no love of world of work, with its ... current: ^ blocks: ...[of ][wor][k, ][wit]...
We look up the previous key in the table,
This will point out to a *list* of historic offsets for the
We also look up where we are now,
[k, ], which is another list of historic offsets
We take these two lists and slide along them, looking for a continuation:
[wor] -> hash lookup -> list of offsets -> 56, 68, 122, 157 …
[k, ] -> hash lookup -> list of offsets -> 35, 125, 220 …
In the sliding-along we spot that
[k, ]125 are adjacent. We know that historicly there was a
[work, ] at 122 and the match was 6 long.
Here’s an augmentory idea, that might be useful elsewhere. We’re doing a *greedy* LZ coder. Look up the prefix -
[wor] - what is it’s history? There will like this for sure:
[ld ]. Now we we *know* that our match is not going to be
[ld ] because, if it was, we’d have previously greedily encoded our LZ! We can use the historic offsets to cull candidates because we know that we’re greedy.
- williamedwardscoder posted this
↓ click the "share" button below!