LZW (Lempel-Ziv-Welch) Compression Algorithm

LZW (Lempel-Ziv-Welch)

This is another dictionary-based compression. It works by entering symbol into the dictionary together with the index then when the symbol occurs again just output the dictionary index instead if the symbol. The original LZW has only 12 bits (4096) dictionary index, the first 256 (0-255) indexs are unuse, they are refer to individual bytes, and the rest indexs 256-4095 refer to inputted symbol.
Building dictionary rules:
1. The dictionary start with a sort value length then growing.
2. The dictionary can not has two same values.

Encoding:
Eg: source: ABCABCBACAACBAA (15 bytes).
The dictionary index value is start with 256, the index value of 0 - 255 is reserved for initial value.

Input symbol	Output symbol / index	Dictionary index	Dictionary value
AB	A	256	AB
C	B	257	BC
A	C	258	CA
BC	256	259	ABC
B	C	260	CB
A	B	261	BA
C	A	262	AC
AA	258	263	CAA
CB	262	264	ACB
AA	261	265	BAA
(End Of File)	A

The result (target) file only have the output symbol / index without dictionary

Decoding:

While decoding the program will build dictionary as well.

Input symbol / index	Output symbol	Dictionary index	Dictionary value
AB	A	256	AB
C	B	257	BC
256	C	258	CA
C	AB	259	ABC
B	C	260	CB
A	B	261	BA
258	A	262	AC
262	CA	263	CAA
261	AC	264	ACB
A	BA	265	BAA
(End Of File)	A

Pros: do not need to store the dictionary in the output file.
Cons: dictionary have to be built in encoding and decoding (double work, time comsuming), also difficult to get the best (the most efficient) dictionary size on different file type and size.
Conclusion: this compression has so many variants and has so many names on the web because people (programmer) like to improving this compression by using a more efficient dictionary size and using different way of indexing the dictionary.

Sliding Window

Grouping

Hosted by www.Geocities.ws