Line 1: |
Line 1: |
− | Many files on the Wii are compressed using the '''LZ77''' compression algorithm. This Python code, by Marcan, will decompress them. | + | Many files on the Wii are compressed using the '''LZ77''' compression algorithm. This page has documentation on how the algorithm works, and example code. |
| + | |
| + | ==LZ77 Algorithm== |
| + | ===Header=== |
| + | In LZ77-compressed Wii files, there is first a simple 8-byte header: |
| + | {| class="wikitable" |
| + | |- |
| + | ! '''Offset''' |
| + | ! '''Size''' |
| + | ! '''Description''' |
| + | |- |
| + | | 0 bytes |
| + | | 4 bytes |
| + | | Magic number - always 0x4C5A3737 ("LZ77" in ASCII) |
| + | |- |
| + | | 4 bytes |
| + | | 3 bytes |
| + | | Size of uncompressed data |
| + | |- |
| + | | 7 bytes |
| + | | 1 byte |
| + | | Compression method - should always be 0x01 |
| + | |} |
| + | |
| + | ===Data=== |
| + | Immediately after the header comes the actual compressed data. The algorithm is very simple: the data is divided in chunks, and each chunk starts with a flag byte. The flag byte indicates whether an entry in the chunk is literal data or a reference to earlier in the data. A chunk has 8 entries, and thus each of the bits in the flag byte gives this indication: if it's 0 the entry is a single byte of literal data. If it's 1 the entry is a reference to earlier in the data. For example, if the flag byte of a chunk is 0xA7 (10100111 in binary), the first entry is a reference, the second a byte of literal data, the third another reference, and so on. Note that an entry of literal data is always a single byte, whereas a reference is two bytes long. |
| + | |
| + | ===References=== |
| + | A reference is a very simple two-byte structure: |
| + | {| class="wikitable" |
| + | |- |
| + | ! '''Offset''' |
| + | ! '''Size''' |
| + | ! '''Description''' |
| + | |- |
| + | | 0 bytes |
| + | | 4 bits |
| + | | A single hexadecimal digit - the length of the data referred to by the reference subtract 3 |
| + | |- |
| + | | 4 bits |
| + | | 12 bits |
| + | | Offset - the offset within the decompressed data of the data referred to by the reference |
| + | |} |
| + | |
| + | For example, a reference 0x24AD would refer to 2 + 3 = '''5''' bytes of data at offset 0x4AD in the uncompressed data. |
| + | |
| + | ==Example Code== |
| + | This code by Marcan will decompress a LZ77-compressed file: |
| + | |
| <source lang="python"> | | <source lang="python"> |
| import sys, struct | | import sys, struct |