LZ77

Many files on the Wii are compressed using the LZ77 compression algorithm. This page has documentation on how the algorithm works, and example code.

LZ77 Algorithm

In LZ77-compressed Wii files, there is first a simple 8-byte header:

Offset Size Description
0 bytes 4 bytes Magic number - always 0x4C5A3737 ("LZ77" in ASCII)
4 bytes 4 bytes Data about the compression - the first three bytes are the size of the uncompressed data and fourth is the compression method, which should always be 0x10 - note this value is stored little endian.

Data

Immediately after the header comes the actual compressed data. The algorithm is very simple: the data is divided in chunks, and each chunk starts with a flag byte. The flag byte indicates whether an entry in the chunk is literal data or a reference to earlier in the data. A chunk has 8 entries, and thus each of the bits in the flag byte gives this indication: if it's 0 the entry is a single byte of literal data. If it's 1 the entry is a reference to earlier in the data. For example, if the flag byte of a chunk is 0xA7 (10100111 in binary), the first entry is a reference, the second a byte of literal data, the third another reference, and so on. Note that an entry of literal data is always a single byte, whereas a reference is two bytes long.

References

A reference is a very simple two-byte structure:

Offset Size Description
0 bytes 4 bits A single hexadecimal digit - the length of the data referred to by the reference subtract 3
4 bits 12 bits Use to calculate offset within uncompressed data referred to by the reference.

Offset is calculated by the current "place" you're writing to in the uncompressed data subtract the value in the reference subtract 1. For example, if your current "place" is 0x9EB3 a reference 0x24AD would refer to 2 + 3 = 5 bytes of data at offset 0x9EB3 - 0x4AD - 1 = 0x9A05.

Example Code

This python code by Marcan will decompress a LZ77-compressed file:

```import sys, struct

class WiiLZ77:
TYPE_LZ77 = 1
def __init__(self, file, offset):
self.file = file
self.offset = offset

self.file.seek(self.offset)

self.uncompressed_length = hdr>>8
self.compression_type = hdr>>4 & 0xF

if self.compression_type != self.TYPE_LZ77:
raise ValueError("Unsupported compression method %d"%self.compression_type)

def uncompress(self):
dout = ""

self.file.seek(self.offset + 0x4)

while len(dout) < self.uncompressed_length:

for i in range(8):
if flags & 0x80:
num = 3 + ((info>>12)&0xF)
disp = info & 0xFFF
ptr = len(dout) - (info & 0xFFF) - 1
for i in range(num):
dout += dout[ptr]
ptr+=1
if len(dout) >= self.uncompressed_length:
break
else:
flags <<= 1
if len(dout) >= self.uncompressed_length:
break

self.data = dout
return self.data

f = open(sys.argv[1])