Difference between revisions of "LZ77"

From WiiBrew
Jump to navigation Jump to search
(Add details of LZ77 algorithm (since it's not very well explained elsewhere on the internet))
Line 46: Line 46:
==Example Code==
==Example Code==
This code by Marcan will decompress a LZ77-compressed file:
This python code by Marcan will decompress a LZ77-compressed file:
<source lang="python">
<source lang="python">

Revision as of 23:46, 14 May 2020

Many files on the Wii are compressed using the LZ77 compression algorithm. This page has documentation on how the algorithm works, and example code.

LZ77 Algorithm


In LZ77-compressed Wii files, there is first a simple 8-byte header:

Offset Size Description
0 bytes 4 bytes Magic number - always 0x4C5A3737 ("LZ77" in ASCII)
4 bytes 3 bytes Size of uncompressed data
7 bytes 1 byte Compression method - should always be 0x01


Immediately after the header comes the actual compressed data. The algorithm is very simple: the data is divided in chunks, and each chunk starts with a flag byte. The flag byte indicates whether an entry in the chunk is literal data or a reference to earlier in the data. A chunk has 8 entries, and thus each of the bits in the flag byte gives this indication: if it's 0 the entry is a single byte of literal data. If it's 1 the entry is a reference to earlier in the data. For example, if the flag byte of a chunk is 0xA7 (10100111 in binary), the first entry is a reference, the second a byte of literal data, the third another reference, and so on. Note that an entry of literal data is always a single byte, whereas a reference is two bytes long.


A reference is a very simple two-byte structure:

Offset Size Description
0 bytes 4 bits A single hexadecimal digit - the length of the data referred to by the reference subtract 3
4 bits 12 bits Offset - the offset within the decompressed data of the data referred to by the reference

For example, a reference 0x24AD would refer to 2 + 3 = 5 bytes of data at offset 0x4AD in the uncompressed data.

Example Code

This python code by Marcan will decompress a LZ77-compressed file:

import sys, struct
class WiiLZ77:
    TYPE_LZ77 = 1
    def __init__(self, file, offset):
        self.file = file
        self.offset = offset


        hdr = struct.unpack("<I",self.file.read(4))[0]
        self.uncompressed_length = hdr>>8
        self.compression_type = hdr>>4 & 0xF

        if self.compression_type != self.TYPE_LZ77:
            raise ValueError("Unsupported compression method %d"%self.compression_type)

    def uncompress(self):
        dout = ""

        self.file.seek(self.offset + 0x4)

        while len(dout) < self.uncompressed_length:
            flags = struct.unpack("<B",self.file.read(1))[0]

            for i in range(8):
                if flags & 0x80:
                    info = struct.unpack(">H",self.file.read(2))[0]
                    num = 3 + ((info>>12)&0xF)
                    disp = info & 0xFFF
                    ptr = len(dout) - (info & 0xFFF) - 1
                    for i in range(num):
                        dout += dout[ptr]
                        if len(dout) >= self.uncompressed_length:
                    dout += self.file.read(1)
                flags <<= 1
                if len(dout) >= self.uncompressed_length:

        self.data = dout
        return self.data
f = open(sys.argv[1])
hdr = f.read(4)
if hdr != "LZ77":
unc = WiiLZ77(f, f.tell())
du = unc.uncompress()
f2 = open(sys.argv[2],"w")