# MIO0 compressed data format - 2015.04.09 - queueRAM This document details the MIO0 compressed data structure used to store data in some Nintendo 64 games and a procedure to decompress it. MIO0 is a byte-granularity compression scheme that relies on identifying repeating data patterns. All data formats assume big-endian encoding and storage (Z64). ## MIO0 compressed block structure +--------+------+-----------------------------+ | Offset | Len | Description | +--------+------+-----------------------------+ | 0x00 | 0x10 | Header: | | 0x00 | 4 | Signature: "MIO0" | | 0x04 | 4 | Decompressed length | | 0x08 | 4 | Compressed offset (C) | | 0x0C | 4 | Uncompressed offset (U) | +--------+------+-----------------------------+ | 0x10 | | Layout bits: | | | | 0 = compressed | | | | 1 = uncompressed data | +--------+------+-----------------------------+ | | | Padding to align compressed | +--------+------+-----------------------------+ | (C) | | Compressed Data: | | | | 16-bit length/offset | +--------+------+-----------------------------+ | (U) | | Uncompressed Data: | | | | individual data bytes | +--------+------+-----------------------------+ ### Header In all games, the header is 16-bytes and aligned to 4-byte boundaries. In SM64, it is aligned to a 16-byte boundary (last 4 address bits are 0). The header contains the "MIO0" signature followed by three 32-bit values: decompressed length, compressed offset, and uncompressed offset. ### Layout Bits The layout bits section begins immediately after the header, at offset 0x10. These bits identify if the next group of output data are described by a compressed group (0) or if the next byte is to be pulled from the uncompressed data (1). The bits are packed starting at most significant first moving down to least significant. Additional padding 0x00 values may be after the layout bits to align the compressed data section to a 4-byte boundary. ### Compressed Data The compressed data are located at "Compressed offset" described in the header and is always aligned to a 4-byte boundary (last 2 address bits are 0). They are composed of 16-bit values that describe where to copy the next sequence of bytes from. Each 16-bit value consists of a 4-bit length, and a 12-bit look-back offset: length = upper 4 bits of first byte + 3 [range: 3-18] = (data[0] >> 4) + 3; offset = lower 4 bits of first and all second byte + 1 [range: 1-4096] = ((data[0] & 0x0F) << 8) + data[1] + 1; Note: The length can be greater than the offset. This just means that it will copy from other data that was already copied during this compressed block. This often occurs when there are large sections of the same data byte. ### Uncompressed Data The uncompressed data immediately follows the compressed data and thus is aligned to a 2-byte boundary since the compressed data is aligned to a 4-byte and the compressed data is 16-bit chunks. The uncompressed data are individual bytes of data, one for each '1' in the layout bits. ## MIO0 Decompression Procedure 1. read header to determine decompressed length and offsets 2. read next bit from layout bits a. if bit is 1, output next byte from uncompressed data b. if bit is 0, read next 2-bytes from compressed data i. length = upper 4 bits + 3 [range: 3-18] ii. offset = next 12 bits + 1 [range: 1-4096] iii. copy 'length' bytes from current output buffer index - 'offset' 3. if (bytes output < decompressed length), go back to step 2. ## Example Input (65B): 0x00 | 4d 49 4f 30 00 00 00 46 00 00 00 18 00 00 00 26 | MIO0...F.......& 0x10 | ff fb ef d8 40 00 00 00 00 04 20 0c 30 05 b0 14 | ....@..... .0... 0x20 | 20 26 30 20 10 37 48 6f 77 20 6d 75 63 68 20 77 | &0 .7How much w 0x30 | 6f 6f 64 75 6c 64 20 61 63 68 75 63 6b 20 69 66 | ooduld achuck if 0x40 | 3f | ? Header: output length=0x46 (70), comp_offset=0x18, uncomp_offset=0x26 1. First 13 bits are 1, pull 13 uncomp bytes starting at 0x26: 48 6f 77 20 6d 75 63 68 20 77 6f 6f 64: 'How much wood' 2. Next bit is 0, look at first comp block at 0x18: 00 04 len=0+3=3, offset=4+1=5: copy 3 bytes starting from 5 bytes ago: ' wo' Buf: 'How much wood wo' 3. Next five bits are 1, copy 5 more from uncomp starting at 0x33: 75 6c 64 20 61: 'uld a' Buf: 'How much wood would a' 4. Next bit is 0, look at next comp block at 0x1A: 20 0c len=2+3=5, offset=c+1=d: copy 5 bytes starting from 13 bytes ago: ' wood' Buf: 'How much wood would a wood' 5. Next six bits are 1, copy 6 more from uncomp starting at 0x38: 63 68 75 63 6b 20: 'chuck ' Buf: 'How much wood would a woodchuck ' 6. Next bit is 0, look at next comp block at 0x1C: 30 05 len=3+3=6, offset=5+1=6: copy 6 bytes starting from 5 bytes ago: 'chuck ' Buf: 'How much wood would a woodchuck chuck ' 7. Next two bits are 1, copy 2 more from uncomp starting at 0x3E: 69 66: 'if' Buf: 'How much wood would a woodchuck chuck if' 8. Next four bits are 0, look at next 4 comp blocks: 0x1E: b0 14: len=14, offset=20: ' a woodchuck c' 0x20: 20 26: len= 5, offset=38: 'ould ' 0x22: 30 20: len= 6, offset=32: 'chuck ' 0x24: 10 37: len= 4, offset=55: 'wood' Buf: 'How much wood would a woodchuck chuck if a woodchuck could chuck wood' 9. Last bit is 1, copy 1 from uncomp at 0x40: 3f: '?' Buf: 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?' Len: 70 ## Software Libraries C: libmio0: https://github.com/queueram/sm64tools VB: M0CK: http://www.emutalk.net/threads/21072-N64-Texture-Image-Extraction/ ## MIO0 compressed data blocks can be found in the following N64 games: F-Zero X Indy Racing 2000 Looney Tunes - Duck Dodgers Mario Kart 64 Pilotwings 64 Star Fox 64 Super Mario 64