jpayne@68: jpayne@68: The .xz File Format jpayne@68: =================== jpayne@68: jpayne@68: Version 1.2.1 (2024-04-08) jpayne@68: jpayne@68: jpayne@68: 0. Preface jpayne@68: 0.1. Notices and Acknowledgements jpayne@68: 0.2. Getting the Latest Version jpayne@68: 0.3. Version History jpayne@68: 1. Conventions jpayne@68: 1.1. Byte and Its Representation jpayne@68: 1.2. Multibyte Integers jpayne@68: 2. Overall Structure of .xz File jpayne@68: 2.1. Stream jpayne@68: 2.1.1. Stream Header jpayne@68: 2.1.1.1. Header Magic Bytes jpayne@68: 2.1.1.2. Stream Flags jpayne@68: 2.1.1.3. CRC32 jpayne@68: 2.1.2. Stream Footer jpayne@68: 2.1.2.1. CRC32 jpayne@68: 2.1.2.2. Backward Size jpayne@68: 2.1.2.3. Stream Flags jpayne@68: 2.1.2.4. Footer Magic Bytes jpayne@68: 2.2. Stream Padding jpayne@68: 3. Block jpayne@68: 3.1. Block Header jpayne@68: 3.1.1. Block Header Size jpayne@68: 3.1.2. Block Flags jpayne@68: 3.1.3. Compressed Size jpayne@68: 3.1.4. Uncompressed Size jpayne@68: 3.1.5. List of Filter Flags jpayne@68: 3.1.6. Header Padding jpayne@68: 3.1.7. CRC32 jpayne@68: 3.2. Compressed Data jpayne@68: 3.3. Block Padding jpayne@68: 3.4. Check jpayne@68: 4. Index jpayne@68: 4.1. Index Indicator jpayne@68: 4.2. Number of Records jpayne@68: 4.3. List of Records jpayne@68: 4.3.1. Unpadded Size jpayne@68: 4.3.2. Uncompressed Size jpayne@68: 4.4. Index Padding jpayne@68: 4.5. CRC32 jpayne@68: 5. Filter Chains jpayne@68: 5.1. Alignment jpayne@68: 5.2. Security jpayne@68: 5.3. Filters jpayne@68: 5.3.1. LZMA2 jpayne@68: 5.3.2. Branch/Call/Jump Filters for Executables jpayne@68: 5.3.3. Delta jpayne@68: 5.3.3.1. Format of the Encoded Output jpayne@68: 5.4. Custom Filter IDs jpayne@68: 5.4.1. Reserved Custom Filter ID Ranges jpayne@68: 6. Cyclic Redundancy Checks jpayne@68: 7. References jpayne@68: jpayne@68: jpayne@68: 0. Preface jpayne@68: jpayne@68: This document describes the .xz file format (filename suffix jpayne@68: ".xz", MIME type "application/x-xz"). It is intended that this jpayne@68: this format replace the old .lzma format used by LZMA SDK and jpayne@68: LZMA Utils. jpayne@68: jpayne@68: jpayne@68: 0.1. Notices and Acknowledgements jpayne@68: jpayne@68: This file format was designed by Lasse Collin jpayne@68: and Igor Pavlov. jpayne@68: jpayne@68: Special thanks for helping with this document goes to jpayne@68: Ville Koskinen. Thanks for helping with this document goes to jpayne@68: Mark Adler, H. Peter Anvin, Mikko Pouru, and Lars Wirzenius. jpayne@68: jpayne@68: This document has been put into the public domain. jpayne@68: jpayne@68: jpayne@68: 0.2. Getting the Latest Version jpayne@68: jpayne@68: The latest official version of this document can be downloaded jpayne@68: from . jpayne@68: jpayne@68: Specific versions of this document have a filename jpayne@68: xz-file-format-X.Y.Z.txt where X.Y.Z is the version number. jpayne@68: For example, the version 1.0.0 of this document is available jpayne@68: at . jpayne@68: jpayne@68: jpayne@68: 0.3. Version History jpayne@68: jpayne@68: Version Date Description jpayne@68: jpayne@68: 1.2.1 2024-04-08 The URLs of this specification and jpayne@68: XZ Utils were changed back to the jpayne@68: original ones in Sections 0.2 and 7. jpayne@68: jpayne@68: 1.2.0 2024-01-19 Added RISC-V filter and updated URLs in jpayne@68: Sections 0.2 and 7. The URL of this jpayne@68: specification was changed. jpayne@68: jpayne@68: 1.1.0 2022-12-11 Added ARM64 filter and clarified 32-bit jpayne@68: ARM endianness in Section 5.3.2, jpayne@68: language improvements in Section 5.4 jpayne@68: jpayne@68: 1.0.4 2009-08-27 Language improvements in Sections 1.2, jpayne@68: 2.1.1.2, 3.1.1, 3.1.2, and 5.3.1 jpayne@68: jpayne@68: 1.0.3 2009-06-05 Spelling fixes in Sections 5.1 and 5.4 jpayne@68: jpayne@68: 1.0.2 2009-06-04 Typo fixes in Sections 4 and 5.3.1 jpayne@68: jpayne@68: 1.0.1 2009-06-01 Typo fix in Section 0.3 and minor jpayne@68: clarifications to Sections 2, 2.2, jpayne@68: 3.3, 4.4, and 5.3.2 jpayne@68: jpayne@68: 1.0.0 2009-01-14 The first official version jpayne@68: jpayne@68: jpayne@68: 1. Conventions jpayne@68: jpayne@68: The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", jpayne@68: "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this jpayne@68: document are to be interpreted as described in [RFC-2119]. jpayne@68: jpayne@68: Indicating a warning means displaying a message, returning jpayne@68: appropriate exit status, or doing something else to let the jpayne@68: user know that something worth warning occurred. The operation jpayne@68: SHOULD still finish if a warning is indicated. jpayne@68: jpayne@68: Indicating an error means displaying a message, returning jpayne@68: appropriate exit status, or doing something else to let the jpayne@68: user know that something prevented successfully finishing the jpayne@68: operation. The operation MUST be aborted once an error has jpayne@68: been indicated. jpayne@68: jpayne@68: jpayne@68: 1.1. Byte and Its Representation jpayne@68: jpayne@68: In this document, byte is always 8 bits. jpayne@68: jpayne@68: A "null byte" has all bits unset. That is, the value of a null jpayne@68: byte is 0x00. jpayne@68: jpayne@68: To represent byte blocks, this document uses notation that jpayne@68: is similar to the notation used in [RFC-1952]: jpayne@68: jpayne@68: +-------+ jpayne@68: | Foo | One byte. jpayne@68: +-------+ jpayne@68: jpayne@68: +---+---+ jpayne@68: | Foo | Two bytes; that is, some of the vertical bars jpayne@68: +---+---+ can be missing. jpayne@68: jpayne@68: +=======+ jpayne@68: | Foo | Zero or more bytes. jpayne@68: +=======+ jpayne@68: jpayne@68: In this document, a boxed byte or a byte sequence declared jpayne@68: using this notation is called "a field". The example field jpayne@68: above would be called "the Foo field" or plain "Foo". jpayne@68: jpayne@68: If there are many fields, they may be split to multiple lines. jpayne@68: This is indicated with an arrow ("--->"): jpayne@68: jpayne@68: +=====+ jpayne@68: | Foo | jpayne@68: +=====+ jpayne@68: jpayne@68: +=====+ jpayne@68: ---> | Bar | jpayne@68: +=====+ jpayne@68: jpayne@68: The above is equivalent to this: jpayne@68: jpayne@68: +=====+=====+ jpayne@68: | Foo | Bar | jpayne@68: +=====+=====+ jpayne@68: jpayne@68: jpayne@68: 1.2. Multibyte Integers jpayne@68: jpayne@68: Multibyte integers of static length, such as CRC values, jpayne@68: are stored in little endian byte order (least significant jpayne@68: byte first). jpayne@68: jpayne@68: When smaller values are more likely than bigger values (for jpayne@68: example file sizes), multibyte integers are encoded in a jpayne@68: variable-length representation: jpayne@68: - Numbers in the range [0, 127] are copied as is, and take jpayne@68: one byte of space. jpayne@68: - Bigger numbers will occupy two or more bytes. All but the jpayne@68: last byte of the multibyte representation have the highest jpayne@68: (eighth) bit set. jpayne@68: jpayne@68: For now, the value of the variable-length integers is limited jpayne@68: to 63 bits, which limits the encoded size of the integer to jpayne@68: nine bytes. These limits may be increased in the future if jpayne@68: needed. jpayne@68: jpayne@68: The following C code illustrates encoding and decoding of jpayne@68: variable-length integers. The functions return the number of jpayne@68: bytes occupied by the integer (1-9), or zero on error. jpayne@68: jpayne@68: #include jpayne@68: #include jpayne@68: jpayne@68: size_t jpayne@68: encode(uint8_t buf[static 9], uint64_t num) jpayne@68: { jpayne@68: if (num > UINT64_MAX / 2) jpayne@68: return 0; jpayne@68: jpayne@68: size_t i = 0; jpayne@68: jpayne@68: while (num >= 0x80) { jpayne@68: buf[i++] = (uint8_t)(num) | 0x80; jpayne@68: num >>= 7; jpayne@68: } jpayne@68: jpayne@68: buf[i++] = (uint8_t)(num); jpayne@68: jpayne@68: return i; jpayne@68: } jpayne@68: jpayne@68: size_t jpayne@68: decode(const uint8_t buf[], size_t size_max, uint64_t *num) jpayne@68: { jpayne@68: if (size_max == 0) jpayne@68: return 0; jpayne@68: jpayne@68: if (size_max > 9) jpayne@68: size_max = 9; jpayne@68: jpayne@68: *num = buf[0] & 0x7F; jpayne@68: size_t i = 0; jpayne@68: jpayne@68: while (buf[i++] & 0x80) { jpayne@68: if (i >= size_max || buf[i] == 0x00) jpayne@68: return 0; jpayne@68: jpayne@68: *num |= (uint64_t)(buf[i] & 0x7F) << (i * 7); jpayne@68: } jpayne@68: jpayne@68: return i; jpayne@68: } jpayne@68: jpayne@68: jpayne@68: 2. Overall Structure of .xz File jpayne@68: jpayne@68: A standalone .xz files consist of one or more Streams which may jpayne@68: have Stream Padding between or after them: jpayne@68: jpayne@68: +========+================+========+================+ jpayne@68: | Stream | Stream Padding | Stream | Stream Padding | ... jpayne@68: +========+================+========+================+ jpayne@68: jpayne@68: The sizes of Stream and Stream Padding are always multiples jpayne@68: of four bytes, thus the size of every valid .xz file MUST be jpayne@68: a multiple of four bytes. jpayne@68: jpayne@68: While a typical file contains only one Stream and no Stream jpayne@68: Padding, a decoder handling standalone .xz files SHOULD support jpayne@68: files that have more than one Stream or Stream Padding. jpayne@68: jpayne@68: In contrast to standalone .xz files, when the .xz file format jpayne@68: is used as an internal part of some other file format or jpayne@68: communication protocol, it usually is expected that the decoder jpayne@68: stops after the first Stream, and doesn't look for Stream jpayne@68: Padding or possibly other Streams. jpayne@68: jpayne@68: jpayne@68: 2.1. Stream jpayne@68: jpayne@68: +-+-+-+-+-+-+-+-+-+-+-+-+=======+=======+ +=======+ jpayne@68: | Stream Header | Block | Block | ... | Block | jpayne@68: +-+-+-+-+-+-+-+-+-+-+-+-+=======+=======+ +=======+ jpayne@68: jpayne@68: +=======+-+-+-+-+-+-+-+-+-+-+-+-+ jpayne@68: ---> | Index | Stream Footer | jpayne@68: +=======+-+-+-+-+-+-+-+-+-+-+-+-+ jpayne@68: jpayne@68: All the above fields have a size that is a multiple of four. If jpayne@68: Stream is used as an internal part of another file format, it jpayne@68: is RECOMMENDED to make the Stream start at an offset that is jpayne@68: a multiple of four bytes. jpayne@68: jpayne@68: Stream Header, Index, and Stream Footer are always present in jpayne@68: a Stream. The maximum size of the Index field is 16 GiB (2^34). jpayne@68: jpayne@68: There are zero or more Blocks. The maximum number of Blocks is jpayne@68: limited only by the maximum size of the Index field. jpayne@68: jpayne@68: Total size of a Stream MUST be less than 8 EiB (2^63 bytes). jpayne@68: The same limit applies to the total amount of uncompressed jpayne@68: data stored in a Stream. jpayne@68: jpayne@68: If an implementation supports handling .xz files with multiple jpayne@68: concatenated Streams, it MAY apply the above limits to the file jpayne@68: as a whole instead of limiting per Stream basis. jpayne@68: jpayne@68: jpayne@68: 2.1.1. Stream Header jpayne@68: jpayne@68: +---+---+---+---+---+---+-------+------+--+--+--+--+ jpayne@68: | Header Magic Bytes | Stream Flags | CRC32 | jpayne@68: +---+---+---+---+---+---+-------+------+--+--+--+--+ jpayne@68: jpayne@68: jpayne@68: 2.1.1.1. Header Magic Bytes jpayne@68: jpayne@68: The first six (6) bytes of the Stream are so called Header jpayne@68: Magic Bytes. They can be used to identify the file type. jpayne@68: jpayne@68: Using a C array and ASCII: jpayne@68: const uint8_t HEADER_MAGIC[6] jpayne@68: = { 0xFD, '7', 'z', 'X', 'Z', 0x00 }; jpayne@68: jpayne@68: In plain hexadecimal: jpayne@68: FD 37 7A 58 5A 00 jpayne@68: jpayne@68: Notes: jpayne@68: - The first byte (0xFD) was chosen so that the files cannot jpayne@68: be erroneously detected as being in .lzma format, in which jpayne@68: the first byte is in the range [0x00, 0xE0]. jpayne@68: - The sixth byte (0x00) was chosen to prevent applications jpayne@68: from misdetecting the file as a text file. jpayne@68: jpayne@68: If the Header Magic Bytes don't match, the decoder MUST jpayne@68: indicate an error. jpayne@68: jpayne@68: jpayne@68: 2.1.1.2. Stream Flags jpayne@68: jpayne@68: The first byte of Stream Flags is always a null byte. In the jpayne@68: future, this byte may be used to indicate a new Stream version jpayne@68: or other Stream properties. jpayne@68: jpayne@68: The second byte of Stream Flags is a bit field: jpayne@68: jpayne@68: Bit(s) Mask Description jpayne@68: 0-3 0x0F Type of Check (see Section 3.4): jpayne@68: ID Size Check name jpayne@68: 0x00 0 bytes None jpayne@68: 0x01 4 bytes CRC32 jpayne@68: 0x02 4 bytes (Reserved) jpayne@68: 0x03 4 bytes (Reserved) jpayne@68: 0x04 8 bytes CRC64 jpayne@68: 0x05 8 bytes (Reserved) jpayne@68: 0x06 8 bytes (Reserved) jpayne@68: 0x07 16 bytes (Reserved) jpayne@68: 0x08 16 bytes (Reserved) jpayne@68: 0x09 16 bytes (Reserved) jpayne@68: 0x0A 32 bytes SHA-256 jpayne@68: 0x0B 32 bytes (Reserved) jpayne@68: 0x0C 32 bytes (Reserved) jpayne@68: 0x0D 64 bytes (Reserved) jpayne@68: 0x0E 64 bytes (Reserved) jpayne@68: 0x0F 64 bytes (Reserved) jpayne@68: 4-7 0xF0 Reserved for future use; MUST be zero for now. jpayne@68: jpayne@68: Implementations SHOULD support at least the Check IDs 0x00 jpayne@68: (None) and 0x01 (CRC32). Supporting other Check IDs is jpayne@68: OPTIONAL. If an unsupported Check is used, the decoder SHOULD jpayne@68: indicate a warning or error. jpayne@68: jpayne@68: If any reserved bit is set, the decoder MUST indicate an error. jpayne@68: It is possible that there is a new field present which the jpayne@68: decoder is not aware of, and can thus parse the Stream Header jpayne@68: incorrectly. jpayne@68: jpayne@68: jpayne@68: 2.1.1.3. CRC32 jpayne@68: jpayne@68: The CRC32 is calculated from the Stream Flags field. It is jpayne@68: stored as an unsigned 32-bit little endian integer. If the jpayne@68: calculated value does not match the stored one, the decoder jpayne@68: MUST indicate an error. jpayne@68: jpayne@68: The idea is that Stream Flags would always be two bytes, even jpayne@68: if new features are needed. This way old decoders will be able jpayne@68: to verify the CRC32 calculated from Stream Flags, and thus jpayne@68: distinguish between corrupt files (CRC32 doesn't match) and jpayne@68: files that the decoder doesn't support (CRC32 matches but jpayne@68: Stream Flags has reserved bits set). jpayne@68: jpayne@68: jpayne@68: 2.1.2. Stream Footer jpayne@68: jpayne@68: +-+-+-+-+---+---+---+---+-------+------+----------+---------+ jpayne@68: | CRC32 | Backward Size | Stream Flags | Footer Magic Bytes | jpayne@68: +-+-+-+-+---+---+---+---+-------+------+----------+---------+ jpayne@68: jpayne@68: jpayne@68: 2.1.2.1. CRC32 jpayne@68: jpayne@68: The CRC32 is calculated from the Backward Size and Stream Flags jpayne@68: fields. It is stored as an unsigned 32-bit little endian jpayne@68: integer. If the calculated value does not match the stored one, jpayne@68: the decoder MUST indicate an error. jpayne@68: jpayne@68: The reason to have the CRC32 field before the Backward Size and jpayne@68: Stream Flags fields is to keep the four-byte fields aligned to jpayne@68: a multiple of four bytes. jpayne@68: jpayne@68: jpayne@68: 2.1.2.2. Backward Size jpayne@68: jpayne@68: Backward Size is stored as a 32-bit little endian integer, jpayne@68: which indicates the size of the Index field as multiple of jpayne@68: four bytes, minimum value being four bytes: jpayne@68: jpayne@68: real_backward_size = (stored_backward_size + 1) * 4; jpayne@68: jpayne@68: If the stored value does not match the real size of the Index jpayne@68: field, the decoder MUST indicate an error. jpayne@68: jpayne@68: Using a fixed-size integer to store Backward Size makes jpayne@68: it slightly simpler to parse the Stream Footer when the jpayne@68: application needs to parse the Stream backwards. jpayne@68: jpayne@68: jpayne@68: 2.1.2.3. Stream Flags jpayne@68: jpayne@68: This is a copy of the Stream Flags field from the Stream jpayne@68: Header. The information stored to Stream Flags is needed jpayne@68: when parsing the Stream backwards. The decoder MUST compare jpayne@68: the Stream Flags fields in both Stream Header and Stream jpayne@68: Footer, and indicate an error if they are not identical. jpayne@68: jpayne@68: jpayne@68: 2.1.2.4. Footer Magic Bytes jpayne@68: jpayne@68: As the last step of the decoding process, the decoder MUST jpayne@68: verify the existence of Footer Magic Bytes. If they don't jpayne@68: match, an error MUST be indicated. jpayne@68: jpayne@68: Using a C array and ASCII: jpayne@68: const uint8_t FOOTER_MAGIC[2] = { 'Y', 'Z' }; jpayne@68: jpayne@68: In hexadecimal: jpayne@68: 59 5A jpayne@68: jpayne@68: The primary reason to have Footer Magic Bytes is to make jpayne@68: it easier to detect incomplete files quickly, without jpayne@68: uncompressing. If the file does not end with Footer Magic Bytes jpayne@68: (excluding Stream Padding described in Section 2.2), it cannot jpayne@68: be undamaged, unless someone has intentionally appended garbage jpayne@68: after the end of the Stream. jpayne@68: jpayne@68: jpayne@68: 2.2. Stream Padding jpayne@68: jpayne@68: Only the decoders that support decoding of concatenated Streams jpayne@68: MUST support Stream Padding. jpayne@68: jpayne@68: Stream Padding MUST contain only null bytes. To preserve the jpayne@68: four-byte alignment of consecutive Streams, the size of Stream jpayne@68: Padding MUST be a multiple of four bytes. Empty Stream Padding jpayne@68: is allowed. If these requirements are not met, the decoder MUST jpayne@68: indicate an error. jpayne@68: jpayne@68: Note that non-empty Stream Padding is allowed at the end of the jpayne@68: file; there doesn't need to be a new Stream after non-empty jpayne@68: Stream Padding. This can be convenient in certain situations jpayne@68: [GNU-tar]. jpayne@68: jpayne@68: The possibility of Stream Padding MUST be taken into account jpayne@68: when designing an application that parses Streams backwards, jpayne@68: and the application supports concatenated Streams. jpayne@68: jpayne@68: jpayne@68: 3. Block jpayne@68: jpayne@68: +==============+=================+===============+=======+ jpayne@68: | Block Header | Compressed Data | Block Padding | Check | jpayne@68: +==============+=================+===============+=======+ jpayne@68: jpayne@68: jpayne@68: 3.1. Block Header jpayne@68: jpayne@68: +-------------------+-------------+=================+ jpayne@68: | Block Header Size | Block Flags | Compressed Size | jpayne@68: +-------------------+-------------+=================+ jpayne@68: jpayne@68: +===================+======================+ jpayne@68: ---> | Uncompressed Size | List of Filter Flags | jpayne@68: +===================+======================+ jpayne@68: jpayne@68: +================+--+--+--+--+ jpayne@68: ---> | Header Padding | CRC32 | jpayne@68: +================+--+--+--+--+ jpayne@68: jpayne@68: jpayne@68: 3.1.1. Block Header Size jpayne@68: jpayne@68: This field overlaps with the Index Indicator field (see jpayne@68: Section 4.1). jpayne@68: jpayne@68: This field contains the size of the Block Header field, jpayne@68: including the Block Header Size field itself. Valid values are jpayne@68: in the range [0x01, 0xFF], which indicate the size of the Block jpayne@68: Header as multiples of four bytes, minimum size being eight jpayne@68: bytes: jpayne@68: jpayne@68: real_header_size = (encoded_header_size + 1) * 4; jpayne@68: jpayne@68: If a Block Header bigger than 1024 bytes is needed in the jpayne@68: future, a new field can be added between the Block Header and jpayne@68: Compressed Data fields. The presence of this new field would jpayne@68: be indicated in the Block Header field. jpayne@68: jpayne@68: jpayne@68: 3.1.2. Block Flags jpayne@68: jpayne@68: The Block Flags field is a bit field: jpayne@68: jpayne@68: Bit(s) Mask Description jpayne@68: 0-1 0x03 Number of filters (1-4) jpayne@68: 2-5 0x3C Reserved for future use; MUST be zero for now. jpayne@68: 6 0x40 The Compressed Size field is present. jpayne@68: 7 0x80 The Uncompressed Size field is present. jpayne@68: jpayne@68: If any reserved bit is set, the decoder MUST indicate an error. jpayne@68: It is possible that there is a new field present which the jpayne@68: decoder is not aware of, and can thus parse the Block Header jpayne@68: incorrectly. jpayne@68: jpayne@68: jpayne@68: 3.1.3. Compressed Size jpayne@68: jpayne@68: This field is present only if the appropriate bit is set in jpayne@68: the Block Flags field (see Section 3.1.2). jpayne@68: jpayne@68: The Compressed Size field contains the size of the Compressed jpayne@68: Data field, which MUST be non-zero. Compressed Size is stored jpayne@68: using the encoding described in Section 1.2. If the Compressed jpayne@68: Size doesn't match the size of the Compressed Data field, the jpayne@68: decoder MUST indicate an error. jpayne@68: jpayne@68: jpayne@68: 3.1.4. Uncompressed Size jpayne@68: jpayne@68: This field is present only if the appropriate bit is set in jpayne@68: the Block Flags field (see Section 3.1.2). jpayne@68: jpayne@68: The Uncompressed Size field contains the size of the Block jpayne@68: after uncompressing. Uncompressed Size is stored using the jpayne@68: encoding described in Section 1.2. If the Uncompressed Size jpayne@68: does not match the real uncompressed size, the decoder MUST jpayne@68: indicate an error. jpayne@68: jpayne@68: Storing the Compressed Size and Uncompressed Size fields serves jpayne@68: several purposes: jpayne@68: - The decoder knows how much memory it needs to allocate jpayne@68: for a temporary buffer in multithreaded mode. jpayne@68: - Simple error detection: wrong size indicates a broken file. jpayne@68: - Seeking forwards to a specific location in streamed mode. jpayne@68: jpayne@68: It should be noted that the only reliable way to determine jpayne@68: the real uncompressed size is to uncompress the Block, jpayne@68: because the Block Header and Index fields may contain jpayne@68: (intentionally or unintentionally) invalid information. jpayne@68: jpayne@68: jpayne@68: 3.1.5. List of Filter Flags jpayne@68: jpayne@68: +================+================+ +================+ jpayne@68: | Filter 0 Flags | Filter 1 Flags | ... | Filter n Flags | jpayne@68: +================+================+ +================+ jpayne@68: jpayne@68: The number of Filter Flags fields is stored in the Block Flags jpayne@68: field (see Section 3.1.2). jpayne@68: jpayne@68: The format of each Filter Flags field is as follows: jpayne@68: jpayne@68: +===========+====================+===================+ jpayne@68: | Filter ID | Size of Properties | Filter Properties | jpayne@68: +===========+====================+===================+ jpayne@68: jpayne@68: Both Filter ID and Size of Properties are stored using the jpayne@68: encoding described in Section 1.2. Size of Properties indicates jpayne@68: the size of the Filter Properties field as bytes. The list of jpayne@68: officially defined Filter IDs and the formats of their Filter jpayne@68: Properties are described in Section 5.3. jpayne@68: jpayne@68: Filter IDs greater than or equal to 0x4000_0000_0000_0000 jpayne@68: (2^62) are reserved for implementation-specific internal use. jpayne@68: These Filter IDs MUST never be used in List of Filter Flags. jpayne@68: jpayne@68: jpayne@68: 3.1.6. Header Padding jpayne@68: jpayne@68: This field contains as many null byte as it is needed to make jpayne@68: the Block Header have the size specified in Block Header Size. jpayne@68: If any of the bytes are not null bytes, the decoder MUST jpayne@68: indicate an error. It is possible that there is a new field jpayne@68: present which the decoder is not aware of, and can thus parse jpayne@68: the Block Header incorrectly. jpayne@68: jpayne@68: jpayne@68: 3.1.7. CRC32 jpayne@68: jpayne@68: The CRC32 is calculated over everything in the Block Header jpayne@68: field except the CRC32 field itself. It is stored as an jpayne@68: unsigned 32-bit little endian integer. If the calculated jpayne@68: value does not match the stored one, the decoder MUST indicate jpayne@68: an error. jpayne@68: jpayne@68: By verifying the CRC32 of the Block Header before parsing the jpayne@68: actual contents allows the decoder to distinguish between jpayne@68: corrupt and unsupported files. jpayne@68: jpayne@68: jpayne@68: 3.2. Compressed Data jpayne@68: jpayne@68: The format of Compressed Data depends on Block Flags and List jpayne@68: of Filter Flags. Excluding the descriptions of the simplest jpayne@68: filters in Section 5.3, the format of the filter-specific jpayne@68: encoded data is out of scope of this document. jpayne@68: jpayne@68: jpayne@68: 3.3. Block Padding jpayne@68: jpayne@68: Block Padding MUST contain 0-3 null bytes to make the size of jpayne@68: the Block a multiple of four bytes. This can be needed when jpayne@68: the size of Compressed Data is not a multiple of four. If any jpayne@68: of the bytes in Block Padding are not null bytes, the decoder jpayne@68: MUST indicate an error. jpayne@68: jpayne@68: jpayne@68: 3.4. Check jpayne@68: jpayne@68: The type and size of the Check field depends on which bits jpayne@68: are set in the Stream Flags field (see Section 2.1.1.2). jpayne@68: jpayne@68: The Check, when used, is calculated from the original jpayne@68: uncompressed data. If the calculated Check does not match the jpayne@68: stored one, the decoder MUST indicate an error. If the selected jpayne@68: type of Check is not supported by the decoder, it SHOULD jpayne@68: indicate a warning or error. jpayne@68: jpayne@68: jpayne@68: 4. Index jpayne@68: jpayne@68: +-----------------+===================+ jpayne@68: | Index Indicator | Number of Records | jpayne@68: +-----------------+===================+ jpayne@68: jpayne@68: +=================+===============+-+-+-+-+ jpayne@68: ---> | List of Records | Index Padding | CRC32 | jpayne@68: +=================+===============+-+-+-+-+ jpayne@68: jpayne@68: Index serves several purposes. Using it, one can jpayne@68: - verify that all Blocks in a Stream have been processed; jpayne@68: - find out the uncompressed size of a Stream; and jpayne@68: - quickly access the beginning of any Block (random access). jpayne@68: jpayne@68: jpayne@68: 4.1. Index Indicator jpayne@68: jpayne@68: This field overlaps with the Block Header Size field (see jpayne@68: Section 3.1.1). The value of Index Indicator is always 0x00. jpayne@68: jpayne@68: jpayne@68: 4.2. Number of Records jpayne@68: jpayne@68: This field indicates how many Records there are in the List jpayne@68: of Records field, and thus how many Blocks there are in the jpayne@68: Stream. The value is stored using the encoding described in jpayne@68: Section 1.2. If the decoder has decoded all the Blocks of the jpayne@68: Stream, and then notices that the Number of Records doesn't jpayne@68: match the real number of Blocks, the decoder MUST indicate an jpayne@68: error. jpayne@68: jpayne@68: jpayne@68: 4.3. List of Records jpayne@68: jpayne@68: List of Records consists of as many Records as indicated by the jpayne@68: Number of Records field: jpayne@68: jpayne@68: +========+========+ jpayne@68: | Record | Record | ... jpayne@68: +========+========+ jpayne@68: jpayne@68: Each Record contains information about one Block: jpayne@68: jpayne@68: +===============+===================+ jpayne@68: | Unpadded Size | Uncompressed Size | jpayne@68: +===============+===================+ jpayne@68: jpayne@68: If the decoder has decoded all the Blocks of the Stream, it jpayne@68: MUST verify that the contents of the Records match the real jpayne@68: Unpadded Size and Uncompressed Size of the respective Blocks. jpayne@68: jpayne@68: Implementation hint: It is possible to verify the Index with jpayne@68: constant memory usage by calculating for example SHA-256 of jpayne@68: both the real size values and the List of Records, then jpayne@68: comparing the hash values. Implementing this using jpayne@68: non-cryptographic hash like CRC32 SHOULD be avoided unless jpayne@68: small code size is important. jpayne@68: jpayne@68: If the decoder supports random-access reading, it MUST verify jpayne@68: that Unpadded Size and Uncompressed Size of every completely jpayne@68: decoded Block match the sizes stored in the Index. If only jpayne@68: partial Block is decoded, the decoder MUST verify that the jpayne@68: processed sizes don't exceed the sizes stored in the Index. jpayne@68: jpayne@68: jpayne@68: 4.3.1. Unpadded Size jpayne@68: jpayne@68: This field indicates the size of the Block excluding the Block jpayne@68: Padding field. That is, Unpadded Size is the size of the Block jpayne@68: Header, Compressed Data, and Check fields. Unpadded Size is jpayne@68: stored using the encoding described in Section 1.2. The value jpayne@68: MUST never be zero; with the current structure of Blocks, the jpayne@68: actual minimum value for Unpadded Size is five. jpayne@68: jpayne@68: Implementation note: Because the size of the Block Padding jpayne@68: field is not included in Unpadded Size, calculating the total jpayne@68: size of a Stream or doing random-access reading requires jpayne@68: calculating the actual size of the Blocks by rounding Unpadded jpayne@68: Sizes up to the next multiple of four. jpayne@68: jpayne@68: The reason to exclude Block Padding from Unpadded Size is to jpayne@68: ease making a raw copy of Compressed Data without Block jpayne@68: Padding. This can be useful, for example, if someone wants jpayne@68: to convert Streams to some other file format quickly. jpayne@68: jpayne@68: jpayne@68: 4.3.2. Uncompressed Size jpayne@68: jpayne@68: This field indicates the Uncompressed Size of the respective jpayne@68: Block as bytes. The value is stored using the encoding jpayne@68: described in Section 1.2. jpayne@68: jpayne@68: jpayne@68: 4.4. Index Padding jpayne@68: jpayne@68: This field MUST contain 0-3 null bytes to pad the Index to jpayne@68: a multiple of four bytes. If any of the bytes are not null jpayne@68: bytes, the decoder MUST indicate an error. jpayne@68: jpayne@68: jpayne@68: 4.5. CRC32 jpayne@68: jpayne@68: The CRC32 is calculated over everything in the Index field jpayne@68: except the CRC32 field itself. The CRC32 is stored as an jpayne@68: unsigned 32-bit little endian integer. If the calculated jpayne@68: value does not match the stored one, the decoder MUST indicate jpayne@68: an error. jpayne@68: jpayne@68: jpayne@68: 5. Filter Chains jpayne@68: jpayne@68: The Block Flags field defines how many filters are used. When jpayne@68: more than one filter is used, the filters are chained; that is, jpayne@68: the output of one filter is the input of another filter. The jpayne@68: following figure illustrates the direction of data flow. jpayne@68: jpayne@68: v Uncompressed Data ^ jpayne@68: | Filter 0 | jpayne@68: Encoder | Filter 1 | Decoder jpayne@68: | Filter n | jpayne@68: v Compressed Data ^ jpayne@68: jpayne@68: jpayne@68: 5.1. Alignment jpayne@68: jpayne@68: Alignment of uncompressed input data is usually the job of jpayne@68: the application producing the data. For example, to get the jpayne@68: best results, an archiver tool should make sure that all jpayne@68: PowerPC executable files in the archive stream start at jpayne@68: offsets that are multiples of four bytes. jpayne@68: jpayne@68: Some filters, for example LZMA2, can be configured to take jpayne@68: advantage of specified alignment of input data. Note that jpayne@68: taking advantage of aligned input can be beneficial also when jpayne@68: a filter is not the first filter in the chain. For example, jpayne@68: if you compress PowerPC executables, you may want to use the jpayne@68: PowerPC filter and chain that with the LZMA2 filter. Because jpayne@68: not only the input but also the output alignment of the PowerPC jpayne@68: filter is four bytes, it is now beneficial to set LZMA2 jpayne@68: settings so that the LZMA2 encoder can take advantage of its jpayne@68: four-byte-aligned input data. jpayne@68: jpayne@68: The output of the last filter in the chain is stored to the jpayne@68: Compressed Data field, which is is guaranteed to be aligned jpayne@68: to a multiple of four bytes relative to the beginning of the jpayne@68: Stream. This can increase jpayne@68: - speed, if the filtered data is handled multiple bytes at jpayne@68: a time by the filter-specific encoder and decoder, jpayne@68: because accessing aligned data in computer memory is jpayne@68: usually faster; and jpayne@68: - compression ratio, if the output data is later compressed jpayne@68: with an external compression tool. jpayne@68: jpayne@68: jpayne@68: 5.2. Security jpayne@68: jpayne@68: If filters would be allowed to be chained freely, it would be jpayne@68: possible to create malicious files, that would be very slow to jpayne@68: decode. Such files could be used to create denial of service jpayne@68: attacks. jpayne@68: jpayne@68: Slow files could occur when multiple filters are chained: jpayne@68: jpayne@68: v Compressed input data jpayne@68: | Filter 1 decoder (last filter) jpayne@68: | Filter 0 decoder (non-last filter) jpayne@68: v Uncompressed output data jpayne@68: jpayne@68: The decoder of the last filter in the chain produces a lot of jpayne@68: output from little input. Another filter in the chain takes the jpayne@68: output of the last filter, and produces very little output jpayne@68: while consuming a lot of input. As a result, a lot of data is jpayne@68: moved inside the filter chain, but the filter chain as a whole jpayne@68: gets very little work done. jpayne@68: jpayne@68: To prevent this kind of slow files, there are restrictions on jpayne@68: how the filters can be chained. These restrictions MUST be jpayne@68: taken into account when designing new filters. jpayne@68: jpayne@68: The maximum number of filters in the chain has been limited to jpayne@68: four, thus there can be at maximum of three non-last filters. jpayne@68: Of these three non-last filters, only two are allowed to change jpayne@68: the size of the data. jpayne@68: jpayne@68: The non-last filters, that change the size of the data, MUST jpayne@68: have a limit how much the decoder can compress the data: the jpayne@68: decoder SHOULD produce at least n bytes of output when the jpayne@68: filter is given 2n bytes of input. This limit is not jpayne@68: absolute, but significant deviations MUST be avoided. jpayne@68: jpayne@68: The above limitations guarantee that if the last filter in the jpayne@68: chain produces 4n bytes of output, the chain as a whole will jpayne@68: produce at least n bytes of output. jpayne@68: jpayne@68: jpayne@68: 5.3. Filters jpayne@68: jpayne@68: 5.3.1. LZMA2 jpayne@68: jpayne@68: LZMA (Lempel-Ziv-Markov chain-Algorithm) is a general-purpose jpayne@68: compression algorithm with high compression ratio and fast jpayne@68: decompression. LZMA is based on LZ77 and range coding jpayne@68: algorithms. jpayne@68: jpayne@68: LZMA2 is an extension on top of the original LZMA. LZMA2 uses jpayne@68: LZMA internally, but adds support for flushing the encoder, jpayne@68: uncompressed chunks, eases stateful decoder implementations, jpayne@68: and improves support for multithreading. Thus, the plain LZMA jpayne@68: will not be supported in this file format. jpayne@68: jpayne@68: Filter ID: 0x21 jpayne@68: Size of Filter Properties: 1 byte jpayne@68: Changes size of data: Yes jpayne@68: Allow as a non-last filter: No jpayne@68: Allow as the last filter: Yes jpayne@68: jpayne@68: Preferred alignment: jpayne@68: Input data: Adjustable to 1/2/4/8/16 byte(s) jpayne@68: Output data: 1 byte jpayne@68: jpayne@68: The format of the one-byte Filter Properties field is as jpayne@68: follows: jpayne@68: jpayne@68: Bits Mask Description jpayne@68: 0-5 0x3F Dictionary Size jpayne@68: 6-7 0xC0 Reserved for future use; MUST be zero for now. jpayne@68: jpayne@68: Dictionary Size is encoded with one-bit mantissa and five-bit jpayne@68: exponent. The smallest dictionary size is 4 KiB and the biggest jpayne@68: is 4 GiB. jpayne@68: jpayne@68: Raw value Mantissa Exponent Dictionary size jpayne@68: 0 2 11 4 KiB jpayne@68: 1 3 11 6 KiB jpayne@68: 2 2 12 8 KiB jpayne@68: 3 3 12 12 KiB jpayne@68: 4 2 13 16 KiB jpayne@68: 5 3 13 24 KiB jpayne@68: 6 2 14 32 KiB jpayne@68: ... ... ... ... jpayne@68: 35 3 27 768 MiB jpayne@68: 36 2 28 1024 MiB jpayne@68: 37 3 29 1536 MiB jpayne@68: 38 2 30 2048 MiB jpayne@68: 39 3 30 3072 MiB jpayne@68: 40 2 31 4096 MiB - 1 B jpayne@68: jpayne@68: Instead of having a table in the decoder, the dictionary size jpayne@68: can be decoded using the following C code: jpayne@68: jpayne@68: const uint8_t bits = get_dictionary_flags() & 0x3F; jpayne@68: if (bits > 40) jpayne@68: return DICTIONARY_TOO_BIG; // Bigger than 4 GiB jpayne@68: jpayne@68: uint32_t dictionary_size; jpayne@68: if (bits == 40) { jpayne@68: dictionary_size = UINT32_MAX; jpayne@68: } else { jpayne@68: dictionary_size = 2 | (bits & 1); jpayne@68: dictionary_size <<= bits / 2 + 11; jpayne@68: } jpayne@68: jpayne@68: jpayne@68: 5.3.2. Branch/Call/Jump Filters for Executables jpayne@68: jpayne@68: These filters convert relative branch, call, and jump jpayne@68: instructions to their absolute counterparts in executable jpayne@68: files. This conversion increases redundancy and thus jpayne@68: compression ratio. jpayne@68: jpayne@68: Size of Filter Properties: 0 or 4 bytes jpayne@68: Changes size of data: No jpayne@68: Allow as a non-last filter: Yes jpayne@68: Allow as the last filter: No jpayne@68: jpayne@68: Below is the list of filters in this category. The alignment jpayne@68: is the same for both input and output data. jpayne@68: jpayne@68: Filter ID Alignment Description jpayne@68: 0x04 1 byte x86 filter (BCJ) jpayne@68: 0x05 4 bytes PowerPC (big endian) filter jpayne@68: 0x06 16 bytes IA64 filter jpayne@68: 0x07 4 bytes ARM filter [1] jpayne@68: 0x08 2 bytes ARM Thumb filter [1] jpayne@68: 0x09 4 bytes SPARC filter jpayne@68: 0x0A 4 bytes ARM64 filter [2] jpayne@68: 0x0B 2 bytes RISC-V filter jpayne@68: jpayne@68: [1] These are for little endian instruction encoding. jpayne@68: This must not be confused with data endianness. jpayne@68: A processor configured for big endian data access jpayne@68: may still use little endian instruction encoding. jpayne@68: The filters don't care about the data endianness. jpayne@68: jpayne@68: [2] 4096-byte alignment gives the best results jpayne@68: because the address in the ADRP instruction jpayne@68: is a multiple of 4096 bytes. jpayne@68: jpayne@68: If the size of Filter Properties is four bytes, the Filter jpayne@68: Properties field contains the start offset used for address jpayne@68: conversions. It is stored as an unsigned 32-bit little endian jpayne@68: integer. The start offset MUST be a multiple of the alignment jpayne@68: of the filter as listed in the table above; if it isn't, the jpayne@68: decoder MUST indicate an error. If the size of Filter jpayne@68: Properties is zero, the start offset is zero. jpayne@68: jpayne@68: Setting the start offset may be useful if an executable has jpayne@68: multiple sections, and there are many cross-section calls. jpayne@68: Taking advantage of this feature usually requires usage of jpayne@68: the Subblock filter, whose design is not complete yet. jpayne@68: jpayne@68: jpayne@68: 5.3.3. Delta jpayne@68: jpayne@68: The Delta filter may increase compression ratio when the value jpayne@68: of the next byte correlates with the value of an earlier byte jpayne@68: at specified distance. jpayne@68: jpayne@68: Filter ID: 0x03 jpayne@68: Size of Filter Properties: 1 byte jpayne@68: Changes size of data: No jpayne@68: Allow as a non-last filter: Yes jpayne@68: Allow as the last filter: No jpayne@68: jpayne@68: Preferred alignment: jpayne@68: Input data: 1 byte jpayne@68: Output data: Same as the original input data jpayne@68: jpayne@68: The Properties byte indicates the delta distance, which can be jpayne@68: 1-256 bytes backwards from the current byte: 0x00 indicates jpayne@68: distance of 1 byte and 0xFF distance of 256 bytes. jpayne@68: jpayne@68: jpayne@68: 5.3.3.1. Format of the Encoded Output jpayne@68: jpayne@68: The code below illustrates both encoding and decoding with jpayne@68: the Delta filter. jpayne@68: jpayne@68: // Distance is in the range [1, 256]. jpayne@68: const unsigned int distance = get_properties_byte() + 1; jpayne@68: uint8_t pos = 0; jpayne@68: uint8_t delta[256]; jpayne@68: jpayne@68: memset(delta, 0, sizeof(delta)); jpayne@68: jpayne@68: while (1) { jpayne@68: const int byte = read_byte(); jpayne@68: if (byte == EOF) jpayne@68: break; jpayne@68: jpayne@68: uint8_t tmp = delta[(uint8_t)(distance + pos)]; jpayne@68: if (is_encoder) { jpayne@68: tmp = (uint8_t)(byte) - tmp; jpayne@68: delta[pos] = (uint8_t)(byte); jpayne@68: } else { jpayne@68: tmp = (uint8_t)(byte) + tmp; jpayne@68: delta[pos] = tmp; jpayne@68: } jpayne@68: jpayne@68: write_byte(tmp); jpayne@68: --pos; jpayne@68: } jpayne@68: jpayne@68: jpayne@68: 5.4. Custom Filter IDs jpayne@68: jpayne@68: If a developer wants to use custom Filter IDs, there are two jpayne@68: choices. The first choice is to contact Lasse Collin and ask jpayne@68: him to allocate a range of IDs for the developer. jpayne@68: jpayne@68: The second choice is to generate a 40-bit random integer jpayne@68: which the developer can use as a personal Developer ID. jpayne@68: To minimize the risk of collisions, Developer ID has to be jpayne@68: a randomly generated integer, not manually selected "hex word". jpayne@68: The following command, which works on many free operating jpayne@68: systems, can be used to generate Developer ID: jpayne@68: jpayne@68: dd if=/dev/urandom bs=5 count=1 | hexdump jpayne@68: jpayne@68: The developer can then use the Developer ID to create unique jpayne@68: (well, hopefully unique) Filter IDs. jpayne@68: jpayne@68: Bits Mask Description jpayne@68: 0-15 0x0000_0000_0000_FFFF Filter ID jpayne@68: 16-55 0x00FF_FFFF_FFFF_0000 Developer ID jpayne@68: 56-62 0x3F00_0000_0000_0000 Static prefix: 0x3F jpayne@68: jpayne@68: The resulting 63-bit integer will use 9 bytes of space when jpayne@68: stored using the encoding described in Section 1.2. To get jpayne@68: a shorter ID, see the beginning of this Section how to jpayne@68: request a custom ID range. jpayne@68: jpayne@68: jpayne@68: 5.4.1. Reserved Custom Filter ID Ranges jpayne@68: jpayne@68: Range Description jpayne@68: 0x0000_0300 - 0x0000_04FF Reserved to ease .7z compatibility jpayne@68: 0x0002_0000 - 0x0007_FFFF Reserved to ease .7z compatibility jpayne@68: 0x0200_0000 - 0x07FF_FFFF Reserved to ease .7z compatibility jpayne@68: jpayne@68: jpayne@68: 6. Cyclic Redundancy Checks jpayne@68: jpayne@68: There are several incompatible variations to calculate CRC32 jpayne@68: and CRC64. For simplicity and clarity, complete examples are jpayne@68: provided to calculate the checks as they are used in this file jpayne@68: format. Implementations MAY use different code as long as it jpayne@68: gives identical results. jpayne@68: jpayne@68: The program below reads data from standard input, calculates jpayne@68: the CRC32 and CRC64 values, and prints the calculated values jpayne@68: as big endian hexadecimal strings to standard output. jpayne@68: jpayne@68: #include jpayne@68: #include jpayne@68: #include jpayne@68: jpayne@68: uint32_t crc32_table[256]; jpayne@68: uint64_t crc64_table[256]; jpayne@68: jpayne@68: void jpayne@68: init(void) jpayne@68: { jpayne@68: static const uint32_t poly32 = UINT32_C(0xEDB88320); jpayne@68: static const uint64_t poly64 jpayne@68: = UINT64_C(0xC96C5795D7870F42); jpayne@68: jpayne@68: for (size_t i = 0; i < 256; ++i) { jpayne@68: uint32_t crc32 = i; jpayne@68: uint64_t crc64 = i; jpayne@68: jpayne@68: for (size_t j = 0; j < 8; ++j) { jpayne@68: if (crc32 & 1) jpayne@68: crc32 = (crc32 >> 1) ^ poly32; jpayne@68: else jpayne@68: crc32 >>= 1; jpayne@68: jpayne@68: if (crc64 & 1) jpayne@68: crc64 = (crc64 >> 1) ^ poly64; jpayne@68: else jpayne@68: crc64 >>= 1; jpayne@68: } jpayne@68: jpayne@68: crc32_table[i] = crc32; jpayne@68: crc64_table[i] = crc64; jpayne@68: } jpayne@68: } jpayne@68: jpayne@68: uint32_t jpayne@68: crc32(const uint8_t *buf, size_t size, uint32_t crc) jpayne@68: { jpayne@68: crc = ~crc; jpayne@68: for (size_t i = 0; i < size; ++i) jpayne@68: crc = crc32_table[buf[i] ^ (crc & 0xFF)] jpayne@68: ^ (crc >> 8); jpayne@68: return ~crc; jpayne@68: } jpayne@68: jpayne@68: uint64_t jpayne@68: crc64(const uint8_t *buf, size_t size, uint64_t crc) jpayne@68: { jpayne@68: crc = ~crc; jpayne@68: for (size_t i = 0; i < size; ++i) jpayne@68: crc = crc64_table[buf[i] ^ (crc & 0xFF)] jpayne@68: ^ (crc >> 8); jpayne@68: return ~crc; jpayne@68: } jpayne@68: jpayne@68: int jpayne@68: main() jpayne@68: { jpayne@68: init(); jpayne@68: jpayne@68: uint32_t value32 = 0; jpayne@68: uint64_t value64 = 0; jpayne@68: uint64_t total_size = 0; jpayne@68: uint8_t buf[8192]; jpayne@68: jpayne@68: while (1) { jpayne@68: const size_t buf_size jpayne@68: = fread(buf, 1, sizeof(buf), stdin); jpayne@68: if (buf_size == 0) jpayne@68: break; jpayne@68: jpayne@68: total_size += buf_size; jpayne@68: value32 = crc32(buf, buf_size, value32); jpayne@68: value64 = crc64(buf, buf_size, value64); jpayne@68: } jpayne@68: jpayne@68: printf("Bytes: %" PRIu64 "\n", total_size); jpayne@68: printf("CRC-32: 0x%08" PRIX32 "\n", value32); jpayne@68: printf("CRC-64: 0x%016" PRIX64 "\n", value64); jpayne@68: jpayne@68: return 0; jpayne@68: } jpayne@68: jpayne@68: jpayne@68: 7. References jpayne@68: jpayne@68: LZMA SDK - The original LZMA implementation jpayne@68: https://7-zip.org/sdk.html jpayne@68: jpayne@68: LZMA Utils - LZMA adapted to POSIX-like systems jpayne@68: https://tukaani.org/lzma/ jpayne@68: jpayne@68: XZ Utils - The next generation of LZMA Utils jpayne@68: https://tukaani.org/xz/ jpayne@68: jpayne@68: [RFC-1952] jpayne@68: GZIP file format specification version 4.3 jpayne@68: https://www.ietf.org/rfc/rfc1952.txt jpayne@68: - Notation of byte boxes in section "2.1. Overall conventions" jpayne@68: jpayne@68: [RFC-2119] jpayne@68: Key words for use in RFCs to Indicate Requirement Levels jpayne@68: https://www.ietf.org/rfc/rfc2119.txt jpayne@68: jpayne@68: [GNU-tar] jpayne@68: GNU tar 1.35 manual jpayne@68: https://www.gnu.org/software/tar/manual/html_node/Blocking-Factor.html jpayne@68: - Node 9.4.2 "Blocking Factor", paragraph that begins jpayne@68: "gzip will complain about trailing garbage" jpayne@68: - Note that this URL points to the latest version of the jpayne@68: manual, and may some day not contain the note which is in jpayne@68: 1.35. For the exact version of the manual, download GNU jpayne@68: tar 1.35: ftp://ftp.gnu.org/pub/gnu/tar/tar-1.35.tar.gz jpayne@68: