annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt @ 68:5028fdace37b

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 16:23:26 -0400
parents
children
rev   line source
jpayne@68 1
jpayne@68 2 The .lzma File Format
jpayne@68 3 =====================
jpayne@68 4
jpayne@68 5 0. Preface
jpayne@68 6 0.1. Notices and Acknowledgements
jpayne@68 7 0.2. Changes
jpayne@68 8 1. File Format
jpayne@68 9 1.1. Header
jpayne@68 10 1.1.1. Properties
jpayne@68 11 1.1.2. Dictionary Size
jpayne@68 12 1.1.3. Uncompressed Size
jpayne@68 13 1.2. LZMA Compressed Data
jpayne@68 14 2. References
jpayne@68 15
jpayne@68 16
jpayne@68 17 0. Preface
jpayne@68 18
jpayne@68 19 This document describes the .lzma file format, which is
jpayne@68 20 sometimes also called LZMA_Alone format. It is a legacy file
jpayne@68 21 format, which is being or has been replaced by the .xz format.
jpayne@68 22 The MIME type of the .lzma format is `application/x-lzma'.
jpayne@68 23
jpayne@68 24 The most commonly used software to handle .lzma files are
jpayne@68 25 LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document
jpayne@68 26 describes some of the differences between these implementations
jpayne@68 27 and gives hints what subset of the .lzma format is the most
jpayne@68 28 portable.
jpayne@68 29
jpayne@68 30
jpayne@68 31 0.1. Notices and Acknowledgements
jpayne@68 32
jpayne@68 33 This file format was designed by Igor Pavlov for use in
jpayne@68 34 LZMA SDK. This document was written by Lasse Collin
jpayne@68 35 <lasse.collin@tukaani.org> using the documentation found
jpayne@68 36 from the LZMA SDK.
jpayne@68 37
jpayne@68 38 This document has been put into the public domain.
jpayne@68 39
jpayne@68 40
jpayne@68 41 0.2. Changes
jpayne@68 42
jpayne@68 43 Last modified: 2024-04-08 17:35+0300
jpayne@68 44
jpayne@68 45 From version 2011-04-12 11:55+0300 to 2022-07-13 21:00+0300:
jpayne@68 46 The section 1.1.3 was modified to allow End of Payload Marker
jpayne@68 47 with a known Uncompressed Size.
jpayne@68 48
jpayne@68 49
jpayne@68 50 1. File Format
jpayne@68 51
jpayne@68 52 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
jpayne@68 53 | Header | LZMA Compressed Data |
jpayne@68 54 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
jpayne@68 55
jpayne@68 56 The .lzma format file consist of 13-byte Header followed by
jpayne@68 57 the LZMA Compressed Data.
jpayne@68 58
jpayne@68 59 Unlike the .gz, .bz2, and .xz formats, it is not possible to
jpayne@68 60 concatenate multiple .lzma files as is and expect the
jpayne@68 61 decompression tool to decode the resulting file as if it were
jpayne@68 62 a single .lzma file.
jpayne@68 63
jpayne@68 64 For example, the command line tools from LZMA Utils and
jpayne@68 65 LZMA SDK silently ignore all the data after the first .lzma
jpayne@68 66 stream. In contrast, the command line tool from XZ Utils
jpayne@68 67 considers the .lzma file to be corrupt if there is data after
jpayne@68 68 the first .lzma stream.
jpayne@68 69
jpayne@68 70
jpayne@68 71 1.1. Header
jpayne@68 72
jpayne@68 73 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
jpayne@68 74 | Properties | Dictionary Size | Uncompressed Size |
jpayne@68 75 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
jpayne@68 76
jpayne@68 77
jpayne@68 78 1.1.1. Properties
jpayne@68 79
jpayne@68 80 The Properties field contains three properties. An abbreviation
jpayne@68 81 is given in parentheses, followed by the value range of the
jpayne@68 82 property. The field consists of
jpayne@68 83
jpayne@68 84 1) the number of literal context bits (lc, [0, 8]);
jpayne@68 85 2) the number of literal position bits (lp, [0, 4]); and
jpayne@68 86 3) the number of position bits (pb, [0, 4]).
jpayne@68 87
jpayne@68 88 The properties are encoded using the following formula:
jpayne@68 89
jpayne@68 90 Properties = (pb * 5 + lp) * 9 + lc
jpayne@68 91
jpayne@68 92 The following C code illustrates a straightforward way to
jpayne@68 93 decode the Properties field:
jpayne@68 94
jpayne@68 95 uint8_t lc, lp, pb;
jpayne@68 96 uint8_t prop = get_lzma_properties();
jpayne@68 97 if (prop > (4 * 5 + 4) * 9 + 8)
jpayne@68 98 return LZMA_PROPERTIES_ERROR;
jpayne@68 99
jpayne@68 100 pb = prop / (9 * 5);
jpayne@68 101 prop -= pb * 9 * 5;
jpayne@68 102 lp = prop / 9;
jpayne@68 103 lc = prop - lp * 9;
jpayne@68 104
jpayne@68 105 XZ Utils has an additional requirement: lc + lp <= 4. Files
jpayne@68 106 which don't follow this requirement cannot be decompressed
jpayne@68 107 with XZ Utils. Usually this isn't a problem since the most
jpayne@68 108 common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb
jpayne@68 109 combination that the files created by LZMA Utils can have,
jpayne@68 110 but LZMA Utils can decompress files with any lc/lp/pb.
jpayne@68 111
jpayne@68 112
jpayne@68 113 1.1.2. Dictionary Size
jpayne@68 114
jpayne@68 115 Dictionary Size is stored as an unsigned 32-bit little endian
jpayne@68 116 integer. Any 32-bit value is possible, but for maximum
jpayne@68 117 portability, only sizes of 2^n and 2^n + 2^(n-1) should be
jpayne@68 118 used.
jpayne@68 119
jpayne@68 120 LZMA Utils creates only files with dictionary size 2^n,
jpayne@68 121 16 <= n <= 25. LZMA Utils can decompress files with any
jpayne@68 122 dictionary size.
jpayne@68 123
jpayne@68 124 XZ Utils creates and decompresses .lzma files only with
jpayne@68 125 dictionary sizes 2^n and 2^n + 2^(n-1). If some other
jpayne@68 126 dictionary size is specified when compressing, the value
jpayne@68 127 stored in the Dictionary Size field is a rounded up, but the
jpayne@68 128 specified value is still used in the actual compression code.
jpayne@68 129
jpayne@68 130
jpayne@68 131 1.1.3. Uncompressed Size
jpayne@68 132
jpayne@68 133 Uncompressed Size is stored as unsigned 64-bit little endian
jpayne@68 134 integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates
jpayne@68 135 that Uncompressed Size is unknown. End of Payload Marker (*)
jpayne@68 136 is used if Uncompressed Size is unknown. End of Payload Marker
jpayne@68 137 is allowed but rarely used if Uncompressed Size is known.
jpayne@68 138 XZ Utils 5.2.5 and older don't support .lzma files that have
jpayne@68 139 End of Payload Marker together with a known Uncompressed Size.
jpayne@68 140
jpayne@68 141 XZ Utils rejects files whose Uncompressed Size field specifies
jpayne@68 142 a known size that is 256 GiB or more. This is to reject false
jpayne@68 143 positives when trying to guess if the input file is in the
jpayne@68 144 .lzma format. When Uncompressed Size is unknown, there is no
jpayne@68 145 limit for the uncompressed size of the file.
jpayne@68 146
jpayne@68 147 (*) Some tools use the term End of Stream (EOS) marker
jpayne@68 148 instead of End of Payload Marker.
jpayne@68 149
jpayne@68 150
jpayne@68 151 1.2. LZMA Compressed Data
jpayne@68 152
jpayne@68 153 Detailed description of the format of this field is out of
jpayne@68 154 scope of this document.
jpayne@68 155
jpayne@68 156
jpayne@68 157 2. References
jpayne@68 158
jpayne@68 159 LZMA SDK - The original LZMA implementation
jpayne@68 160 https://7-zip.org/sdk.html
jpayne@68 161
jpayne@68 162 7-Zip
jpayne@68 163 https://7-zip.org/
jpayne@68 164
jpayne@68 165 LZMA Utils - LZMA adapted to POSIX-like systems
jpayne@68 166 https://tukaani.org/lzma/
jpayne@68 167
jpayne@68 168 XZ Utils - The next generation of LZMA Utils
jpayne@68 169 https://tukaani.org/xz/
jpayne@68 170
jpayne@68 171 The .xz file format - The successor of the .lzma format
jpayne@68 172 https://tukaani.org/xz/xz-file-format.txt
jpayne@68 173