comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt @ 68:5028fdace37b

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 16:23:26 -0400
parents
children
comparison
equal deleted inserted replaced
67:0e9998148a16 68:5028fdace37b
1
2 The .lzma File Format
3 =====================
4
5 0. Preface
6 0.1. Notices and Acknowledgements
7 0.2. Changes
8 1. File Format
9 1.1. Header
10 1.1.1. Properties
11 1.1.2. Dictionary Size
12 1.1.3. Uncompressed Size
13 1.2. LZMA Compressed Data
14 2. References
15
16
17 0. Preface
18
19 This document describes the .lzma file format, which is
20 sometimes also called LZMA_Alone format. It is a legacy file
21 format, which is being or has been replaced by the .xz format.
22 The MIME type of the .lzma format is `application/x-lzma'.
23
24 The most commonly used software to handle .lzma files are
25 LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document
26 describes some of the differences between these implementations
27 and gives hints what subset of the .lzma format is the most
28 portable.
29
30
31 0.1. Notices and Acknowledgements
32
33 This file format was designed by Igor Pavlov for use in
34 LZMA SDK. This document was written by Lasse Collin
35 <lasse.collin@tukaani.org> using the documentation found
36 from the LZMA SDK.
37
38 This document has been put into the public domain.
39
40
41 0.2. Changes
42
43 Last modified: 2024-04-08 17:35+0300
44
45 From version 2011-04-12 11:55+0300 to 2022-07-13 21:00+0300:
46 The section 1.1.3 was modified to allow End of Payload Marker
47 with a known Uncompressed Size.
48
49
50 1. File Format
51
52 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
53 | Header | LZMA Compressed Data |
54 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
55
56 The .lzma format file consist of 13-byte Header followed by
57 the LZMA Compressed Data.
58
59 Unlike the .gz, .bz2, and .xz formats, it is not possible to
60 concatenate multiple .lzma files as is and expect the
61 decompression tool to decode the resulting file as if it were
62 a single .lzma file.
63
64 For example, the command line tools from LZMA Utils and
65 LZMA SDK silently ignore all the data after the first .lzma
66 stream. In contrast, the command line tool from XZ Utils
67 considers the .lzma file to be corrupt if there is data after
68 the first .lzma stream.
69
70
71 1.1. Header
72
73 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
74 | Properties | Dictionary Size | Uncompressed Size |
75 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
76
77
78 1.1.1. Properties
79
80 The Properties field contains three properties. An abbreviation
81 is given in parentheses, followed by the value range of the
82 property. The field consists of
83
84 1) the number of literal context bits (lc, [0, 8]);
85 2) the number of literal position bits (lp, [0, 4]); and
86 3) the number of position bits (pb, [0, 4]).
87
88 The properties are encoded using the following formula:
89
90 Properties = (pb * 5 + lp) * 9 + lc
91
92 The following C code illustrates a straightforward way to
93 decode the Properties field:
94
95 uint8_t lc, lp, pb;
96 uint8_t prop = get_lzma_properties();
97 if (prop > (4 * 5 + 4) * 9 + 8)
98 return LZMA_PROPERTIES_ERROR;
99
100 pb = prop / (9 * 5);
101 prop -= pb * 9 * 5;
102 lp = prop / 9;
103 lc = prop - lp * 9;
104
105 XZ Utils has an additional requirement: lc + lp <= 4. Files
106 which don't follow this requirement cannot be decompressed
107 with XZ Utils. Usually this isn't a problem since the most
108 common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb
109 combination that the files created by LZMA Utils can have,
110 but LZMA Utils can decompress files with any lc/lp/pb.
111
112
113 1.1.2. Dictionary Size
114
115 Dictionary Size is stored as an unsigned 32-bit little endian
116 integer. Any 32-bit value is possible, but for maximum
117 portability, only sizes of 2^n and 2^n + 2^(n-1) should be
118 used.
119
120 LZMA Utils creates only files with dictionary size 2^n,
121 16 <= n <= 25. LZMA Utils can decompress files with any
122 dictionary size.
123
124 XZ Utils creates and decompresses .lzma files only with
125 dictionary sizes 2^n and 2^n + 2^(n-1). If some other
126 dictionary size is specified when compressing, the value
127 stored in the Dictionary Size field is a rounded up, but the
128 specified value is still used in the actual compression code.
129
130
131 1.1.3. Uncompressed Size
132
133 Uncompressed Size is stored as unsigned 64-bit little endian
134 integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates
135 that Uncompressed Size is unknown. End of Payload Marker (*)
136 is used if Uncompressed Size is unknown. End of Payload Marker
137 is allowed but rarely used if Uncompressed Size is known.
138 XZ Utils 5.2.5 and older don't support .lzma files that have
139 End of Payload Marker together with a known Uncompressed Size.
140
141 XZ Utils rejects files whose Uncompressed Size field specifies
142 a known size that is 256 GiB or more. This is to reject false
143 positives when trying to guess if the input file is in the
144 .lzma format. When Uncompressed Size is unknown, there is no
145 limit for the uncompressed size of the file.
146
147 (*) Some tools use the term End of Stream (EOS) marker
148 instead of End of Payload Marker.
149
150
151 1.2. LZMA Compressed Data
152
153 Detailed description of the format of this field is out of
154 scope of this document.
155
156
157 2. References
158
159 LZMA SDK - The original LZMA implementation
160 https://7-zip.org/sdk.html
161
162 7-Zip
163 https://7-zip.org/
164
165 LZMA Utils - LZMA adapted to POSIX-like systems
166 https://tukaani.org/lzma/
167
168 XZ Utils - The next generation of LZMA Utils
169 https://tukaani.org/xz/
170
171 The .xz file format - The successor of the .lzma format
172 https://tukaani.org/xz/xz-file-format.txt
173