Mercurial > repos > rliterman > csp2
comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt @ 68:5028fdace37b
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 16:23:26 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
67:0e9998148a16 | 68:5028fdace37b |
---|---|
1 | |
2 The .lzma File Format | |
3 ===================== | |
4 | |
5 0. Preface | |
6 0.1. Notices and Acknowledgements | |
7 0.2. Changes | |
8 1. File Format | |
9 1.1. Header | |
10 1.1.1. Properties | |
11 1.1.2. Dictionary Size | |
12 1.1.3. Uncompressed Size | |
13 1.2. LZMA Compressed Data | |
14 2. References | |
15 | |
16 | |
17 0. Preface | |
18 | |
19 This document describes the .lzma file format, which is | |
20 sometimes also called LZMA_Alone format. It is a legacy file | |
21 format, which is being or has been replaced by the .xz format. | |
22 The MIME type of the .lzma format is `application/x-lzma'. | |
23 | |
24 The most commonly used software to handle .lzma files are | |
25 LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document | |
26 describes some of the differences between these implementations | |
27 and gives hints what subset of the .lzma format is the most | |
28 portable. | |
29 | |
30 | |
31 0.1. Notices and Acknowledgements | |
32 | |
33 This file format was designed by Igor Pavlov for use in | |
34 LZMA SDK. This document was written by Lasse Collin | |
35 <lasse.collin@tukaani.org> using the documentation found | |
36 from the LZMA SDK. | |
37 | |
38 This document has been put into the public domain. | |
39 | |
40 | |
41 0.2. Changes | |
42 | |
43 Last modified: 2024-04-08 17:35+0300 | |
44 | |
45 From version 2011-04-12 11:55+0300 to 2022-07-13 21:00+0300: | |
46 The section 1.1.3 was modified to allow End of Payload Marker | |
47 with a known Uncompressed Size. | |
48 | |
49 | |
50 1. File Format | |
51 | |
52 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+ | |
53 | Header | LZMA Compressed Data | | |
54 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+ | |
55 | |
56 The .lzma format file consist of 13-byte Header followed by | |
57 the LZMA Compressed Data. | |
58 | |
59 Unlike the .gz, .bz2, and .xz formats, it is not possible to | |
60 concatenate multiple .lzma files as is and expect the | |
61 decompression tool to decode the resulting file as if it were | |
62 a single .lzma file. | |
63 | |
64 For example, the command line tools from LZMA Utils and | |
65 LZMA SDK silently ignore all the data after the first .lzma | |
66 stream. In contrast, the command line tool from XZ Utils | |
67 considers the .lzma file to be corrupt if there is data after | |
68 the first .lzma stream. | |
69 | |
70 | |
71 1.1. Header | |
72 | |
73 +------------+----+----+----+----+--+--+--+--+--+--+--+--+ | |
74 | Properties | Dictionary Size | Uncompressed Size | | |
75 +------------+----+----+----+----+--+--+--+--+--+--+--+--+ | |
76 | |
77 | |
78 1.1.1. Properties | |
79 | |
80 The Properties field contains three properties. An abbreviation | |
81 is given in parentheses, followed by the value range of the | |
82 property. The field consists of | |
83 | |
84 1) the number of literal context bits (lc, [0, 8]); | |
85 2) the number of literal position bits (lp, [0, 4]); and | |
86 3) the number of position bits (pb, [0, 4]). | |
87 | |
88 The properties are encoded using the following formula: | |
89 | |
90 Properties = (pb * 5 + lp) * 9 + lc | |
91 | |
92 The following C code illustrates a straightforward way to | |
93 decode the Properties field: | |
94 | |
95 uint8_t lc, lp, pb; | |
96 uint8_t prop = get_lzma_properties(); | |
97 if (prop > (4 * 5 + 4) * 9 + 8) | |
98 return LZMA_PROPERTIES_ERROR; | |
99 | |
100 pb = prop / (9 * 5); | |
101 prop -= pb * 9 * 5; | |
102 lp = prop / 9; | |
103 lc = prop - lp * 9; | |
104 | |
105 XZ Utils has an additional requirement: lc + lp <= 4. Files | |
106 which don't follow this requirement cannot be decompressed | |
107 with XZ Utils. Usually this isn't a problem since the most | |
108 common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb | |
109 combination that the files created by LZMA Utils can have, | |
110 but LZMA Utils can decompress files with any lc/lp/pb. | |
111 | |
112 | |
113 1.1.2. Dictionary Size | |
114 | |
115 Dictionary Size is stored as an unsigned 32-bit little endian | |
116 integer. Any 32-bit value is possible, but for maximum | |
117 portability, only sizes of 2^n and 2^n + 2^(n-1) should be | |
118 used. | |
119 | |
120 LZMA Utils creates only files with dictionary size 2^n, | |
121 16 <= n <= 25. LZMA Utils can decompress files with any | |
122 dictionary size. | |
123 | |
124 XZ Utils creates and decompresses .lzma files only with | |
125 dictionary sizes 2^n and 2^n + 2^(n-1). If some other | |
126 dictionary size is specified when compressing, the value | |
127 stored in the Dictionary Size field is a rounded up, but the | |
128 specified value is still used in the actual compression code. | |
129 | |
130 | |
131 1.1.3. Uncompressed Size | |
132 | |
133 Uncompressed Size is stored as unsigned 64-bit little endian | |
134 integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates | |
135 that Uncompressed Size is unknown. End of Payload Marker (*) | |
136 is used if Uncompressed Size is unknown. End of Payload Marker | |
137 is allowed but rarely used if Uncompressed Size is known. | |
138 XZ Utils 5.2.5 and older don't support .lzma files that have | |
139 End of Payload Marker together with a known Uncompressed Size. | |
140 | |
141 XZ Utils rejects files whose Uncompressed Size field specifies | |
142 a known size that is 256 GiB or more. This is to reject false | |
143 positives when trying to guess if the input file is in the | |
144 .lzma format. When Uncompressed Size is unknown, there is no | |
145 limit for the uncompressed size of the file. | |
146 | |
147 (*) Some tools use the term End of Stream (EOS) marker | |
148 instead of End of Payload Marker. | |
149 | |
150 | |
151 1.2. LZMA Compressed Data | |
152 | |
153 Detailed description of the format of this field is out of | |
154 scope of this document. | |
155 | |
156 | |
157 2. References | |
158 | |
159 LZMA SDK - The original LZMA implementation | |
160 https://7-zip.org/sdk.html | |
161 | |
162 7-Zip | |
163 https://7-zip.org/ | |
164 | |
165 LZMA Utils - LZMA adapted to POSIX-like systems | |
166 https://tukaani.org/lzma/ | |
167 | |
168 XZ Utils - The next generation of LZMA Utils | |
169 https://tukaani.org/xz/ | |
170 | |
171 The .xz file format - The successor of the .lzma format | |
172 https://tukaani.org/xz/xz-file-format.txt | |
173 |