csp2: CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt annotate

annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt @ 68:5028fdace37b

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d

author	jpayne
date	Tue, 18 Mar 2025 16:23:26 -0400
parents
children

rev	line source
jpayne@68	1
jpayne@68	2 The .lzma File Format
jpayne@68	3 =====================
jpayne@68	4
jpayne@68	5 0. Preface
jpayne@68	6 0.1. Notices and Acknowledgements
jpayne@68	7 0.2. Changes
jpayne@68	8 1. File Format
jpayne@68	9 1.1. Header
jpayne@68	10 1.1.1. Properties
jpayne@68	11 1.1.2. Dictionary Size
jpayne@68	12 1.1.3. Uncompressed Size
jpayne@68	13 1.2. LZMA Compressed Data
jpayne@68	14 2. References
jpayne@68	15
jpayne@68	16
jpayne@68	17 0. Preface
jpayne@68	18
jpayne@68	19 This document describes the .lzma file format, which is
jpayne@68	20 sometimes also called LZMA_Alone format. It is a legacy file
jpayne@68	21 format, which is being or has been replaced by the .xz format.
jpayne@68	22 The MIME type of the .lzma format is `application/x-lzma'.
jpayne@68	23
jpayne@68	24 The most commonly used software to handle .lzma files are
jpayne@68	25 LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document
jpayne@68	26 describes some of the differences between these implementations
jpayne@68	27 and gives hints what subset of the .lzma format is the most
jpayne@68	28 portable.
jpayne@68	29
jpayne@68	30
jpayne@68	31 0.1. Notices and Acknowledgements
jpayne@68	32
jpayne@68	33 This file format was designed by Igor Pavlov for use in
jpayne@68	34 LZMA SDK. This document was written by Lasse Collin
jpayne@68	35 <lasse.collin@tukaani.org> using the documentation found
jpayne@68	36 from the LZMA SDK.
jpayne@68	37
jpayne@68	38 This document has been put into the public domain.
jpayne@68	39
jpayne@68	40
jpayne@68	41 0.2. Changes
jpayne@68	42
jpayne@68	43 Last modified: 2024-04-08 17:35+0300
jpayne@68	44
jpayne@68	45 From version 2011-04-12 11:55+0300 to 2022-07-13 21:00+0300:
jpayne@68	46 The section 1.1.3 was modified to allow End of Payload Marker
jpayne@68	47 with a known Uncompressed Size.
jpayne@68	48
jpayne@68	49
jpayne@68	50 1. File Format
jpayne@68	51
jpayne@68	52 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
jpayne@68	53 \| Header \| LZMA Compressed Data \|
jpayne@68	54 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
jpayne@68	55
jpayne@68	56 The .lzma format file consist of 13-byte Header followed by
jpayne@68	57 the LZMA Compressed Data.
jpayne@68	58
jpayne@68	59 Unlike the .gz, .bz2, and .xz formats, it is not possible to
jpayne@68	60 concatenate multiple .lzma files as is and expect the
jpayne@68	61 decompression tool to decode the resulting file as if it were
jpayne@68	62 a single .lzma file.
jpayne@68	63
jpayne@68	64 For example, the command line tools from LZMA Utils and
jpayne@68	65 LZMA SDK silently ignore all the data after the first .lzma
jpayne@68	66 stream. In contrast, the command line tool from XZ Utils
jpayne@68	67 considers the .lzma file to be corrupt if there is data after
jpayne@68	68 the first .lzma stream.
jpayne@68	69
jpayne@68	70
jpayne@68	71 1.1. Header
jpayne@68	72
jpayne@68	73 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
jpayne@68	74 \| Properties \| Dictionary Size \| Uncompressed Size \|
jpayne@68	75 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
jpayne@68	76
jpayne@68	77
jpayne@68	78 1.1.1. Properties
jpayne@68	79
jpayne@68	80 The Properties field contains three properties. An abbreviation
jpayne@68	81 is given in parentheses, followed by the value range of the
jpayne@68	82 property. The field consists of
jpayne@68	83
jpayne@68	84 1) the number of literal context bits (lc, [0, 8]);
jpayne@68	85 2) the number of literal position bits (lp, [0, 4]); and
jpayne@68	86 3) the number of position bits (pb, [0, 4]).
jpayne@68	87
jpayne@68	88 The properties are encoded using the following formula:
jpayne@68	89
jpayne@68	90 Properties = (pb * 5 + lp) * 9 + lc
jpayne@68	91
jpayne@68	92 The following C code illustrates a straightforward way to
jpayne@68	93 decode the Properties field:
jpayne@68	94
jpayne@68	95 uint8_t lc, lp, pb;
jpayne@68	96 uint8_t prop = get_lzma_properties();
jpayne@68	97 if (prop > (4 * 5 + 4) * 9 + 8)
jpayne@68	98 return LZMA_PROPERTIES_ERROR;
jpayne@68	99
jpayne@68	100 pb = prop / (9 * 5);
jpayne@68	101 prop -= pb * 9 * 5;
jpayne@68	102 lp = prop / 9;
jpayne@68	103 lc = prop - lp * 9;
jpayne@68	104
jpayne@68	105 XZ Utils has an additional requirement: lc + lp <= 4. Files
jpayne@68	106 which don't follow this requirement cannot be decompressed
jpayne@68	107 with XZ Utils. Usually this isn't a problem since the most
jpayne@68	108 common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb
jpayne@68	109 combination that the files created by LZMA Utils can have,
jpayne@68	110 but LZMA Utils can decompress files with any lc/lp/pb.
jpayne@68	111
jpayne@68	112
jpayne@68	113 1.1.2. Dictionary Size
jpayne@68	114
jpayne@68	115 Dictionary Size is stored as an unsigned 32-bit little endian
jpayne@68	116 integer. Any 32-bit value is possible, but for maximum
jpayne@68	117 portability, only sizes of 2^n and 2^n + 2^(n-1) should be
jpayne@68	118 used.
jpayne@68	119
jpayne@68	120 LZMA Utils creates only files with dictionary size 2^n,
jpayne@68	121 16 <= n <= 25. LZMA Utils can decompress files with any
jpayne@68	122 dictionary size.
jpayne@68	123
jpayne@68	124 XZ Utils creates and decompresses .lzma files only with
jpayne@68	125 dictionary sizes 2^n and 2^n + 2^(n-1). If some other
jpayne@68	126 dictionary size is specified when compressing, the value
jpayne@68	127 stored in the Dictionary Size field is a rounded up, but the
jpayne@68	128 specified value is still used in the actual compression code.
jpayne@68	129
jpayne@68	130
jpayne@68	131 1.1.3. Uncompressed Size
jpayne@68	132
jpayne@68	133 Uncompressed Size is stored as unsigned 64-bit little endian
jpayne@68	134 integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates
jpayne@68	135 that Uncompressed Size is unknown. End of Payload Marker (*)
jpayne@68	136 is used if Uncompressed Size is unknown. End of Payload Marker
jpayne@68	137 is allowed but rarely used if Uncompressed Size is known.
jpayne@68	138 XZ Utils 5.2.5 and older don't support .lzma files that have
jpayne@68	139 End of Payload Marker together with a known Uncompressed Size.
jpayne@68	140
jpayne@68	141 XZ Utils rejects files whose Uncompressed Size field specifies
jpayne@68	142 a known size that is 256 GiB or more. This is to reject false
jpayne@68	143 positives when trying to guess if the input file is in the
jpayne@68	144 .lzma format. When Uncompressed Size is unknown, there is no
jpayne@68	145 limit for the uncompressed size of the file.
jpayne@68	146
jpayne@68	147 (*) Some tools use the term End of Stream (EOS) marker
jpayne@68	148 instead of End of Payload Marker.
jpayne@68	149
jpayne@68	150
jpayne@68	151 1.2. LZMA Compressed Data
jpayne@68	152
jpayne@68	153 Detailed description of the format of this field is out of
jpayne@68	154 scope of this document.
jpayne@68	155
jpayne@68	156
jpayne@68	157 2. References
jpayne@68	158
jpayne@68	159 LZMA SDK - The original LZMA implementation
jpayne@68	160 https://7-zip.org/sdk.html
jpayne@68	161
jpayne@68	162 7-Zip
jpayne@68	163 https://7-zip.org/
jpayne@68	164
jpayne@68	165 LZMA Utils - LZMA adapted to POSIX-like systems
jpayne@68	166 https://tukaani.org/lzma/
jpayne@68	167
jpayne@68	168 XZ Utils - The next generation of LZMA Utils
jpayne@68	169 https://tukaani.org/xz/
jpayne@68	170
jpayne@68	171 The .xz file format - The successor of the .lzma format
jpayne@68	172 https://tukaani.org/xz/xz-file-format.txt
jpayne@68	173

Mercurial > repos > rliterman > csp2

annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/lzma-file-format.txt @ 68:5028fdace37b