csp2: CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/faq.txt annotate

annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/faq.txt @ 68:5028fdace37b

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d

author	jpayne
date	Tue, 18 Mar 2025 16:23:26 -0400
parents
children

rev	line source
jpayne@68	1
jpayne@68	2 XZ Utils FAQ
jpayne@68	3 ============
jpayne@68	4
jpayne@68	5 Q: What do the letters XZ mean?
jpayne@68	6
jpayne@68	7 A: Nothing. They are just two letters, which come from the file format
jpayne@68	8 suffix .xz. The .xz suffix was selected, because it seemed to be
jpayne@68	9 pretty much unused. It has no deeper meaning.
jpayne@68	10
jpayne@68	11
jpayne@68	12 Q: What are LZMA and LZMA2?
jpayne@68	13
jpayne@68	14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name
jpayne@68	15 of the compression algorithm designed by Igor Pavlov for 7-Zip.
jpayne@68	16 LZMA is based on LZ77 and range encoding.
jpayne@68	17
jpayne@68	18 LZMA2 is an updated version of the original LZMA to fix a couple of
jpayne@68	19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to
jpayne@68	20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the
jpayne@68	21 primary compression algorithm in the .xz file format.
jpayne@68	22
jpayne@68	23
jpayne@68	24 Q: There are many LZMA related projects. How does XZ Utils relate to them?
jpayne@68	25
jpayne@68	26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly
jpayne@68	27 a subset of the 7-Zip source tree.
jpayne@68	28
jpayne@68	29 p7zip is 7-Zip's command-line tools ported to POSIX-like systems.
jpayne@68	30
jpayne@68	31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems.
jpayne@68	32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to
jpayne@68	33 LZMA Utils.
jpayne@68	34
jpayne@68	35 There are several other projects using LZMA. Most are more or less
jpayne@68	36 based on LZMA SDK. See <https://7-zip.org/links.html>.
jpayne@68	37
jpayne@68	38
jpayne@68	39 Q: Why is liblzma named liblzma if its primary file format is .xz?
jpayne@68	40 Shouldn't it be e.g. libxz?
jpayne@68	41
jpayne@68	42 A: When the designing of the .xz format began, the idea was to replace
jpayne@68	43 the .lzma format and use the same .lzma suffix. It would have been
jpayne@68	44 quite OK to reuse the suffix when there were very few .lzma files
jpayne@68	45 around. However, the old .lzma format became popular before the
jpayne@68	46 new format was finished. The new format was renamed to .xz but the
jpayne@68	47 name of liblzma wasn't changed.
jpayne@68	48
jpayne@68	49
jpayne@68	50 Q: Do XZ Utils support the .7z format?
jpayne@68	51
jpayne@68	52 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z
jpayne@68	53 files.
jpayne@68	54
jpayne@68	55
jpayne@68	56 Q: I have many .tar.7z files. Can I convert them to .tar.xz without
jpayne@68	57 spending hours recompressing the data?
jpayne@68	58
jpayne@68	59 A: In the "extra" directory, there is a script named 7z2lzma.bash which
jpayne@68	60 is able to convert some .7z files to the .lzma format (not .xz). It
jpayne@68	61 needs the 7za (or 7z) command from p7zip. The script may silently
jpayne@68	62 produce corrupt output if certain assumptions are not met, so
jpayne@68	63 decompress the resulting .lzma file and compare it against the
jpayne@68	64 original before deleting the original file!
jpayne@68	65
jpayne@68	66
jpayne@68	67 Q: I have many .lzma files. Can I quickly convert them to the .xz format?
jpayne@68	68
jpayne@68	69 A: For now, no. Since XZ Utils supports the .lzma format, it's usually
jpayne@68	70 not too bad to keep the old files in the old format. If you want to
jpayne@68	71 do the conversion anyway, you need to decompress the .lzma files and
jpayne@68	72 then recompress to the .xz format.
jpayne@68	73
jpayne@68	74 Technically, there is a way to make the conversion relatively fast
jpayne@68	75 (roughly twice the time that normal decompression takes). Writing
jpayne@68	76 such a tool would take quite a bit of time though, and would probably
jpayne@68	77 be useful to only a few people. If you really want such a conversion
jpayne@68	78 tool, contact Lasse Collin and offer some money.
jpayne@68	79
jpayne@68	80
jpayne@68	81 Q: I have installed xz, but my tar doesn't recognize .tar.xz files.
jpayne@68	82 How can I extract .tar.xz files?
jpayne@68	83
jpayne@68	84 A: xz -dc foo.tar.xz \| tar xf -
jpayne@68	85
jpayne@68	86
jpayne@68	87 Q: Can I recover parts of a broken .xz file (e.g. a corrupted CD-R)?
jpayne@68	88
jpayne@68	89 A: It may be possible if the file consists of multiple blocks, which
jpayne@68	90 typically is not the case if the file was created in single-threaded
jpayne@68	91 mode. There is no recovery program yet.
jpayne@68	92
jpayne@68	93
jpayne@68	94 Q: Is (some part of) XZ Utils patented?
jpayne@68	95
jpayne@68	96 A: Lasse Collin is not aware of any patents that could affect XZ Utils.
jpayne@68	97 However, due to the nature of software patents, it's not possible to
jpayne@68	98 guarantee that XZ Utils isn't affected by any third party patent(s).
jpayne@68	99
jpayne@68	100
jpayne@68	101 Q: Where can I find documentation about the file format and algorithms?
jpayne@68	102
jpayne@68	103 A: The .xz format is documented in xz-file-format.txt. It is a container
jpayne@68	104 format only, and doesn't include descriptions of any non-trivial
jpayne@68	105 filters.
jpayne@68	106
jpayne@68	107 Documenting LZMA and LZMA2 is planned, but for now, there is no other
jpayne@68	108 documentation than the source code. Before you begin, you should know
jpayne@68	109 the basics of LZ77 and range-coding algorithms. LZMA is based on LZ77,
jpayne@68	110 but LZMA is a lot more complex. Range coding is used to compress
jpayne@68	111 the final bitstream like Huffman coding is used in Deflate.
jpayne@68	112
jpayne@68	113
jpayne@68	114 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?
jpayne@68	115
jpayne@68	116 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included,
jpayne@68	117 because it requires using more than one encoded output stream.
jpayne@68	118
jpayne@68	119
jpayne@68	120 Q: I need to use a script that runs "xz -9". On a system with 256 MiB
jpayne@68	121 of RAM, xz says that it cannot allocate memory. Can I make the
jpayne@68	122 script work without modifying it?
jpayne@68	123
jpayne@68	124 A: Set a default memory usage limit for compression. You can do it e.g.
jpayne@68	125 in a shell initialization script such as ~/.bashrc or /etc/profile:
jpayne@68	126
jpayne@68	127 XZ_DEFAULTS=--memlimit-compress=150MiB
jpayne@68	128 export XZ_DEFAULTS
jpayne@68	129
jpayne@68	130 xz will then scale the compression settings down so that the given
jpayne@68	131 memory usage limit is not reached. This way xz shouldn't run out
jpayne@68	132 of memory.
jpayne@68	133
jpayne@68	134 Check also that memory-related resource limits are high enough.
jpayne@68	135 On most systems, "ulimit -a" will show the current resource limits.
jpayne@68	136
jpayne@68	137
jpayne@68	138 Q: How do I create files that can be decompressed with XZ Embedded?
jpayne@68	139
jpayne@68	140 A: See the documentation in XZ Embedded. In short, something like
jpayne@68	141 this is a good start:
jpayne@68	142
jpayne@68	143 xz --check=crc32 --lzma2=preset=6e,dict=64KiB
jpayne@68	144
jpayne@68	145 Or if a BCJ filter is needed too, e.g. if compressing
jpayne@68	146 a kernel image for PowerPC:
jpayne@68	147
jpayne@68	148 xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB
jpayne@68	149
jpayne@68	150 Adjust the dictionary size to get a good compromise between
jpayne@68	151 compression ratio and decompressor memory usage. Note that
jpayne@68	152 in single-call decompression mode of XZ Embedded, a big
jpayne@68	153 dictionary doesn't increase memory usage.
jpayne@68	154
jpayne@68	155
jpayne@68	156 Q: How is multi-threaded compression implemented in XZ Utils?
jpayne@68	157
jpayne@68	158 A: The simplest method is splitting the uncompressed data into blocks
jpayne@68	159 and compressing them in parallel independent from each other.
jpayne@68	160 This is currently the only threading method supported in XZ Utils.
jpayne@68	161 Since the blocks are compressed independently, they can also be
jpayne@68	162 decompressed independently. Together with the index feature in .xz,
jpayne@68	163 this allows using threads to create .xz files for random-access
jpayne@68	164 reading. This also makes threaded decompression possible.
jpayne@68	165
jpayne@68	166 The independent blocks method has a couple of disadvantages too. It
jpayne@68	167 will compress worse than a single-block method. Often the difference
jpayne@68	168 is not too big (maybe 1-2 %) but sometimes it can be too big. Also,
jpayne@68	169 the memory usage of the compressor increases linearly when adding
jpayne@68	170 threads.
jpayne@68	171
jpayne@68	172 At least two other threading methods are possible but these haven't
jpayne@68	173 been implemented in XZ Utils:
jpayne@68	174
jpayne@68	175 Match finder parallelization has been in 7-Zip for ages. It doesn't
jpayne@68	176 affect compression ratio or memory usage significantly. Among the
jpayne@68	177 three threading methods, only this is useful when compressing small
jpayne@68	178 files (files that are not significantly bigger than the dictionary).
jpayne@68	179 Unfortunately this method scales only to about two CPU cores.
jpayne@68	180
jpayne@68	181 The third method is pigz-style threading (I use that name, because
jpayne@68	182 pigz <https://www.zlib.net/pigz/> uses that method). It doesn't
jpayne@68	183 affect compression ratio significantly and scales to many cores.
jpayne@68	184 The memory usage scales linearly when threads are added. This isn't
jpayne@68	185 significant with pigz, because Deflate uses only a 32 KiB dictionary,
jpayne@68	186 but with LZMA2 the memory usage will increase dramatically just like
jpayne@68	187 with the independent-blocks method. There is also a constant
jpayne@68	188 computational overhead, which may make pigz-method a bit dull on
jpayne@68	189 dual-core compared to the parallel match finder method, but with more
jpayne@68	190 cores the overhead is not a big deal anymore.
jpayne@68	191
jpayne@68	192 Combining the threading methods will be possible and also useful.
jpayne@68	193 For example, combining match finder parallelization with pigz-style
jpayne@68	194 threading or independent-blocks-threading can cut the memory usage
jpayne@68	195 by 50 %.
jpayne@68	196
jpayne@68	197
jpayne@68	198 Q: I told xz to use many threads but it is using only one or two
jpayne@68	199 processor cores. What is wrong?
jpayne@68	200
jpayne@68	201 A: Since multi-threaded compression is done by splitting the data into
jpayne@68	202 blocks that are compressed individually, if the input file is too
jpayne@68	203 small for the block size, then many threads cannot be used. The
jpayne@68	204 default block size increases when the compression level is
jpayne@68	205 increased. For example, xz -6 uses 8 MiB LZMA2 dictionary and
jpayne@68	206 24 MiB blocks, and xz -9 uses 64 MiB LZMA dictionary and 192 MiB
jpayne@68	207 blocks. If the input file is 100 MiB, xz -6 can use five threads
jpayne@68	208 of which one will finish quickly as it has only 4 MiB to compress.
jpayne@68	209 However, for the same file, xz -9 can only use one thread.
jpayne@68	210
jpayne@68	211 One can adjust block size with --block-size=SIZE but making the
jpayne@68	212 block size smaller than LZMA2 dictionary is waste of RAM: using
jpayne@68	213 xz -9 with 6 MiB blocks isn't any better than using xz -6 with
jpayne@68	214 6 MiB blocks. The default settings use a block size bigger than
jpayne@68	215 the LZMA2 dictionary size because this was seen as a reasonable
jpayne@68	216 compromise between RAM usage and compression ratio.
jpayne@68	217
jpayne@68	218 When decompressing, the ability to use threads depends on how the
jpayne@68	219 file was created. If it was created in multi-threaded mode then
jpayne@68	220 it can be decompressed in multi-threaded mode too if there are
jpayne@68	221 multiple blocks in the file.
jpayne@68	222
jpayne@68	223
jpayne@68	224 Q: How do I build a program that needs liblzmadec (lzmadec.h)?
jpayne@68	225
jpayne@68	226 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no
jpayne@68	227 liblzmadec. The code using liblzmadec should be ported to use
jpayne@68	228 liblzma instead. If you cannot or don't want to do that, download
jpayne@68	229 LZMA Utils from <https://tukaani.org/lzma/>.
jpayne@68	230
jpayne@68	231
jpayne@68	232 Q: The default build of liblzma is too big. How can I make it smaller?
jpayne@68	233
jpayne@68	234 A: Give --enable-small to the configure script. Use also appropriate
jpayne@68	235 --enable or --disable options to include only those filter encoders
jpayne@68	236 and decoders and integrity checks that you actually need. Use
jpayne@68	237 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize
jpayne@68	238 for size. See INSTALL for information about configure options.
jpayne@68	239
jpayne@68	240 If the result is still too big, take a look at XZ Embedded. It is
jpayne@68	241 a separate project, which provides a limited but significantly
jpayne@68	242 smaller XZ decoder implementation than XZ Utils. You can find it
jpayne@68	243 at <https://tukaani.org/xz/embedded.html>.
jpayne@68	244

Mercurial > repos > rliterman > csp2

annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/xz/faq.txt @ 68:5028fdace37b