Mercurial > repos > rliterman > csp2
comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/man/man1/unxz.1 @ 68:5028fdace37b
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 16:23:26 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
67:0e9998148a16 | 68:5028fdace37b |
---|---|
1 '\" t | |
2 .\" SPDX-License-Identifier: 0BSD | |
3 .\" | |
4 .\" Authors: Lasse Collin | |
5 .\" Jia Tan | |
6 .\" | |
7 .TH XZ 1 "2024-12-30" "Tukaani" "XZ Utils" | |
8 . | |
9 .SH NAME | |
10 xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files | |
11 . | |
12 .SH SYNOPSIS | |
13 .B xz | |
14 .RI [ option... ] | |
15 .RI [ file... ] | |
16 . | |
17 .SH COMMAND ALIASES | |
18 .B unxz | |
19 is equivalent to | |
20 .BR "xz \-\-decompress" . | |
21 .br | |
22 .B xzcat | |
23 is equivalent to | |
24 .BR "xz \-\-decompress \-\-stdout" . | |
25 .br | |
26 .B lzma | |
27 is equivalent to | |
28 .BR "xz \-\-format=lzma" . | |
29 .br | |
30 .B unlzma | |
31 is equivalent to | |
32 .BR "xz \-\-format=lzma \-\-decompress" . | |
33 .br | |
34 .B lzcat | |
35 is equivalent to | |
36 .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" . | |
37 .PP | |
38 When writing scripts that need to decompress files, | |
39 it is recommended to always use the name | |
40 .B xz | |
41 with appropriate arguments | |
42 .RB ( "xz \-d" | |
43 or | |
44 .BR "xz \-dc" ) | |
45 instead of the names | |
46 .B unxz | |
47 and | |
48 .BR xzcat . | |
49 . | |
50 .SH DESCRIPTION | |
51 .B xz | |
52 is a general-purpose data compression tool with | |
53 command line syntax similar to | |
54 .BR gzip (1) | |
55 and | |
56 .BR bzip2 (1). | |
57 The native file format is the | |
58 .B .xz | |
59 format, but the legacy | |
60 .B .lzma | |
61 format used by LZMA Utils and | |
62 raw compressed streams with no container format headers | |
63 are also supported. | |
64 In addition, decompression of the | |
65 .B .lz | |
66 format used by | |
67 .B lzip | |
68 is supported. | |
69 .PP | |
70 .B xz | |
71 compresses or decompresses each | |
72 .I file | |
73 according to the selected operation mode. | |
74 If no | |
75 .I files | |
76 are given or | |
77 .I file | |
78 is | |
79 .BR \- , | |
80 .B xz | |
81 reads from standard input and writes the processed data | |
82 to standard output. | |
83 .B xz | |
84 will refuse (display an error and skip the | |
85 .IR file ) | |
86 to write compressed data to standard output if it is a terminal. | |
87 Similarly, | |
88 .B xz | |
89 will refuse to read compressed data | |
90 from standard input if it is a terminal. | |
91 .PP | |
92 Unless | |
93 .B \-\-stdout | |
94 is specified, | |
95 .I files | |
96 other than | |
97 .B \- | |
98 are written to a new file whose name is derived from the source | |
99 .I file | |
100 name: | |
101 .IP \(bu 3 | |
102 When compressing, the suffix of the target file format | |
103 .RB ( .xz | |
104 or | |
105 .BR .lzma ) | |
106 is appended to the source filename to get the target filename. | |
107 .IP \(bu 3 | |
108 When decompressing, the | |
109 .BR .xz , | |
110 .BR .lzma , | |
111 or | |
112 .B .lz | |
113 suffix is removed from the filename to get the target filename. | |
114 .B xz | |
115 also recognizes the suffixes | |
116 .B .txz | |
117 and | |
118 .BR .tlz , | |
119 and replaces them with the | |
120 .B .tar | |
121 suffix. | |
122 .PP | |
123 If the target file already exists, an error is displayed and the | |
124 .I file | |
125 is skipped. | |
126 .PP | |
127 Unless writing to standard output, | |
128 .B xz | |
129 will display a warning and skip the | |
130 .I file | |
131 if any of the following applies: | |
132 .IP \(bu 3 | |
133 .I File | |
134 is not a regular file. | |
135 Symbolic links are not followed, | |
136 and thus they are not considered to be regular files. | |
137 .IP \(bu 3 | |
138 .I File | |
139 has more than one hard link. | |
140 .IP \(bu 3 | |
141 .I File | |
142 has setuid, setgid, or sticky bit set. | |
143 .IP \(bu 3 | |
144 The operation mode is set to compress and the | |
145 .I file | |
146 already has a suffix of the target file format | |
147 .RB ( .xz | |
148 or | |
149 .B .txz | |
150 when compressing to the | |
151 .B .xz | |
152 format, and | |
153 .B .lzma | |
154 or | |
155 .B .tlz | |
156 when compressing to the | |
157 .B .lzma | |
158 format). | |
159 .IP \(bu 3 | |
160 The operation mode is set to decompress and the | |
161 .I file | |
162 doesn't have a suffix of any of the supported file formats | |
163 .RB ( .xz , | |
164 .BR .txz , | |
165 .BR .lzma , | |
166 .BR .tlz , | |
167 or | |
168 .BR .lz ). | |
169 .PP | |
170 After successfully compressing or decompressing the | |
171 .IR file , | |
172 .B xz | |
173 copies the owner, group, permissions, access time, | |
174 and modification time from the source | |
175 .I file | |
176 to the target file. | |
177 If copying the group fails, the permissions are modified | |
178 so that the target file doesn't become accessible to users | |
179 who didn't have permission to access the source | |
180 .IR file . | |
181 .B xz | |
182 doesn't support copying other metadata like access control lists | |
183 or extended attributes yet. | |
184 .PP | |
185 Once the target file has been successfully closed, the source | |
186 .I file | |
187 is removed unless | |
188 .B \-\-keep | |
189 was specified. | |
190 The source | |
191 .I file | |
192 is never removed if the output is written to standard output | |
193 or if an error occurs. | |
194 .PP | |
195 Sending | |
196 .B SIGINFO | |
197 or | |
198 .B SIGUSR1 | |
199 to the | |
200 .B xz | |
201 process makes it print progress information to standard error. | |
202 This has only limited use since when standard error | |
203 is a terminal, using | |
204 .B \-\-verbose | |
205 will display an automatically updating progress indicator. | |
206 . | |
207 .SS "Memory usage" | |
208 The memory usage of | |
209 .B xz | |
210 varies from a few hundred kilobytes to several gigabytes | |
211 depending on the compression settings. | |
212 The settings used when compressing a file determine | |
213 the memory requirements of the decompressor. | |
214 Typically the decompressor needs 5\ % to 20\ % of | |
215 the amount of memory that the compressor needed when | |
216 creating the file. | |
217 For example, decompressing a file created with | |
218 .B xz \-9 | |
219 currently requires 65\ MiB of memory. | |
220 Still, it is possible to have | |
221 .B .xz | |
222 files that require several gigabytes of memory to decompress. | |
223 .PP | |
224 Especially users of older systems may find | |
225 the possibility of very large memory usage annoying. | |
226 To prevent uncomfortable surprises, | |
227 .B xz | |
228 has a built-in memory usage limiter, which is disabled by default. | |
229 While some operating systems provide ways to limit | |
230 the memory usage of processes, relying on it | |
231 wasn't deemed to be flexible enough (for example, using | |
232 .BR ulimit (1) | |
233 to limit virtual memory tends to cripple | |
234 .BR mmap (2)). | |
235 .PP | |
236 The memory usage limiter can be enabled with | |
237 the command line option \fB\-\-memlimit=\fIlimit\fR. | |
238 Often it is more convenient to enable the limiter | |
239 by default by setting the environment variable | |
240 .BR XZ_DEFAULTS , | |
241 for example, | |
242 .BR XZ_DEFAULTS=\-\-memlimit=150MiB . | |
243 It is possible to set the limits separately | |
244 for compression and decompression by using | |
245 .BI \-\-memlimit\-compress= limit | |
246 and \fB\-\-memlimit\-decompress=\fIlimit\fR. | |
247 Using these two options outside | |
248 .B XZ_DEFAULTS | |
249 is rarely useful because a single run of | |
250 .B xz | |
251 cannot do both compression and decompression and | |
252 .BI \-\-memlimit= limit | |
253 (or | |
254 .B \-M | |
255 .IR limit ) | |
256 is shorter to type on the command line. | |
257 .PP | |
258 If the specified memory usage limit is exceeded when decompressing, | |
259 .B xz | |
260 will display an error and decompressing the file will fail. | |
261 If the limit is exceeded when compressing, | |
262 .B xz | |
263 will try to scale the settings down so that the limit | |
264 is no longer exceeded (except when using | |
265 .B \-\-format=raw | |
266 or | |
267 .BR \-\-no\-adjust ). | |
268 This way the operation won't fail unless the limit is very small. | |
269 The scaling of the settings is done in steps that don't | |
270 match the compression level presets, for example, if the limit is | |
271 only slightly less than the amount required for | |
272 .BR "xz \-9" , | |
273 the settings will be scaled down only a little, | |
274 not all the way down to | |
275 .BR "xz \-8" . | |
276 . | |
277 .SS "Concatenation and padding with .xz files" | |
278 It is possible to concatenate | |
279 .B .xz | |
280 files as is. | |
281 .B xz | |
282 will decompress such files as if they were a single | |
283 .B .xz | |
284 file. | |
285 .PP | |
286 It is possible to insert padding between the concatenated parts | |
287 or after the last part. | |
288 The padding must consist of null bytes and the size | |
289 of the padding must be a multiple of four bytes. | |
290 This can be useful, for example, if the | |
291 .B .xz | |
292 file is stored on a medium that measures file sizes | |
293 in 512-byte blocks. | |
294 .PP | |
295 Concatenation and padding are not allowed with | |
296 .B .lzma | |
297 files or raw streams. | |
298 . | |
299 .SH OPTIONS | |
300 . | |
301 .SS "Integer suffixes and special values" | |
302 In most places where an integer argument is expected, | |
303 an optional suffix is supported to easily indicate large integers. | |
304 There must be no space between the integer and the suffix. | |
305 .TP | |
306 .B KiB | |
307 Multiply the integer by 1,024 (2^10). | |
308 .BR Ki , | |
309 .BR k , | |
310 .BR kB , | |
311 .BR K , | |
312 and | |
313 .B KB | |
314 are accepted as synonyms for | |
315 .BR KiB . | |
316 .TP | |
317 .B MiB | |
318 Multiply the integer by 1,048,576 (2^20). | |
319 .BR Mi , | |
320 .BR m , | |
321 .BR M , | |
322 and | |
323 .B MB | |
324 are accepted as synonyms for | |
325 .BR MiB . | |
326 .TP | |
327 .B GiB | |
328 Multiply the integer by 1,073,741,824 (2^30). | |
329 .BR Gi , | |
330 .BR g , | |
331 .BR G , | |
332 and | |
333 .B GB | |
334 are accepted as synonyms for | |
335 .BR GiB . | |
336 .PP | |
337 The special value | |
338 .B max | |
339 can be used to indicate the maximum integer value | |
340 supported by the option. | |
341 . | |
342 .SS "Operation mode" | |
343 If multiple operation mode options are given, | |
344 the last one takes effect. | |
345 .TP | |
346 .BR \-z ", " \-\-compress | |
347 Compress. | |
348 This is the default operation mode when no operation mode option | |
349 is specified and no other operation mode is implied from | |
350 the command name (for example, | |
351 .B unxz | |
352 implies | |
353 .BR \-\-decompress ). | |
354 .IP "" | |
355 .\" The DESCRIPTION section already says this but it's good to repeat it | |
356 .\" here because the default behavior is a bit dangerous and new users | |
357 .\" in a hurry may skip reading the DESCRIPTION section. | |
358 After successful compression, the source file is removed | |
359 unless writing to standard output or | |
360 .B \-\-keep | |
361 was specified. | |
362 .TP | |
363 .BR \-d ", " \-\-decompress ", " \-\-uncompress | |
364 Decompress. | |
365 .\" The DESCRIPTION section already says this but it's good to repeat it | |
366 .\" here because the default behavior is a bit dangerous and new users | |
367 .\" in a hurry may skip reading the DESCRIPTION section. | |
368 After successful decompression, the source file is removed | |
369 unless writing to standard output or | |
370 .B \-\-keep | |
371 was specified. | |
372 .TP | |
373 .BR \-t ", " \-\-test | |
374 Test the integrity of compressed | |
375 .IR files . | |
376 This option is equivalent to | |
377 .B "\-\-decompress \-\-stdout" | |
378 except that the decompressed data is discarded instead of being | |
379 written to standard output. | |
380 No files are created or removed. | |
381 .TP | |
382 .BR \-l ", " \-\-list | |
383 Print information about compressed | |
384 .IR files . | |
385 No uncompressed output is produced, | |
386 and no files are created or removed. | |
387 In list mode, the program cannot read | |
388 the compressed data from standard | |
389 input or from other unseekable sources. | |
390 .IP "" | |
391 The default listing shows basic information about | |
392 .IR files , | |
393 one file per line. | |
394 To get more detailed information, use also the | |
395 .B \-\-verbose | |
396 option. | |
397 For even more information, use | |
398 .B \-\-verbose | |
399 twice, but note that this may be slow, because getting all the extra | |
400 information requires many seeks. | |
401 The width of verbose output exceeds | |
402 80 characters, so piping the output to, for example, | |
403 .B "less\ \-S" | |
404 may be convenient if the terminal isn't wide enough. | |
405 .IP "" | |
406 The exact output may vary between | |
407 .B xz | |
408 versions and different locales. | |
409 For machine-readable output, | |
410 .B \-\-robot \-\-list | |
411 should be used. | |
412 . | |
413 .SS "Operation modifiers" | |
414 .TP | |
415 .BR \-k ", " \-\-keep | |
416 Don't delete the input files. | |
417 .IP "" | |
418 Since | |
419 .B xz | |
420 5.2.6, | |
421 this option also makes | |
422 .B xz | |
423 compress or decompress even if the input is | |
424 a symbolic link to a regular file, | |
425 has more than one hard link, | |
426 or has the setuid, setgid, or sticky bit set. | |
427 The setuid, setgid, and sticky bits are not copied | |
428 to the target file. | |
429 In earlier versions this was only done with | |
430 .BR \-\-force . | |
431 .TP | |
432 .BR \-f ", " \-\-force | |
433 This option has several effects: | |
434 .RS | |
435 .IP \(bu 3 | |
436 If the target file already exists, | |
437 delete it before compressing or decompressing. | |
438 .IP \(bu 3 | |
439 Compress or decompress even if the input is | |
440 a symbolic link to a regular file, | |
441 has more than one hard link, | |
442 or has the setuid, setgid, or sticky bit set. | |
443 The setuid, setgid, and sticky bits are not copied | |
444 to the target file. | |
445 .IP \(bu 3 | |
446 When used with | |
447 .B \-\-decompress | |
448 .B \-\-stdout | |
449 and | |
450 .B xz | |
451 cannot recognize the type of the source file, | |
452 copy the source file as is to standard output. | |
453 This allows | |
454 .B xzcat | |
455 .B \-\-force | |
456 to be used like | |
457 .BR cat (1) | |
458 for files that have not been compressed with | |
459 .BR xz . | |
460 Note that in future, | |
461 .B xz | |
462 might support new compressed file formats, which may make | |
463 .B xz | |
464 decompress more types of files instead of copying them as is to | |
465 standard output. | |
466 .BI \-\-format= format | |
467 can be used to restrict | |
468 .B xz | |
469 to decompress only a single file format. | |
470 .RE | |
471 .TP | |
472 .BR \-c ", " \-\-stdout ", " \-\-to\-stdout | |
473 Write the compressed or decompressed data to | |
474 standard output instead of a file. | |
475 This implies | |
476 .BR \-\-keep . | |
477 .TP | |
478 .B \-\-single\-stream | |
479 Decompress only the first | |
480 .B .xz | |
481 stream, and | |
482 silently ignore possible remaining input data following the stream. | |
483 Normally such trailing garbage makes | |
484 .B xz | |
485 display an error. | |
486 .IP "" | |
487 .B xz | |
488 never decompresses more than one stream from | |
489 .B .lzma | |
490 files or raw streams, but this option still makes | |
491 .B xz | |
492 ignore the possible trailing data after the | |
493 .B .lzma | |
494 file or raw stream. | |
495 .IP "" | |
496 This option has no effect if the operation mode is not | |
497 .B \-\-decompress | |
498 or | |
499 .BR \-\-test . | |
500 .TP | |
501 .B \-\-no\-sparse | |
502 Disable creation of sparse files. | |
503 By default, if decompressing into a regular file, | |
504 .B xz | |
505 tries to make the file sparse if the decompressed data contains | |
506 long sequences of binary zeros. | |
507 It also works when writing to standard output | |
508 as long as standard output is connected to a regular file | |
509 and certain additional conditions are met to make it safe. | |
510 Creating sparse files may save disk space and speed up | |
511 the decompression by reducing the amount of disk I/O. | |
512 .TP | |
513 \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf | |
514 When compressing, use | |
515 .I .suf | |
516 as the suffix for the target file instead of | |
517 .B .xz | |
518 or | |
519 .BR .lzma . | |
520 If not writing to standard output and | |
521 the source file already has the suffix | |
522 .IR .suf , | |
523 a warning is displayed and the file is skipped. | |
524 .IP "" | |
525 When decompressing, recognize files with the suffix | |
526 .I .suf | |
527 in addition to files with the | |
528 .BR .xz , | |
529 .BR .txz , | |
530 .BR .lzma , | |
531 .BR .tlz , | |
532 or | |
533 .B .lz | |
534 suffix. | |
535 If the source file has the suffix | |
536 .IR .suf , | |
537 the suffix is removed to get the target filename. | |
538 .IP "" | |
539 When compressing or decompressing raw streams | |
540 .RB ( \-\-format=raw ), | |
541 the suffix must always be specified unless | |
542 writing to standard output, | |
543 because there is no default suffix for raw streams. | |
544 .TP | |
545 \fB\-\-files\fR[\fB=\fIfile\fR] | |
546 Read the filenames to process from | |
547 .IR file ; | |
548 if | |
549 .I file | |
550 is omitted, filenames are read from standard input. | |
551 Filenames must be terminated with the newline character. | |
552 A dash | |
553 .RB ( \- ) | |
554 is taken as a regular filename; it doesn't mean standard input. | |
555 If filenames are given also as command line arguments, they are | |
556 processed before the filenames read from | |
557 .IR file . | |
558 .TP | |
559 \fB\-\-files0\fR[\fB=\fIfile\fR] | |
560 This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except | |
561 that each filename must be terminated with the null character. | |
562 . | |
563 .SS "Basic file format and compression options" | |
564 .TP | |
565 \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat | |
566 Specify the file | |
567 .I format | |
568 to compress or decompress: | |
569 .RS | |
570 .TP | |
571 .B auto | |
572 This is the default. | |
573 When compressing, | |
574 .B auto | |
575 is equivalent to | |
576 .BR xz . | |
577 When decompressing, | |
578 the format of the input file is automatically detected. | |
579 Note that raw streams (created with | |
580 .BR \-\-format=raw ) | |
581 cannot be auto-detected. | |
582 .TP | |
583 .B xz | |
584 Compress to the | |
585 .B .xz | |
586 file format, or accept only | |
587 .B .xz | |
588 files when decompressing. | |
589 .TP | |
590 .BR lzma ", " alone | |
591 Compress to the legacy | |
592 .B .lzma | |
593 file format, or accept only | |
594 .B .lzma | |
595 files when decompressing. | |
596 The alternative name | |
597 .B alone | |
598 is provided for backwards compatibility with LZMA Utils. | |
599 .TP | |
600 .B lzip | |
601 Accept only | |
602 .B .lz | |
603 files when decompressing. | |
604 Compression is not supported. | |
605 .IP "" | |
606 The | |
607 .B .lz | |
608 format version 0 and the unextended version 1 are supported. | |
609 Version 0 files were produced by | |
610 .B lzip | |
611 1.3 and older. | |
612 Such files aren't common but may be found from file archives | |
613 as a few source packages were released in this format. | |
614 People might have old personal files in this format too. | |
615 Decompression support for the format version 0 was removed in | |
616 .B lzip | |
617 1.18. | |
618 .IP "" | |
619 .B lzip | |
620 1.4 and later create files in the format version 1. | |
621 The sync flush marker extension to the format version 1 was added in | |
622 .B lzip | |
623 1.6. | |
624 This extension is rarely used and isn't supported by | |
625 .B xz | |
626 (diagnosed as corrupt input). | |
627 .TP | |
628 .B raw | |
629 Compress or uncompress a raw stream (no headers). | |
630 This is meant for advanced users only. | |
631 To decode raw streams, you need use | |
632 .B \-\-format=raw | |
633 and explicitly specify the filter chain, | |
634 which normally would have been stored in the container headers. | |
635 .RE | |
636 .TP | |
637 \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck | |
638 Specify the type of the integrity check. | |
639 The check is calculated from the uncompressed data and | |
640 stored in the | |
641 .B .xz | |
642 file. | |
643 This option has an effect only when compressing into the | |
644 .B .xz | |
645 format; the | |
646 .B .lzma | |
647 format doesn't support integrity checks. | |
648 The integrity check (if any) is verified when the | |
649 .B .xz | |
650 file is decompressed. | |
651 .IP "" | |
652 Supported | |
653 .I check | |
654 types: | |
655 .RS | |
656 .TP | |
657 .B none | |
658 Don't calculate an integrity check at all. | |
659 This is usually a bad idea. | |
660 This can be useful when integrity of the data is verified | |
661 by other means anyway. | |
662 .TP | |
663 .B crc32 | |
664 Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet). | |
665 .TP | |
666 .B crc64 | |
667 Calculate CRC64 using the polynomial from ECMA-182. | |
668 This is the default, since it is slightly better than CRC32 | |
669 at detecting damaged files and the speed difference is negligible. | |
670 .TP | |
671 .B sha256 | |
672 Calculate SHA-256. | |
673 This is somewhat slower than CRC32 and CRC64. | |
674 .RE | |
675 .IP "" | |
676 Integrity of the | |
677 .B .xz | |
678 headers is always verified with CRC32. | |
679 It is not possible to change or disable it. | |
680 .TP | |
681 .B \-\-ignore\-check | |
682 Don't verify the integrity check of the compressed data when decompressing. | |
683 The CRC32 values in the | |
684 .B .xz | |
685 headers will still be verified normally. | |
686 .IP "" | |
687 .B "Do not use this option unless you know what you are doing." | |
688 Possible reasons to use this option: | |
689 .RS | |
690 .IP \(bu 3 | |
691 Trying to recover data from a corrupt .xz file. | |
692 .IP \(bu 3 | |
693 Speeding up decompression. | |
694 This matters mostly with SHA-256 or | |
695 with files that have compressed extremely well. | |
696 It's recommended to not use this option for this purpose | |
697 unless the file integrity is verified externally in some other way. | |
698 .RE | |
699 .TP | |
700 .BR \-0 " ... " \-9 | |
701 Select a compression preset level. | |
702 The default is | |
703 .BR \-6 . | |
704 If multiple preset levels are specified, | |
705 the last one takes effect. | |
706 If a custom filter chain was already specified, setting | |
707 a compression preset level clears the custom filter chain. | |
708 .IP "" | |
709 The differences between the presets are more significant than with | |
710 .BR gzip (1) | |
711 and | |
712 .BR bzip2 (1). | |
713 The selected compression settings determine | |
714 the memory requirements of the decompressor, | |
715 thus using a too high preset level might make it painful | |
716 to decompress the file on an old system with little RAM. | |
717 Specifically, | |
718 .B "it's not a good idea to blindly use \-9 for everything" | |
719 like it often is with | |
720 .BR gzip (1) | |
721 and | |
722 .BR bzip2 (1). | |
723 .RS | |
724 .TP | |
725 .BR "\-0" " ... " "\-3" | |
726 These are somewhat fast presets. | |
727 .B \-0 | |
728 is sometimes faster than | |
729 .B "gzip \-9" | |
730 while compressing much better. | |
731 The higher ones often have speed comparable to | |
732 .BR bzip2 (1) | |
733 with comparable or better compression ratio, | |
734 although the results | |
735 depend a lot on the type of data being compressed. | |
736 .TP | |
737 .BR "\-4" " ... " "\-6" | |
738 Good to very good compression while keeping | |
739 decompressor memory usage reasonable even for old systems. | |
740 .B \-6 | |
741 is the default, which is usually a good choice | |
742 for distributing files that need to be decompressible | |
743 even on systems with only 16\ MiB RAM. | |
744 .RB ( \-5e | |
745 or | |
746 .B \-6e | |
747 may be worth considering too. | |
748 See | |
749 .BR \-\-extreme .) | |
750 .TP | |
751 .B "\-7 ... \-9" | |
752 These are like | |
753 .B \-6 | |
754 but with higher compressor and decompressor memory requirements. | |
755 These are useful only when compressing files bigger than | |
756 8\ MiB, 16\ MiB, and 32\ MiB, respectively. | |
757 .RE | |
758 .IP "" | |
759 On the same hardware, the decompression speed is approximately | |
760 a constant number of bytes of compressed data per second. | |
761 In other words, the better the compression, | |
762 the faster the decompression will usually be. | |
763 This also means that the amount of uncompressed output | |
764 produced per second can vary a lot. | |
765 .IP "" | |
766 The following table summarises the features of the presets: | |
767 .RS | |
768 .RS | |
769 .PP | |
770 .TS | |
771 tab(;); | |
772 c c c c c | |
773 n n n n n. | |
774 Preset;DictSize;CompCPU;CompMem;DecMem | |
775 \-0;256 KiB;0;3 MiB;1 MiB | |
776 \-1;1 MiB;1;9 MiB;2 MiB | |
777 \-2;2 MiB;2;17 MiB;3 MiB | |
778 \-3;4 MiB;3;32 MiB;5 MiB | |
779 \-4;4 MiB;4;48 MiB;5 MiB | |
780 \-5;8 MiB;5;94 MiB;9 MiB | |
781 \-6;8 MiB;6;94 MiB;9 MiB | |
782 \-7;16 MiB;6;186 MiB;17 MiB | |
783 \-8;32 MiB;6;370 MiB;33 MiB | |
784 \-9;64 MiB;6;674 MiB;65 MiB | |
785 .TE | |
786 .RE | |
787 .RE | |
788 .IP "" | |
789 Column descriptions: | |
790 .RS | |
791 .IP \(bu 3 | |
792 DictSize is the LZMA2 dictionary size. | |
793 It is waste of memory to use a dictionary bigger than | |
794 the size of the uncompressed file. | |
795 This is why it is good to avoid using the presets | |
796 .BR \-7 " ... " \-9 | |
797 when there's no real need for them. | |
798 At | |
799 .B \-6 | |
800 and lower, the amount of memory wasted is | |
801 usually low enough to not matter. | |
802 .IP \(bu 3 | |
803 CompCPU is a simplified representation of the LZMA2 settings | |
804 that affect compression speed. | |
805 The dictionary size affects speed too, | |
806 so while CompCPU is the same for levels | |
807 .BR \-6 " ... " \-9 , | |
808 higher levels still tend to be a little slower. | |
809 To get even slower and thus possibly better compression, see | |
810 .BR \-\-extreme . | |
811 .IP \(bu 3 | |
812 CompMem contains the compressor memory requirements | |
813 in the single-threaded mode. | |
814 It may vary slightly between | |
815 .B xz | |
816 versions. | |
817 .IP \(bu 3 | |
818 DecMem contains the decompressor memory requirements. | |
819 That is, the compression settings determine | |
820 the memory requirements of the decompressor. | |
821 The exact decompressor memory usage is slightly more than | |
822 the LZMA2 dictionary size, but the values in the table | |
823 have been rounded up to the next full MiB. | |
824 .RE | |
825 .IP "" | |
826 Memory requirements of the multi-threaded mode are | |
827 significantly higher than that of the single-threaded mode. | |
828 With the default value of | |
829 .BR \-\-block\-size , | |
830 each thread needs 3*3*DictSize plus CompMem or DecMem. | |
831 For example, four threads with preset | |
832 .B \-6 | |
833 needs 660\(en670\ MiB of memory. | |
834 .TP | |
835 .BR \-e ", " \-\-extreme | |
836 Use a slower variant of the selected compression preset level | |
837 .RB ( \-0 " ... " \-9 ) | |
838 to hopefully get a little bit better compression ratio, | |
839 but with bad luck this can also make it worse. | |
840 Decompressor memory usage is not affected, | |
841 but compressor memory usage increases a little at preset levels | |
842 .BR \-0 " ... " \-3 . | |
843 .IP "" | |
844 Since there are two presets with dictionary sizes | |
845 4\ MiB and 8\ MiB, the presets | |
846 .B \-3e | |
847 and | |
848 .B \-5e | |
849 use slightly faster settings (lower CompCPU) than | |
850 .B \-4e | |
851 and | |
852 .BR \-6e , | |
853 respectively. | |
854 That way no two presets are identical. | |
855 .RS | |
856 .RS | |
857 .PP | |
858 .TS | |
859 tab(;); | |
860 c c c c c | |
861 n n n n n. | |
862 Preset;DictSize;CompCPU;CompMem;DecMem | |
863 \-0e;256 KiB;8;4 MiB;1 MiB | |
864 \-1e;1 MiB;8;13 MiB;2 MiB | |
865 \-2e;2 MiB;8;25 MiB;3 MiB | |
866 \-3e;4 MiB;7;48 MiB;5 MiB | |
867 \-4e;4 MiB;8;48 MiB;5 MiB | |
868 \-5e;8 MiB;7;94 MiB;9 MiB | |
869 \-6e;8 MiB;8;94 MiB;9 MiB | |
870 \-7e;16 MiB;8;186 MiB;17 MiB | |
871 \-8e;32 MiB;8;370 MiB;33 MiB | |
872 \-9e;64 MiB;8;674 MiB;65 MiB | |
873 .TE | |
874 .RE | |
875 .RE | |
876 .IP "" | |
877 For example, there are a total of four presets that use | |
878 8\ MiB dictionary, whose order from the fastest to the slowest is | |
879 .BR \-5 , | |
880 .BR \-6 , | |
881 .BR \-5e , | |
882 and | |
883 .BR \-6e . | |
884 .TP | |
885 .B \-\-fast | |
886 .PD 0 | |
887 .TP | |
888 .B \-\-best | |
889 .PD | |
890 These are somewhat misleading aliases for | |
891 .B \-0 | |
892 and | |
893 .BR \-9 , | |
894 respectively. | |
895 These are provided only for backwards compatibility | |
896 with LZMA Utils. | |
897 Avoid using these options. | |
898 .TP | |
899 .BI \-\-block\-size= size | |
900 When compressing to the | |
901 .B .xz | |
902 format, split the input data into blocks of | |
903 .I size | |
904 bytes. | |
905 The blocks are compressed independently from each other, | |
906 which helps with multi-threading and | |
907 makes limited random-access decompression possible. | |
908 This option is typically used to override the default | |
909 block size in multi-threaded mode, | |
910 but this option can be used in single-threaded mode too. | |
911 .IP "" | |
912 In multi-threaded mode about three times | |
913 .I size | |
914 bytes will be allocated in each thread for buffering input and output. | |
915 The default | |
916 .I size | |
917 is three times the LZMA2 dictionary size or 1 MiB, | |
918 whichever is more. | |
919 Typically a good value is 2\(en4 times | |
920 the size of the LZMA2 dictionary or at least 1 MiB. | |
921 Using | |
922 .I size | |
923 less than the LZMA2 dictionary size is waste of RAM | |
924 because then the LZMA2 dictionary buffer will never get fully used. | |
925 In multi-threaded mode, | |
926 the sizes of the blocks are stored in the block headers. | |
927 This size information is required for multi-threaded decompression. | |
928 .IP "" | |
929 In single-threaded mode no block splitting is done by default. | |
930 Setting this option doesn't affect memory usage. | |
931 No size information is stored in block headers, | |
932 thus files created in single-threaded mode | |
933 won't be identical to files created in multi-threaded mode. | |
934 The lack of size information also means that | |
935 .B xz | |
936 won't be able decompress the files in multi-threaded mode. | |
937 .TP | |
938 .BI \-\-block\-list= items | |
939 When compressing to the | |
940 .B .xz | |
941 format, start a new block with an optional custom filter chain after | |
942 the given intervals of uncompressed data. | |
943 .IP "" | |
944 The | |
945 .I items | |
946 are a comma-separated list. | |
947 Each item consists of an optional filter chain number | |
948 between 0 and 9 followed by a colon | |
949 .RB ( : ) | |
950 and a required size of uncompressed data. | |
951 Omitting an item (two or more consecutive commas) is a | |
952 shorthand to use the size and filters of the previous item. | |
953 .IP "" | |
954 If the input file is bigger than the sum of | |
955 the sizes in | |
956 .IR items , | |
957 the last item is repeated until the end of the file. | |
958 A special value of | |
959 .B 0 | |
960 may be used as the last size to indicate that | |
961 the rest of the file should be encoded as a single block. | |
962 .IP "" | |
963 An alternative filter chain for each block can be | |
964 specified in combination with the | |
965 .BI \-\-filters1= filters | |
966 \&...\& | |
967 .BI \-\-filters9= filters | |
968 options. | |
969 These options define filter chains with an identifier | |
970 between 1\(en9. | |
971 Filter chain 0 can be used to refer to the default filter chain, | |
972 which is the same as not specifying a filter chain. | |
973 The filter chain identifier can be used before the uncompressed | |
974 size, followed by a colon | |
975 .RB ( : ). | |
976 For example, if one specifies | |
977 .B \-\-block\-list=1:2MiB,3:2MiB,2:4MiB,,2MiB,0:4MiB | |
978 then blocks will be created using: | |
979 .RS | |
980 .IP \(bu 3 | |
981 The filter chain specified by | |
982 .B \-\-filters1 | |
983 and 2 MiB input | |
984 .IP \(bu 3 | |
985 The filter chain specified by | |
986 .B \-\-filters3 | |
987 and 2 MiB input | |
988 .IP \(bu 3 | |
989 The filter chain specified by | |
990 .B \-\-filters2 | |
991 and 4 MiB input | |
992 .IP \(bu 3 | |
993 The filter chain specified by | |
994 .B \-\-filters2 | |
995 and 4 MiB input | |
996 .IP \(bu 3 | |
997 The default filter chain and 2 MiB input | |
998 .IP \(bu 3 | |
999 The default filter chain and 4 MiB input for every block until | |
1000 end of input. | |
1001 .RE | |
1002 .IP "" | |
1003 If one specifies a size that exceeds the encoder's block size | |
1004 (either the default value in threaded mode or | |
1005 the value specified with \fB\-\-block\-size=\fIsize\fR), | |
1006 the encoder will create additional blocks while | |
1007 keeping the boundaries specified in | |
1008 .IR items . | |
1009 For example, if one specifies | |
1010 .B \-\-block\-size=10MiB | |
1011 .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB | |
1012 and the input file is 80 MiB, | |
1013 one will get 11 blocks: | |
1014 5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB. | |
1015 .IP "" | |
1016 In multi-threaded mode the sizes of the blocks | |
1017 are stored in the block headers. | |
1018 This isn't done in single-threaded mode, | |
1019 so the encoded output won't be | |
1020 identical to that of the multi-threaded mode. | |
1021 .TP | |
1022 .BI \-\-flush\-timeout= timeout | |
1023 When compressing, if more than | |
1024 .I timeout | |
1025 milliseconds (a positive integer) has passed since the previous flush and | |
1026 reading more input would block, | |
1027 all the pending input data is flushed from the encoder and | |
1028 made available in the output stream. | |
1029 This can be useful if | |
1030 .B xz | |
1031 is used to compress data that is streamed over a network. | |
1032 Small | |
1033 .I timeout | |
1034 values make the data available at the receiving end | |
1035 with a small delay, but large | |
1036 .I timeout | |
1037 values give better compression ratio. | |
1038 .IP "" | |
1039 This feature is disabled by default. | |
1040 If this option is specified more than once, the last one takes effect. | |
1041 The special | |
1042 .I timeout | |
1043 value of | |
1044 .B 0 | |
1045 can be used to explicitly disable this feature. | |
1046 .IP "" | |
1047 This feature is not available on non-POSIX systems. | |
1048 .IP "" | |
1049 .\" FIXME | |
1050 .B "This feature is still experimental." | |
1051 Currently | |
1052 .B xz | |
1053 is unsuitable for decompressing the stream in real time due to how | |
1054 .B xz | |
1055 does buffering. | |
1056 .TP | |
1057 .BI \-\-memlimit\-compress= limit | |
1058 Set a memory usage limit for compression. | |
1059 If this option is specified multiple times, | |
1060 the last one takes effect. | |
1061 .IP "" | |
1062 If the compression settings exceed the | |
1063 .IR limit , | |
1064 .B xz | |
1065 will attempt to adjust the settings downwards so that | |
1066 the limit is no longer exceeded and display a notice that | |
1067 automatic adjustment was done. | |
1068 The adjustments are done in this order: | |
1069 reducing the number of threads, | |
1070 switching to single-threaded mode | |
1071 if even one thread in multi-threaded mode exceeds the | |
1072 .IR limit , | |
1073 and finally reducing the LZMA2 dictionary size. | |
1074 .IP "" | |
1075 When compressing with | |
1076 .B \-\-format=raw | |
1077 or if | |
1078 .B \-\-no\-adjust | |
1079 has been specified, | |
1080 only the number of threads may be reduced | |
1081 since it can be done without affecting the compressed output. | |
1082 .IP "" | |
1083 If the | |
1084 .I limit | |
1085 cannot be met even with the adjustments described above, | |
1086 an error is displayed and | |
1087 .B xz | |
1088 will exit with exit status 1. | |
1089 .IP "" | |
1090 The | |
1091 .I limit | |
1092 can be specified in multiple ways: | |
1093 .RS | |
1094 .IP \(bu 3 | |
1095 The | |
1096 .I limit | |
1097 can be an absolute value in bytes. | |
1098 Using an integer suffix like | |
1099 .B MiB | |
1100 can be useful. | |
1101 Example: | |
1102 .B "\-\-memlimit\-compress=80MiB" | |
1103 .IP \(bu 3 | |
1104 The | |
1105 .I limit | |
1106 can be specified as a percentage of total physical memory (RAM). | |
1107 This can be useful especially when setting the | |
1108 .B XZ_DEFAULTS | |
1109 environment variable in a shell initialization script | |
1110 that is shared between different computers. | |
1111 That way the limit is automatically bigger | |
1112 on systems with more memory. | |
1113 Example: | |
1114 .B "\-\-memlimit\-compress=70%" | |
1115 .IP \(bu 3 | |
1116 The | |
1117 .I limit | |
1118 can be reset back to its default value by setting it to | |
1119 .BR 0 . | |
1120 This is currently equivalent to setting the | |
1121 .I limit | |
1122 to | |
1123 .B max | |
1124 (no memory usage limit). | |
1125 .RE | |
1126 .IP "" | |
1127 For 32-bit | |
1128 .B xz | |
1129 there is a special case: if the | |
1130 .I limit | |
1131 would be over | |
1132 .BR "4020\ MiB" , | |
1133 the | |
1134 .I limit | |
1135 is set to | |
1136 .BR "4020\ MiB" . | |
1137 On MIPS32 | |
1138 .B "2000\ MiB" | |
1139 is used instead. | |
1140 (The values | |
1141 .B 0 | |
1142 and | |
1143 .B max | |
1144 aren't affected by this. | |
1145 A similar feature doesn't exist for decompression.) | |
1146 This can be helpful when a 32-bit executable has access | |
1147 to 4\ GiB address space (2 GiB on MIPS32) | |
1148 while hopefully doing no harm in other situations. | |
1149 .IP "" | |
1150 See also the section | |
1151 .BR "Memory usage" . | |
1152 .TP | |
1153 .BI \-\-memlimit\-decompress= limit | |
1154 Set a memory usage limit for decompression. | |
1155 This also affects the | |
1156 .B \-\-list | |
1157 mode. | |
1158 If the operation is not possible without exceeding the | |
1159 .IR limit , | |
1160 .B xz | |
1161 will display an error and decompressing the file will fail. | |
1162 See | |
1163 .BI \-\-memlimit\-compress= limit | |
1164 for possible ways to specify the | |
1165 .IR limit . | |
1166 .TP | |
1167 .BI \-\-memlimit\-mt\-decompress= limit | |
1168 Set a memory usage limit for multi-threaded decompression. | |
1169 This can only affect the number of threads; | |
1170 this will never make | |
1171 .B xz | |
1172 refuse to decompress a file. | |
1173 If | |
1174 .I limit | |
1175 is too low to allow any multi-threading, the | |
1176 .I limit | |
1177 is ignored and | |
1178 .B xz | |
1179 will continue in single-threaded mode. | |
1180 Note that if also | |
1181 .B \-\-memlimit\-decompress | |
1182 is used, | |
1183 it will always apply to both single-threaded and multi-threaded modes, | |
1184 and so the effective | |
1185 .I limit | |
1186 for multi-threading will never be higher than the limit set with | |
1187 .BR \-\-memlimit\-decompress . | |
1188 .IP "" | |
1189 In contrast to the other memory usage limit options, | |
1190 .BI \-\-memlimit\-mt\-decompress= limit | |
1191 has a system-specific default | |
1192 .IR limit . | |
1193 .B "xz \-\-info\-memory" | |
1194 can be used to see the current value. | |
1195 .IP "" | |
1196 This option and its default value exist | |
1197 because without any limit the threaded decompressor | |
1198 could end up allocating an insane amount of memory with some input files. | |
1199 If the default | |
1200 .I limit | |
1201 is too low on your system, | |
1202 feel free to increase the | |
1203 .I limit | |
1204 but never set it to a value larger than the amount of usable RAM | |
1205 as with appropriate input files | |
1206 .B xz | |
1207 will attempt to use that amount of memory | |
1208 even with a low number of threads. | |
1209 Running out of memory or swapping | |
1210 will not improve decompression performance. | |
1211 .IP "" | |
1212 See | |
1213 .BI \-\-memlimit\-compress= limit | |
1214 for possible ways to specify the | |
1215 .IR limit . | |
1216 Setting | |
1217 .I limit | |
1218 to | |
1219 .B 0 | |
1220 resets the | |
1221 .I limit | |
1222 to the default system-specific value. | |
1223 .TP | |
1224 \fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit | |
1225 This is equivalent to specifying | |
1226 .BI \-\-memlimit\-compress= limit | |
1227 .BI \-\-memlimit-decompress= limit | |
1228 \fB\-\-memlimit\-mt\-decompress=\fIlimit\fR. | |
1229 .TP | |
1230 .B \-\-no\-adjust | |
1231 Display an error and exit if the memory usage limit cannot be | |
1232 met without adjusting settings that affect the compressed output. | |
1233 That is, this prevents | |
1234 .B xz | |
1235 from switching the encoder from multi-threaded mode to single-threaded mode | |
1236 and from reducing the LZMA2 dictionary size. | |
1237 Even when this option is used the number of threads may be reduced | |
1238 to meet the memory usage limit as that won't affect the compressed output. | |
1239 .IP "" | |
1240 Automatic adjusting is always disabled when creating raw streams | |
1241 .RB ( \-\-format=raw ). | |
1242 .TP | |
1243 \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads | |
1244 Specify the number of worker threads to use. | |
1245 Setting | |
1246 .I threads | |
1247 to a special value | |
1248 .B 0 | |
1249 makes | |
1250 .B xz | |
1251 use up to as many threads as the processor(s) on the system support. | |
1252 The actual number of threads can be fewer than | |
1253 .I threads | |
1254 if the input file is not big enough | |
1255 for threading with the given settings or | |
1256 if using more threads would exceed the memory usage limit. | |
1257 .IP "" | |
1258 The single-threaded and multi-threaded compressors produce different output. | |
1259 Single-threaded compressor will give the smallest file size but | |
1260 only the output from the multi-threaded compressor can be decompressed | |
1261 using multiple threads. | |
1262 Setting | |
1263 .I threads | |
1264 to | |
1265 .B 1 | |
1266 will use the single-threaded mode. | |
1267 Setting | |
1268 .I threads | |
1269 to any other value, including | |
1270 .BR 0 , | |
1271 will use the multi-threaded compressor | |
1272 even if the system supports only one hardware thread. | |
1273 .RB ( xz | |
1274 5.2.x | |
1275 used single-threaded mode in this situation.) | |
1276 .IP "" | |
1277 To use multi-threaded mode with only one thread, set | |
1278 .I threads | |
1279 to | |
1280 .BR +1 . | |
1281 The | |
1282 .B + | |
1283 prefix has no effect with values other than | |
1284 .BR 1 . | |
1285 A memory usage limit can still make | |
1286 .B xz | |
1287 switch to single-threaded mode unless | |
1288 .B \-\-no\-adjust | |
1289 is used. | |
1290 Support for the | |
1291 .B + | |
1292 prefix was added in | |
1293 .B xz | |
1294 5.4.0. | |
1295 .IP "" | |
1296 If an automatic number of threads has been requested and | |
1297 no memory usage limit has been specified, | |
1298 then a system-specific default soft limit will be used to possibly | |
1299 limit the number of threads. | |
1300 It is a soft limit in sense that it is ignored | |
1301 if the number of threads becomes one, | |
1302 thus a soft limit will never stop | |
1303 .B xz | |
1304 from compressing or decompressing. | |
1305 This default soft limit will not make | |
1306 .B xz | |
1307 switch from multi-threaded mode to single-threaded mode. | |
1308 The active limits can be seen with | |
1309 .BR "xz \-\-info\-memory" . | |
1310 .IP "" | |
1311 Currently the only threading method is to split the input into | |
1312 blocks and compress them independently from each other. | |
1313 The default block size depends on the compression level and | |
1314 can be overridden with the | |
1315 .BI \-\-block\-size= size | |
1316 option. | |
1317 .IP "" | |
1318 Threaded decompression only works on files that contain | |
1319 multiple blocks with size information in block headers. | |
1320 All large enough files compressed in multi-threaded mode | |
1321 meet this condition, | |
1322 but files compressed in single-threaded mode don't even if | |
1323 .BI \-\-block\-size= size | |
1324 has been used. | |
1325 .IP "" | |
1326 The default value for | |
1327 .I threads | |
1328 is | |
1329 .BR 0 . | |
1330 In | |
1331 .B xz | |
1332 5.4.x and older the default is | |
1333 .BR 1 . | |
1334 . | |
1335 .SS "Custom compressor filter chains" | |
1336 A custom filter chain allows specifying | |
1337 the compression settings in detail instead of relying on | |
1338 the settings associated to the presets. | |
1339 When a custom filter chain is specified, | |
1340 preset options | |
1341 .RB ( \-0 | |
1342 \&...\& | |
1343 .B \-9 | |
1344 and | |
1345 .BR \-\-extreme ) | |
1346 earlier on the command line are forgotten. | |
1347 If a preset option is specified | |
1348 after one or more custom filter chain options, | |
1349 the new preset takes effect and | |
1350 the custom filter chain options specified earlier are forgotten. | |
1351 .PP | |
1352 A filter chain is comparable to piping on the command line. | |
1353 When compressing, the uncompressed input goes to the first filter, | |
1354 whose output goes to the next filter (if any). | |
1355 The output of the last filter gets written to the compressed file. | |
1356 The maximum number of filters in the chain is four, | |
1357 but typically a filter chain has only one or two filters. | |
1358 .PP | |
1359 Many filters have limitations on where they can be | |
1360 in the filter chain: | |
1361 some filters can work only as the last filter in the chain, | |
1362 some only as a non-last filter, and some work in any position | |
1363 in the chain. | |
1364 Depending on the filter, this limitation is either inherent to | |
1365 the filter design or exists to prevent security issues. | |
1366 .PP | |
1367 A custom filter chain can be specified in two different ways. | |
1368 The options | |
1369 .BI \-\-filters= filters | |
1370 and | |
1371 .BI \-\-filters1= filters | |
1372 \&...\& | |
1373 .BI \-\-filters9= filters | |
1374 allow specifying an entire filter chain in one option using the | |
1375 liblzma filter string syntax. | |
1376 Alternatively, a filter chain can be specified by using one or more | |
1377 individual filter options in the order they are wanted in the filter chain. | |
1378 That is, the order of the individual filter options is significant! | |
1379 When decoding raw streams | |
1380 .RB ( \-\-format=raw ), | |
1381 the filter chain must be specified in the same order as | |
1382 it was specified when compressing. | |
1383 Any individual filter or preset options specified before the full | |
1384 chain option | |
1385 (\fB\-\-filters=\fIfilters\fR) | |
1386 will be forgotten. | |
1387 Individual filters specified after the full chain option will reset the | |
1388 filter chain. | |
1389 .PP | |
1390 Both the full and individual filter options take filter-specific | |
1391 .I options | |
1392 as a comma-separated list. | |
1393 Extra commas in | |
1394 .I options | |
1395 are ignored. | |
1396 Every option has a default value, so | |
1397 specify those you want to change. | |
1398 .PP | |
1399 To see the whole filter chain and | |
1400 .IR options , | |
1401 use | |
1402 .B "xz \-vv" | |
1403 (that is, use | |
1404 .B \-\-verbose | |
1405 twice). | |
1406 This works also for viewing the filter chain options used by presets. | |
1407 .TP | |
1408 .BI \-\-filters= filters | |
1409 Specify the full filter chain or a preset in a single option. | |
1410 Each filter can be separated by spaces or two dashes | |
1411 .RB ( \-\- ). | |
1412 .I filters | |
1413 may need to be quoted on the shell command line so it is | |
1414 parsed as a single option. | |
1415 To denote | |
1416 .IR options , | |
1417 use | |
1418 .B : | |
1419 or | |
1420 .BR = . | |
1421 A preset can be prefixed with a | |
1422 .B \- | |
1423 and followed with zero or more flags. | |
1424 The only supported flag is | |
1425 .B e | |
1426 to apply the same options as | |
1427 .BR \-\-extreme . | |
1428 .TP | |
1429 \fB\-\-filters1\fR=\fIfilters\fR ... \fB\-\-filters9\fR=\fIfilters | |
1430 Specify up to nine additional filter chains that can be used with | |
1431 .BR \-\-block\-list . | |
1432 .IP "" | |
1433 For example, when compressing an archive with executable files | |
1434 followed by text files, the executable part could use a filter | |
1435 chain with a BCJ filter and the text part only the LZMA2 filter. | |
1436 .TP | |
1437 .B \-\-filters-help | |
1438 Display a help message describing how to specify presets and | |
1439 custom filter chains in the | |
1440 .B \-\-filters | |
1441 and | |
1442 .BI \-\-filters1= filters | |
1443 \&...\& | |
1444 .BI \-\-filters9= filters | |
1445 options, and exit successfully. | |
1446 .TP | |
1447 \fB\-\-lzma1\fR[\fB=\fIoptions\fR] | |
1448 .PD 0 | |
1449 .TP | |
1450 \fB\-\-lzma2\fR[\fB=\fIoptions\fR] | |
1451 .PD | |
1452 Add LZMA1 or LZMA2 filter to the filter chain. | |
1453 These filters can be used only as the last filter in the chain. | |
1454 .IP "" | |
1455 LZMA1 is a legacy filter, | |
1456 which is supported almost solely due to the legacy | |
1457 .B .lzma | |
1458 file format, which supports only LZMA1. | |
1459 LZMA2 is an updated | |
1460 version of LZMA1 to fix some practical issues of LZMA1. | |
1461 The | |
1462 .B .xz | |
1463 format uses LZMA2 and doesn't support LZMA1 at all. | |
1464 Compression speed and ratios of LZMA1 and LZMA2 | |
1465 are practically the same. | |
1466 .IP "" | |
1467 LZMA1 and LZMA2 share the same set of | |
1468 .IR options : | |
1469 .RS | |
1470 .TP | |
1471 .BI preset= preset | |
1472 Reset all LZMA1 or LZMA2 | |
1473 .I options | |
1474 to | |
1475 .IR preset . | |
1476 .I Preset | |
1477 consist of an integer, which may be followed by single-letter | |
1478 preset modifiers. | |
1479 The integer can be from | |
1480 .B 0 | |
1481 to | |
1482 .BR 9 , | |
1483 matching the command line options | |
1484 .B \-0 | |
1485 \&...\& | |
1486 .BR \-9 . | |
1487 The only supported modifier is currently | |
1488 .BR e , | |
1489 which matches | |
1490 .BR \-\-extreme . | |
1491 If no | |
1492 .B preset | |
1493 is specified, the default values of LZMA1 or LZMA2 | |
1494 .I options | |
1495 are taken from the preset | |
1496 .BR 6 . | |
1497 .TP | |
1498 .BI dict= size | |
1499 Dictionary (history buffer) | |
1500 .I size | |
1501 indicates how many bytes of the recently processed | |
1502 uncompressed data is kept in memory. | |
1503 The algorithm tries to find repeating byte sequences (matches) in | |
1504 the uncompressed data, and replace them with references | |
1505 to the data currently in the dictionary. | |
1506 The bigger the dictionary, the higher is the chance | |
1507 to find a match. | |
1508 Thus, increasing dictionary | |
1509 .I size | |
1510 usually improves compression ratio, but | |
1511 a dictionary bigger than the uncompressed file is waste of memory. | |
1512 .IP "" | |
1513 Typical dictionary | |
1514 .I size | |
1515 is from 64\ KiB to 64\ MiB. | |
1516 The minimum is 4\ KiB. | |
1517 The maximum for compression is currently 1.5\ GiB (1536\ MiB). | |
1518 The decompressor already supports dictionaries up to | |
1519 one byte less than 4\ GiB, which is the maximum for | |
1520 the LZMA1 and LZMA2 stream formats. | |
1521 .IP "" | |
1522 Dictionary | |
1523 .I size | |
1524 and match finder | |
1525 .RI ( mf ) | |
1526 together determine the memory usage of the LZMA1 or LZMA2 encoder. | |
1527 The same (or bigger) dictionary | |
1528 .I size | |
1529 is required for decompressing that was used when compressing, | |
1530 thus the memory usage of the decoder is determined | |
1531 by the dictionary size used when compressing. | |
1532 The | |
1533 .B .xz | |
1534 headers store the dictionary | |
1535 .I size | |
1536 either as | |
1537 .RI "2^" n | |
1538 or | |
1539 .RI "2^" n " + 2^(" n "\-1)," | |
1540 so these | |
1541 .I sizes | |
1542 are somewhat preferred for compression. | |
1543 Other | |
1544 .I sizes | |
1545 will get rounded up when stored in the | |
1546 .B .xz | |
1547 headers. | |
1548 .TP | |
1549 .BI lc= lc | |
1550 Specify the number of literal context bits. | |
1551 The minimum is 0 and the maximum is 4; the default is 3. | |
1552 In addition, the sum of | |
1553 .I lc | |
1554 and | |
1555 .I lp | |
1556 must not exceed 4. | |
1557 .IP "" | |
1558 All bytes that cannot be encoded as matches | |
1559 are encoded as literals. | |
1560 That is, literals are simply 8-bit bytes | |
1561 that are encoded one at a time. | |
1562 .IP "" | |
1563 The literal coding makes an assumption that the highest | |
1564 .I lc | |
1565 bits of the previous uncompressed byte correlate | |
1566 with the next byte. | |
1567 For example, in typical English text, an upper-case letter is | |
1568 often followed by a lower-case letter, and a lower-case | |
1569 letter is usually followed by another lower-case letter. | |
1570 In the US-ASCII character set, the highest three bits are 010 | |
1571 for upper-case letters and 011 for lower-case letters. | |
1572 When | |
1573 .I lc | |
1574 is at least 3, the literal coding can take advantage of | |
1575 this property in the uncompressed data. | |
1576 .IP "" | |
1577 The default value (3) is usually good. | |
1578 If you want maximum compression, test | |
1579 .BR lc=4 . | |
1580 Sometimes it helps a little, and | |
1581 sometimes it makes compression worse. | |
1582 If it makes it worse, test | |
1583 .B lc=2 | |
1584 too. | |
1585 .TP | |
1586 .BI lp= lp | |
1587 Specify the number of literal position bits. | |
1588 The minimum is 0 and the maximum is 4; the default is 0. | |
1589 .IP "" | |
1590 .I Lp | |
1591 affects what kind of alignment in the uncompressed data is | |
1592 assumed when encoding literals. | |
1593 See | |
1594 .I pb | |
1595 below for more information about alignment. | |
1596 .TP | |
1597 .BI pb= pb | |
1598 Specify the number of position bits. | |
1599 The minimum is 0 and the maximum is 4; the default is 2. | |
1600 .IP "" | |
1601 .I Pb | |
1602 affects what kind of alignment in the uncompressed data is | |
1603 assumed in general. | |
1604 The default means four-byte alignment | |
1605 .RI (2^ pb =2^2=4), | |
1606 which is often a good choice when there's no better guess. | |
1607 .IP "" | |
1608 When the alignment is known, setting | |
1609 .I pb | |
1610 accordingly may reduce the file size a little. | |
1611 For example, with text files having one-byte | |
1612 alignment (US-ASCII, ISO-8859-*, UTF-8), setting | |
1613 .B pb=0 | |
1614 can improve compression slightly. | |
1615 For UTF-16 text, | |
1616 .B pb=1 | |
1617 is a good choice. | |
1618 If the alignment is an odd number like 3 bytes, | |
1619 .B pb=0 | |
1620 might be the best choice. | |
1621 .IP "" | |
1622 Even though the assumed alignment can be adjusted with | |
1623 .I pb | |
1624 and | |
1625 .IR lp , | |
1626 LZMA1 and LZMA2 still slightly favor 16-byte alignment. | |
1627 It might be worth taking into account when designing file formats | |
1628 that are likely to be often compressed with LZMA1 or LZMA2. | |
1629 .TP | |
1630 .BI mf= mf | |
1631 Match finder has a major effect on encoder speed, | |
1632 memory usage, and compression ratio. | |
1633 Usually Hash Chain match finders are faster than Binary Tree | |
1634 match finders. | |
1635 The default depends on the | |
1636 .IR preset : | |
1637 0 uses | |
1638 .BR hc3 , | |
1639 1\(en3 | |
1640 use | |
1641 .BR hc4 , | |
1642 and the rest use | |
1643 .BR bt4 . | |
1644 .IP "" | |
1645 The following match finders are supported. | |
1646 The memory usage formulas below are rough approximations, | |
1647 which are closest to the reality when | |
1648 .I dict | |
1649 is a power of two. | |
1650 .RS | |
1651 .TP | |
1652 .B hc3 | |
1653 Hash Chain with 2- and 3-byte hashing | |
1654 .br | |
1655 Minimum value for | |
1656 .IR nice : | |
1657 3 | |
1658 .br | |
1659 Memory usage: | |
1660 .br | |
1661 .I dict | |
1662 * 7.5 (if | |
1663 .I dict | |
1664 <= 16 MiB); | |
1665 .br | |
1666 .I dict | |
1667 * 5.5 + 64 MiB (if | |
1668 .I dict | |
1669 > 16 MiB) | |
1670 .TP | |
1671 .B hc4 | |
1672 Hash Chain with 2-, 3-, and 4-byte hashing | |
1673 .br | |
1674 Minimum value for | |
1675 .IR nice : | |
1676 4 | |
1677 .br | |
1678 Memory usage: | |
1679 .br | |
1680 .I dict | |
1681 * 7.5 (if | |
1682 .I dict | |
1683 <= 32 MiB); | |
1684 .br | |
1685 .I dict | |
1686 * 6.5 (if | |
1687 .I dict | |
1688 > 32 MiB) | |
1689 .TP | |
1690 .B bt2 | |
1691 Binary Tree with 2-byte hashing | |
1692 .br | |
1693 Minimum value for | |
1694 .IR nice : | |
1695 2 | |
1696 .br | |
1697 Memory usage: | |
1698 .I dict | |
1699 * 9.5 | |
1700 .TP | |
1701 .B bt3 | |
1702 Binary Tree with 2- and 3-byte hashing | |
1703 .br | |
1704 Minimum value for | |
1705 .IR nice : | |
1706 3 | |
1707 .br | |
1708 Memory usage: | |
1709 .br | |
1710 .I dict | |
1711 * 11.5 (if | |
1712 .I dict | |
1713 <= 16 MiB); | |
1714 .br | |
1715 .I dict | |
1716 * 9.5 + 64 MiB (if | |
1717 .I dict | |
1718 > 16 MiB) | |
1719 .TP | |
1720 .B bt4 | |
1721 Binary Tree with 2-, 3-, and 4-byte hashing | |
1722 .br | |
1723 Minimum value for | |
1724 .IR nice : | |
1725 4 | |
1726 .br | |
1727 Memory usage: | |
1728 .br | |
1729 .I dict | |
1730 * 11.5 (if | |
1731 .I dict | |
1732 <= 32 MiB); | |
1733 .br | |
1734 .I dict | |
1735 * 10.5 (if | |
1736 .I dict | |
1737 > 32 MiB) | |
1738 .RE | |
1739 .TP | |
1740 .BI mode= mode | |
1741 Compression | |
1742 .I mode | |
1743 specifies the method to analyze | |
1744 the data produced by the match finder. | |
1745 Supported | |
1746 .I modes | |
1747 are | |
1748 .B fast | |
1749 and | |
1750 .BR normal . | |
1751 The default is | |
1752 .B fast | |
1753 for | |
1754 .I presets | |
1755 0\(en3 and | |
1756 .B normal | |
1757 for | |
1758 .I presets | |
1759 4\(en9. | |
1760 .IP "" | |
1761 Usually | |
1762 .B fast | |
1763 is used with Hash Chain match finders and | |
1764 .B normal | |
1765 with Binary Tree match finders. | |
1766 This is also what the | |
1767 .I presets | |
1768 do. | |
1769 .TP | |
1770 .BI nice= nice | |
1771 Specify what is considered to be a nice length for a match. | |
1772 Once a match of at least | |
1773 .I nice | |
1774 bytes is found, the algorithm stops | |
1775 looking for possibly better matches. | |
1776 .IP "" | |
1777 .I Nice | |
1778 can be 2\(en273 bytes. | |
1779 Higher values tend to give better compression ratio | |
1780 at the expense of speed. | |
1781 The default depends on the | |
1782 .IR preset . | |
1783 .TP | |
1784 .BI depth= depth | |
1785 Specify the maximum search depth in the match finder. | |
1786 The default is the special value of 0, | |
1787 which makes the compressor determine a reasonable | |
1788 .I depth | |
1789 from | |
1790 .I mf | |
1791 and | |
1792 .IR nice . | |
1793 .IP "" | |
1794 Reasonable | |
1795 .I depth | |
1796 for Hash Chains is 4\(en100 and 16\(en1000 for Binary Trees. | |
1797 Using very high values for | |
1798 .I depth | |
1799 can make the encoder extremely slow with some files. | |
1800 Avoid setting the | |
1801 .I depth | |
1802 over 1000 unless you are prepared to interrupt | |
1803 the compression in case it is taking far too long. | |
1804 .RE | |
1805 .IP "" | |
1806 When decoding raw streams | |
1807 .RB ( \-\-format=raw ), | |
1808 LZMA2 needs only the dictionary | |
1809 .IR size . | |
1810 LZMA1 needs also | |
1811 .IR lc , | |
1812 .IR lp , | |
1813 and | |
1814 .IR pb . | |
1815 .TP | |
1816 \fB\-\-x86\fR[\fB=\fIoptions\fR] | |
1817 .PD 0 | |
1818 .TP | |
1819 \fB\-\-arm\fR[\fB=\fIoptions\fR] | |
1820 .TP | |
1821 \fB\-\-armthumb\fR[\fB=\fIoptions\fR] | |
1822 .TP | |
1823 \fB\-\-arm64\fR[\fB=\fIoptions\fR] | |
1824 .TP | |
1825 \fB\-\-powerpc\fR[\fB=\fIoptions\fR] | |
1826 .TP | |
1827 \fB\-\-ia64\fR[\fB=\fIoptions\fR] | |
1828 .TP | |
1829 \fB\-\-sparc\fR[\fB=\fIoptions\fR] | |
1830 .TP | |
1831 \fB\-\-riscv\fR[\fB=\fIoptions\fR] | |
1832 .PD | |
1833 Add a branch/call/jump (BCJ) filter to the filter chain. | |
1834 These filters can be used only as a non-last filter | |
1835 in the filter chain. | |
1836 .IP "" | |
1837 A BCJ filter converts relative addresses in | |
1838 the machine code to their absolute counterparts. | |
1839 This doesn't change the size of the data | |
1840 but it increases redundancy, | |
1841 which can help LZMA2 to produce 0\(en15\ % smaller | |
1842 .B .xz | |
1843 file. | |
1844 The BCJ filters are always reversible, | |
1845 so using a BCJ filter for wrong type of data | |
1846 doesn't cause any data loss, although it may make | |
1847 the compression ratio slightly worse. | |
1848 The BCJ filters are very fast and | |
1849 use an insignificant amount of memory. | |
1850 .IP "" | |
1851 These BCJ filters have known problems related to | |
1852 the compression ratio: | |
1853 .RS | |
1854 .IP \(bu 3 | |
1855 Some types of files containing executable code | |
1856 (for example, object files, static libraries, and Linux kernel modules) | |
1857 have the addresses in the instructions filled with filler values. | |
1858 These BCJ filters will still do the address conversion, | |
1859 which will make the compression worse with these files. | |
1860 .IP \(bu 3 | |
1861 If a BCJ filter is applied on an archive, | |
1862 it is possible that it makes the compression ratio | |
1863 worse than not using a BCJ filter. | |
1864 For example, if there are similar or even identical executables | |
1865 then filtering will likely make the files less similar | |
1866 and thus compression is worse. | |
1867 The contents of non-executable files in the same archive can matter too. | |
1868 In practice one has to try with and without a BCJ filter to see | |
1869 which is better in each situation. | |
1870 .RE | |
1871 .IP "" | |
1872 Different instruction sets have different alignment: | |
1873 the executable file must be aligned to a multiple of | |
1874 this value in the input data to make the filter work. | |
1875 .RS | |
1876 .RS | |
1877 .PP | |
1878 .TS | |
1879 tab(;); | |
1880 l n l | |
1881 l n l. | |
1882 Filter;Alignment;Notes | |
1883 x86;1;32-bit or 64-bit x86 | |
1884 ARM;4; | |
1885 ARM-Thumb;2; | |
1886 ARM64;4;4096-byte alignment is best | |
1887 PowerPC;4;Big endian only | |
1888 IA-64;16;Itanium | |
1889 SPARC;4; | |
1890 RISC-V;2; | |
1891 .TE | |
1892 .RE | |
1893 .RE | |
1894 .IP "" | |
1895 Since the BCJ-filtered data is usually compressed with LZMA2, | |
1896 the compression ratio may be improved slightly if | |
1897 the LZMA2 options are set to match the | |
1898 alignment of the selected BCJ filter. | |
1899 Examples: | |
1900 .RS | |
1901 .IP \(bu 3 | |
1902 IA-64 filter has 16-byte alignment so | |
1903 .B pb=4,lp=4,lc=0 | |
1904 is good | |
1905 with LZMA2 (2^4=16). | |
1906 .IP \(bu 3 | |
1907 RISC-V code has 2-byte or 4-byte alignment | |
1908 depending on whether the file contains | |
1909 16-bit compressed instructions (the C extension). | |
1910 When 16-bit instructions are used, | |
1911 .B pb=2,lp=1,lc=3 | |
1912 or | |
1913 .B pb=1,lp=1,lc=3 | |
1914 is good. | |
1915 When 16-bit instructions aren't present, | |
1916 .B pb=2,lp=2,lc=2 | |
1917 is the best. | |
1918 .B readelf \-h | |
1919 can be used to check if "RVC" | |
1920 appears on the "Flags" line. | |
1921 .IP \(bu 3 | |
1922 ARM64 is always 4-byte aligned so | |
1923 .B pb=2,lp=2,lc=2 | |
1924 is the best. | |
1925 .IP \(bu 3 | |
1926 The x86 filter is an exception. | |
1927 It's usually good to stick to LZMA2's defaults | |
1928 .RB ( pb=2,lp=0,lc=3 ) | |
1929 when compressing x86 executables. | |
1930 .RE | |
1931 .IP "" | |
1932 All BCJ filters support the same | |
1933 .IR options : | |
1934 .RS | |
1935 .TP | |
1936 .BI start= offset | |
1937 Specify the start | |
1938 .I offset | |
1939 that is used when converting between relative | |
1940 and absolute addresses. | |
1941 The | |
1942 .I offset | |
1943 must be a multiple of the alignment of the filter | |
1944 (see the table above). | |
1945 The default is zero. | |
1946 In practice, the default is good; specifying a custom | |
1947 .I offset | |
1948 is almost never useful. | |
1949 .RE | |
1950 .TP | |
1951 \fB\-\-delta\fR[\fB=\fIoptions\fR] | |
1952 Add the Delta filter to the filter chain. | |
1953 The Delta filter can be only used as a non-last filter | |
1954 in the filter chain. | |
1955 .IP "" | |
1956 Currently only simple byte-wise delta calculation is supported. | |
1957 It can be useful when compressing, for example, uncompressed bitmap images | |
1958 or uncompressed PCM audio. | |
1959 However, special purpose algorithms may give significantly better | |
1960 results than Delta + LZMA2. | |
1961 This is true especially with audio, | |
1962 which compresses faster and better, for example, with | |
1963 .BR flac (1). | |
1964 .IP "" | |
1965 Supported | |
1966 .IR options : | |
1967 .RS | |
1968 .TP | |
1969 .BI dist= distance | |
1970 Specify the | |
1971 .I distance | |
1972 of the delta calculation in bytes. | |
1973 .I distance | |
1974 must be 1\(en256. | |
1975 The default is 1. | |
1976 .IP "" | |
1977 For example, with | |
1978 .B dist=2 | |
1979 and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be | |
1980 A1 B1 01 02 01 02 01 02. | |
1981 .RE | |
1982 . | |
1983 .SS "Other options" | |
1984 .TP | |
1985 .BR \-q ", " \-\-quiet | |
1986 Suppress warnings and notices. | |
1987 Specify this twice to suppress errors too. | |
1988 This option has no effect on the exit status. | |
1989 That is, even if a warning was suppressed, | |
1990 the exit status to indicate a warning is still used. | |
1991 .TP | |
1992 .BR \-v ", " \-\-verbose | |
1993 Be verbose. | |
1994 If standard error is connected to a terminal, | |
1995 .B xz | |
1996 will display a progress indicator. | |
1997 Specifying | |
1998 .B \-\-verbose | |
1999 twice will give even more verbose output. | |
2000 .IP "" | |
2001 The progress indicator shows the following information: | |
2002 .RS | |
2003 .IP \(bu 3 | |
2004 Completion percentage is shown | |
2005 if the size of the input file is known. | |
2006 That is, the percentage cannot be shown in pipes. | |
2007 .IP \(bu 3 | |
2008 Amount of compressed data produced (compressing) | |
2009 or consumed (decompressing). | |
2010 .IP \(bu 3 | |
2011 Amount of uncompressed data consumed (compressing) | |
2012 or produced (decompressing). | |
2013 .IP \(bu 3 | |
2014 Compression ratio, which is calculated by dividing | |
2015 the amount of compressed data processed so far by | |
2016 the amount of uncompressed data processed so far. | |
2017 .IP \(bu 3 | |
2018 Compression or decompression speed. | |
2019 This is measured as the amount of uncompressed data consumed | |
2020 (compression) or produced (decompression) per second. | |
2021 It is shown after a few seconds have passed since | |
2022 .B xz | |
2023 started processing the file. | |
2024 .IP \(bu 3 | |
2025 Elapsed time in the format M:SS or H:MM:SS. | |
2026 .IP \(bu 3 | |
2027 Estimated remaining time is shown | |
2028 only when the size of the input file is | |
2029 known and a couple of seconds have already passed since | |
2030 .B xz | |
2031 started processing the file. | |
2032 The time is shown in a less precise format which | |
2033 never has any colons, for example, 2 min 30 s. | |
2034 .RE | |
2035 .IP "" | |
2036 When standard error is not a terminal, | |
2037 .B \-\-verbose | |
2038 will make | |
2039 .B xz | |
2040 print the filename, compressed size, uncompressed size, | |
2041 compression ratio, and possibly also the speed and elapsed time | |
2042 on a single line to standard error after compressing or | |
2043 decompressing the file. | |
2044 The speed and elapsed time are included only when | |
2045 the operation took at least a few seconds. | |
2046 If the operation didn't finish, for example, due to user interruption, | |
2047 also the completion percentage is printed | |
2048 if the size of the input file is known. | |
2049 .TP | |
2050 .BR \-Q ", " \-\-no\-warn | |
2051 Don't set the exit status to 2 | |
2052 even if a condition worth a warning was detected. | |
2053 This option doesn't affect the verbosity level, thus both | |
2054 .B \-\-quiet | |
2055 and | |
2056 .B \-\-no\-warn | |
2057 have to be used to not display warnings and | |
2058 to not alter the exit status. | |
2059 .TP | |
2060 .B \-\-robot | |
2061 Print messages in a machine-parsable format. | |
2062 This is intended to ease writing frontends that want to use | |
2063 .B xz | |
2064 instead of liblzma, which may be the case with various scripts. | |
2065 The output with this option enabled is meant to be stable across | |
2066 .B xz | |
2067 releases. | |
2068 See the section | |
2069 .B "ROBOT MODE" | |
2070 for details. | |
2071 .TP | |
2072 .B \-\-info\-memory | |
2073 Display, in human-readable format, how much physical memory (RAM) | |
2074 and how many processor threads | |
2075 .B xz | |
2076 thinks the system has and the memory usage limits for compression | |
2077 and decompression, and exit successfully. | |
2078 .TP | |
2079 .BR \-h ", " \-\-help | |
2080 Display a help message describing the most commonly used options, | |
2081 and exit successfully. | |
2082 .TP | |
2083 .BR \-H ", " \-\-long\-help | |
2084 Display a help message describing all features of | |
2085 .BR xz , | |
2086 and exit successfully | |
2087 .TP | |
2088 .BR \-V ", " \-\-version | |
2089 Display the version number of | |
2090 .B xz | |
2091 and liblzma in human readable format. | |
2092 To get machine-parsable output, specify | |
2093 .B \-\-robot | |
2094 before | |
2095 .BR \-\-version . | |
2096 . | |
2097 .SH "ROBOT MODE" | |
2098 The robot mode is activated with the | |
2099 .B \-\-robot | |
2100 option. | |
2101 It makes the output of | |
2102 .B xz | |
2103 easier to parse by other programs. | |
2104 Currently | |
2105 .B \-\-robot | |
2106 is supported only together with | |
2107 .BR \-\-list , | |
2108 .BR \-\-filters\-help , | |
2109 .BR \-\-info\-memory , | |
2110 and | |
2111 .BR \-\-version . | |
2112 It will be supported for compression and | |
2113 decompression in the future. | |
2114 . | |
2115 .SS "List mode" | |
2116 .B "xz \-\-robot \-\-list" | |
2117 uses tab-separated output. | |
2118 The first column of every line has a string | |
2119 that indicates the type of the information found on that line: | |
2120 .TP | |
2121 .B name | |
2122 This is always the first line when starting to list a file. | |
2123 The second column on the line is the filename. | |
2124 .TP | |
2125 .B file | |
2126 This line contains overall information about the | |
2127 .B .xz | |
2128 file. | |
2129 This line is always printed after the | |
2130 .B name | |
2131 line. | |
2132 .TP | |
2133 .B stream | |
2134 This line type is used only when | |
2135 .B \-\-verbose | |
2136 was specified. | |
2137 There are as many | |
2138 .B stream | |
2139 lines as there are streams in the | |
2140 .B .xz | |
2141 file. | |
2142 .TP | |
2143 .B block | |
2144 This line type is used only when | |
2145 .B \-\-verbose | |
2146 was specified. | |
2147 There are as many | |
2148 .B block | |
2149 lines as there are blocks in the | |
2150 .B .xz | |
2151 file. | |
2152 The | |
2153 .B block | |
2154 lines are shown after all the | |
2155 .B stream | |
2156 lines; different line types are not interleaved. | |
2157 .TP | |
2158 .B summary | |
2159 This line type is used only when | |
2160 .B \-\-verbose | |
2161 was specified twice. | |
2162 This line is printed after all | |
2163 .B block | |
2164 lines. | |
2165 Like the | |
2166 .B file | |
2167 line, the | |
2168 .B summary | |
2169 line contains overall information about the | |
2170 .B .xz | |
2171 file. | |
2172 .TP | |
2173 .B totals | |
2174 This line is always the very last line of the list output. | |
2175 It shows the total counts and sizes. | |
2176 .PP | |
2177 The columns of the | |
2178 .B file | |
2179 lines: | |
2180 .PD 0 | |
2181 .RS | |
2182 .IP 2. 4 | |
2183 Number of streams in the file | |
2184 .IP 3. 4 | |
2185 Total number of blocks in the stream(s) | |
2186 .IP 4. 4 | |
2187 Compressed size of the file | |
2188 .IP 5. 4 | |
2189 Uncompressed size of the file | |
2190 .IP 6. 4 | |
2191 Compression ratio, for example, | |
2192 .BR 0.123 . | |
2193 If ratio is over 9.999, three dashes | |
2194 .RB ( \-\-\- ) | |
2195 are displayed instead of the ratio. | |
2196 .IP 7. 4 | |
2197 Comma-separated list of integrity check names. | |
2198 The following strings are used for the known check types: | |
2199 .BR None , | |
2200 .BR CRC32 , | |
2201 .BR CRC64 , | |
2202 and | |
2203 .BR SHA\-256 . | |
2204 For unknown check types, | |
2205 .BI Unknown\- N | |
2206 is used, where | |
2207 .I N | |
2208 is the Check ID as a decimal number (one or two digits). | |
2209 .IP 8. 4 | |
2210 Total size of stream padding in the file | |
2211 .RE | |
2212 .PD | |
2213 .PP | |
2214 The columns of the | |
2215 .B stream | |
2216 lines: | |
2217 .PD 0 | |
2218 .RS | |
2219 .IP 2. 4 | |
2220 Stream number (the first stream is 1) | |
2221 .IP 3. 4 | |
2222 Number of blocks in the stream | |
2223 .IP 4. 4 | |
2224 Compressed start offset | |
2225 .IP 5. 4 | |
2226 Uncompressed start offset | |
2227 .IP 6. 4 | |
2228 Compressed size (does not include stream padding) | |
2229 .IP 7. 4 | |
2230 Uncompressed size | |
2231 .IP 8. 4 | |
2232 Compression ratio | |
2233 .IP 9. 4 | |
2234 Name of the integrity check | |
2235 .IP 10. 4 | |
2236 Size of stream padding | |
2237 .RE | |
2238 .PD | |
2239 .PP | |
2240 The columns of the | |
2241 .B block | |
2242 lines: | |
2243 .PD 0 | |
2244 .RS | |
2245 .IP 2. 4 | |
2246 Number of the stream containing this block | |
2247 .IP 3. 4 | |
2248 Block number relative to the beginning of the stream | |
2249 (the first block is 1) | |
2250 .IP 4. 4 | |
2251 Block number relative to the beginning of the file | |
2252 .IP 5. 4 | |
2253 Compressed start offset relative to the beginning of the file | |
2254 .IP 6. 4 | |
2255 Uncompressed start offset relative to the beginning of the file | |
2256 .IP 7. 4 | |
2257 Total compressed size of the block (includes headers) | |
2258 .IP 8. 4 | |
2259 Uncompressed size | |
2260 .IP 9. 4 | |
2261 Compression ratio | |
2262 .IP 10. 4 | |
2263 Name of the integrity check | |
2264 .RE | |
2265 .PD | |
2266 .PP | |
2267 If | |
2268 .B \-\-verbose | |
2269 was specified twice, additional columns are included on the | |
2270 .B block | |
2271 lines. | |
2272 These are not displayed with a single | |
2273 .BR \-\-verbose , | |
2274 because getting this information requires many seeks | |
2275 and can thus be slow: | |
2276 .PD 0 | |
2277 .RS | |
2278 .IP 11. 4 | |
2279 Value of the integrity check in hexadecimal | |
2280 .IP 12. 4 | |
2281 Block header size | |
2282 .IP 13. 4 | |
2283 Block flags: | |
2284 .B c | |
2285 indicates that compressed size is present, and | |
2286 .B u | |
2287 indicates that uncompressed size is present. | |
2288 If the flag is not set, a dash | |
2289 .RB ( \- ) | |
2290 is shown instead to keep the string length fixed. | |
2291 New flags may be added to the end of the string in the future. | |
2292 .IP 14. 4 | |
2293 Size of the actual compressed data in the block (this excludes | |
2294 the block header, block padding, and check fields) | |
2295 .IP 15. 4 | |
2296 Amount of memory (in bytes) required to decompress | |
2297 this block with this | |
2298 .B xz | |
2299 version | |
2300 .IP 16. 4 | |
2301 Filter chain. | |
2302 Note that most of the options used at compression time | |
2303 cannot be known, because only the options | |
2304 that are needed for decompression are stored in the | |
2305 .B .xz | |
2306 headers. | |
2307 .RE | |
2308 .PD | |
2309 .PP | |
2310 The columns of the | |
2311 .B summary | |
2312 lines: | |
2313 .PD 0 | |
2314 .RS | |
2315 .IP 2. 4 | |
2316 Amount of memory (in bytes) required to decompress | |
2317 this file with this | |
2318 .B xz | |
2319 version | |
2320 .IP 3. 4 | |
2321 .B yes | |
2322 or | |
2323 .B no | |
2324 indicating if all block headers have both compressed size and | |
2325 uncompressed size stored in them | |
2326 .PP | |
2327 .I Since | |
2328 .B xz | |
2329 .I 5.1.2alpha: | |
2330 .IP 4. 4 | |
2331 Minimum | |
2332 .B xz | |
2333 version required to decompress the file | |
2334 .RE | |
2335 .PD | |
2336 .PP | |
2337 The columns of the | |
2338 .B totals | |
2339 line: | |
2340 .PD 0 | |
2341 .RS | |
2342 .IP 2. 4 | |
2343 Number of streams | |
2344 .IP 3. 4 | |
2345 Number of blocks | |
2346 .IP 4. 4 | |
2347 Compressed size | |
2348 .IP 5. 4 | |
2349 Uncompressed size | |
2350 .IP 6. 4 | |
2351 Average compression ratio | |
2352 .IP 7. 4 | |
2353 Comma-separated list of integrity check names | |
2354 that were present in the files | |
2355 .IP 8. 4 | |
2356 Stream padding size | |
2357 .IP 9. 4 | |
2358 Number of files. | |
2359 This is here to | |
2360 keep the order of the earlier columns the same as on | |
2361 .B file | |
2362 lines. | |
2363 .PD | |
2364 .RE | |
2365 .PP | |
2366 If | |
2367 .B \-\-verbose | |
2368 was specified twice, additional columns are included on the | |
2369 .B totals | |
2370 line: | |
2371 .PD 0 | |
2372 .RS | |
2373 .IP 10. 4 | |
2374 Maximum amount of memory (in bytes) required to decompress | |
2375 the files with this | |
2376 .B xz | |
2377 version | |
2378 .IP 11. 4 | |
2379 .B yes | |
2380 or | |
2381 .B no | |
2382 indicating if all block headers have both compressed size and | |
2383 uncompressed size stored in them | |
2384 .PP | |
2385 .I Since | |
2386 .B xz | |
2387 .I 5.1.2alpha: | |
2388 .IP 12. 4 | |
2389 Minimum | |
2390 .B xz | |
2391 version required to decompress the file | |
2392 .RE | |
2393 .PD | |
2394 .PP | |
2395 Future versions may add new line types and | |
2396 new columns can be added to the existing line types, | |
2397 but the existing columns won't be changed. | |
2398 . | |
2399 .SS "Filters help" | |
2400 .B "xz \-\-robot \-\-filters-help" | |
2401 prints the supported filters in the following format: | |
2402 .PP | |
2403 \fIfilter\fB:\fIoption\fB=<\fIvalue\fB>,\fIoption\fB=<\fIvalue\fB>\fR... | |
2404 .TP | |
2405 .I filter | |
2406 Name of the filter | |
2407 .TP | |
2408 .I option | |
2409 Name of a filter specific option | |
2410 .TP | |
2411 .I value | |
2412 Numeric | |
2413 .I value | |
2414 ranges appear as | |
2415 \fB<\fImin\fB\-\fImax\fB>\fR. | |
2416 String | |
2417 .I value | |
2418 choices are shown within | |
2419 .B "< >" | |
2420 and separated by a | |
2421 .B | | |
2422 character. | |
2423 .PP | |
2424 Each filter is printed on its own line. | |
2425 . | |
2426 .SS "Memory limit information" | |
2427 .B "xz \-\-robot \-\-info\-memory" | |
2428 prints a single line with multiple tab-separated columns: | |
2429 .IP 1. 4 | |
2430 Total amount of physical memory (RAM) in bytes. | |
2431 .IP 2. 4 | |
2432 Memory usage limit for compression in bytes | |
2433 .RB ( \-\-memlimit\-compress ). | |
2434 A special value of | |
2435 .B 0 | |
2436 indicates the default setting | |
2437 which for single-threaded mode is the same as no limit. | |
2438 .IP 3. 4 | |
2439 Memory usage limit for decompression in bytes | |
2440 .RB ( \-\-memlimit\-decompress ). | |
2441 A special value of | |
2442 .B 0 | |
2443 indicates the default setting | |
2444 which for single-threaded mode is the same as no limit. | |
2445 .IP 4. 4 | |
2446 Since | |
2447 .B xz | |
2448 5.3.4alpha: | |
2449 Memory usage for multi-threaded decompression in bytes | |
2450 .RB ( \-\-memlimit\-mt\-decompress ). | |
2451 This is never zero because a system-specific default value | |
2452 shown in the column 5 | |
2453 is used if no limit has been specified explicitly. | |
2454 This is also never greater than the value in the column 3 | |
2455 even if a larger value has been specified with | |
2456 .BR \-\-memlimit\-mt\-decompress . | |
2457 .IP 5. 4 | |
2458 Since | |
2459 .B xz | |
2460 5.3.4alpha: | |
2461 A system-specific default memory usage limit | |
2462 that is used to limit the number of threads | |
2463 when compressing with an automatic number of threads | |
2464 .RB ( \-\-threads=0 ) | |
2465 and no memory usage limit has been specified | |
2466 .RB ( \-\-memlimit\-compress ). | |
2467 This is also used as the default value for | |
2468 .BR \-\-memlimit\-mt\-decompress . | |
2469 .IP 6. 4 | |
2470 Since | |
2471 .B xz | |
2472 5.3.4alpha: | |
2473 Number of available processor threads. | |
2474 .PP | |
2475 In the future, the output of | |
2476 .B "xz \-\-robot \-\-info\-memory" | |
2477 may have more columns, but never more than a single line. | |
2478 . | |
2479 .SS Version | |
2480 .B "xz \-\-robot \-\-version" | |
2481 prints the version number of | |
2482 .B xz | |
2483 and liblzma in the following format: | |
2484 .PP | |
2485 .BI XZ_VERSION= XYYYZZZS | |
2486 .br | |
2487 .BI LIBLZMA_VERSION= XYYYZZZS | |
2488 .TP | |
2489 .I X | |
2490 Major version. | |
2491 .TP | |
2492 .I YYY | |
2493 Minor version. | |
2494 Even numbers are stable. | |
2495 Odd numbers are alpha or beta versions. | |
2496 .TP | |
2497 .I ZZZ | |
2498 Patch level for stable releases or | |
2499 just a counter for development releases. | |
2500 .TP | |
2501 .I S | |
2502 Stability. | |
2503 0 is alpha, 1 is beta, and 2 is stable. | |
2504 .I S | |
2505 should be always 2 when | |
2506 .I YYY | |
2507 is even. | |
2508 .PP | |
2509 .I XYYYZZZS | |
2510 are the same on both lines if | |
2511 .B xz | |
2512 and liblzma are from the same XZ Utils release. | |
2513 .PP | |
2514 Examples: 4.999.9beta is | |
2515 .B 49990091 | |
2516 and | |
2517 5.0.0 is | |
2518 .BR 50000002 . | |
2519 . | |
2520 .SH "EXIT STATUS" | |
2521 .TP | |
2522 .B 0 | |
2523 All is good. | |
2524 .TP | |
2525 .B 1 | |
2526 An error occurred. | |
2527 .TP | |
2528 .B 2 | |
2529 Something worth a warning occurred, | |
2530 but no actual errors occurred. | |
2531 .PP | |
2532 Notices (not warnings or errors) printed on standard error | |
2533 don't affect the exit status. | |
2534 . | |
2535 .SH ENVIRONMENT | |
2536 .B xz | |
2537 parses space-separated lists of options | |
2538 from the environment variables | |
2539 .B XZ_DEFAULTS | |
2540 and | |
2541 .BR XZ_OPT , | |
2542 in this order, before parsing the options from the command line. | |
2543 Note that only options are parsed from the environment variables; | |
2544 all non-options are silently ignored. | |
2545 Parsing is done with | |
2546 .BR getopt_long (3) | |
2547 which is used also for the command line arguments. | |
2548 .TP | |
2549 .B XZ_DEFAULTS | |
2550 User-specific or system-wide default options. | |
2551 Typically this is set in a shell initialization script to enable | |
2552 .BR xz 's | |
2553 memory usage limiter by default. | |
2554 Excluding shell initialization scripts | |
2555 and similar special cases, scripts must never set or unset | |
2556 .BR XZ_DEFAULTS . | |
2557 .TP | |
2558 .B XZ_OPT | |
2559 This is for passing options to | |
2560 .B xz | |
2561 when it is not possible to set the options directly on the | |
2562 .B xz | |
2563 command line. | |
2564 This is the case when | |
2565 .B xz | |
2566 is run by a script or tool, for example, GNU | |
2567 .BR tar (1): | |
2568 .RS | |
2569 .RS | |
2570 .PP | |
2571 .nf | |
2572 .ft CR | |
2573 XZ_OPT=\-2v tar caf foo.tar.xz foo | |
2574 .ft R | |
2575 .fi | |
2576 .RE | |
2577 .RE | |
2578 .IP "" | |
2579 Scripts may use | |
2580 .BR XZ_OPT , | |
2581 for example, to set script-specific default compression options. | |
2582 It is still recommended to allow users to override | |
2583 .B XZ_OPT | |
2584 if that is reasonable. | |
2585 For example, in | |
2586 .BR sh (1) | |
2587 scripts one may use something like this: | |
2588 .RS | |
2589 .RS | |
2590 .PP | |
2591 .nf | |
2592 .ft CR | |
2593 XZ_OPT=${XZ_OPT\-"\-7e"} | |
2594 export XZ_OPT | |
2595 .ft R | |
2596 .fi | |
2597 .RE | |
2598 .RE | |
2599 . | |
2600 .SH "LZMA UTILS COMPATIBILITY" | |
2601 The command line syntax of | |
2602 .B xz | |
2603 is practically a superset of | |
2604 .BR lzma , | |
2605 .BR unlzma , | |
2606 and | |
2607 .B lzcat | |
2608 as found from LZMA Utils 4.32.x. | |
2609 In most cases, it is possible to replace | |
2610 LZMA Utils with XZ Utils without breaking existing scripts. | |
2611 There are some incompatibilities though, | |
2612 which may sometimes cause problems. | |
2613 . | |
2614 .SS "Compression preset levels" | |
2615 The numbering of the compression level presets is not identical in | |
2616 .B xz | |
2617 and LZMA Utils. | |
2618 The most important difference is how dictionary sizes | |
2619 are mapped to different presets. | |
2620 Dictionary size is roughly equal to the decompressor memory usage. | |
2621 .RS | |
2622 .PP | |
2623 .TS | |
2624 tab(;); | |
2625 c c c | |
2626 c n n. | |
2627 Level;xz;LZMA Utils | |
2628 \-0;256 KiB;N/A | |
2629 \-1;1 MiB;64 KiB | |
2630 \-2;2 MiB;1 MiB | |
2631 \-3;4 MiB;512 KiB | |
2632 \-4;4 MiB;1 MiB | |
2633 \-5;8 MiB;2 MiB | |
2634 \-6;8 MiB;4 MiB | |
2635 \-7;16 MiB;8 MiB | |
2636 \-8;32 MiB;16 MiB | |
2637 \-9;64 MiB;32 MiB | |
2638 .TE | |
2639 .RE | |
2640 .PP | |
2641 The dictionary size differences affect | |
2642 the compressor memory usage too, | |
2643 but there are some other differences between | |
2644 LZMA Utils and XZ Utils, which | |
2645 make the difference even bigger: | |
2646 .RS | |
2647 .PP | |
2648 .TS | |
2649 tab(;); | |
2650 c c c | |
2651 c n n. | |
2652 Level;xz;LZMA Utils 4.32.x | |
2653 \-0;3 MiB;N/A | |
2654 \-1;9 MiB;2 MiB | |
2655 \-2;17 MiB;12 MiB | |
2656 \-3;32 MiB;12 MiB | |
2657 \-4;48 MiB;16 MiB | |
2658 \-5;94 MiB;26 MiB | |
2659 \-6;94 MiB;45 MiB | |
2660 \-7;186 MiB;83 MiB | |
2661 \-8;370 MiB;159 MiB | |
2662 \-9;674 MiB;311 MiB | |
2663 .TE | |
2664 .RE | |
2665 .PP | |
2666 The default preset level in LZMA Utils is | |
2667 .B \-7 | |
2668 while in XZ Utils it is | |
2669 .BR \-6 , | |
2670 so both use an 8 MiB dictionary by default. | |
2671 . | |
2672 .SS "Streamed vs. non-streamed .lzma files" | |
2673 The uncompressed size of the file can be stored in the | |
2674 .B .lzma | |
2675 header. | |
2676 LZMA Utils does that when compressing regular files. | |
2677 The alternative is to mark that uncompressed size is unknown | |
2678 and use end-of-payload marker to indicate | |
2679 where the decompressor should stop. | |
2680 LZMA Utils uses this method when uncompressed size isn't known, | |
2681 which is the case, for example, in pipes. | |
2682 .PP | |
2683 .B xz | |
2684 supports decompressing | |
2685 .B .lzma | |
2686 files with or without end-of-payload marker, but all | |
2687 .B .lzma | |
2688 files created by | |
2689 .B xz | |
2690 will use end-of-payload marker and have uncompressed size | |
2691 marked as unknown in the | |
2692 .B .lzma | |
2693 header. | |
2694 This may be a problem in some uncommon situations. | |
2695 For example, a | |
2696 .B .lzma | |
2697 decompressor in an embedded device might work | |
2698 only with files that have known uncompressed size. | |
2699 If you hit this problem, you need to use LZMA Utils | |
2700 or LZMA SDK to create | |
2701 .B .lzma | |
2702 files with known uncompressed size. | |
2703 . | |
2704 .SS "Unsupported .lzma files" | |
2705 The | |
2706 .B .lzma | |
2707 format allows | |
2708 .I lc | |
2709 values up to 8, and | |
2710 .I lp | |
2711 values up to 4. | |
2712 LZMA Utils can decompress files with any | |
2713 .I lc | |
2714 and | |
2715 .IR lp , | |
2716 but always creates files with | |
2717 .B lc=3 | |
2718 and | |
2719 .BR lp=0 . | |
2720 Creating files with other | |
2721 .I lc | |
2722 and | |
2723 .I lp | |
2724 is possible with | |
2725 .B xz | |
2726 and with LZMA SDK. | |
2727 .PP | |
2728 The implementation of the LZMA1 filter in liblzma | |
2729 requires that the sum of | |
2730 .I lc | |
2731 and | |
2732 .I lp | |
2733 must not exceed 4. | |
2734 Thus, | |
2735 .B .lzma | |
2736 files, which exceed this limitation, cannot be decompressed with | |
2737 .BR xz . | |
2738 .PP | |
2739 LZMA Utils creates only | |
2740 .B .lzma | |
2741 files which have a dictionary size of | |
2742 .RI "2^" n | |
2743 (a power of 2) but accepts files with any dictionary size. | |
2744 liblzma accepts only | |
2745 .B .lzma | |
2746 files which have a dictionary size of | |
2747 .RI "2^" n | |
2748 or | |
2749 .RI "2^" n " + 2^(" n "\-1)." | |
2750 This is to decrease false positives when detecting | |
2751 .B .lzma | |
2752 files. | |
2753 .PP | |
2754 These limitations shouldn't be a problem in practice, | |
2755 since practically all | |
2756 .B .lzma | |
2757 files have been compressed with settings that liblzma will accept. | |
2758 . | |
2759 .SS "Trailing garbage" | |
2760 When decompressing, | |
2761 LZMA Utils silently ignore everything after the first | |
2762 .B .lzma | |
2763 stream. | |
2764 In most situations, this is a bug. | |
2765 This also means that LZMA Utils | |
2766 don't support decompressing concatenated | |
2767 .B .lzma | |
2768 files. | |
2769 .PP | |
2770 If there is data left after the first | |
2771 .B .lzma | |
2772 stream, | |
2773 .B xz | |
2774 considers the file to be corrupt unless | |
2775 .B \-\-single\-stream | |
2776 was used. | |
2777 This may break obscure scripts which have | |
2778 assumed that trailing garbage is ignored. | |
2779 . | |
2780 .SH NOTES | |
2781 . | |
2782 .SS "Compressed output may vary" | |
2783 The exact compressed output produced from | |
2784 the same uncompressed input file | |
2785 may vary between XZ Utils versions even if | |
2786 compression options are identical. | |
2787 This is because the encoder can be improved | |
2788 (faster or better compression) | |
2789 without affecting the file format. | |
2790 The output can vary even between different | |
2791 builds of the same XZ Utils version, | |
2792 if different build options are used. | |
2793 .PP | |
2794 The above means that once | |
2795 .B \-\-rsyncable | |
2796 has been implemented, | |
2797 the resulting files won't necessarily be rsyncable | |
2798 unless both old and new files have been compressed | |
2799 with the same xz version. | |
2800 This problem can be fixed if a part of the encoder | |
2801 implementation is frozen to keep rsyncable output | |
2802 stable across xz versions. | |
2803 . | |
2804 .SS "Embedded .xz decompressors" | |
2805 Embedded | |
2806 .B .xz | |
2807 decompressor implementations like XZ Embedded don't necessarily | |
2808 support files created with integrity | |
2809 .I check | |
2810 types other than | |
2811 .B none | |
2812 and | |
2813 .BR crc32 . | |
2814 Since the default is | |
2815 .BR \-\-check=crc64 , | |
2816 you must use | |
2817 .B \-\-check=none | |
2818 or | |
2819 .B \-\-check=crc32 | |
2820 when creating files for embedded systems. | |
2821 .PP | |
2822 Outside embedded systems, all | |
2823 .B .xz | |
2824 format decompressors support all the | |
2825 .I check | |
2826 types, or at least are able to decompress | |
2827 the file without verifying the | |
2828 integrity check if the particular | |
2829 .I check | |
2830 is not supported. | |
2831 .PP | |
2832 XZ Embedded supports BCJ filters, | |
2833 but only with the default start offset. | |
2834 . | |
2835 .SH EXAMPLES | |
2836 . | |
2837 .SS Basics | |
2838 Compress the file | |
2839 .I foo | |
2840 into | |
2841 .I foo.xz | |
2842 using the default compression level | |
2843 .RB ( \-6 ), | |
2844 and remove | |
2845 .I foo | |
2846 if compression is successful: | |
2847 .RS | |
2848 .PP | |
2849 .nf | |
2850 .ft CR | |
2851 xz foo | |
2852 .ft R | |
2853 .fi | |
2854 .RE | |
2855 .PP | |
2856 Decompress | |
2857 .I bar.xz | |
2858 into | |
2859 .I bar | |
2860 and don't remove | |
2861 .I bar.xz | |
2862 even if decompression is successful: | |
2863 .RS | |
2864 .PP | |
2865 .nf | |
2866 .ft CR | |
2867 xz \-dk bar.xz | |
2868 .ft R | |
2869 .fi | |
2870 .RE | |
2871 .PP | |
2872 Create | |
2873 .I baz.tar.xz | |
2874 with the preset | |
2875 .B \-4e | |
2876 .RB ( "\-4 \-\-extreme" ), | |
2877 which is slower than the default | |
2878 .BR \-6 , | |
2879 but needs less memory for compression and decompression (48\ MiB | |
2880 and 5\ MiB, respectively): | |
2881 .RS | |
2882 .PP | |
2883 .nf | |
2884 .ft CR | |
2885 tar cf \- baz | xz \-4e > baz.tar.xz | |
2886 .ft R | |
2887 .fi | |
2888 .RE | |
2889 .PP | |
2890 A mix of compressed and uncompressed files can be decompressed | |
2891 to standard output with a single command: | |
2892 .RS | |
2893 .PP | |
2894 .nf | |
2895 .ft CR | |
2896 xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt | |
2897 .ft R | |
2898 .fi | |
2899 .RE | |
2900 . | |
2901 .SS "Parallel compression of many files" | |
2902 On GNU and *BSD, | |
2903 .BR find (1) | |
2904 and | |
2905 .BR xargs (1) | |
2906 can be used to parallelize compression of many files: | |
2907 .RS | |
2908 .PP | |
2909 .nf | |
2910 .ft CR | |
2911 find . \-type f \e! \-name '*.xz' \-print0 \e | |
2912 | xargs \-0r \-P4 \-n16 xz \-T1 | |
2913 .ft R | |
2914 .fi | |
2915 .RE | |
2916 .PP | |
2917 The | |
2918 .B \-P | |
2919 option to | |
2920 .BR xargs (1) | |
2921 sets the number of parallel | |
2922 .B xz | |
2923 processes. | |
2924 The best value for the | |
2925 .B \-n | |
2926 option depends on how many files there are to be compressed. | |
2927 If there are only a couple of files, | |
2928 the value should probably be 1; | |
2929 with tens of thousands of files, | |
2930 100 or even more may be appropriate to reduce the number of | |
2931 .B xz | |
2932 processes that | |
2933 .BR xargs (1) | |
2934 will eventually create. | |
2935 .PP | |
2936 The option | |
2937 .B \-T1 | |
2938 for | |
2939 .B xz | |
2940 is there to force it to single-threaded mode, because | |
2941 .BR xargs (1) | |
2942 is used to control the amount of parallelization. | |
2943 . | |
2944 .SS "Robot mode" | |
2945 Calculate how many bytes have been saved in total | |
2946 after compressing multiple files: | |
2947 .RS | |
2948 .PP | |
2949 .nf | |
2950 .ft CR | |
2951 xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}' | |
2952 .ft R | |
2953 .fi | |
2954 .RE | |
2955 .PP | |
2956 A script may want to know that it is using new enough | |
2957 .BR xz . | |
2958 The following | |
2959 .BR sh (1) | |
2960 script checks that the version number of the | |
2961 .B xz | |
2962 tool is at least 5.0.0. | |
2963 This method is compatible with old beta versions, | |
2964 which didn't support the | |
2965 .B \-\-robot | |
2966 option: | |
2967 .RS | |
2968 .PP | |
2969 .nf | |
2970 .ft CR | |
2971 if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" || | |
2972 [ "$XZ_VERSION" \-lt 50000002 ]; then | |
2973 echo "Your xz is too old." | |
2974 fi | |
2975 unset XZ_VERSION LIBLZMA_VERSION | |
2976 .ft R | |
2977 .fi | |
2978 .RE | |
2979 .PP | |
2980 Set a memory usage limit for decompression using | |
2981 .BR XZ_OPT , | |
2982 but if a limit has already been set, don't increase it: | |
2983 .RS | |
2984 .PP | |
2985 .nf | |
2986 .ft CR | |
2987 NEWLIM=$((123 << 20))\ \ # 123 MiB | |
2988 OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3) | |
2989 if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then | |
2990 XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM" | |
2991 export XZ_OPT | |
2992 fi | |
2993 .ft R | |
2994 .fi | |
2995 .RE | |
2996 . | |
2997 .SS "Custom compressor filter chains" | |
2998 The simplest use for custom filter chains is | |
2999 customizing a LZMA2 preset. | |
3000 This can be useful, | |
3001 because the presets cover only a subset of the | |
3002 potentially useful combinations of compression settings. | |
3003 .PP | |
3004 The CompCPU columns of the tables | |
3005 from the descriptions of the options | |
3006 .BR "\-0" " ... " "\-9" | |
3007 and | |
3008 .B \-\-extreme | |
3009 are useful when customizing LZMA2 presets. | |
3010 Here are the relevant parts collected from those two tables: | |
3011 .RS | |
3012 .PP | |
3013 .TS | |
3014 tab(;); | |
3015 c c | |
3016 n n. | |
3017 Preset;CompCPU | |
3018 \-0;0 | |
3019 \-1;1 | |
3020 \-2;2 | |
3021 \-3;3 | |
3022 \-4;4 | |
3023 \-5;5 | |
3024 \-6;6 | |
3025 \-5e;7 | |
3026 \-6e;8 | |
3027 .TE | |
3028 .RE | |
3029 .PP | |
3030 If you know that a file requires | |
3031 somewhat big dictionary (for example, 32\ MiB) to compress well, | |
3032 but you want to compress it quicker than | |
3033 .B "xz \-8" | |
3034 would do, a preset with a low CompCPU value (for example, 1) | |
3035 can be modified to use a bigger dictionary: | |
3036 .RS | |
3037 .PP | |
3038 .nf | |
3039 .ft CR | |
3040 xz \-\-lzma2=preset=1,dict=32MiB foo.tar | |
3041 .ft R | |
3042 .fi | |
3043 .RE | |
3044 .PP | |
3045 With certain files, the above command may be faster than | |
3046 .B "xz \-6" | |
3047 while compressing significantly better. | |
3048 However, it must be emphasized that only some files benefit from | |
3049 a big dictionary while keeping the CompCPU value low. | |
3050 The most obvious situation, | |
3051 where a big dictionary can help a lot, | |
3052 is an archive containing very similar files | |
3053 of at least a few megabytes each. | |
3054 The dictionary size has to be significantly bigger | |
3055 than any individual file to allow LZMA2 to take | |
3056 full advantage of the similarities between consecutive files. | |
3057 .PP | |
3058 If very high compressor and decompressor memory usage is fine, | |
3059 and the file being compressed is | |
3060 at least several hundred megabytes, it may be useful | |
3061 to use an even bigger dictionary than the 64 MiB that | |
3062 .B "xz \-9" | |
3063 would use: | |
3064 .RS | |
3065 .PP | |
3066 .nf | |
3067 .ft CR | |
3068 xz \-vv \-\-lzma2=dict=192MiB big_foo.tar | |
3069 .ft R | |
3070 .fi | |
3071 .RE | |
3072 .PP | |
3073 Using | |
3074 .B \-vv | |
3075 .RB ( "\-\-verbose \-\-verbose" ) | |
3076 like in the above example can be useful | |
3077 to see the memory requirements | |
3078 of the compressor and decompressor. | |
3079 Remember that using a dictionary bigger than | |
3080 the size of the uncompressed file is waste of memory, | |
3081 so the above command isn't useful for small files. | |
3082 .PP | |
3083 Sometimes the compression time doesn't matter, | |
3084 but the decompressor memory usage has to be kept low, for example, | |
3085 to make it possible to decompress the file on an embedded system. | |
3086 The following command uses | |
3087 .B \-6e | |
3088 .RB ( "\-6 \-\-extreme" ) | |
3089 as a base and sets the dictionary to only 64\ KiB. | |
3090 The resulting file can be decompressed with XZ Embedded | |
3091 (that's why there is | |
3092 .BR \-\-check=crc32 ) | |
3093 using about 100\ KiB of memory. | |
3094 .RS | |
3095 .PP | |
3096 .nf | |
3097 .ft CR | |
3098 xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo | |
3099 .ft R | |
3100 .fi | |
3101 .RE | |
3102 .PP | |
3103 If you want to squeeze out as many bytes as possible, | |
3104 adjusting the number of literal context bits | |
3105 .RI ( lc ) | |
3106 and number of position bits | |
3107 .RI ( pb ) | |
3108 can sometimes help. | |
3109 Adjusting the number of literal position bits | |
3110 .RI ( lp ) | |
3111 might help too, but usually | |
3112 .I lc | |
3113 and | |
3114 .I pb | |
3115 are more important. | |
3116 For example, a source code archive contains mostly US-ASCII text, | |
3117 so something like the following might give | |
3118 slightly (like 0.1\ %) smaller file than | |
3119 .B "xz \-6e" | |
3120 (try also without | |
3121 .BR lc=4 ): | |
3122 .RS | |
3123 .PP | |
3124 .nf | |
3125 .ft CR | |
3126 xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar | |
3127 .ft R | |
3128 .fi | |
3129 .RE | |
3130 .PP | |
3131 Using another filter together with LZMA2 can improve | |
3132 compression with certain file types. | |
3133 For example, to compress a x86-32 or x86-64 shared library | |
3134 using the x86 BCJ filter: | |
3135 .RS | |
3136 .PP | |
3137 .nf | |
3138 .ft CR | |
3139 xz \-\-x86 \-\-lzma2 libfoo.so | |
3140 .ft R | |
3141 .fi | |
3142 .RE | |
3143 .PP | |
3144 Note that the order of the filter options is significant. | |
3145 If | |
3146 .B \-\-x86 | |
3147 is specified after | |
3148 .BR \-\-lzma2 , | |
3149 .B xz | |
3150 will give an error, | |
3151 because there cannot be any filter after LZMA2, | |
3152 and also because the x86 BCJ filter cannot be used | |
3153 as the last filter in the chain. | |
3154 .PP | |
3155 The Delta filter together with LZMA2 | |
3156 can give good results with bitmap images. | |
3157 It should usually beat PNG, | |
3158 which has a few more advanced filters than simple | |
3159 delta but uses Deflate for the actual compression. | |
3160 .PP | |
3161 The image has to be saved in uncompressed format, | |
3162 for example, as uncompressed TIFF. | |
3163 The distance parameter of the Delta filter is set | |
3164 to match the number of bytes per pixel in the image. | |
3165 For example, 24-bit RGB bitmap needs | |
3166 .BR dist=3 , | |
3167 and it is also good to pass | |
3168 .B pb=0 | |
3169 to LZMA2 to accommodate the three-byte alignment: | |
3170 .RS | |
3171 .PP | |
3172 .nf | |
3173 .ft CR | |
3174 xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff | |
3175 .ft R | |
3176 .fi | |
3177 .RE | |
3178 .PP | |
3179 If multiple images have been put into a single archive (for example, | |
3180 .BR .tar ), | |
3181 the Delta filter will work on that too as long as all images | |
3182 have the same number of bytes per pixel. | |
3183 . | |
3184 .SH "SEE ALSO" | |
3185 .BR xzdec (1), | |
3186 .BR xzdiff (1), | |
3187 .BR xzgrep (1), | |
3188 .BR xzless (1), | |
3189 .BR xzmore (1), | |
3190 .BR gzip (1), | |
3191 .BR bzip2 (1), | |
3192 .BR 7z (1) | |
3193 .PP | |
3194 XZ Utils: <https://tukaani.org/xz/> | |
3195 .br | |
3196 XZ Embedded: <https://tukaani.org/xz/embedded.html> | |
3197 .br | |
3198 LZMA SDK: <https://7-zip.org/sdk.html> |