Mercurial > repos > rliterman > csp2

diff CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/man/man1/mash-triangle.1 @ 68:5028fdace37b
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author: jpayne
date: Tue, 18 Mar 2025 16:23:26 -0400
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/man/man1/mash-triangle.1	Tue Mar 18 16:23:26 2025 -0400
@@ -0,0 +1,169 @@
+'\" t
+.\"     Title: mash-triangle
+.\"    Author: [see the "AUTHOR(S)" section]
+.\" Generator: Asciidoctor 2.0.10
+.\"      Date: 2019-12-13
+.\"    Manual: \ \&
+.\"    Source: \ \&
+.\"  Language: English
+.\"
+.TH "MASH\-TRIANGLE" "1" "2019-12-13" "\ \&" "\ \&"
+.ie \n(.g .ds Aq \(aq
+.el       .ds Aq '
+.ss \n[.ss] 0
+.nh
+.ad l
+.de URL
+\fI\\$2\fP <\\$1>\\$3
+..
+.als MTO URL
+.if \n[.g] \{\
+.  mso www.tmac
+.  am URL
+.    ad l
+.  .
+.  am MTO
+.    ad l
+.  .
+.  LINKSTYLE blue R < >
+.\}
+.SH "NAME"
+mash\-triangle \- estimate a lower\-triangular distance matrix
+.SH "SYNOPSIS"
+.sp
+\fBmash triangle\fP [options] <seq1> [<seq2>] ...
+.SH "DESCRIPTION"
+.sp
+Estimate the distance of each input sequence to every other input
+sequence.  Outputs a lower\-triangular distance matrix in relaxed Phylip
+format. The input sequences can be fasta or fastq, gzipped or not, or
+Mash sketch files (.msh) with matching k\-mer sizes. Input files can also
+be files of file names (see \-l). If more than one input file is provided,
+whole files are compared by default (see \-i).
+.SH "OPTIONS"
+.sp
+\fB\-h\fP
+.RS 4
+Help
+.RE
+.sp
+\fB\-p\fP <int>
+.RS 4
+Parallelism. This many threads will be spawned for processing. [1]
+.RE
+.SS "Input"
+.sp
+\fB\-l\fP
+.RS 4
+List input. Each query file contains a list of sequence files, one
+per line. The reference file is not affected.
+.RE
+.SS "Output"
+.sp
+\fB\-C\fP
+.RS 4
+Use comment fields for sequence names instead of IDs.
+.RE
+.sp
+\fB\-E\fP
+.RS 4
+Output edge list instead of Phylip matrix, with fields [seq1, seq2,
+dist, p\-val, shared\-hashes].
+.RE
+.sp
+\fB\-v\fP <num>
+.RS 4
+Maximum p\-value to report in edge list. Implies \-E. (0\-1) [1.0]
+.RE
+.sp
+\fB\-d\fP <num>
+.RS 4
+Maximum distance to report in edge list. Implies \-E. (0\-1) [1.0]
+.RE
+.SS "Sketching"
+.sp
+\fB\-k\fP <int>
+.RS 4
+K\-mer size. Hashes will be based on strings of this many
+nucleotides. Canonical nucleotides are used by default (see
+Alphabet options below). (1\-32) [21]
+.RE
+.sp
+\fB\-s\fP <int>
+.RS 4
+Sketch size. Each sketch will have at most this many non\-redundant
+min\-hashes. [1000]
+.RE
+.sp
+\fB\-i\fP
+.RS 4
+Sketch individual sequences, rather than whole files, e.g. for
+multi\-fastas of single\-chromosome genomes or pair\-wise gene comparisons.
+.RE
+.sp
+\fB\-w\fP <num>
+.RS 4
+Probability threshold for warning about low k\-mer size. (0\-1) [0.01]
+.RE
+.sp
+\fB\-r\fP
+.RS 4
+Input is a read set. See Reads options below. Incompatible with \fB\-i\fP.
+.RE
+.SS "Sketching (reads)"
+.sp
+\fB\-b\fP <size>
+.RS 4
+Use a Bloom filter of this size (raw bytes or with K/M/G/T) to
+filter out unique k\-mers. This is useful if exact filtering with \fB\-m\fP
+uses too much memory. However, some unique k\-mers may pass
+erroneously, and copies cannot be counted beyond 2. Implies \fB\-r\fP.
+.RE
+.sp
+\fB\-m\fP <int>
+.RS 4
+Minimum copies of each k\-mer required to pass noise filter for
+reads. Implies \fB\-r\fP. [1]
+.RE
+.sp
+\fB\-c\fP <num>
+.RS 4
+Target coverage. Sketching will conclude if this coverage is
+reached before the end of the input file (estimated by average
+k\-mer multiplicity). Implies \fB\-r\fP.
+.RE
+.sp
+\fB\-g\fP <size>
+.RS 4
+Genome size. If specified, will be used for p\-value calculation
+instead of an estimated size from k\-mer content. Implies \fB\-r\fP.
+.RE
+.SS "Sketching (alphabet)"
+.sp
+\fB\-n\fP
+.RS 4
+Preserve strand (by default, strand is ignored by using canonical
+DNA k\-mers, which are alphabetical minima of forward\-reverse
+pairs). Implied if an alphabet is specified with \fB\-a\fP or \fB\-z\fP.
+.RE
+.sp
+\fB\-a\fP
+.RS 4
+Use amino acid alphabet (A\-Z, except BJOUXZ). Implies \fB\-n\fP, \fB\-k\fP 9.
+.RE
+.sp
+\fB\-z\fP <text>
+.RS 4
+Alphabet to base hashes on (case ignored by default; see \fB\-Z\fP).
+K\-mers with other characters will be ignored. Implies \fB\-n\fP.
+.RE
+.sp
+\fB\-Z\fP
+.RS 4
+Preserve case in k\-mers and alphabet (case is ignored by default).
+Sequence letters whose case is not in the current alphabet will be
+skipped when sketching.
+.RE
+.SH "SEE ALSO"
+.sp
+mash(1)
\ No newline at end of file
author	jpayne
date	Tue, 18 Mar 2025 16:23:26 -0400
parents
children