kkonganti@1: # bettercallsal
kkonganti@1: 
kkonganti@1: `bettercallsal` is an automated workflow to assign Salmonella serotype based on [NCBI Pathogens Database](https://www.ncbi.nlm.nih.gov/pathogens). It uses `MASH` to reduce the search space followed by additional genome filtering with `sourmash`. It then performs genome based alignment with `kma` followed by count generation using `salmon`. This workflow is especially useful in a case where a sample is of multi-serovar mixture.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: <!-- TOC -->
kkonganti@1: 
kkonganti@1: - [Minimum Requirements](#minimum-requirements)
kkonganti@1: - [Usage and Examples](#usage-and-examples)
kkonganti@1:   - [Database](#database)
kkonganti@1:   - [Input](#input)
kkonganti@1:   - [Output](#output)
kkonganti@1:   - [Computational resources](#computational-resources)
kkonganti@1:   - [Runtime profiles](#runtime-profiles)
kkonganti@1:   - [your_institution.config](#your_institutionconfig)
kkonganti@1:   - [Cloud computing](#cloud-computing)
kkonganti@1:   - [Example data](#example-data)
kkonganti@1: - [Using sourmash](#using-sourmash)
kkonganti@1: - [bettercallsal CLI Help](#bettercallsal-cli-help)
kkonganti@1: 
kkonganti@1: <!-- /TOC -->
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ## Minimum Requirements
kkonganti@1: 
kkonganti@1: 1. [Nextflow version 22.10.0](https://github.com/nextflow-io/nextflow/releases/download/v22.10.0/nextflow).
kkonganti@1:     - Make the `nextflow` binary executable (`chmod 755 nextflow`) and also make sure that it is made available in your `$PATH`.
kkonganti@1:     - If your existing `JAVA` install does not support the newest **Nextflow** version, you can try **Amazon**'s `JAVA` (OpenJDK):  [Corretto](https://corretto.aws/downloads/latest/amazon-corretto-17-x64-linux-jdk.tar.gz).
kkonganti@1: 2. Either of `micromamba` or `docker` or `singularity` installed and made available in your `$PATH`.
kkonganti@1:     - Running the workflow via `micromamba` software provisioning is **preferred** as it does not require any `sudo` or `admin` privileges or any other configurations with respect to the various container providers.
kkonganti@1:     - To install `micromamba` for your system type, please follow these [installation steps](https://mamba.readthedocs.io/en/latest/installation.html#manual-installation) and make sure that the `micromamba` binary is made available in your `$PATH`.
kkonganti@1:     - Just the `curl` step is sufficient to download the binary as far as running the workflows are concerned.
kkonganti@1: 3. Minimum of 10 CPU cores and about 16 GBs for main workflow steps. More memory may be required if your **FASTQ** files are big.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ## Usage and Examples
kkonganti@1: 
kkonganti@1: Clone or download this repository and then call `cpipes`.
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cpipes --pipeline bettercallsal [options]
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: **Example**: Run the default `bettercallsal` pipeline in single-end mode.
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cd /data/scratch/$USER
kkonganti@1: mkdir nf-cpipes
kkonganti@1: cd nf-cpipes
kkonganti@1: cpipes
kkonganti@1:       --pipeline bettercallsal \
kkonganti@1:       --input /path/to/illumina/fastq/dir \
kkonganti@1:       --output /path/to/output \
kkonganti@1:       --bcs_root_dbdir /data/Kranti_Konganti/bettercallsal_db
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: **Example**: Run the `bettercallsal` pipeline in paired-end mode. In this mode, the `R1` and `R2` files are concatenated. We have found that concatenated reads yields better calling rates. Please refer to the **Methods** and the **Results** section in our [preprint](https://www.biorxiv.org/content/10.1101/2023.04.06.535929v1.full) for more information. Users can still choose to use `bbmerge.sh` by adding the following options on the command-line: `--bbmerge_run true --bcs_concat_pe false`.
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cd /data/scratch/$USER
kkonganti@1: mkdir nf-cpipes
kkonganti@1: cd nf-cpipes
kkonganti@1: cpipes \
kkonganti@1:       --pipeline bettercallsal \
kkonganti@1:       --input /path/to/illumina/fastq/dir \
kkonganti@1:       --output /path/to/output \
kkonganti@1:       --bcs_root_dbdir /data/Kranti_Konganti/bettercallsal_db \
kkonganti@1:       --fq_single_end false \
kkonganti@1:       --fq_suffix '_R1_001.fastq.gz'
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Database
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: The successful run of the workflow requires certain database flat files specific for the workflow.
kkonganti@1: 
kkonganti@1: Please refer to `bettercallsal_db` [README](./bettercallsal_db.md) if you would like to run the workflow on the latest version of the **PDG** release.
kkonganti@1: 
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Input
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: The input to the workflow is a folder containing compressed (`.gz`) FASTQ files. Please note that the sample grouping happens automatically by the file name of the FASTQ file. If for example, a single sample is sequenced across multiple sequencing lanes, you can choose to group those FASTQ files into one sample by using the `--fq_filename_delim` and `--fq_filename_delim_idx` options. By default, `--fq_filename_delim` is set to `_` (underscore) and `--fq_filename_delim_idx` is set to 1.
kkonganti@1: 
kkonganti@1: For example, if the directory contains FASTQ files as shown below:
kkonganti@1: 
kkonganti@1: - KB-01_apple_L001_R1.fastq.gz
kkonganti@1: - KB-01_apple_L001_R2.fastq.gz
kkonganti@1: - KB-01_apple_L002_R1.fastq.gz
kkonganti@1: - KB-01_apple_L002_R2.fastq.gz
kkonganti@1: - KB-02_mango_L001_R1.fastq.gz
kkonganti@1: - KB-02_mango_L001_R2.fastq.gz
kkonganti@1: - KB-02_mango_L002_R1.fastq.gz
kkonganti@1: - KB-02_mango_L002_R2.fastq.gz
kkonganti@1: 
kkonganti@1: Then, to create 2 sample groups, `apple` and `mango`, we split the file name by the delimitor (underscore in the case, which is default) and group by the first 2 words (`--fq_filename_delim_idx 2`).
kkonganti@1: 
kkonganti@1: This goes without saying that all the FASTQ files should have uniform naming patterns so that `--fq_filename_delim` and `--fq_filename_delim_idx` options do not have any adverse effect in collecting and creating a sample metadata sheet.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Output
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: All the outputs for each step are stored inside the folder mentioned with the `--output` option. A `multiqc_report.html` file inside the `bettercallsal-multiqc` folder can be opened in any browser on your local workstation which contains a consolidated brief report.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Computational resources
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: The workflow `bettercallsal` requires at least a minimum of 16 GBs of memory to successfully finish the workflow. By default, `bettercallsal` uses 10 CPU cores where possible. You can change this behavior and adjust the CPU cores with `--max_cpus` option.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: Example:
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cpipes \
kkonganti@1:     --pipeline bettercallsal \
kkonganti@1:     --input /path/to/bettercallsal_sim_reads \
kkonganti@1:     --output /path/to/bettercallsal_sim_reads_output \
kkonganti@1:     --bcs_root_dbdir /path/to/PDG000000002.2537
kkonganti@1:     --kmaalign_ignorequals \
kkonganti@1:     --max_cpus 5 \
kkonganti@1:     -profile stdkondagac \
kkonganti@1:     -resume
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Runtime profiles
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: You can use different run time profiles that suit your specific compute environments i.e., you can run the workflow locally on your machine or in a grid computing infrastructure.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: Example:
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cd /data/scratch/$USER
kkonganti@1: mkdir nf-cpipes
kkonganti@1: cd nf-cpipes
kkonganti@1: cpipes \
kkonganti@1:     --pipeline bettercallsal \
kkonganti@1:     --input /path/to/fastq_pass_dir \
kkonganti@1:     --output /path/to/where/output/should/go \
kkonganti@1:     -profile your_institution
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: The above command would run the pipeline and store the output at the location per the `--output` flag and the **NEXTFLOW** reports are always stored in the current working directory from where `cpipes` is run. For example, for the above command, a directory called `CPIPES-bettercallsal` would hold all the **NEXTFLOW** related logs, reports and trace files.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### `your_institution.config`
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: In the above example, we can see that we have mentioned the run time profile as `your_institution`. For this to work, add the following lines at the end of [`computeinfra.config`](../conf/computeinfra.config) file which should be located inside the `conf` folder. For example, if your institution uses **SGE** or **UNIVA** for grid computing instead of **SLURM** and has a job queue named `normal.q`, then add these lines:
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ```groovy
kkonganti@1: your_institution {
kkonganti@1:     process.executor = 'sge'
kkonganti@1:     process.queue = 'normal.q'
kkonganti@1:     singularity.enabled = false
kkonganti@1:     singularity.autoMounts = true
kkonganti@1:     docker.enabled = false
kkonganti@1:     params.enable_conda = true
kkonganti@1:     conda.enabled = true
kkonganti@1:     conda.useMicromamba = true
kkonganti@1:     params.enable_module = false
kkonganti@1: }
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: In the above example, by default, all the software provisioning choices are disabled except `conda`. You can also choose to remove the `process.queue` line altogether and the `bettercallsal` workflow will request the appropriate memory and number of CPU cores automatically, which ranges from 1 CPU, 1 GB and 1 hour for job completion up to 10 CPU cores, 1 TB and 120 hours for job completion.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Cloud computing
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: You can run the workflow in the cloud (works only with proper set up of AWS resources). Add new run time profiles with required parameters per [Nextflow docs](https://www.nextflow.io/docs/latest/executor.html):
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: Example:
kkonganti@1: 
kkonganti@1: ```groovy
kkonganti@1: my_aws_batch {
kkonganti@1:     executor = 'awsbatch'
kkonganti@1:     queue = 'my-batch-queue'
kkonganti@1:     aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
kkonganti@1:     aws.batch.region = 'us-east-1'
kkonganti@1:     singularity.enabled = false
kkonganti@1:     singularity.autoMounts = true
kkonganti@1:     docker.enabled = true
kkonganti@1:     params.conda_enabled = false
kkonganti@1:     params.enable_module = false
kkonganti@1: }
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ### Example data
kkonganti@1: 
kkonganti@1: ---
kkonganti@1: 
kkonganti@1: After you make sure that you have all the [minimum requirements](#minimum-requirements) to run the workflow, you can try the `bettercallsal` pipeline on some simulated reads. The following input dataset contains simulated reads for `Montevideo` and `I 4,[5],12:i:-` in about roughly equal proportions.
kkonganti@1: 
kkonganti@1: - Download simulated reads: [S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/bettercallsal/bettercallsal_sim_reads.tar.bz2) (~ 3 GB).
kkonganti@1: - Download pre-formatted test database: [S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/bettercallsal/PDG000000002.2491.test-db.tar.bz2) (~ 75 MB). This test database works only with the simulated reads.
kkonganti@1: - Download pre-formatted full database (**Optional**): If you would like to do a complete run with your own **FASTQ** datasets, you can either create your own [database](./bettercallsal_db.md) or use [PDG000000002.2537](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/bettercallsal/PDG000000002.2537.tar.bz2) version of the database (~ 37 GB).
kkonganti@1: - After succesful run of the workflow, your **MultiQC** report should look something like [this](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/bettercallsal/bettercallsal_sim_reads_mqc.html).
kkonganti@1: 
kkonganti@1: Now run the workflow by ignoring quality values since these are simulated base qualities:
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ```bash
kkonganti@1: cpipes \
kkonganti@1:     --pipeline bettercallsal \
kkonganti@1:     --input /path/to/bettercallsal_sim_reads \
kkonganti@1:     --output /path/to/bettercallsal_sim_reads_output \
kkonganti@1:     --bcs_root_dbdir /path/to/PDG000000002.2537
kkonganti@1:     --kmaalign_ignorequals \
kkonganti@1:     -profile stdkondagac \
kkonganti@1:     -resume
kkonganti@1: ```
kkonganti@1: 
kkonganti@1: Please note that the run time profile `stdkondagac` will run jobs locally using `micromamba` for software provisioning. The first time you run the command, a new folder called `kondagac_cache` will be created and subsequent runs should use this `conda` cache.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ## Using `sourmash`
kkonganti@1: 
kkonganti@1: Beginning with `v0.3.0` of `bettercallsal` workflow, `sourmash` sketching is used to further narrow down possible serotype hits. It is **ON** by default. This will enable the generation of **ANI Containment** matrix for **Samples** vs **Genomes**. There may be multiple hits for the same serotype in the final **MultiQC** report as multiple genome accessions can belong to a single serotype.
kkonganti@1: 
kkonganti@1: You can turn **OFF** this feature with `--sourmashsketch_run false` option.
kkonganti@1: 
kkonganti@1: \
kkonganti@1: &nbsp;
kkonganti@1: 
kkonganti@1: ## `bettercallsal` CLI Help
kkonganti@1: 
kkonganti@1: ```text
kkonganti@1: [Kranti_Konganti@my-unix-box ]$ cpipes --pipeline bettercallsal --help
kkonganti@1: N E X T F L O W  ~  version 22.10.0
kkonganti@1: Launching `./bettercallsal/cpipes` [awesome_chandrasekhar] DSL2 - revision: 8da4e11078
kkonganti@1: ================================================================================
kkonganti@1:              (o)
kkonganti@1:   ___  _ __   _  _ __    ___  ___
kkonganti@1:  / __|| '_ \ | || '_ \  / _ \/ __|
kkonganti@1: | (__ | |_) || || |_) ||  __/\__ \
kkonganti@1:  \___|| .__/ |_|| .__/  \___||___/
kkonganti@1:       | |       | |
kkonganti@1:       |_|       |_|
kkonganti@1: --------------------------------------------------------------------------------
kkonganti@1: A collection of modular pipelines at CFSAN, FDA.
kkonganti@1: --------------------------------------------------------------------------------
kkonganti@1: Name                            : CPIPES
kkonganti@1: Author                          : Kranti Konganti
kkonganti@1: Version                         : 0.5.0
kkonganti@1: Center                          : CFSAN, FDA.
kkonganti@1: ================================================================================
kkonganti@1: 
kkonganti@1: Workflow                        : bettercallsal
kkonganti@1: 
kkonganti@1: Author                          : Kranti Konganti
kkonganti@1: 
kkonganti@1: Version                         : 0.5.0
kkonganti@1: 
kkonganti@1: 
kkonganti@1: Usage                           : cpipes --pipeline bettercallsal [options]
kkonganti@1: 
kkonganti@1: 
kkonganti@1: Required                        :
kkonganti@1: 
kkonganti@1: --input                         : Absolute path to directory containing FASTQ
kkonganti@1:                                   files. The directory should contain only
kkonganti@1:                                   FASTQ files as all the files within the
kkonganti@1:                                   mentioned directory will be read. Ex: --
kkonganti@1:                                   input /path/to/fastq_pass
kkonganti@1: 
kkonganti@1: --output                        : Absolute path to directory where all the
kkonganti@1:                                   pipeline outputs should be stored. Ex: --
kkonganti@1:                                   output /path/to/output
kkonganti@1: 
kkonganti@1: Other options                   :
kkonganti@1: 
kkonganti@1: --metadata                      : Absolute path to metadata CSV file
kkonganti@1:                                   containing five mandatory columns: sample,
kkonganti@1:                                   fq1,fq2,strandedness,single_end. The fq1
kkonganti@1:                                   and fq2 columns contain absolute paths to
kkonganti@1:                                   the FASTQ files. This option can be used in
kkonganti@1:                                   place of --input option. This is rare. Ex
kkonganti@1:                                   : --metadata samplesheet.csv
kkonganti@1: 
kkonganti@1: --fq_suffix                     : The suffix of FASTQ files (Unpaired reads
kkonganti@1:                                   or R1 reads or Long reads) if an input
kkonganti@1:                                   directory is mentioned via --input option.
kkonganti@1:                                   Default: .fastq.gz
kkonganti@1: 
kkonganti@1: --fq2_suffix                    : The suffix of FASTQ files (Paired-end reads
kkonganti@1:                                   or R2 reads) if an input directory is
kkonganti@1:                                   mentioned via --input option. Default:
kkonganti@1:                                   _R2_001.fastq.gz
kkonganti@1: 
kkonganti@1: --fq_filter_by_len              : Remove FASTQ reads that are less than this
kkonganti@1:                                   many bases. Default: 0
kkonganti@1: 
kkonganti@1: --fq_strandedness               : The strandedness of the sequencing run.
kkonganti@1:                                   This is mostly needed if your sequencing
kkonganti@1:                                   run is RNA-SEQ. For most of the other runs
kkonganti@1:                                   , it is probably safe to use unstranded for
kkonganti@1:                                   the option. Default: unstranded
kkonganti@1: 
kkonganti@1: --fq_single_end                 : SINGLE-END information will be auto-
kkonganti@1:                                   detected but this option forces PAIRED-END
kkonganti@1:                                   FASTQ files to be treated as SINGLE-END so
kkonganti@1:                                   only read 1 information is included in auto
kkonganti@1:                                   -generated samplesheet. Default: true
kkonganti@1: 
kkonganti@1: --fq_filename_delim             : Delimiter by which the file name is split
kkonganti@1:                                   to obtain sample name. Default: _
kkonganti@1: 
kkonganti@1: --fq_filename_delim_idx         : After splitting FASTQ file name by using
kkonganti@1:                                   the --fq_filename_delim option, all
kkonganti@1:                                   elements before this index (1-based) will
kkonganti@1:                                   be joined to create final sample name.
kkonganti@1:                                   Default: 1
kkonganti@1: 
kkonganti@1: --bcs_concat_pe                 : Concatenate paired-end files. Default: true
kkonganti@1: 
kkonganti@1: --bbmerge_run                   : Run BBMerge tool. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_reads                 : Quit after this many read pairs (-1 means
kkonganti@1:                                   all) Default: -1
kkonganti@1: 
kkonganti@1: --bbmerge_adapters              : Absolute UNIX path pointing to the adapters
kkonganti@1:                                   file in FASTA format. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_ziplevel              : Set to 1 (lowest) through 9 (max) to change
kkonganti@1:                                   compression level; lower compression is
kkonganti@1:                                   faster. Default: 1
kkonganti@1: 
kkonganti@1: --bbmerge_ordered               : Output reads in the same order as input.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_qtrim                 : Trim read ends to remove bases with quality
kkonganti@1:                                   below --bbmerge_minq. Trims BEFORE merging
kkonganti@1:                                   . Values: t (trim both ends), f (neither
kkonganti@1:                                   end), r (right end only), l (left end only
kkonganti@1:                                   ). Default: true
kkonganti@1: 
kkonganti@1: --bbmerge_qtrim2                : May be specified instead of --bbmerge_qtrim
kkonganti@1:                                   to perform trimming only if merging is
kkonganti@1:                                   unsuccesful. then retry merging. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --bbmerge_trimq                 : Trim quality threshold. This may be comma-
kkonganti@1:                                   delimited list (ascending) to try multiple
kkonganti@1:                                   values. Default: 10
kkonganti@1: 
kkonganti@1: --bbmerge_minlength             : (ml) Reads shorter than this after trimming
kkonganti@1:                                   , but before merging, will be discarded.
kkonganti@1:                                   Pairs will be discarded onlyif both are
kkonganti@1:                                   shorter. Default: 1
kkonganti@1: 
kkonganti@1: --bbmerge_tbo                   : (trimbyoverlap). Trim overlapping reads to
kkonganti@1:                                   remove right most (3') non-overlaping
kkonganti@1:                                   portion instead of joining Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_minavgquality         : (maq). Reads with average quality below
kkonganti@1:                                   this after trimming will not be attempted
kkonganti@1:                                   to merge. Default: 30
kkonganti@1: 
kkonganti@1: --bbmerge_trimpolya             : Trim trailing poly-A tail from adapter
kkonganti@1:                                   output. Only affects outadapter.  This also
kkonganti@1:                                   trims poly-A followed by poly-G, which
kkonganti@1:                                   occurs on NextSeq. Default: true
kkonganti@1: 
kkonganti@1: --bbmerge_pfilter               : Ban improbable overlaps. Higher is more
kkonganti@1:                                   strict. 0 will disable the filter; 1 will
kkonganti@1:                                   allow only perfect overlaps. Default: 1
kkonganti@1: 
kkonganti@1: --bbmerge_ouq                   : Calculate best overlap using quality values
kkonganti@1:                                   . Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_owq                   : Calculate best overlap without using
kkonganti@1:                                   quality values. Default: true
kkonganti@1: 
kkonganti@1: --bbmerge_strict                : Decrease false positive rate and merging
kkonganti@1:                                   rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_verystrict            : Greatly decrease false positive rate and
kkonganti@1:                                   merging rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_ultrastrict           : Decrease false positive rate and merging
kkonganti@1:                                   rate even more. Default: true
kkonganti@1: 
kkonganti@1: --bbmerge_maxstrict             : Maxiamally decrease false positive rate and
kkonganti@1:                                   merging rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_loose                 : Increase false positive rate and merging
kkonganti@1:                                   rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_veryloose             : Greatly increase false positive rate and
kkonganti@1:                                   merging rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_ultraloose            : Increase false positive rate and merging
kkonganti@1:                                   rate even more. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_maxloose              : Maximally increase false positive rate and
kkonganti@1:                                   merging rate. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_fast                  : Fastest possible preset. Default: false
kkonganti@1: 
kkonganti@1: --bbmerge_k                     : Kmer length.  31 (or less) is fastest and
kkonganti@1:                                   uses the least memory, but higher values
kkonganti@1:                                   may be more accurate. 60 tends to work well
kkonganti@1:                                   for 150bp reads. Default: 60
kkonganti@1: 
kkonganti@1: --bbmerge_prealloc              : Pre-allocate memory rather than dynamically
kkonganti@1:                                   growing. Faster and more memory-efficient
kkonganti@1:                                   for large datasets. A float fraction (0-1)
kkonganti@1:                                   may be specified, default 1. Default: true
kkonganti@1: 
kkonganti@1: --fastp_run                     : Run fastp tool. Default: true
kkonganti@1: 
kkonganti@1: --fastp_failed_out              : Specify whether to store reads that cannot
kkonganti@1:                                   pass the filters. Default: false
kkonganti@1: 
kkonganti@1: --fastp_merged_out              : Specify whether to store merged output or
kkonganti@1:                                   not. Default: false
kkonganti@1: 
kkonganti@1: --fastp_overlapped_out          : For each read pair, output the overlapped
kkonganti@1:                                   region if it has no mismatched base.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --fastp_6                       : Indicate that the input is using phred64
kkonganti@1:                                   scoring (it'll be converted to phred33, so
kkonganti@1:                                   the output will still be phred33). Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --fastp_reads_to_process        : Specify how many reads/pairs are to be
kkonganti@1:                                   processed. Default value 0 means process
kkonganti@1:                                   all reads. Default: 0
kkonganti@1: 
kkonganti@1: --fastp_fix_mgi_id              : The MGI FASTQ ID format is not compatible
kkonganti@1:                                   with many BAM operation tools, enable this
kkonganti@1:                                   option to fix it. Default: false
kkonganti@1: 
kkonganti@1: --fastp_A                       : Disable adapter trimming. On by default.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --fastp_adapter_fasta           : Specify a FASTA file to trim both read1 and
kkonganti@1:                                   read2 (if PE) by all the sequences in this
kkonganti@1:                                   FASTA file. Default: false
kkonganti@1: 
kkonganti@1: --fastp_f                       : Trim how many bases in front of read1.
kkonganti@1:                                   Default: 0
kkonganti@1: 
kkonganti@1: --fastp_t                       : Trim how many bases at the end of read1.
kkonganti@1:                                   Default: 0
kkonganti@1: 
kkonganti@1: --fastp_b                       : Max length of read1 after trimming. Default
kkonganti@1:                                   : 0
kkonganti@1: 
kkonganti@1: --fastp_F                       : Trim how many bases in front of read2.
kkonganti@1:                                   Default: 0
kkonganti@1: 
kkonganti@1: --fastp_T                       : Trim how many bases at the end of read2.
kkonganti@1:                                   Default: 0
kkonganti@1: 
kkonganti@1: --fastp_B                       : Max length of read2 after trimming. Default
kkonganti@1:                                   : 0
kkonganti@1: 
kkonganti@1: --fastp_dedup                   : Enable deduplication to drop the duplicated
kkonganti@1:                                   reads/pairs. Default: true
kkonganti@1: 
kkonganti@1: --fastp_dup_calc_accuracy       : Accuracy level to calculate duplication (1~
kkonganti@1:                                   6), higher level uses more memory (1G, 2G,
kkonganti@1:                                   4G, 8G, 16G, 24G). Default 1 for no-dedup
kkonganti@1:                                   mode, and 3 for dedup mode. Default: 6
kkonganti@1: 
kkonganti@1: --fastp_poly_g_min_len          : The minimum length to detect polyG in the
kkonganti@1:                                   read tail. Default: 10
kkonganti@1: 
kkonganti@1: --fastp_G                       : Disable polyG tail trimming. Default: true
kkonganti@1: 
kkonganti@1: --fastp_x                       : Enable polyX trimming in 3' ends. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --fastp_poly_x_min_len          : The minimum length to detect polyX in the
kkonganti@1:                                   read tail. Default: 10
kkonganti@1: 
kkonganti@1: --fastp_cut_front               : Move a sliding window from front (5') to
kkonganti@1:                                   tail, drop the bases in the window if its
kkonganti@1:                                   mean quality < threshold, stop otherwise.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --fastp_cut_tail                : Move a sliding window from tail (3') to
kkonganti@1:                                   front, drop the bases in the window if its
kkonganti@1:                                   mean quality < threshold, stop otherwise.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --fastp_cut_right               : Move a sliding window from tail, drop the
kkonganti@1:                                   bases in the window and the right part if
kkonganti@1:                                   its mean quality < threshold, and then stop
kkonganti@1:                                   . Default: true
kkonganti@1: 
kkonganti@1: --fastp_W                       : Sliding window size shared by --
kkonganti@1:                                   fastp_cut_front, --fastp_cut_tail and --
kkonganti@1:                                   fastp_cut_right. Default: 20
kkonganti@1: 
kkonganti@1: --fastp_M                       : The mean quality requirement shared by --
kkonganti@1:                                   fastp_cut_front, --fastp_cut_tail and --
kkonganti@1:                                   fastp_cut_right. Default: 30
kkonganti@1: 
kkonganti@1: --fastp_q                       : The quality value below which a base should
kkonganti@1:                                   is not qualified. Default: 30
kkonganti@1: 
kkonganti@1: --fastp_u                       : What percent of bases are allowed to be
kkonganti@1:                                   unqualified. Default: 40
kkonganti@1: 
kkonganti@1: --fastp_n                       : How many N's can a read have. Default: 5
kkonganti@1: 
kkonganti@1: --fastp_e                       : If the full reads' average quality is below
kkonganti@1:                                   this value, then it is discarded. Default
kkonganti@1:                                   : 0
kkonganti@1: 
kkonganti@1: --fastp_l                       : Reads shorter than this length will be
kkonganti@1:                                   discarded. Default: 35
kkonganti@1: 
kkonganti@1: --fastp_max_len                 : Reads longer than this length will be
kkonganti@1:                                   discarded. Default: 0
kkonganti@1: 
kkonganti@1: --fastp_y                       : Enable low complexity filter. The
kkonganti@1:                                   complexity is defined as the percentage of
kkonganti@1:                                   bases that are different from its next base
kkonganti@1:                                   (base[i] != base[i+1]). Default: true
kkonganti@1: 
kkonganti@1: --fastp_Y                       : The threshold for low complexity filter (0~
kkonganti@1:                                   100). Ex: A value of 30 means 30%
kkonganti@1:                                   complexity is required. Default: 30
kkonganti@1: 
kkonganti@1: --fastp_U                       : Enable Unique Molecular Identifier (UMI)
kkonganti@1:                                   pre-processing. Default: false
kkonganti@1: 
kkonganti@1: --fastp_umi_loc                 : Specify the location of UMI, can be one of
kkonganti@1:                                   index1/index2/read1/read2/per_index/
kkonganti@1:                                   per_read. Default: false
kkonganti@1: 
kkonganti@1: --fastp_umi_len                 : If the UMI is in read1 or read2, its length
kkonganti@1:                                   should be provided. Default: false
kkonganti@1: 
kkonganti@1: --fastp_umi_prefix              : If specified, an underline will be used to
kkonganti@1:                                   connect prefix and UMI (i.e. prefix=UMI,
kkonganti@1:                                   UMI=AATTCG, final=UMI_AATTCG). Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --fastp_umi_skip                : If the UMI is in read1 or read2, fastp can
kkonganti@1:                                   skip several bases following the UMI.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --fastp_p                       : Enable overrepresented sequence analysis.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --fastp_P                       : One in this many number of reads will be
kkonganti@1:                                   computed for overrepresentation analysis (1
kkonganti@1:                                   ~10000), smaller is slower. Default: 20
kkonganti@1: 
kkonganti@1: --fastp_use_custom_adapaters    : Use custom adapter FASTA with fastp on top
kkonganti@1:                                   of built-in adapter sequence auto-detection
kkonganti@1:                                   . Enabling this option will attempt to find
kkonganti@1:                                   and remove all possible Illumina adapter
kkonganti@1:                                   and primer sequences but will make the
kkonganti@1:                                   workflow run slow. Default: false
kkonganti@1: 
kkonganti@1: --mashscreen_run                : Run `mash screen` tool. Default: true
kkonganti@1: 
kkonganti@1: --mashscreen_w                  : Winner-takes-all strategy for identity
kkonganti@1:                                   estimates. After counting hashes for each
kkonganti@1:                                   query, hashes that appear in multiple
kkonganti@1:                                   queries will be removed from all except the
kkonganti@1:                                   one with the best identity (ties broken by
kkonganti@1:                                   larger query), and other identities will
kkonganti@1:                                   be reduced. This removes output redundancy
kkonganti@1:                                   , providing a rough compositional outline
kkonganti@1:                                   .  Default: false
kkonganti@1: 
kkonganti@1: --mashscreen_i                  : Minimum identity to report. Inclusive
kkonganti@1:                                   unless set to zero, in which case only
kkonganti@1:                                   identities greater than zero (i.e. with at
kkonganti@1:                                   least one shared hash) will be reported.
kkonganti@1:                                   Set to -1 to output everything. (-1-1).
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --mashscreen_v                  : Maximum p-value to report (0-1). Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --tuspy_run                     : Run the get_top_unique_mash_hits_genomes.py
kkonganti@1:                                   script. Default: true
kkonganti@1: 
kkonganti@1: --tuspy_s                       : Absolute UNIX path to metadata text file
kkonganti@1:                                   with the field separator, | and 5 fields:
kkonganti@1:                                   serotype|asm_lvl|asm_url|snp_cluster_idEx:
kkonganti@1:                                   serotype=Derby,antigen_formula=4:f,g:-|
kkonganti@1:                                   Scaffold|402440|ftp://...|PDS000096654.2.
kkonganti@1:                                   Mentioning this option will create a pickle
kkonganti@1:                                   file for the provided metadata and exits.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --tuspy_m                       : Absolute UNIX path to mash screen results
kkonganti@1:                                   file. Default: false
kkonganti@1: 
kkonganti@1: --tuspy_ps                      : Absolute UNIX Path to serialized metadata
kkonganti@1:                                   object in a pickle file. Default: /hpc/db/
kkonganti@1:                                   bettercallsal/latest/index_metadata/
kkonganti@1:                                   per_snp_cluster.ACC2SERO.pickle
kkonganti@1: 
kkonganti@1: --tuspy_gd                      : Absolute UNIX Path to directory containing
kkonganti@1:                                   gzipped genome FASTA files. Default: /hpc/
kkonganti@1:                                   db/bettercallsal/latest/scaffold_genomes
kkonganti@1: 
kkonganti@1: --tuspy_gds                     : Genome FASTA file suffix to search for in
kkonganti@1:                                   the genome directory. Default:
kkonganti@1:                                   _scaffolded_genomic.fna.gz
kkonganti@1: 
kkonganti@1: --tuspy_n                       : Return up to this many number of top N
kkonganti@1:                                   unique genome accession hits. Default: 10
kkonganti@1: 
kkonganti@1: --sourmashsketch_run            : Run `sourmash sketch dna` tool. Default:
kkonganti@1:                                   true
kkonganti@1: 
kkonganti@1: --sourmashsketch_mode           : Select which type of signatures to be
kkonganti@1:                                   created: dna, protein, fromfile or
kkonganti@1:                                   translate. Default: dna
kkonganti@1: 
kkonganti@1: --sourmashsketch_p              : Signature parameters to use. Default: abund
kkonganti@1:                                   ,scaled=1000,k=51,k=61,k=71
kkonganti@1: 
kkonganti@1: --sourmashsketch_file           : <path>  A text file containing a list of
kkonganti@1:                                   sequence files to load. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsketch_f              : Recompute signatures even if the file
kkonganti@1:                                   exists. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsketch_merge          : Merge all input files into one signature
kkonganti@1:                                   file with the specified name. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --sourmashsketch_singleton      : Compute a signature for each sequence
kkonganti@1:                                   record individually. Default: true
kkonganti@1: 
kkonganti@1: --sourmashsketch_name           : Name the signature generated from each file
kkonganti@1:                                   after the first record in the file.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --sourmashsketch_randomize      : Shuffle the list of input files randomly.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_run            : Run `sourmash gather` tool. Default: true
kkonganti@1: 
kkonganti@1: --sourmashgather_n              : Number of results to report. By default,
kkonganti@1:                                   will terminate at --sourmashgather_thr_bp
kkonganti@1:                                   value. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_thr_bp         : Reporting threshold (in bp) for estimated
kkonganti@1:                                   overlap with remaining query. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --sourmashgather_ignoreabn      : Do NOT use k-mer abundances if present.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_prefetch       : Use prefetch before gather. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_noprefetch     : Do not use prefetch before gather. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --sourmashgather_ani_ci         : Output confidence intervals for ANI
kkonganti@1:                                   estimates. Default: true
kkonganti@1: 
kkonganti@1: --sourmashgather_k              : The k-mer size to select. Default: 71
kkonganti@1: 
kkonganti@1: --sourmashgather_protein        : Choose a protein signature. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_noprotein      : Do not choose a protein signature. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --sourmashgather_dayhoff        : Choose Dayhoff-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_nodayhoff      : Do not choose Dayhoff-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_hp             : Choose hydrophobic-polar-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_nohp           : Do not choose hydrophobic-polar-encoded
kkonganti@1:                                   amino acid signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_dna            : Choose DNA signature. Default: true
kkonganti@1: 
kkonganti@1: --sourmashgather_nodna          : Do not choose DNA signature. Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_scaled         : Scaled value should be between 100 and 1e6
kkonganti@1:                                   . Default: false
kkonganti@1: 
kkonganti@1: --sourmashgather_inc_pat        : Search only signatures that match this
kkonganti@1:                                   pattern in name, filename, or md5. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --sourmashgather_exc_pat        : Search only signatures that do not match
kkonganti@1:                                   this pattern in name, filename, or md5.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_run            : Run `sourmash search` tool. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_n              : Number of results to report. By default,
kkonganti@1:                                   will terminate at --sourmashsearch_thr
kkonganti@1:                                   value. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_thr            : Reporting threshold (similarity) to return
kkonganti@1:                                   results. Default: 0
kkonganti@1: 
kkonganti@1: --sourmashsearch_contain        : Score based on containment rather than
kkonganti@1:                                   similarity. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_maxcontain     : Score based on max containment rather than
kkonganti@1:                                   similarity. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_ignoreabn      : Do NOT use k-mer abundances if present.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --sourmashsearch_ani_ci         : Output confidence intervals for ANI
kkonganti@1:                                   estimates. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_k              : The k-mer size to select. Default: 71
kkonganti@1: 
kkonganti@1: --sourmashsearch_protein        : Choose a protein signature. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_noprotein      : Do not choose a protein signature. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --sourmashsearch_dayhoff        : Choose Dayhoff-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_nodayhoff      : Do not choose Dayhoff-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_hp             : Choose hydrophobic-polar-encoded amino acid
kkonganti@1:                                   signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_nohp           : Do not choose hydrophobic-polar-encoded
kkonganti@1:                                   amino acid signatures. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_dna            : Choose DNA signature. Default: true
kkonganti@1: 
kkonganti@1: --sourmashsearch_nodna          : Do not choose DNA signature. Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_scaled         : Scaled value should be between 100 and 1e6
kkonganti@1:                                   . Default: false
kkonganti@1: 
kkonganti@1: --sourmashsearch_inc_pat        : Search only signatures that match this
kkonganti@1:                                   pattern in name, filename, or md5. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --sourmashsearch_exc_pat        : Search only signatures that do not match
kkonganti@1:                                   this pattern in name, filename, or md5.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --sfhpy_run                     : Run the sourmash_filter_hits.py script.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --sfhpy_fcn                     : Column name by which filtering of rows
kkonganti@1:                                   should be applied. Default: f_match
kkonganti@1: 
kkonganti@1: --sfhpy_fcv                     : Remove genomes whose match with the query
kkonganti@1:                                   FASTQ is less than this much. Default: 0.1
kkonganti@1: 
kkonganti@1: --sfhpy_gt                      : Apply greather than or equal to condition
kkonganti@1:                                   on numeric values of --sfhpy_fcn column.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --sfhpy_lt                      : Apply less than or equal to condition on
kkonganti@1:                                   numeric values of --sfhpy_fcn column.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_run                  : Run kma index tool. Default: true
kkonganti@1: 
kkonganti@1: --kmaindex_t_db                 : Add to existing DB. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_k                    : k-mer size. Default: 31
kkonganti@1: 
kkonganti@1: --kmaindex_m                    : Minimizer size. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_hc                   : Homopolymer compression. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_ML                   : Minimum length of templates. Defaults to --
kkonganti@1:                                   kmaindex_k Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_ME                   : Mega DB. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_Sparse               : Make Sparse DB. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_ht                   : Homology template. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_hq                   : Homology query. Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_and                  : Both homology thresholds have to reach.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --kmaindex_nbp                  : No bias print. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_run                  : Run kma tool. Default: true
kkonganti@1: 
kkonganti@1: --kmaalign_int                  : Input file has interleaved reads.  Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --kmaalign_ef                   : Output additional features. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_vcf                  : Output vcf file. 2 to apply FT. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_sam                  : Output SAM, 4/2096 for mapped/aligned.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_nc                   : No consensus file. Default: true
kkonganti@1: 
kkonganti@1: --kmaalign_na                   : No aln file. Default: true
kkonganti@1: 
kkonganti@1: --kmaalign_nf                   : No frag file. Default: true
kkonganti@1: 
kkonganti@1: --kmaalign_a                    : Output all template mappings. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_and                  : Use both -mrs and p-value on consensus.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_oa                   : Use neither -mrs or p-value on consensus.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_bc                   : Minimum support to call bases. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_bcNano               : Altered indel calling for ONT data. Default
kkonganti@1:                                   : false
kkonganti@1: 
kkonganti@1: --kmaalign_bcd                  : Minimum depth to call bases. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_bcg                  : Maintain insignificant gaps. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_ID                   : Minimum consensus ID. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_md                   : Minimum depth. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_dense                : Skip insertion in consensus. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_ref_fsa              : Use Ns on indels. Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_Mt1                  : Map everything to one template. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_1t1                  : Map one query to one template. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_mrs                  : Minimum relative alignment score. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_mrc                  : Minimum query coverage. Default: 0.99
kkonganti@1: 
kkonganti@1: --kmaalign_mp                   : Minimum phred score of trailing and leading
kkonganti@1:                                   bases. Default: 30
kkonganti@1: 
kkonganti@1: --kmaalign_mq                   : Set the minimum mapping quality. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_eq                   : Minimum average quality score. Default: 30
kkonganti@1: 
kkonganti@1: --kmaalign_5p                   : Trim 5 prime by this many bases. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_3p                   : Trim 3 prime by this many bases Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --kmaalign_apm                  : Sets both -pm and -fpm Default: false
kkonganti@1: 
kkonganti@1: --kmaalign_cge                  : Set CGE penalties and rewards Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --salmonidx_run                 : Run `salmon index` tool. Default: true
kkonganti@1: 
kkonganti@1: --salmonidx_k                   : The size of k-mers that should be used for
kkonganti@1:                                   the  quasi index. Default: false
kkonganti@1: 
kkonganti@1: --salmonidx_gencode             : This flag will expect the input transcript
kkonganti@1:                                   FASTA to be in GENCODE format, and will
kkonganti@1:                                   split the transcript name at the first `|`
kkonganti@1:                                   character. These reduced names will be used
kkonganti@1:                                   in the output and when looking for these
kkonganti@1:                                   transcripts in a gene to transcript GTF.
kkonganti@1:                                   Default: false
kkonganti@1: 
kkonganti@1: --salmonidx_features            : This flag will expect the input reference
kkonganti@1:                                   to be in the tsv file format, and will
kkonganti@1:                                   split the feature name at the first `tab`
kkonganti@1:                                   character. These reduced names will be used
kkonganti@1:                                   in the output and when looking for the
kkonganti@1:                                   sequence of the features. GTF. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --salmonidx_keepDuplicates      : This flag will disable the default indexing
kkonganti@1:                                   behavior of discarding sequence-identical
kkonganti@1:                                   duplicate transcripts. If this flag is
kkonganti@1:                                   passed then duplicate transcripts that
kkonganti@1:                                   appear in the input will be retained and
kkonganti@1:                                   quantified separately. Default: false
kkonganti@1: 
kkonganti@1: --salmonidx_keepFixedFasta      : Retain the fixed fasta file (without short
kkonganti@1:                                   transcripts and duplicates, clipped, etc.)
kkonganti@1:                                   generated during indexing. Default: false
kkonganti@1: 
kkonganti@1: --salmonidx_filterSize          : The size of the Bloom filter that will be
kkonganti@1:                                   used by TwoPaCo during indexing. The filter
kkonganti@1:                                   will be of size 2^{filterSize}. A value of
kkonganti@1:                                   -1 means that the filter size will be
kkonganti@1:                                   automatically set based on the number of
kkonganti@1:                                   distinct k-mers in the input, as estimated
kkonganti@1:                                   by nthll. Default: false
kkonganti@1: 
kkonganti@1: --salmonidx_sparse              : Build the index using a sparse sampling of
kkonganti@1:                                   k-mer positions This will require less
kkonganti@1:                                   memory (especially during quantification),
kkonganti@1:                                   but will take longer to constructand can
kkonganti@1:                                   slow down mapping / alignment. Default:
kkonganti@1:                                   false
kkonganti@1: 
kkonganti@1: --salmonidx_n                   : Do not clip poly-A tails from the ends of
kkonganti@1:                                   target sequences. Default: false
kkonganti@1: 
kkonganti@1: --gsrpy_run                     : Run the gen_salmon_res_table.py script.
kkonganti@1:                                   Default: true
kkonganti@1: 
kkonganti@1: --gsrpy_url                     : Generate an additional column in final
kkonganti@1:                                   results table which links out to NCBI
kkonganti@1:                                   Pathogens Isolate Browser.  Default: true
kkonganti@1: 
kkonganti@1: Help options                    :
kkonganti@1: 
kkonganti@1: --help                          : Display this message.
kkonganti@1: 
kkonganti@1: ```