annotate 0.4.2/readme/centriflaken.md @ 0:082e0091e813 draft default tip

planemo upload
author galaxytrakr
date Fri, 29 May 2026 13:27:47 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
1 # centriflaken
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
2
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
3 `centriflaken` is an automated precision metagenomics workflow for assembly and _in silico_ analyses of food-borne pathogens. `centriflaken` primarily fine-tuned for detecting and classifying Shiga toxin-producing **_Escherichia coli_** (**STEC**), can also be used for performing analyses on other food-borne pathogens such as **_Salmonella enterica_**. `centriflaken` takes as input a UNIX path to FASTQ, generates MAGs, and performs in silico-based analysis for STECs as described in [Maguire et al. 2021](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0245172).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
4
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
5 `centriflaken` works on both **Illumina** short reads and **Oxford Nanopore** long reads.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
6
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
7 It is written in **Nextflow** and is part of the modular data analysis pipelines at **HFP**.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
8
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
9 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
10  
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
11
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
12 <!-- TOC -->
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
13
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
14 - [Minimum Requirements](#minimum-requirements)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
15 - [HFP GalaxyTrakr](#hfp-galaxytrakr)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
16 - [Usage and Examples](#usage-and-examples)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
17 - [Databases](#databases)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
18 - [Input](#input)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
19 - [Illumina short reads](#illumina-short-reads)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
20 - [Output](#output)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
21 - [Computational resources](#computational-resources)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
22 - [Runtime profiles](#runtime-profiles)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
23 - [your_institution.config](#your_institutionconfig)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
24 - [Test run](#test-run)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
25 - [centriflaken CLI Help](#centriflaken-cli-help)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
26 - [centriflaken_hy CLI Help](#centriflaken_hy-cli-help)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
27
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
28 <!-- /TOC -->
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
29
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
30 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
31 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
32
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
33 ## Minimum Requirements
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
34
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
35 1. [Nextflow version 24.10.4](https://github.com/nextflow-io/nextflow/releases/download/v24.10.4/nextflow).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
36 - Make the `nextflow` binary executable (`chmod 755 nextflow`) and also make sure that it is made available in your `$PATH`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
37 - If your existing `JAVA` install does not support the newest **Nextflow** version, you can try **Amazon**'s `JAVA` (OpenJDK): [Corretto](https://docs.aws.amazon.com/corretto/latest/corretto-21-ug/downloads-list.html).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
38 2. Either of `micromamba` (version `1.5.9`) or `docker` or `singularity` installed and made available in your `$PATH`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
39 - Running the workflow via `micromamba` software provisioning is **preferred** as it does not require any `sudo` or `admin` privileges or any other configurations with respect to the various container providers.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
40 - To install `micromamba` for your system type, please follow these [installation steps](https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html#linux-and-macos) and make sure that the `micromamba` binary is made available in your `$PATH`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
41 - Just the `curl` step is sufficient to download the binary as far as running the workflows are concerned.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
42 - Once you have finished the installation, **it is important that you downgrade `micromamba` to version `1.5.9`**.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
43 - First check, if your version is other than `1.5.9` and if not, do the downgrade.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
44
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
45 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
46 micromamba --version
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
47 micromamba self-update --version 1.5.9 -c conda-forge
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
48 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
49
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
50 3. Minimum of 10 CPU cores and about 60 GBs for main workflow steps. More memory may be required if your **FASTQ** files are big.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
51
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
52 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
53 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
54
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
55 ## HFP GalaxyTrakr
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
56
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
57 The `centriflaken` pipeline is also available for use on the [Galaxy instance supported by HFP, FDA](https://galaxytrakr.org/). If you wish to run the analysis using **Galaxy**, please register for an account, after which [you can run the workflow using this protocol](https://www.protocols.io/view/centriflaken-an-automated-data-analysis-pipeline-f-kxygxzdbwv8j/v5).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
58
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
59 Please note that the pipeline on [HFP GalaxyTrakr](https://galaxytrakr.org) in most cases may be a version older than the one on **GitHub** due to testing prioritization.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
60
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
61 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
62 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
63
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
64 ## Usage and Examples
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
65
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
66 Clone or download this repository and then call `cpipes`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
67
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
68 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
69 cpipes --pipeline centriflaken [options]
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
70 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
71
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
72 Alternatively, you can use `nextflow` to directly pull and run the pipeline.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
73
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
74 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
75 nextflow pull CFSAN-Biostatistics/centriflaken
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
76 nextflow list
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
77 nextflow info CFSAN-Biostatistics/centriflaken
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
78 nextflow run CFSAN-Biostatistics/centriflaken --pipeline centriflaken --help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
79 nextflow run CFSAN-Biostatistics/centriflaken --pipeline centriflaken_hy --help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
80 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
81
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
82 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
83 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
84
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
85 ### Databases
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
86
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
87 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
88
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
89 The successful run of the workflow requires all of the following databases:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
90
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
91 - `kraken2`, `centrifuge`, `serotypefinder` and `abricate`: [Download](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/centriflaken/centriflaken_dbs.tar.bz2).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
92
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
93 Once you have downloaded the databases, uncompress and set the **UNIX** path's in the configuration files as follows:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
94
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
95 - [Line no. 4](../workflows/conf/centriflaken.config#L4): `centrifuge_x = /path/to/centriflaken_dbs/centrifuge/ab`. The `ab` prefix is necessary.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
96 - [Line no. 11](../workflows/conf/centriflaken_hy.config#L11): `centrifuge_x = /path/to/centriflaken_dbs/centrifuge/ab`. The `ab` prefix is necessary.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
97 - [Line no. 10](../workflows/conf/centriflaken.config#L10): `kraken2_db = /path/to/centriflaken_dbs/kraken2`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
98 - [Line no. 17](../workflows/conf/centriflaken_hy.config#L17): `kraken2_db = /path/to/centriflaken_dbs/kraken2`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
99 - [Line no. 36](../workflows/conf/centriflaken.config#L36): `serotypefinder_db = /path/to/centriflaken_dbs/serotypefinder`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
100 - [Line no. 64](../workflows/conf/centriflaken_hy.config#L64): `serotypefinder_db = /path/to/centriflaken_dbs/serotypefinder`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
101 - [Line no. 53](../workflows/conf/centriflaken.config#L53): `abricate_datadir = /path/to/centriflaken_dbs/abricate`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
102 - [Line no. 81](../workflows/conf/centriflaken_hy.config#L81): `abricate_datadir = /path/to/centriflaken_dbs/abricate`.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
103
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
104 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
105 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
106
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
107 ### Input
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
108
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
109 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
110
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
111 The input to the workflow is a folder containing compressed (`.gz`) FASTQ files of long reads or short reads. Please note that the sample grouping happens automatically by the file name of the FASTQ file. If for example, a single sample is sequenced across multiple sequencing lanes, you can choose to group those FASTQ files into one sample by using the `--fq_filename_delim` and `--fq_filename_delim_idx` options. By default, `--fq_filename_delim` is set to `_` (underscore) and `--fq_filename_delim_idx` is set to 1.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
112
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
113 For example, if the directory contains FASTQ files as shown below:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
114
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
115 - KB-01_apple_L001_R1.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
116 - KB-01_apple_L001_R2.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
117 - KB-01_apple_L002_R1.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
118 - KB-01_apple_L002_R2.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
119 - KB-02_mango_L001_R1.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
120 - KB-02_mango_L001_R2.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
121 - KB-02_mango_L002_R1.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
122 - KB-02_mango_L002_R2.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
123
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
124 Then, to create 2 sample groups, `apple` and `mango`, we split the file name by the delimitor (underscore in the case, which is default) and group by the first 2 words (`--fq_filename_delim_idx 2`).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
125
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
126 This goes without saying that all the FASTQ files should have uniform naming patterns so that `--fq_filename_delim` and `--fq_filename_delim_idx` options do not have any adverse effect in collecting and creating a sample metadata sheet.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
127
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
128 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
129 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
130
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
131 ### Illumina short reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
132
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
133 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
134
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
135 `centriflaken` was primarily developed for **ONT** long reads but also supports **Illumina** short reads. Use the `--pipeline centriflaken_hy` instead of `--pipeline centriflaken` to activate this feature. The `centriflaken_hy` variant of the pipeline uses `megahit` instead of `flye` to perform short read assembly. There is no other change needed from the user other than using the `--pipeline centriflaken_hy` parameter for Illumina short reads.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
136
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
137 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
138 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
139
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
140 ### Output
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
141
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
142 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
143
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
144 All the outputs for each step are stored inside the folder mentioned with the `--output` option. A `multiqc_report.html` file inside the `centriflaken-multiqc` folder can be opened in any browser on your local workstation which contains a consolidated brief report.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
145
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
146 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
147 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
148
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
149 ### Computational resources
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
150
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
151 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
152
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
153 The workflows `centriflaken` and `centriflaken_hy` require at least a minimum of 60 GBs of memory to successfully finish the workflow.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
154
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
155 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
156 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
157
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
158 ### Runtime profiles
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
159
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
160 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
161
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
162 You can use different run time profiles that suit your specific compute environments i.e., you can run the workflow locally on your machine or in a grid computing infrastructure.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
163
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
164 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
165 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
166
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
167 Example:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
168
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
169 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
170 cd /data/scratch/$USER
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
171 mkdir nf-cpipes
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
172 cd nf-cpipes
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
173 cpipes \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
174 --pipeline centriflaken \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
175 --input /path/to/fastq_pass_dir \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
176 --output /path/to/where/output/should/go \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
177 -profile your_institution
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
178 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
179
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
180 The above command would run the pipeline and store the output at the location per the `--output` flag and the **NEXTFLOW** reports are always stored in the current working directory from where `cpipes` is run. For example, for the above command, a directory called `CPIPES-centriflaken` would hold all the **NEXTFLOW** related logs, reports and trace files.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
181
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
182 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
183 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
184
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
185 ### `your_institution.config`
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
186
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
187 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
188
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
189 In the above example, we can see that we have mentioned the run time profile as `your_institution`. For this to work, add the following lines at the end of [`computeinfra.config`](../conf/computeinfra.config) file which should be located inside the `conf` folder. For example, if your institution uses **SGE** or **UNIVA** for grid computing instead of **SLURM** and has a job queue named `normal.q`, then add these lines:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
190
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
191 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
192 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
193
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
194 ```groovy
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
195 your_institution {
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
196 process.executor = 'sge'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
197 process.queue = 'normal.q'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
198 singularity.enabled = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
199 singularity.autoMounts = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
200 docker.enabled = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
201 params.enable_conda = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
202 conda.enabled = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
203 conda.useMicromamba = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
204 params.enable_module = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
205 }
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
206 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
207
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
208 In the above example, by default, all the software provisioning choices are disabled except `conda`. You can also choose to remove the `process.queue` line altogether and the `centriflaken` workflow will request the appropriate memory and number of CPU cores automatically, which ranges from 1 CPU, 1 GB and 1 hour for job completion up to 10 CPU cores, 1 TB and 120 hours for job completion.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
209
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
210 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
211 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
212
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
213 ### Cloud computing
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
214
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
215 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
216
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
217 You can run the workflow in the cloud (works only with proper set up of AWS resources). Add new run time profiles with required parameters per [Nextflow docs](https://www.nextflow.io/docs/latest/executor.html):
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
218
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
219 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
220 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
221
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
222 Example:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
223
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
224 ```groovy
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
225 my_aws_batch {
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
226 executor = 'awsbatch'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
227 queue = 'my-batch-queue'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
228 aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
229 aws.batch.region = 'us-east-1'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
230 singularity.enabled = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
231 singularity.autoMounts = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
232 docker.enabled = true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
233 params.conda_enabled = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
234 params.enable_module = false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
235 }
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
236 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
237
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
238 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
239 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
240
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
241 ### Test run
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
242
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
243 ---
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
244
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
245 After you make sure that you have all the [minimum requirements](#minimum-requirements) to run the workflow, you can try the `centriflaken` pipeline on some subsampled reads belonging to the NCBI BioProject `PRJNA639799` as discussed in [Maguire _et al_](https://pmc.ncbi.nlm.nih.gov/articles/PMC10500926/).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
246
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
247 - Please note that the input reads are subsampled to validate the software install.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
248 - Download them [from S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/centriflaken/macguire_et_al_subsampled_reads.tar.bz2) (~ 20 GB).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
249
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
250 | Samples | Biosample | SRA accession | Flowcell |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
251 |:---------------------------------------------------------------|:-------------|:--------------|:---------|
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
252 | FAL00958 | SAMN46790801 | SRR32346290 | FAL00958 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
253 | FAL01198 | SAMN46793213 | SRR32346289 | FAL01198 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
254 | FAL01556 | SAMN46793220 | SRR32346278 | FAL01556 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
255 | ZymoBIOMICS Microbial Community DNA Standard R1 | SAMN46793392 | SRR32381322 | FAL11413 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
256 | ZymoBIOMICS Microbial Community DNA Standard R2 | SAMN46793393 | SRR32381321 | FAL01565 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
257 | ZymoBIOMICS Microbial Community Standard II - log distribution | SAMN46793397 | SRR32381320 | FAL01514 |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
258
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
259 - Download pre-formatted databases (**MANDATORY**) [from S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/centriflaken/centriflaken_dbs.tar.bz2) (~ 47 GB).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
260 - One of the assembly jobs should fail to assemble the reads and the pipeline will ignore the failed assembly and finish to completion.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
261 - After successful download, untar and change the paths to the databases in **BOTH** the [long reads conf file](../workflows/conf/centriflaken.config) and [short reads conf file](../workflows/conf/centriflaken_hy.config) as described in the [Databases](#databases) section.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
262 - The following values should point to the UNIX paths of the downloaded databases.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
263
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
264 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
265 centrifuge_x = '/path/to/centrifuge/ab' # /ab suffix SHOULD NOT change. Only the /path/to/centrifuge changes to your specific UNIX path.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
266 kraken2_db = '/path/to/kraken2'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
267 serotypefinder_db = '/path/to/serotypefinder'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
268 abricate_datadir = '/path/to/abricate'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
269 amrfinderplus_db = '/hpc/db/amrfinderplus/3.10.24/latest' # IGNORE THIS PATH SINCE AMRFINDERPLUS SHOULD NOT BE RUN.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
270 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
271
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
272 - It is always a best practice to use absolute UNIX paths and real destinations of symbolic links during pipeline execution. For example, find out the real path(s) of your absolute UNIX path(s) and use that for the `--input` and `--output` options of the pipeline.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
273
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
274 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
275 realpath /hpc/scratch/user/input/srr
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
276 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
277
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
278 - Now run the workflow by ignoring quality values since these are simulated base qualities:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
279
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
280 ```bash
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
281 cpipes \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
282 --pipeline centriflaken \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
283 --input /path/to/macguire_et_al_subsampled_reads \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
284 --output /path/to/centriflaken_test_output \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
285 -profile stdkondagac \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
286 -resume
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
287 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
288
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
289 - After succesful run of the workflow, your **MultiQC** report should look something like [this](https://cfsan-pub-xfer.s3.us-east-1.amazonaws.com/Kranti.Konganti/centriflaken/macquire_et_al_test_report.html).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
290
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
291 Please note that the run time profile `stdkondagac` will run jobs locally using `micromamba` for software provisioning. The first time you run the command, a new folder called `kondagac_cache` will be created and subsequent runs should use this `conda` cache.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
292
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
293 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
294 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
295
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
296 ## `centriflaken` CLI Help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
297
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
298 ```text
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
299 cpipes --pipeline centriflaken --help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
300
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
301 N E X T F L O W ~ version 24.10.4
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
302
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
303 Launching `/home/user/centriflaken/cpipes` [sleepy_pauling] DSL2 - revision: 55d6f63710
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
304
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
305 ================================================================================
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
306 (o)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
307 ___ _ __ _ _ __ ___ ___
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
308 / __|| '_ \ | || '_ \ / _ \/ __|
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
309 | (__ | |_) || || |_) || __/\__ \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
310 \___|| .__/ |_|| .__/ \___||___/
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
311 | | | |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
312 |_| |_|
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
313 --------------------------------------------------------------------------------
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
314 A collection of modular pipelines at CFSAN, FDA.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
315 --------------------------------------------------------------------------------
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
316 Name : CPIPES
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
317 Author : Kranti.Konganti@fda.hhs.gov
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
318 Version : 0.4.1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
319 Center : CFSAN, FDA.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
320 ================================================================================
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
321
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
322 Workflow : centriflaken
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
323
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
324 Author : Kranti.Konganti@fda.hhs.gov
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
325
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
326 Version : 0.4.2
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
327
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
328
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
329 Usage : cpipes --pipeline centriflaken [options]
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
330
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
331
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
332 Required :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
333
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
334 --input : Absolute path to directory containing FASTQ
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
335 files. The directory should contain only
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
336 FASTQ files as all the files within the
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
337 mentioned directory will be read. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
338 input /path/to/fastq_pass
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
339
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
340 --output : Absolute path to directory where all the
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
341 pipeline outputs should be stored. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
342 output /path/to/output
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
343
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
344 Other options :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
345
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
346 --metadata : Absolute path to metadata CSV file
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
347 containing five mandatory columns: sample,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
348 fq1,fq2,strandedness,single_end. The fq1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
349 and fq2 columns contain absolute paths to
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
350 the FASTQ files. This option can be used in
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
351 place of --input option. This is rare. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
352 metadata samplesheet.csv
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
353
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
354 --fq_suffix : The suffix of FASTQ files (Unpaired reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
355 or R1 reads or Long reads) if an input
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
356 directory is mentioned via --input option.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
357 Default: .fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
358
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
359 --fq2_suffix : The suffix of FASTQ files (Paired-end reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
360 or R2 reads) if an input directory is
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
361 mentioned via --input option. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
362 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
363
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
364 --fq_filter_by_len : Remove FASTQ reads that are less than this
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
365 many bases. Default: 4000
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
366
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
367 --fq_strandedness : The strandedness of the sequencing run.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
368 This is mostly needed if your sequencing
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
369 run is RNA-SEQ. For most of the other runs,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
370 it is probably safe to use unstranded for
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
371 the option. Default: unstranded
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
372
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
373 --fq_single_end : SINGLE-END information will be auto-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
374 detected but this option forces PAIRED-END
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
375 FASTQ files to be treated as SINGLE-END so
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
376 only read 1 information is included in auto-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
377 generated samplesheet. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
378
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
379 --fq_filename_delim : Delimiter by which the file name is split
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
380 to obtain sample name. Default: _
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
381
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
382 --fq_filename_delim_idx : After splitting FASTQ file name by using
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
383 the --fq_filename_delim option, all
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
384 elements before this index (1-based) will
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
385 be joined to create final sample name.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
386 Default: 1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
387
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
388 --kraken2_db : Absolute path to kraken database. Default: /
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
389 hpc/db/kraken2/standard-210914
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
390
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
391 --kraken2_confidence : Confidence score threshold which must be
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
392 between 0 and 1. Default: 0.0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
393
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
394 --kraken2_quick : Quick operation (use first hit or hits).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
395 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
396
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
397 --kraken2_use_mpa_style : Report output like Kraken 1's kraken-mpa-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
398 report. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
399
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
400 --kraken2_minimum_base_quality : Minimum base quality used in classification
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
401 which is only effective with FASTQ input.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
402 Default: 0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
403
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
404 --kraken2_report_zero_counts : Report counts for ALL taxa, even if counts
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
405 are zero. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
406
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
407 --kraken2_report_minmizer_data : Report minimizer and distinct minimizer
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
408 count information in addition to normal
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
409 Kraken report. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
410
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
411 --kraken2_use_names : Print scientific names instead of just
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
412 taxids. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
413
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
414 --kraken2_extract_bug : Extract the reads or contigs beloging to
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
415 this bug. Default: Escherichia coli
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
416
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
417 --centrifuge_x : Absolute path to centrifuge database.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
418 Default: /hpc/db/centrifuge/2022-04-12/ab
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
419
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
420 --centrifuge_save_unaligned : Save SINGLE-END reads that did not align.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
421 For PAIRED-END reads, save read pairs that
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
422 did not align concordantly. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
423
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
424 --centrifuge_save_aligned : Save SINGLE-END reads that aligned. For
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
425 PAIRED-END reads, save read pairs that
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
426 aligned concordantly. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
427
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
428 --centrifuge_out_fmt_sam : Centrifuge output should be in SAM. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
429 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
430
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
431 --centrifuge_extract_bug : Extract this bug from centrifuge results.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
432 Default: Escherichia coli
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
433
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
434 --centrifuge_ignore_quals : Treat all quality values as 30 on Phred
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
435 scale. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
436
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
437 --flye_pacbio_raw : Input FASTQ reads are PacBio regular CLR
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
438 reads (<20% error) Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
439
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
440 --flye_pacbio_corr : Input FASTQ reads are PacBio reads that
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
441 were corrected with other methods (<3%
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
442 error). Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
443
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
444 --flye_pacbio_hifi : Input FASTQ reads are PacBio HiFi reads (<1%
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
445 error). Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
446
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
447 --flye_nano_raw : Input FASTQ reads are ONT regular reads,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
448 pre-Guppy5 (<20% error). Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
449
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
450 --flye_nano_corr : Input FASTQ reads are ONT reads that were
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
451 corrected with other methods (<3% error).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
452 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
453
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
454 --flye_nano_hq : Input FASTQ reads are ONT high-quality
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
455 reads: Guppy5+ SUP or Q20 (<5% error).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
456 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
457
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
458 --flye_genome_size : Estimated genome size (for example, 5m or 2.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
459 6g). Default: 5.5m
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
460
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
461 --flye_polish_iter : Number of genome polishing iterations.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
462 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
463
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
464 --flye_meta : Do a metagenome assembly (unenven coverage
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
465 mode). Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
466
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
467 --flye_min_overlap : Minimum overlap between reads. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
468 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
469
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
470 --flye_scaffold : Enable scaffolding using assembly graph.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
471 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
472
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
473 --serotypefinder_run : Run SerotypeFinder tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
474
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
475 --serotypefinder_x : Generate extended output files. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
476 true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
477
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
478 --serotypefinder_db : Path to SerotypeFinder databases. Default: /
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
479 hpc/db/serotypefinder/2.0.2
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
480
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
481 --serotypefinder_min_threshold : Minimum percent identity (in float)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
482 required for calling a hit. Default: 0.85
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
483
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
484 --serotypefinder_min_cov : Minumum percent coverage (in float)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
485 required for calling a hit. Default: 0.80
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
486
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
487 --seqsero2_run : Run SeqSero2 tool. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
488
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
489 --seqsero2_t : '1' for interleaved paired-end reads, '2'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
490 for separated paired-end reads, '3' for
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
491 single reads, '4' for genome assembly, '5'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
492 for nanopore reads (fasta/fastq). Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
493 4
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
494
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
495 --seqsero2_m : Which workflow to apply, 'a'(raw reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
496 allele micro-assembly), 'k'(raw reads and
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
497 genome assembly k-mer). Default: k
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
498
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
499 --seqsero2_c : SeqSero2 will only output serotype
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
500 prediction without the directory containing
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
501 log files. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
502
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
503 --seqsero2_s : SeqSero2 will not output header in
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
504 SeqSero_result.tsv. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
505
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
506 --mlst_run : Run MLST tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
507
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
508 --mlst_minid : DNA %identity of full allelle to consider '
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
509 similar' [~]. Default: 95
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
510
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
511 --mlst_mincov : DNA %cov to report partial allele at all [?].
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
512 Default: 10
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
513
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
514 --mlst_minscore : Minumum score out of 100 to match a scheme.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
515 Default: 50
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
516
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
517 --abricate_run : Run ABRicate tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
518
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
519 --abricate_minid : Minimum DNA %identity. Defaut: 90
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
520
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
521 --abricate_mincov : Minimum DNA %coverage. Defaut: 80
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
522
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
523 --abricate_datadir : ABRicate databases folder. Defaut: /hpc/db/
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
524 abricate/1.0.1/db
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
525
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
526 Help options :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
527
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
528 --help : Display this message.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
529 ```
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
530
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
531 \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
532 &nbsp;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
533
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
534 ## `centriflaken_hy` CLI Help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
535
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
536 ```text
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
537 cpipes --pipeline centriflaken_hy --help
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
538
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
539 N E X T F L O W ~ version 24.10.4
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
540
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
541 Launching `/home/user/centriflaken/cpipes` [big_ramanujan] DSL2 - revision: 55d6f63710
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
542
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
543 ================================================================================
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
544 (o)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
545 ___ _ __ _ _ __ ___ ___
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
546 / __|| '_ \ | || '_ \ / _ \/ __|
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
547 | (__ | |_) || || |_) || __/\__ \
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
548 \___|| .__/ |_|| .__/ \___||___/
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
549 | | | |
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
550 |_| |_|
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
551 --------------------------------------------------------------------------------
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
552 A collection of modular pipelines at CFSAN, FDA.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
553 --------------------------------------------------------------------------------
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
554 Name : CPIPES
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
555 Author : Kranti.Konganti@fda.hhs.gov
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
556 Version : 0.4.1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
557 Center : CFSAN, FDA.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
558 ================================================================================
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
559
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
560 Workflow : centriflaken_hy
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
561
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
562 Author : Kranti.Konganti@fda.hhs.gov
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
563
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
564 Version : 0.4.1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
565
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
566
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
567 Usage : cpipes --pipeline centriflaken_hy [options]
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
568
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
569
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
570 Required :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
571
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
572 --input : Absolute path to directory containing FASTQ
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
573 files. The directory should contain only
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
574 FASTQ files as all the files within the
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
575 mentioned directory will be read. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
576 input /path/to/fastq_pass
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
577
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
578 --output : Absolute path to directory where all the
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
579 pipeline outputs should be stored. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
580 output /path/to/output
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
581
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
582 Other options :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
583
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
584 --metadata : Absolute path to metadata CSV file
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
585 containing five mandatory columns: sample,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
586 fq1,fq2,strandedness,single_end. The fq1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
587 and fq2 columns contain absolute paths to
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
588 the FASTQ files. This option can be used in
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
589 place of --input option. This is rare. Ex: --
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
590 metadata samplesheet.csv
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
591
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
592 --fq_suffix : The suffix of FASTQ files (Unpaired reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
593 or R1 reads or Long reads) if an input
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
594 directory is mentioned via --input option.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
595 Default: _R1_001.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
596
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
597 --fq2_suffix : The suffix of FASTQ files (Paired-end reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
598 or R2 reads) if an input directory is
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
599 mentioned via --input option. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
600 _R2_001.fastq.gz
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
601
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
602 --fq_filter_by_len : Remove FASTQ reads that are less than this
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
603 many bases. Default: 75
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
604
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
605 --fq_strandedness : The strandedness of the sequencing run.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
606 This is mostly needed if your sequencing
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
607 run is RNA-SEQ. For most of the other runs,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
608 it is probably safe to use unstranded for
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
609 the option. Default: unstranded
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
610
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
611 --fq_single_end : SINGLE-END information will be auto-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
612 detected but this option forces PAIRED-END
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
613 FASTQ files to be treated as SINGLE-END so
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
614 only read 1 information is included in auto-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
615 generated samplesheet. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
616
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
617 --fq_filename_delim : Delimiter by which the file name is split
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
618 to obtain sample name. Default: _
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
619
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
620 --fq_filename_delim_idx : After splitting FASTQ file name by using
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
621 the --fq_filename_delim option, all
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
622 elements before this index (1-based) will
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
623 be joined to create final sample name.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
624 Default: 1
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
625
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
626 --seqkit_rmdup_run : Remove duplicate sequences using seqkit
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
627 rmdup. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
628
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
629 --seqkit_rmdup_n : Match and remove duplicate sequences by
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
630 full name instead of just ID. Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
631
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
632 --seqkit_rmdup_s : Match and remove duplicate sequences by
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
633 sequence content. Defaut: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
634
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
635 --seqkit_rmdup_d : Save the duplicated sequences to a file.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
636 Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
637
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
638 --seqkit_rmdup_D : Save the number and list of duplicated
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
639 sequences to a file. Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
640
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
641 --seqkit_rmdup_i : Ignore case while using seqkit rmdup.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
642 Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
643
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
644 --seqkit_rmdup_P : Only consider positive strand (i.e. 5')
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
645 when comparing by sequence content. Defaut:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
646 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
647
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
648 --kraken2_db : Absolute path to kraken database. Default: /
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
649 hpc/db/kraken2/standard-210914
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
650
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
651 --kraken2_confidence : Confidence score threshold which must be
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
652 between 0 and 1. Default: 0.0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
653
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
654 --kraken2_quick : Quick operation (use first hit or hits).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
655 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
656
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
657 --kraken2_use_mpa_style : Report output like Kraken 1's kraken-mpa-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
658 report. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
659
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
660 --kraken2_minimum_base_quality : Minimum base quality used in classification
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
661 which is only effective with FASTQ input.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
662 Default: 0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
663
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
664 --kraken2_report_zero_counts : Report counts for ALL taxa, even if counts
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
665 are zero. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
666
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
667 --kraken2_report_minmizer_data : Report minimizer and distinct minimizer
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
668 count information in addition to normal
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
669 Kraken report. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
670
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
671 --kraken2_use_names : Print scientific names instead of just
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
672 taxids. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
673
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
674 --kraken2_extract_bug : Extract the reads or contigs beloging to
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
675 this bug. Default: Escherichia coli
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
676
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
677 --centrifuge_x : Absolute path to centrifuge database.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
678 Default: /hpc/db/centrifuge/2022-04-12/ab
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
679
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
680 --centrifuge_save_unaligned : Save SINGLE-END reads that did not align.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
681 For PAIRED-END reads, save read pairs that
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
682 did not align concordantly. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
683
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
684 --centrifuge_save_aligned : Save SINGLE-END reads that aligned. For
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
685 PAIRED-END reads, save read pairs that
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
686 aligned concordantly. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
687
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
688 --centrifuge_out_fmt_sam : Centrifuge output should be in SAM. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
689 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
690
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
691 --centrifuge_extract_bug : Extract this bug from centrifuge results.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
692 Default: Escherichia coli
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
693
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
694 --centrifuge_ignore_quals : Treat all quality values as 30 on Phred
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
695 scale. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
696
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
697 --megahit_run : Run MEGAHIT assembler. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
698
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
699 --megahit_min_count : <int>. Minimum multiplicity for filtering (
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
700 k_min+1)-mers. Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
701
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
702 --megahit_k_list : Comma-separated list of kmer size. All
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
703 values must be odd, in the range 15-255,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
704 increment should be <= 28. Ex: '21,29,39,59,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
705 79,99,119,141'. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
706
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
707 --megahit_no_mercy : Do not add mercy k-mers. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
708
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
709 --megahit_bubble_level : <int>. Intensity of bubble merging (0-2), 0
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
710 to disable. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
711
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
712 --megahit_merge_level : <l,s>. Merge complex bubbles of length <= l*
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
713 kmer_size and similarity >= s. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
714 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
715
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
716 --megahit_prune_level : <int>. Strength of low depth pruning (0-3).
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
717 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
718
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
719 --megahit_prune_depth : <int>. Remove unitigs with avg k-mer depth
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
720 less than this value. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
721
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
722 --megahit_low_local_ratio : <float>. Ratio threshold to define low
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
723 local coverage contigs. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
724
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
725 --megahit_max_tip_len : <int>. remove tips less than this value [<
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
726 int> * k]. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
727
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
728 --megahit_no_local : Disable local assembly. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
729
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
730 --megahit_kmin_1pass : Use 1pass mode to build SdBG of k_min.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
731 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
732
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
733 --megahit_preset : <str>. Override a group of parameters.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
734 Valid values are meta-sensitive which
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
735 enforces '--min-count 1 --k-list 21,29,39,
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
736 49,...,129,141', meta-large (large &
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
737 complex metagenomes, like soil) which
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
738 enforces '--k-min 27 --k-max 127 --k-step
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
739 10'. Default: meta-sensitive
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
740
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
741 --megahit_mem_flag : <int>. SdBG builder memory mode. 0: minimum;
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
742 1: moderate; 2: use all memory specified.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
743 Default: 2
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
744
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
745 --megahit_min_contig_len : <int>. Minimum length of contigs to output.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
746 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
747
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
748 --spades_run : Run SPAdes assembler. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
749
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
750 --spades_isolate : This flag is highly recommended for high-
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
751 coverage isolate and multi-cell data.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
752 Defaut: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
753
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
754 --spades_sc : This flag is required for MDA (single-cell)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
755 data. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
756
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
757 --spades_meta : This flag is required for metagenomic data.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
758 Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
759
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
760 --spades_bio : This flag is required for biosytheticSPAdes
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
761 mode. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
762
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
763 --spades_corona : This flag is required for coronaSPAdes mode.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
764 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
765
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
766 --spades_rna : This flag is required for RNA-Seq data.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
767 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
768
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
769 --spades_plasmid : Runs plasmidSPAdes pipeline for plasmid
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
770 detection. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
771
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
772 --spades_metaviral : Runs metaviralSPAdes pipeline for virus
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
773 detection. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
774
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
775 --spades_metaplasmid : Runs metaplasmidSPAdes pipeline for plasmid
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
776 detection in metagenomics datasets. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
777 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
778
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
779 --spades_rnaviral : This flag enables virus assembly module
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
780 from RNA-Seq data. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
781
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
782 --spades_iontorrent : This flag is required for IonTorrent data.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
783 Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
784
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
785 --spades_only_assembler : Runs only the SPAdes assembler module (
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
786 without read error correction). Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
787 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
788
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
789 --spades_careful : Tries to reduce the number of mismatches
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
790 and short indels in the assembly. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
791 false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
792
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
793 --spades_cov_cutoff : Coverage cutoff value (a positive float
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
794 number). Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
795
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
796 --spades_k : List of k-mer sizes (must be odd and less
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
797 than 128). Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
798
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
799 --spades_hmm : Directory with custom hmms that replace the
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
800 default ones (very rare). Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
801
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
802 --serotypefinder_run : Run SerotypeFinder tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
803
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
804 --serotypefinder_x : Generate extended output files. Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
805 true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
806
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
807 --serotypefinder_db : Path to SerotypeFinder databases. Default: /
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
808 hpc/db/serotypefinder/2.0.2
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
809
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
810 --serotypefinder_min_threshold : Minimum percent identity (in float)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
811 required for calling a hit. Default: 0.85
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
812
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
813 --serotypefinder_min_cov : Minumum percent coverage (in float)
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
814 required for calling a hit. Default: 0.80
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
815
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
816 --seqsero2_run : Run SeqSero2 tool. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
817
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
818 --seqsero2_t : '1' for interleaved paired-end reads, '2'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
819 for separated paired-end reads, '3' for
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
820 single reads, '4' for genome assembly, '5'
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
821 for nanopore reads (fasta/fastq). Default:
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
822 4
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
823
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
824 --seqsero2_m : Which workflow to apply, 'a'(raw reads
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
825 allele micro-assembly), 'k'(raw reads and
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
826 genome assembly k-mer). Default: k
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
827
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
828 --seqsero2_c : SeqSero2 will only output serotype
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
829 prediction without the directory containing
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
830 log files. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
831
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
832 --seqsero2_s : SeqSero2 will not output header in
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
833 SeqSero_result.tsv. Default: false
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
834
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
835 --mlst_run : Run MLST tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
836
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
837 --mlst_minid : DNA %identity of full allelle to consider '
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
838 similar' [~]. Default: 95
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
839
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
840 --mlst_mincov : DNA %cov to report partial allele at all [?].
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
841 Default: 10
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
842
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
843 --mlst_minscore : Minumum score out of 100 to match a scheme.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
844 Default: 50
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
845
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
846 --abricate_run : Run ABRicate tool. Default: true
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
847
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
848 --abricate_minid : Minimum DNA %identity. Defaut: 90
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
849
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
850 --abricate_mincov : Minimum DNA %coverage. Defaut: 80
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
851
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
852 --abricate_datadir : ABRicate databases folder. Defaut: /hpc/db/
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
853 abricate/1.0.1/db
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
854
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
855 Help options :
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
856
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
857 --help : Display this message.
082e0091e813 planemo upload
galaxytrakr
parents:
diff changeset
858 ```