cfsan_centriflaken: 0.2.1/readme/centriflaken

annotate 0.2.1/readme/centriflaken_hy.md @ 1:e0d902b50cff

"planemo upload"

author	kkonganti
date	Mon, 27 Jun 2022 16:00:59 -0400
parents	77494b0fa3c7
children	30191f39a957

rev	line source
kkonganti@0	1 # CPIPES (CFSAN PIPELINES)
kkonganti@0	2
kkonganti@0	3 ## The modular pipeline repository at CFSAN, FDA
kkonganti@0	4
kkonganti@0	5 CPIPES (CFSAN PIPELINES) is a collection of modular pipelines based on NEXTFLOW,
kkonganti@0	6 mostly for bioinformatics data analysis at CFSAN, FDA.
kkonganti@0	7
kkonganti@0	8 ---
kkonganti@0	9
kkonganti@0	10 ### centriflaken_hy
kkonganti@0	11
kkonganti@0	12 ---
kkonganti@0	13 `centriflaken_hy` is a variant of the original `centriflaken` pipeline but for Illumina short reads either single-end or paired-end.
kkonganti@0	14
kkonganti@0	15 #### Workflow Usage
kkonganti@0	16
kkonganti@0	17 ```bash
kkonganti@0	18 module load cpipes/0.2.0
kkonganti@0	19
kkonganti@0	20 cpipes --pipeline centriflaken_hy [options]
kkonganti@0	21 ```
kkonganti@0	22
kkonganti@0	23 Example: Run the default `centriflaken_hy` pipeline with taxa of interest as E. coli.
kkonganti@0	24
kkonganti@0	25 ```bash
kkonganti@0	26 cd /hpc/scratch/$USER
kkonganti@0	27 mkdir nf-cpipes
kkonganti@0	28 cd nf-cpipes
kkonganti@0	29 cpipes --pipeline centriflaken_hy --input /path/to/illumina/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
kkonganti@0	30 ```
kkonganti@0	31
kkonganti@0	32 Example: Run the `centriflaken_hy` pipeline with taxa of interest as Salmonella. In this mode, `SerotypeFinder` tool will be replaced with `SeqSero2` tool.
kkonganti@0	33
kkonganti@0	34 ```bash
kkonganti@0	35 cd /hpc/scratch/$USER
kkonganti@0	36 mkdir nf-cpipes
kkonganti@0	37 cd nf-cpipes
kkonganti@0	38 cpipes --pipeline centriflaken_hy --centrifuge_extract_bug 'Salmonella' --input /path/to/illumina/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
kkonganti@0	39 ```
kkonganti@0	40
kkonganti@0	41 #### `centriflaken_hy` Help
kkonganti@0	42
kkonganti@0	43 ```text
kkonganti@0	44 [Kranti.Konganti@login2-slurm ]$ cpipes --pipeline centriflaken_hy --help
kkonganti@0	45 N E X T F L O W ~ version 21.12.1-edge
kkonganti@0	46 Launching `/nfs/software/apps/cpipes/0.2.1/cpipes` [wise_noyce] - revision: 72db279311
kkonganti@0	47 ================================================================================
kkonganti@0	48 (o)
kkonganti@0	49 ___ _ __ _ _ __ ___ ___
kkonganti@0	50 / __\|\| '_ \ \| \|\| '_ \ / _ \/ __\|
kkonganti@0	51 \| (__ \| \|_) \|\| \|\| \|_) \|\| __/\__ \
kkonganti@0	52 \___\|\| .__/ \|_\|\| .__/ \___\|\|___/
kkonganti@0	53 \| \| \| \|
kkonganti@0	54 \|_\| \|_\|
kkonganti@0	55 --------------------------------------------------------------------------------
kkonganti@0	56 A collection of modular pipelines at CFSAN, FDA.
kkonganti@0	57 --------------------------------------------------------------------------------
kkonganti@0	58 Name : CPIPES
kkonganti@0	59 Author : Kranti.Konganti@fda.hhs.gov
kkonganti@0	60 Version : 0.2.1
kkonganti@0	61 Center : CFSAN, FDA.
kkonganti@0	62 ================================================================================
kkonganti@0	63
kkonganti@0	64 Workflow : centriflaken_hy
kkonganti@0	65
kkonganti@0	66 Author : Kranti.Konganti@fda.hhs.gov
kkonganti@0	67
kkonganti@0	68 Version : 0.2.0
kkonganti@0	69
kkonganti@0	70
kkonganti@0	71 Usage : cpipes --pipeline centriflaken_hy [options]
kkonganti@0	72
kkonganti@0	73
kkonganti@0	74 Required :
kkonganti@0	75
kkonganti@0	76 --input : Absolute path to directory containing FASTQ
kkonganti@0	77 files. The directory should contain only
kkonganti@0	78 FASTQ files as all the files within the
kkonganti@0	79 mentioned directory will be read. Ex: --
kkonganti@0	80 input /path/to/fastq_pass
kkonganti@0	81
kkonganti@0	82 --output : Absolute path to directory where all the
kkonganti@0	83 pipeline outputs should be stored. Ex: --
kkonganti@0	84 output /path/to/output
kkonganti@0	85
kkonganti@0	86 Other options :
kkonganti@0	87
kkonganti@0	88 --metadata : Absolute path to metadata CSV file
kkonganti@0	89 containing five mandatory columns: sample,
kkonganti@0	90 fq1,fq2,strandedness,single_end. The fq1
kkonganti@0	91 and fq2 columns contain absolute paths to
kkonganti@0	92 the FASTQ files. This option can be used in
kkonganti@0	93 place of --input option. This is rare. Ex: --
kkonganti@0	94 metadata samplesheet.csv
kkonganti@0	95
kkonganti@0	96 --fq_suffix : The suffix of FASTQ files (Unpaired reads
kkonganti@0	97 or R1 reads or Long reads) if an input
kkonganti@0	98 directory is mentioned via --input option.
kkonganti@0	99 Default: _R1_001.fastq.gz
kkonganti@0	100
kkonganti@0	101 --fq2_suffix : The suffix of FASTQ files (Paired-end reads
kkonganti@0	102 or R2 reads) if an input directory is
kkonganti@0	103 mentioned via --input option. Default:
kkonganti@0	104 _R2_001.fastq.gz
kkonganti@0	105
kkonganti@0	106 --fq_filter_by_len : Remove FASTQ reads that are less than this
kkonganti@0	107 many bases. Default: 75
kkonganti@0	108
kkonganti@0	109 --fq_strandedness : The strandedness of the sequencing run.
kkonganti@0	110 This is mostly needed if your sequencing
kkonganti@0	111 run is RNA-SEQ. For most of the other runs,
kkonganti@0	112 it is probably safe to use unstranded for
kkonganti@0	113 the option. Default: unstranded
kkonganti@0	114
kkonganti@0	115 --fq_single_end : SINGLE-END information will be auto-
kkonganti@0	116 detected but this option forces PAIRED-END
kkonganti@0	117 FASTQ files to be treated as SINGLE-END so
kkonganti@0	118 only read 1 information is included in auto-
kkonganti@0	119 generated samplesheet. Default: false
kkonganti@0	120
kkonganti@0	121 --fq_filename_delim : Delimiter by which the file name is split
kkonganti@0	122 to obtain sample name. Default: _
kkonganti@0	123
kkonganti@0	124 --fq_filename_delim_idx : After splitting FASTQ file name by using
kkonganti@0	125 the --fq_filename_delim option, all
kkonganti@0	126 elements before this index (1-based) will
kkonganti@0	127 be joined to create final sample name.
kkonganti@0	128 Default: 1
kkonganti@0	129
kkonganti@0	130 --kraken2_db : Absolute path to kraken database. Default: /
kkonganti@0	131 hpc/db/kraken2/standard-210914
kkonganti@0	132
kkonganti@0	133 --kraken2_confidence : Confidence score threshold which must be
kkonganti@0	134 between 0 and 1. Default: 0.0
kkonganti@0	135
kkonganti@0	136 --kraken2_quick : Quick operation (use first hit or hits).
kkonganti@0	137 Default: false
kkonganti@0	138
kkonganti@0	139 --kraken2_use_mpa_style : Report output like Kraken 1's kraken-mpa-
kkonganti@0	140 report. Default: false
kkonganti@0	141
kkonganti@0	142 --kraken2_minimum_base_quality : Minimum base quality used in classification
kkonganti@0	143 which is only effective with FASTQ input.
kkonganti@0	144 Default: 0
kkonganti@0	145
kkonganti@0	146 --kraken2_report_zero_counts : Report counts for ALL taxa, even if counts
kkonganti@0	147 are zero. Default: false
kkonganti@0	148
kkonganti@0	149 --kraken2_report_minmizer_data : Report minimizer and distinct minimizer
kkonganti@0	150 count information in addition to normal
kkonganti@0	151 Kraken report. Default: false
kkonganti@0	152
kkonganti@0	153 --kraken2_use_names : Print scientific names instead of just
kkonganti@0	154 taxids. Default: true
kkonganti@0	155
kkonganti@0	156 --kraken2_extract_bug : Extract the reads or contigs beloging to
kkonganti@0	157 this bug. Default: Escherichia coli
kkonganti@0	158
kkonganti@0	159 --centrifuge_x : Absolute path to centrifuge database.
kkonganti@0	160 Default: /hpc/db/centrifuge/2022-04-12/ab
kkonganti@0	161
kkonganti@0	162 --centrifuge_save_unaligned : Save SINGLE-END reads that did not align.
kkonganti@0	163 For PAIRED-END reads, save read pairs that
kkonganti@0	164 did not align concordantly. Default: false
kkonganti@0	165
kkonganti@0	166 --centrifuge_save_aligned : Save SINGLE-END reads that aligned. For
kkonganti@0	167 PAIRED-END reads, save read pairs that
kkonganti@0	168 aligned concordantly. Default: false
kkonganti@0	169
kkonganti@0	170 --centrifuge_out_fmt_sam : Centrifuge output should be in SAM. Default:
kkonganti@0	171 false
kkonganti@0	172
kkonganti@0	173 --centrifuge_extract_bug : Extract this bug from centrifuge results.
kkonganti@0	174 Default: Escherichia coli
kkonganti@0	175
kkonganti@0	176 --centrifuge_ignore_quals : Treat all quality values as 30 on Phred
kkonganti@0	177 scale. Default: false
kkonganti@0	178
kkonganti@0	179 --spades_isolate : This flag is highly recommended for high-
kkonganti@0	180 coverage isolate and multi-cell data.
kkonganti@0	181 Defaut: false
kkonganti@0	182
kkonganti@0	183 --spades_sc : This flag is required for MDA (single-cell)
kkonganti@0	184 data. Default: false
kkonganti@0	185
kkonganti@0	186 --spades_meta : This flag is required for metagenomic data.
kkonganti@0	187 Default: true
kkonganti@0	188
kkonganti@0	189 --spades_bio : This flag is required for biosytheticSPAdes
kkonganti@0	190 mode. Default: false
kkonganti@0	191
kkonganti@0	192 --spades_corona : This flag is required for coronaSPAdes mode.
kkonganti@0	193 Default: false
kkonganti@0	194
kkonganti@0	195 --spades_rna : This flag is required for RNA-Seq data.
kkonganti@0	196 Default: false
kkonganti@0	197
kkonganti@0	198 --spades_plasmid : Runs plasmidSPAdes pipeline for plasmid
kkonganti@0	199 detection. Default: false
kkonganti@0	200
kkonganti@0	201 --spades_metaviral : Runs metaviralSPAdes pipeline for virus
kkonganti@0	202 detection. Default: false
kkonganti@0	203
kkonganti@0	204 --spades_metaplasmid : Runs metaplasmidSPAdes pipeline for plasmid
kkonganti@0	205 detection in metagenomics datasets. Default:
kkonganti@0	206 false
kkonganti@0	207
kkonganti@0	208 --spades_rnaviral : This flag enables virus assembly module
kkonganti@0	209 from RNA-Seq data. Default: false
kkonganti@0	210
kkonganti@0	211 --spades_iontorrent : This flag is required for IonTorrent data.
kkonganti@0	212 Default: false
kkonganti@0	213
kkonganti@0	214 --spades_only_assembler : Runs only the SPAdes assembler module (
kkonganti@0	215 without read error correction).Default:
kkonganti@0	216 false
kkonganti@0	217
kkonganti@0	218 --spades_careful : Tries to reduce the number of mismatches
kkonganti@0	219 and short indels in the assembly. Default:
kkonganti@0	220 false
kkonganti@0	221
kkonganti@0	222 --spades_cov_cutoff : Coverage cutoff value (a positive float
kkonganti@0	223 number). Default: false
kkonganti@0	224
kkonganti@0	225 --spades_k : List of k-mer sizes (must be odd and less
kkonganti@0	226 than 128). Default: false
kkonganti@0	227
kkonganti@0	228 --spades_hmm : Directory with custom hmms that replace the
kkonganti@0	229 default ones (very rare). Default: false
kkonganti@0	230
kkonganti@0	231 --serotypefinder_run : Run SerotypeFinder tool. Default: true
kkonganti@0	232
kkonganti@0	233 --serotypefinder_x : Generate extended output files. Default:
kkonganti@0	234 true
kkonganti@0	235
kkonganti@0	236 --serotypefinder_db : Path to SerotypeFinder databases. Default: /
kkonganti@0	237 hpc/db/serotypefinder/2.0.2
kkonganti@0	238
kkonganti@0	239 --serotypefinder_min_threshold : Minimum percent identity (in float)
kkonganti@0	240 required for calling a hit. Default: 0.85
kkonganti@0	241
kkonganti@0	242 --serotypefinder_min_cov : Minumum percent coverage (in float)
kkonganti@0	243 required for calling a hit. Default: 0.80
kkonganti@0	244
kkonganti@0	245 --seqsero2_run : Run SeqSero2 tool. Default: false
kkonganti@0	246
kkonganti@0	247 --seqsero2_t : '1' for interleaved paired-end reads, '2'
kkonganti@0	248 for separated paired-end reads, '3' for
kkonganti@0	249 single reads, '4' for genome assembly, '5'
kkonganti@0	250 for nanopore reads (fasta/fastq). Default:
kkonganti@0	251 4
kkonganti@0	252
kkonganti@0	253 --seqsero2_m : Which workflow to apply, 'a'(raw reads
kkonganti@0	254 allele micro-assembly), 'k'(raw reads and
kkonganti@0	255 genome assembly k-mer). Default: k
kkonganti@0	256
kkonganti@0	257 --seqsero2_c : SeqSero2 will only output serotype
kkonganti@0	258 prediction without the directory containing
kkonganti@0	259 log files. Default: false
kkonganti@0	260
kkonganti@0	261 --seqsero2_s : SeqSero2 will not output header in
kkonganti@0	262 SeqSero_result.tsv. Default: false
kkonganti@0	263
kkonganti@0	264 --mlst_run : Run MLST tool. Default: true
kkonganti@0	265
kkonganti@0	266 --mlst_minid : DNA %identity of full allelle to consider '
kkonganti@0	267 similar' [~]. Default: 95
kkonganti@0	268
kkonganti@0	269 --mlst_mincov : DNA %cov to report partial allele at all [?].
kkonganti@0	270 Default: 10
kkonganti@0	271
kkonganti@0	272 --mlst_minscore : Minumum score out of 100 to match a scheme.
kkonganti@0	273 Default: 50
kkonganti@0	274
kkonganti@0	275 --abricate_run : Run ABRicate tool. Default: true
kkonganti@0	276
kkonganti@0	277 --abricate_minid : Minimum DNA %identity. Defaut: 90
kkonganti@0	278
kkonganti@0	279 --abricate_mincov : Minimum DNA %coverage. Defaut: 80
kkonganti@0	280
kkonganti@0	281 --abricate_datadir : ABRicate databases folder. Defaut: /hpc/db/
kkonganti@0	282 abricate/1.0.1/db
kkonganti@0	283
kkonganti@0	284 Help options :
kkonganti@0	285
kkonganti@0	286 --help : Display this message.
kkonganti@0	287 ```
kkonganti@0	288
kkonganti@0	289 ### PRE ALPHA
kkonganti@0	290
kkonganti@0	291 ---
kkonganti@0	292 This modular structure and flow is still in rapid development and may change
kkonganti@0	293 depending on assessment of various computational topics and other considerations

Mercurial > repos > kkonganti > cfsan_centriflaken

annotate 0.2.1/readme/centriflaken_hy.md @ 1:e0d902b50cff