Mercurial > repos > kkonganti > cfsan_cronology
comparison cfsan_cronology.xml @ 5:6e5ceea33843
"planemo upload"
author | kkonganti |
---|---|
date | Mon, 27 Nov 2023 14:50:43 -0500 |
parents | 7ac696717239 |
children | a3c1cba6f773 |
comparison
equal
deleted
inserted
replaced
4:7ac696717239 | 5:6e5ceea33843 |
---|---|
73 </when> | 73 </when> |
74 </conditional> | 74 </conditional> |
75 <param name="refgenome" optional="true" value="GCF_003516125" type="text" | 75 <param name="refgenome" optional="true" value="GCF_003516125" type="text" |
76 label="NCBI reference genome accession" | 76 label="NCBI reference genome accession" |
77 help="Is the reference genome other than Cronobacter sakazakii? Reference genome FASTA is used as a model for gene prediction. DO NOT ENTER THE DECIMAL PART (Ex: GCF_003516125.1)." /> | 77 help="Is the reference genome other than Cronobacter sakazakii? Reference genome FASTA is used as a model for gene prediction. DO NOT ENTER THE DECIMAL PART (Ex: GCF_003516125.1)." /> |
78 <param name="tuspy_n" optional="true" value="10" type="integer" label="Enter the number of top unique hits to retain after initial MASH screen step" | 78 <param name="tuspy_n" optional="true" value="2" type="integer" label="Enter the number of top unique hits to retain after initial MASH screen step" |
79 help="These hits will be used to build a genome distance based tree for your experiment run. Default value of 2 is suitable for almost all scenarios."/> | 79 help="These hits will be used to build a genome distance based tree for your experiment run. Default value of 2 is suitable for almost all scenarios."/> |
80 <param name="fq_filename_delim" type="text" value="_" label="File name delimitor by which samples are grouped together (--fq_filename_delim)" | 80 <param name="fq_filename_delim" type="text" value="_" label="File name delimitor by which samples are grouped together (--fq_filename_delim)" |
81 help="This is the delimitor by which samples are grouped together to display in the final MultiQC report. For example, if your input data sets are mango_replicate1.fastq.gz, mango_replicate2.fastq.gz, orange_replicate1_maryland.fastq.gz, orange_replicate2_maryland.fastq.gz, then to create 2 samples mango and orange, the value for --fq_filename_delim would be _ (underscore) and the value for --fq_filename_delim_idx would be 1, since you want to group by the first word (i.e. mango or orange) after splitting the filename based on _ (underscore)."/> | 81 help="This is the delimitor by which samples are grouped together to display in the final MultiQC report. For example, if your input data sets are mango_replicate1.fastq.gz, mango_replicate2.fastq.gz, orange_replicate1_maryland.fastq.gz, orange_replicate2_maryland.fastq.gz, then to create 2 samples mango and orange, the value for --fq_filename_delim would be _ (underscore) and the value for --fq_filename_delim_idx would be 1, since you want to group by the first word (i.e. mango or orange) after splitting the filename based on _ (underscore)."/> |
82 <param name="fq_filename_delim_idx" type="integer" value="1" label="File name delimitor index (--fq_filename_delim_idx)" /> | 82 <param name="fq_filename_delim_idx" type="integer" value="1" label="File name delimitor index (--fq_filename_delim_idx)" /> |
83 </inputs> | 83 </inputs> |
113 | 113 |
114 .. class:: infomark | 114 .. class:: infomark |
115 | 115 |
116 **Purpose** | 116 **Purpose** |
117 | 117 |
118 cronology is an automated workflow to assign Salmonella serotype based on NCBI Pathogen Detection Project for Salmonella. | 118 cronology is an automated workflow for Cronobacter isolate assembly, |
119 It uses MASH to reduce the search space followed by additional genome filtering with sourmash. It then performs genome based | 119 sequencing typing and traceback. The workflow version 0.1.0 takes in single-end |
120 alignment with kma followed by count generation using salmon. This workflow can be used to analyze shotgun metagenomics | 120 or paired-end Illumina short read data, performs QC using fastp, assembly and polish using shovill and polypolish |
121 datasets, quasi-metagenomic datasets (enriched for Salmonella) and target enriched datasets (enriched with molecular baits specific for Salmonella) | 121 and whole genome distance based clustering using mashtree based on NCBI Pathogen Detection DB for Cronobacter. |
122 and is especially useful in a case where a sample is of multi-serovar mixture. | |
123 | 122 |
124 It is written in Nextflow and is part of the modular data analysis pipelines (CFSAN PIPELINES or CPIPES for short) at CFSAN. | 123 It is written in Nextflow and is part of the modular data analysis pipelines (CFSAN PIPELINES or CPIPES for short) at CFSAN. |
125 | 124 |
126 | 125 |
127 ---- | 126 ---- |
128 | 127 |
129 .. class:: infomark | 128 .. class:: infomark |
130 | 129 |
131 **Testing and Validation** | 130 **Testing and Validation** |
132 | 131 |
133 The CPIPES - cronology Nextflow pipeline has been wrapped to make it work in Galaxy. It takes in either paired or unpaired short reads list as an input | 132 The CPIPES - cronology Nextflow pipeline has been wrapped to make it work in Galaxy. |
134 and performs read quality control followed by de novo assembly, gene prediction and annotation, sequence typing and whole genome distance based clustering. | |
135 All the testing has been done on the command line on the CFSAN Raven2 HPC Cluster. | 133 All the testing has been done on the command line on the CFSAN Raven2 HPC Cluster. |
136 | 134 |
137 | 135 |
138 ---- | 136 ---- |
139 | 137 |