Mercurial > repos > kkonganti > hfp_nowayout

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/LICENSE.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,98 @@
+# CPIPES (CFSAN PIPELINES)
+
+## The modular pipeline repository at CFSAN, FDA
+
+**CPIPES** (CFSAN PIPELINES) is a collection of modular pipelines based on **NEXTFLOW**,
+mostly for bioinformatics data analysis at **CFSAN, FDA.**
+
+---
+
+### **LICENSES**
+
+\
+&nbsp;
+
+**CPIPES** is licensed under:
+
+```text
+MIT License
+
+In the U.S.A. Public Domain; elsewhere Copyright (c) 2022 U.S. Food and Drug Administration
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
+
+\
+&nbsp;
+
+Portions of **CPIPES** are built on modified versions of many tools, scripts and libraries from [nf-core/modules](https://github.com/nf-core/modules) and [nf-core/rnaseq](https://github.com/nf-core/rna-seq) which are originally licensed under:
+
+```text
+MIT License
+
+Copyright (c) Philip Ewels
+Copyright (c) Phil Ewels, Rickard Hammarén
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
+
+\
+&nbsp;
+
+The **MultiQC** report, in addition uses [DataTables](https://datatables.net), which is licensed under:
+
+```text
+MIT License
+
+Copyright (C) 2008-2022, SpryMedia Ltd.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,48 @@
+# CPIPES (CFSAN PIPELINES)
+
+## The modular pipeline repository at CFSAN, FDA
+
+**CPIPES** (CFSAN PIPELINES) is a collection of modular pipelines based on **NEXTFLOW**,
+mostly for bioinformatics data analysis at **CFSAN, FDA.**
+
+---
+
+### **Pipelines**
+
+---
+**CPIPES**:
+
+ 1. `centriflaken`       : [README](./readme/centriflaken.md).
+ 2. `centriflaken_hy`    : [README](./readme/centriflaken_hy.md).
+
+#### Workflow Usage
+
+Following is the example of how to run the `centriflaken` pipeline on the **CFSAN** raven cluster.
+
+```bash
+module load cpipes/0.4.0
+
+cpipes --pipeline centriflaken [options]
+```
+
+Example:
+
+```bash
+cd /hpc/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes \
+    --pipeline centriflaken \
+    --input /path/to/fastq_pass_dir \
+    --output /path/to/where/output/should/go \
+    --user_email First.Last@fda.hhs.gov \
+    -profile raven
+```
+
+The above command would run the pipeline and store the output wherever the author of the workflow decided it to be and the **NEXTFLOW** reports are always stored in the current working directory from where `cpipes` is run. For example, for the above command, a directory called `CPIPES-centriflaken` would hold all the **NEXTFLOW**
+related logs, reports and trace files.
+
+### **BETA**
+
+---
+The development of the modular structure and flow is an ongoing effort and may change depending on assessment of various computational topics and other considerations.
Binary file 0.5.0/assets/FDA-CFSAN-Fill.png has changed
Binary file 0.5.0/assets/FDa-Logo-Blue---medium-01.png has changed
Binary file 0.5.0/assets/FDa-Logo-replace-Blue-small-01.png has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/adaptors.fa	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,1194 @@
+>gnl|uv|NGB00360.1 Illumina PCR Primer
+AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00362.1 Illumina Paired End PCR Primer 2.0
+CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
+>gnl|uv|NGB00363.1 Illumina Multiplexing PCR Primer 2.0
+GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00364.1 Illumina Multiplexing PCR Primer Index 1
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTC
+>gnl|uv|NGB00365.1 Illumina Multiplexing PCR Primer Index 2
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTC
+>gnl|uv|NGB00366.1 Illumina Multiplexing PCR Primer Index 3
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTC
+>gnl|uv|NGB00367.1 Illumina Multiplexing PCR Primer Index 4
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTC
+>gnl|uv|NGB00368.1 Illumina Multiplexing PCR Primer Index 5
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTC
+>gnl|uv|NGB00369.1 Illumina Multiplexing PCR Primer Index 6
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTC
+>gnl|uv|NGB00370.1 Illumina Multiplexing PCR Primer Index 7
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTC
+>gnl|uv|NGB00371.1 Illumina Multiplexing PCR Primer Index 8
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTC
+>gnl|uv|NGB00372.1 Illumina Multiplexing PCR Primer Index 9
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTC
+>gnl|uv|NGB00373.1 Illumina Multiplexing PCR Primer Index 10
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTC
+>gnl|uv|NGB00374.1 Illumina Multiplexing PCR Primer Index 11
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTC
+>gnl|uv|NGB00375.1 Illumina Multiplexing PCR Primer Index 12
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTC
+>gnl|uv|NGB00376.1 Illumina Gex PCR Primer 2
+AATGATACGGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGA
+>gnl|uv|NGB00377.1 Illumina DpnII Gex Sequencing Primer
+CGACAGGTTCAGAGTTCTACAGTCCGACGATC
+>gnl|uv|NGB00378.1 Illumina NlaIII Gex Sequencing Primer
+CCGACAGGTTCAGAGTTCTACAGTCCGACATG
+>gnl|uv|NGB00379.1 Illumina 3' RNA Adapter
+TCGTATGCCGTCTTCTGCTTGTT
+>gnl|uv|NGB00380.1 Illumina Small RNA 3' Adapter
+AATCTCGTATGCCGTCTTCTGCTTGC
+>gnl|uv|NGB00385.1 454 FLX linker
+GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC
+>gnl|uv|NGB00414.1 454 Life Sciences GS FLX Titanium Primer A-key
+CGTATCGCCTCCCTCGCGCCATCAG
+>gnl|uv|NGB00415.1 454 Life Sciences GS FLX Titanium Primer B-key
+CTATGCGCCTTGCCAGCCCGCTCAG
+>gnl|uv|NGB00416.1 454 Life Sciences GS FLX Titanium MID Adaptor B
+CCTATCCCCTGTGTGCCTTGGCAGTCTCAG
+>gnl|uv|NGB00417.1 454 Life Sciences GS FLX Titanium MID-1 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGAGTGCGT
+>gnl|uv|NGB00418.1 454 Life Sciences GS FLX Titanium MID-2 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCTCGACA
+>gnl|uv|NGB00419.1 454 Life Sciences GS FLX Titanium MID-3 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACGCACTC
+>gnl|uv|NGB00420.1 454 Life Sciences GS FLX Titanium MID-4 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCACTGTAG
+>gnl|uv|NGB00421.1 454 Life Sciences GS FLX Titanium MID-5 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCAGACACG
+>gnl|uv|NGB00422.1 454 Life Sciences GS FLX Titanium MID-6 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATATCGCGAG
+>gnl|uv|NGB00423.1 454 Life Sciences GS FLX Titanium MID-7 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTGTCTCTA
+>gnl|uv|NGB00424.1 454 Life Sciences GS FLX Titanium MID-8 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGCGTGTC
+>gnl|uv|NGB00425.1 454 Life Sciences GS FLX Titanium MID-10 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTCTATGCG
+>gnl|uv|NGB00426.1 454 Life Sciences GS FLX Titanium MID-11 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATACGTCT
+>gnl|uv|NGB00427.1 454 Life Sciences GS FLX Titanium MID-13 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCATAGTAGTG
+>gnl|uv|NGB00428.1 454 Life Sciences GS FLX Titanium MID-14 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGAGATAC
+>gnl|uv|NGB00429.1 454 Life Sciences GS FLX Titanium MID-15 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATACGACGTA
+>gnl|uv|NGB00430.1 454 Life Sciences GS FLX Titanium MID-16 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACGTACTA
+>gnl|uv|NGB00431.1 454 Life Sciences GS FLX Titanium MID-17 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTCTAGTAC
+>gnl|uv|NGB00432.1 454 Life Sciences GS FLX Titanium MID-18 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTACGTAGC
+>gnl|uv|NGB00433.1 454 Life Sciences GS FLX Titanium MID-19 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTACTACTC
+>gnl|uv|NGB00434.1 454 Life Sciences GS FLX Titanium MID-20 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGACTACAG
+>gnl|uv|NGB00435.1 454 Life Sciences GS FLX Titanium MID-21 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTAGACTAG
+>gnl|uv|NGB00436.1 454 Life Sciences GS FLX Titanium MID-22 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGAGTATG
+>gnl|uv|NGB00437.1 454 Life Sciences GS FLX Titanium MID-23 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACTCTCGTG
+>gnl|uv|NGB00438.1 454 Life Sciences GS FLX Titanium MID-24 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGAGACGAG
+>gnl|uv|NGB00439.1 454 Life Sciences GS FLX Titanium MID-25 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGTCGCTCG
+>gnl|uv|NGB00440.1 454 Life Sciences GS FLX Titanium MID-26 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACATACGCGT
+>gnl|uv|NGB00441.1 454 Life Sciences GS FLX Titanium MID-27 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCGAGTAT
+>gnl|uv|NGB00442.1 454 Life Sciences GS FLX Titanium MID-28 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTACTATGT
+>gnl|uv|NGB00443.1 454 Life Sciences GS FLX Titanium MID-29 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTGTACAGT
+>gnl|uv|NGB00444.1 454 Life Sciences GS FLX Titanium MID-30 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACTATACT
+>gnl|uv|NGB00445.1 454 Life Sciences GS FLX Titanium MID-31 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCGTCGTCT
+>gnl|uv|NGB00446.1 454 Life Sciences GS FLX Titanium MID-32 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTACGCTAT
+>gnl|uv|NGB00447.1 454 Life Sciences GS FLX Titanium MID-33 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGAGTACT
+>gnl|uv|NGB00448.1 454 Life Sciences GS FLX Titanium MID-34 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGCTACGT
+>gnl|uv|NGB00449.1 454 Life Sciences GS FLX Titanium MID-35 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGTAGACGT
+>gnl|uv|NGB00450.1 454 Life Sciences GS FLX Titanium MID-36 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACGTGACT
+>gnl|uv|NGB00451.1 454 Life Sciences GS FLX Titanium MID-37 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACACACT
+>gnl|uv|NGB00452.1 454 Life Sciences GS FLX Titanium MID-38 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACGTGAT
+>gnl|uv|NGB00453.1 454 Life Sciences GS FLX Titanium MID-39 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACAGATCGT
+>gnl|uv|NGB00454.1 454 Life Sciences GS FLX Titanium MID-40 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGCTGTCT
+>gnl|uv|NGB00455.1 454 Life Sciences GS FLX Titanium MID-41 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGTGTAGAT
+>gnl|uv|NGB00456.1 454 Life Sciences GS FLX Titanium MID-42 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGATCACGT
+>gnl|uv|NGB00457.1 454 Life Sciences GS FLX Titanium MID-43 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCACTAGT
+>gnl|uv|NGB00458.1 454 Life Sciences GS FLX Titanium MID-44 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGCGACT
+>gnl|uv|NGB00459.1 454 Life Sciences GS FLX Titanium MID-45 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTATACTAT
+>gnl|uv|NGB00460.1 454 Life Sciences GS FLX Titanium MID-46 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGACGTATGT
+>gnl|uv|NGB00461.1 454 Life Sciences GS FLX Titanium MID-47 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTGAGTAGT
+>gnl|uv|NGB00462.1 454 Life Sciences GS FLX Titanium MID-48 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACAGTATATA
+>gnl|uv|NGB00463.1 454 Life Sciences GS FLX Titanium MID-49 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCGATCGA
+>gnl|uv|NGB00464.1 454 Life Sciences GS FLX Titanium MID-50 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTAGCAGTA
+>gnl|uv|NGB00465.1 454 Life Sciences GS FLX Titanium MID-51 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCTCACGTA
+>gnl|uv|NGB00466.1 454 Life Sciences GS FLX Titanium MID-52 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTATACATA
+>gnl|uv|NGB00467.1 454 Life Sciences GS FLX Titanium MID-53 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTCGAGAGA
+>gnl|uv|NGB00468.1 454 Life Sciences GS FLX Titanium MID-54 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGCTACGA
+>gnl|uv|NGB00469.1 454 Life Sciences GS FLX Titanium MID-55 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCGTATA
+>gnl|uv|NGB00470.1 454 Life Sciences GS FLX Titanium MID-56 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCAGTACGA
+>gnl|uv|NGB00471.1 454 Life Sciences GS FLX Titanium MID-57 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCGTATACA
+>gnl|uv|NGB00472.1 454 Life Sciences GS FLX Titanium MID-58 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTACAGTCA
+>gnl|uv|NGB00473.1 454 Life Sciences GS FLX Titanium MID-59 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTACTCAGA
+>gnl|uv|NGB00474.1 454 Life Sciences GS FLX Titanium MID-60 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTACGCTCTA
+>gnl|uv|NGB00475.1 454 Life Sciences GS FLX Titanium MID-61 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTATAGCGTA
+>gnl|uv|NGB00476.1 454 Life Sciences GS FLX Titanium MID-62 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGTCATCA
+>gnl|uv|NGB00477.1 454 Life Sciences GS FLX Titanium MID-63 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGTCGCATA
+>gnl|uv|NGB00478.1 454 Life Sciences GS FLX Titanium MID-64 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATATATACA
+>gnl|uv|NGB00479.1 454 Life Sciences GS FLX Titanium MID-65 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATGCTAGTA
+>gnl|uv|NGB00480.1 454 Life Sciences GS FLX Titanium MID-66 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACGCGAGA
+>gnl|uv|NGB00481.1 454 Life Sciences GS FLX Titanium MID-67 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGATAGTGA
+>gnl|uv|NGB00482.1 454 Life Sciences GS FLX Titanium MID-68 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCTGCGTA
+>gnl|uv|NGB00483.1 454 Life Sciences GS FLX Titanium MID-69 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGACGTCA
+>gnl|uv|NGB00484.1 454 Life Sciences GS FLX Titanium MID-70 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGTCAGTA
+>gnl|uv|NGB00485.1 454 Life Sciences GS FLX Titanium MID-71 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTAGTGTGA
+>gnl|uv|NGB00486.1 454 Life Sciences GS FLX Titanium MID-72 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTCACACGA
+>gnl|uv|NGB00487.1 454 Life Sciences GS FLX Titanium MID-73 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTCGTCGCA
+>gnl|uv|NGB00488.1 454 Life Sciences GS FLX Titanium MID-74 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACACATACGC
+>gnl|uv|NGB00489.1 454 Life Sciences GS FLX Titanium MID-75 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACAGTCGTGC
+>gnl|uv|NGB00490.1 454 Life Sciences GS FLX Titanium MID-76 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACATGACGAC
+>gnl|uv|NGB00491.1 454 Life Sciences GS FLX Titanium MID-77 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGACAGCTC
+>gnl|uv|NGB00492.1 454 Life Sciences GS FLX Titanium MID-78 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGTCTCATC
+>gnl|uv|NGB00493.1 454 Life Sciences GS FLX Titanium MID-79 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCATCTAC
+>gnl|uv|NGB00494.1 454 Life Sciences GS FLX Titanium MID-80 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCGCGCAC
+>gnl|uv|NGB00495.1 454 Life Sciences GS FLX Titanium MID-81 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGAGCGTCAC
+>gnl|uv|NGB00496.1 454 Life Sciences GS FLX Titanium MID-82 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCGACTAGC
+>gnl|uv|NGB00497.1 454 Life Sciences GS FLX Titanium MID-83 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTAGTGATC
+>gnl|uv|NGB00498.1 454 Life Sciences GS FLX Titanium MID-84 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGACACAC
+>gnl|uv|NGB00499.1 454 Life Sciences GS FLX Titanium MID-85 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGTATGTC
+>gnl|uv|NGB00500.1 454 Life Sciences GS FLX Titanium MID-86 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGATAGAC
+>gnl|uv|NGB00501.1 454 Life Sciences GS FLX Titanium MID-87 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATATAGTCGC
+>gnl|uv|NGB00502.1 454 Life Sciences GS FLX Titanium MID-88 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCTACTGAC
+>gnl|uv|NGB00503.1 454 Life Sciences GS FLX Titanium MID-89 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGTAGATC
+>gnl|uv|NGB00504.1 454 Life Sciences GS FLX Titanium MID-90 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGTGTCGC
+>gnl|uv|NGB00505.1 454 Life Sciences GS FLX Titanium MID-91 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCATACTCTAC
+>gnl|uv|NGB00506.1 454 Life Sciences GS FLX Titanium MID-92 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACACTATC
+>gnl|uv|NGB00507.1 454 Life Sciences GS FLX Titanium MID-93 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGACGCGC
+>gnl|uv|NGB00508.1 454 Life Sciences GS FLX Titanium MID-94 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTATGCGAC
+>gnl|uv|NGB00509.1 454 Life Sciences GS FLX Titanium MID-95 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTCGATCTC
+>gnl|uv|NGB00510.1 454 Life Sciences GS FLX Titanium MID-96 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTACGACTGC
+>gnl|uv|NGB00511.1 454 Life Sciences GS FLX Titanium MID-97 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGTCACTC
+>gnl|uv|NGB00512.1 454 Life Sciences GS FLX Titanium MID-98 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCTACGCTC
+>gnl|uv|NGB00513.1 454 Life Sciences GS FLX Titanium MID-99 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGTACATAC
+>gnl|uv|NGB00514.1 454 Life Sciences GS FLX Titanium MID-100 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGACTGCAC
+>gnl|uv|NGB00515.1 454 Life Sciences GS FLX Titanium MID-101 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCGCGCGC
+>gnl|uv|NGB00516.1 454 Life Sciences GS FLX Titanium MID-102 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCTCTATC
+>gnl|uv|NGB00517.1 454 Life Sciences GS FLX Titanium MID-103 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATAGACATC
+>gnl|uv|NGB00518.1 454 Life Sciences GS FLX Titanium MID-104 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATGATACGC
+>gnl|uv|NGB00519.1 454 Life Sciences GS FLX Titanium MID-105 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACTCATAC
+>gnl|uv|NGB00520.1 454 Life Sciences GS FLX Titanium MID-106 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCATCGAGTC
+>gnl|uv|NGB00521.1 454 Life Sciences GS FLX Titanium MID-107 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGAGCTCTC
+>gnl|uv|NGB00522.1 454 Life Sciences GS FLX Titanium MID-108 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCAGACAC
+>gnl|uv|NGB00523.1 454 Life Sciences GS FLX Titanium MID-109 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGTCTCGC
+>gnl|uv|NGB00524.1 454 Life Sciences GS FLX Titanium MID-110 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGTGACGC
+>gnl|uv|NGB00525.1 454 Life Sciences GS FLX Titanium MID-111 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATGTGTAC
+>gnl|uv|NGB00526.1 454 Life Sciences GS FLX Titanium MID-112 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTATAGAC
+>gnl|uv|NGB00527.1 454 Life Sciences GS FLX Titanium MID-113 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTCGCTAC
+>gnl|uv|NGB00528.1 454 Life Sciences GS FLX Titanium MID-114 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGTGCAGCG
+>gnl|uv|NGB00529.1 454 Life Sciences GS FLX Titanium MID-115 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCACAGAG
+>gnl|uv|NGB00530.1 454 Life Sciences GS FLX Titanium MID-116 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACTCAGCG
+>gnl|uv|NGB00531.1 454 Life Sciences GS FLX Titanium MID-117 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGAGAGTGTG
+>gnl|uv|NGB00532.1 454 Life Sciences GS FLX Titanium MID-118 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCTATCGCG
+>gnl|uv|NGB00533.1 454 Life Sciences GS FLX Titanium MID-119 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTCTGACTG
+>gnl|uv|NGB00534.1 454 Life Sciences GS FLX Titanium MID-120 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGAGCTCG
+>gnl|uv|NGB00535.1 454 Life Sciences GS FLX Titanium MID-121 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGCTCTCG
+>gnl|uv|NGB00536.1 454 Life Sciences GS FLX Titanium MID-122 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCACGTGCG
+>gnl|uv|NGB00537.1 454 Life Sciences GS FLX Titanium MID-123 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCGTAGCAG
+>gnl|uv|NGB00538.1 454 Life Sciences GS FLX Titanium MID-124 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCGTCTGTG
+>gnl|uv|NGB00539.1 454 Life Sciences GS FLX Titanium MID-125 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATGTACGATG
+>gnl|uv|NGB00540.1 454 Life Sciences GS FLX Titanium MID-126 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATGTGTCTAG
+>gnl|uv|NGB00541.1 454 Life Sciences GS FLX Titanium MID-127 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACACGATAG
+>gnl|uv|NGB00542.1 454 Life Sciences GS FLX Titanium MID-128 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACTCGCACG
+>gnl|uv|NGB00543.1 454 Life Sciences GS FLX Titanium MID-129 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGACGTCTG
+>gnl|uv|NGB00544.1 454 Life Sciences GS FLX Titanium MID-130 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGTACTGCG
+>gnl|uv|NGB00545.1 454 Life Sciences GS FLX Titanium MID-131 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACAGCGAG
+>gnl|uv|NGB00546.1 454 Life Sciences GS FLX Titanium MID-132 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCTGTCG
+>gnl|uv|NGB00547.1 454 Life Sciences GS FLX Titanium MID-133 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCGTGCTAG
+>gnl|uv|NGB00548.1 454 Life Sciences GS FLX Titanium MID-134 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCTCGAGTG
+>gnl|uv|NGB00549.1 454 Life Sciences GS FLX Titanium MID-135 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTGATGACG
+>gnl|uv|NGB00550.1 454 Life Sciences GS FLX Titanium MID-136 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTATGTACAG
+>gnl|uv|NGB00551.1 454 Life Sciences GS FLX Titanium MID-137 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGATATAG
+>gnl|uv|NGB00552.1 454 Life Sciences GS FLX Titanium MID-138 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGCACGCG
+>gnl|uv|NGB00553.1 454 Life Sciences GS FLX Titanium MID-139 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGCGTCACG
+>gnl|uv|NGB00554.1 454 Life Sciences GS FLX Titanium MID-140 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGTGCGTCG
+>gnl|uv|NGB00555.1 454 Life Sciences GS FLX Titanium MID-141 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCATACTG
+>gnl|uv|NGB00556.1 454 Life Sciences GS FLX Titanium MID-142 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATACATGTG
+>gnl|uv|NGB00557.1 454 Life Sciences GS FLX Titanium MID-143 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATCACTCAG
+>gnl|uv|NGB00558.1 454 Life Sciences GS FLX Titanium MID-144 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATCTGATAG
+>gnl|uv|NGB00559.1 454 Life Sciences GS FLX Titanium MID-145 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGTGACATG
+>gnl|uv|NGB00560.1 454 Life Sciences GS FLX Titanium MID-146 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGATCGAG
+>gnl|uv|NGB00561.1 454 Life Sciences GS FLX Titanium MID-147 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGACATCTCG
+>gnl|uv|NGB00562.1 454 Life Sciences GS FLX Titanium MID-148 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGCTAGAG
+>gnl|uv|NGB00563.1 454 Life Sciences GS FLX Titanium MID-149 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATAGAGCG
+>gnl|uv|NGB00564.1 454 Life Sciences GS FLX Titanium MID-150 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCGTGTGCG
+>gnl|uv|NGB00565.1 454 Life Sciences GS FLX Titanium MID-151 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTAGTCAG
+>gnl|uv|NGB00566.1 454 Life Sciences GS FLX Titanium MID-152 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTATCACAG
+>gnl|uv|NGB00567.1 454 Life Sciences GS FLX Titanium MID-153 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTGCGCGTG
+>gnl|uv|NGB00568.1 454 GS FLX Titanium Rapid Library Adaptor A universal segment
+CCATCTCATCCCTGCGTGTCTCCGACGACT
+>gnl|uv|NGB00569.1 454 GS FLX Titanium Rapid Library Adaptor B universal segment
+NGTCGNCGTCTCTCAAGGCACACAGGGGATAGG
+>gnl|uv|NGB00099.1 CLONTECH GenomeWalker Adaptor
+GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGGT
+>gnl|uv|NGB00361.2 Illumina PCR Primer (Oligonucleotide sequence copyright 2007-2009 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
+>gnl|uv|NGB00623.1 ABI SOLiD P1 Adaptor
+AACCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00624.1 ABI SOLiD P2 Adaptor
+AGAGAATGAGGAACCCGGGGCAGTT
+>gnl|uv|NGB00625.1 ABI SOLiD P2-T Adaptor
+AGAGAATGAGGAACCCGGGGCAGCC
+>gnl|uv|NGB00626.1 ABI SOLiD Internal Adaptor
+CTGCTGTACCGTACATCCGCCTTGGCCGTACAGCAG
+>gnl|uv|NGB00627.1 ABI SOLiD P1-T Adaptor
+GGCCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00628.1 ABI SOLiD Barcode Adaptor T-001
+CTGCCCCGGGTTCCTCATTCTCTGTGTAAGAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00629.1 ABI SOLiD Barcode Adaptor T-002
+CTGCCCCGGGTTCCTCATTCTCTAGGGAGTGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00630.1 ABI SOLiD Barcode Adaptor T-003
+CTGCCCCGGGTTCCTCATTCTCTATAGGTTATACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00631.1 ABI SOLiD Barcode Adaptor T-004
+CTGCCCCGGGTTCCTCATTCTCTGGATGCGGTCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00632.1 ABI SOLiD Barcode Adaptor T-005
+CTGCCCCGGGTTCCTCATTCTCTGTGGTGTAAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00633.1 ABI SOLiD Barcode Adaptor T-006
+CTGCCCCGGGTTCCTCATTCTCTGCGAGGGACACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00634.1 ABI SOLiD Barcode Adaptor T-007
+CTGCCCCGGGTTCCTCATTCTCTGGGTTATGCCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00635.1 ABI SOLiD Barcode Adaptor T-008
+CTGCCCCGGGTTCCTCATTCTCTGAGCGAGGATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00636.1 ABI SOLiD Barcode Adaptor T-009
+CTGCCCCGGGTTCCTCATTCTCTAGGTTGCGACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00637.1 ABI SOLiD Barcode Adaptor T-010
+CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGCTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00638.1 ABI SOLiD Barcode Adaptor T-011
+CTGCCCCGGGTTCCTCATTCTCTGTGCGACACGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00639.1 ABI SOLiD Barcode Adaptor T-012
+CTGCCCCGGGTTCCTCATTCTCTAAGAGGAAAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00640.1 ABI SOLiD Barcode Adaptor T-013
+CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00641.1 ABI SOLiD Barcode Adaptor T-014
+CTGCCCCGGGTTCCTCATTCTCTGTGCGGCAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00642.1 ABI SOLiD Barcode Adaptor T-015
+CTGCCCCGGGTTCCTCATTCTCTGAGTTGAATGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00643.1 ABI SOLiD Barcode Adaptor T-016
+CTGCCCCGGGTTCCTCATTCTCTGGGAGACGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00644.1 ABI SOLiD Barcode Adaptor T-017
+CTGCCCCGGGTTCCTCATTCTCTGGCTCACCGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00645.1 ABI SOLiD Barcode Adaptor T-018
+CTGCCCCGGGTTCCTCATTCTCTAGGCGGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00646.1 ABI SOLiD Barcode Adaptor T-019
+CTGCCCCGGGTTCCTCATTCTCTATGGTAACTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00647.1 ABI SOLiD Barcode Adaptor T-020
+CTGCCCCGGGTTCCTCATTCTCTGTCAAGCTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00648.1 ABI SOLiD Barcode Adaptor T-021
+CTGCCCCGGGTTCCTCATTCTCTGTGCGGTTCCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00649.1 ABI SOLiD Barcode Adaptor T-022
+CTGCCCCGGGTTCCTCATTCTCTGAGAAGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00650.1 ABI SOLiD Barcode Adaptor T-023
+CTGCCCCGGGTTCCTCATTCTCTGCGGTGCTTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00651.1 ABI SOLiD Barcode Adaptor T-024
+CTGCCCCGGGTTCCTCATTCTCTGGGTCGGTATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00652.1 ABI SOLiD Barcode Adaptor T-025
+CTGCCCCGGGTTCCTCATTCTCTAACATGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00653.1 ABI SOLiD Barcode Adaptor T-026
+CTGCCCCGGGTTCCTCATTCTCTCGGGAGCCCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00654.1 ABI SOLiD Barcode Adaptor T-027
+CTGCCCCGGGTTCCTCATTCTCTCAGCAAACTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00655.1 ABI SOLiD Barcode Adaptor T-028
+CTGCCCCGGGTTCCTCATTCTCTAGCTTACTACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00656.1 ABI SOLiD Barcode Adaptor T-029
+CTGCCCCGGGTTCCTCATTCTCTGAATCTAGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00657.1 ABI SOLiD Barcode Adaptor T-030
+CTGCCCCGGGTTCCTCATTCTCTGTAGCGAAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00658.1 ABI SOLiD Barcode Adaptor T-031
+CTGCCCCGGGTTCCTCATTCTCTGCTGGTGCGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00659.1 ABI SOLiD Barcode Adaptor T-032
+CTGCCCCGGGTTCCTCATTCTCTGGTTGGGTGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00660.1 ABI SOLiD Barcode Adaptor T-033
+CTGCCCCGGGTTCCTCATTCTCTCGTTGGATACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00661.1 ABI SOLiD Barcode Adaptor T-034
+CTGCCCCGGGTTCCTCATTCTCTTCGTTAAAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00662.1 ABI SOLiD Barcode Adaptor T-035
+CTGCCCCGGGTTCCTCATTCTCTAAGCGTAGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00663.1 ABI SOLiD Barcode Adaptor T-036
+CTGCCCCGGGTTCCTCATTCTCTGTTCTCACATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00664.1 ABI SOLiD Barcode Adaptor T-037
+CTGCCCCGGGTTCCTCATTCTCTCTGTTATACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00665.1 ABI SOLiD Barcode Adaptor T-038
+CTGCCCCGGGTTCCTCATTCTCTGTCGTCTTAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00666.1 ABI SOLiD Barcode Adaptor T-039
+CTGCCCCGGGTTCCTCATTCTCTTATCGTGAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00667.1 ABI SOLiD Barcode Adaptor T-040
+CTGCCCCGGGTTCCTCATTCTCTAAAAGGGTTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00668.1 ABI SOLiD Barcode Adaptor T-041
+CTGCCCCGGGTTCCTCATTCTCTTGTGGGATTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00669.1 ABI SOLiD Barcode Adaptor T-042
+CTGCCCCGGGTTCCTCATTCTCTGAATGTACTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00670.1 ABI SOLiD Barcode Adaptor T-043
+CTGCCCCGGGTTCCTCATTCTCTCGCTAGGGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00671.1 ABI SOLiD Barcode Adaptor T-044
+CTGCCCCGGGTTCCTCATTCTCTAAGGATGATCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00672.1 ABI SOLiD Barcode Adaptor T-045
+CTGCCCCGGGTTCCTCATTCTCTGTACTTGGCTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00673.1 ABI SOLiD Barcode Adaptor T-046
+CTGCCCCGGGTTCCTCATTCTCTGGTCGTCGAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00674.1 ABI SOLiD Barcode Adaptor T-047
+CTGCCCCGGGTTCCTCATTCTCTGAGGGATGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00675.1 ABI SOLiD Barcode Adaptor T-048
+CTGCCCCGGGTTCCTCATTCTCTGCCGTAAGTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00676.1 ABI SOLiD Barcode Adaptor T-049
+CTGCCCCGGGTTCCTCATTCTCTATGTCATAAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00677.1 ABI SOLiD Barcode Adaptor T-050
+CTGCCCCGGGTTCCTCATTCTCTGAAGGCTTGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00678.1 ABI SOLiD Barcode Adaptor T-051
+CTGCCCCGGGTTCCTCATTCTCTAAGCAGGAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00679.1 ABI SOLiD Barcode Adaptor T-052
+CTGCCCCGGGTTCCTCATTCTCTGTAATTGTAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00680.1 ABI SOLiD Barcode Adaptor T-053
+CTGCCCCGGGTTCCTCATTCTCTGTCATCAAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00681.1 ABI SOLiD Barcode Adaptor T-054
+CTGCCCCGGGTTCCTCATTCTCTAAAAGGCGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00682.1 ABI SOLiD Barcode Adaptor T-055
+CTGCCCCGGGTTCCTCATTCTCTAGCTTAAGCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00683.1 ABI SOLiD Barcode Adaptor T-056
+CTGCCCCGGGTTCCTCATTCTCTGCATGTCACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00684.1 ABI SOLiD Barcode Adaptor T-057
+CTGCCCCGGGTTCCTCATTCTCTCTAGTAAGAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00685.1 ABI SOLiD Barcode Adaptor T-058
+CTGCCCCGGGTTCCTCATTCTCTTAAAGTGGCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00686.1 ABI SOLiD Barcode Adaptor T-059
+CTGCCCCGGGTTCCTCATTCTCTAAGTAATGTCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00687.1 ABI SOLiD Barcode Adaptor T-060
+CTGCCCCGGGTTCCTCATTCTCTGTGCCTCGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00688.1 ABI SOLiD Barcode Adaptor T-061
+CTGCCCCGGGTTCCTCATTCTCTAAGATTATCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00689.1 ABI SOLiD Barcode Adaptor T-062
+CTGCCCCGGGTTCCTCATTCTCTAGGTGAGGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00690.1 ABI SOLiD Barcode Adaptor T-063
+CTGCCCCGGGTTCCTCATTCTCTGCGGGTTCGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00691.1 ABI SOLiD Barcode Adaptor T-064
+CTGCCCCGGGTTCCTCATTCTCTGTGCTACACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00692.1 ABI SOLiD Barcode Adaptor T-065
+CTGCCCCGGGTTCCTCATTCTCTGGGATCAAGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00693.1 ABI SOLiD Barcode Adaptor T-066
+CTGCCCCGGGTTCCTCATTCTCTGATGTAATGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00694.1 ABI SOLiD Barcode Adaptor T-067
+CTGCCCCGGGTTCCTCATTCTCTGTCCTTAGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00695.1 ABI SOLiD Barcode Adaptor T-068
+CTGCCCCGGGTTCCTCATTCTCTGCATTGACGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00696.1 ABI SOLiD Barcode Adaptor T-069
+CTGCCCCGGGTTCCTCATTCTCTGATATGCTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00697.1 ABI SOLiD Barcode Adaptor T-070
+CTGCCCCGGGTTCCTCATTCTCTGCCCTACAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00698.1 ABI SOLiD Barcode Adaptor T-071
+CTGCCCCGGGTTCCTCATTCTCTACAGGGAACGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00699.1 ABI SOLiD Barcode Adaptor T-072
+CTGCCCCGGGTTCCTCATTCTCTAAGTGAATACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00700.1 ABI SOLiD Barcode Adaptor T-073
+CTGCCCCGGGTTCCTCATTCTCTGCAATGACGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00701.1 ABI SOLiD Barcode Adaptor T-074
+CTGCCCCGGGTTCCTCATTCTCTAGGACGCTGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00702.1 ABI SOLiD Barcode Adaptor T-075
+CTGCCCCGGGTTCCTCATTCTCTGTATCTGGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00703.1 ABI SOLiD Barcode Adaptor T-076
+CTGCCCCGGGTTCCTCATTCTCTAAGTTTTAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00704.1 ABI SOLiD Barcode Adaptor T-077
+CTGCCCCGGGTTCCTCATTCTCTATCTGGTCTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00705.1 ABI SOLiD Barcode Adaptor T-078
+CTGCCCCGGGTTCCTCATTCTCTGGCAATCATCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00706.1 ABI SOLiD Barcode Adaptor T-079
+CTGCCCCGGGTTCCTCATTCTCTAGTAGAATTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00707.1 ABI SOLiD Barcode Adaptor T-080
+CTGCCCCGGGTTCCTCATTCTCTGTTTACGGTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00708.1 ABI SOLiD Barcode Adaptor T-081
+CTGCCCCGGGTTCCTCATTCTCTGAACGTCATTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00709.1 ABI SOLiD Barcode Adaptor T-082
+CTGCCCCGGGTTCCTCATTCTCTGTGAAGGGAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00710.1 ABI SOLiD Barcode Adaptor T-083
+CTGCCCCGGGTTCCTCATTCTCTGGATGGCGTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00711.1 ABI SOLiD Barcode Adaptor T-084
+CTGCCCCGGGTTCCTCATTCTCTGCGGATGAACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00712.1 ABI SOLiD Barcode Adaptor T-085
+CTGCCCCGGGTTCCTCATTCTCTGGAAAGCGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00713.1 ABI SOLiD Barcode Adaptor T-086
+CTGCCCCGGGTTCCTCATTCTCTAGTACCAGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00714.1 ABI SOLiD Barcode Adaptor T-087
+CTGCCCCGGGTTCCTCATTCTCTATAGCAAAGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00715.1 ABI SOLiD Barcode Adaptor T-088
+CTGCCCCGGGTTCCTCATTCTCTGTTGATCATGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00716.1 ABI SOLiD Barcode Adaptor T-089
+CTGCCCCGGGTTCCTCATTCTCTAGGCTGTCTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00717.1 ABI SOLiD Barcode Adaptor T-090
+CTGCCCCGGGTTCCTCATTCTCTGTGACCTACTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00718.1 ABI SOLiD Barcode Adaptor T-091
+CTGCCCCGGGTTCCTCATTCTCTGCGTATTGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00719.1 ABI SOLiD Barcode Adaptor T-092
+CTGCCCCGGGTTCCTCATTCTCTAAGGGATTACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00720.1 ABI SOLiD Barcode Adaptor T-093
+CTGCCCCGGGTTCCTCATTCTCTGTTACGATGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00721.1 ABI SOLiD Barcode Adaptor T-094
+CTGCCCCGGGTTCCTCATTCTCTATGGGTGTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00722.1 ABI SOLiD Barcode Adaptor T-095
+CTGCCCCGGGTTCCTCATTCTCTGAGTCCGGCACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00723.1 ABI SOLiD Barcode Adaptor T-096
+CTGCCCCGGGTTCCTCATTCTCTAATCGAAGAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00724.1 ABI SOLiD Barcode Adaptor A
+GCTGTACGGCCAAGGCGCAGCAGCATG
+>gnl|uv|NGB00727.1 Illumina Nextera PCR primer i5 index N501 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTAGATCGCTCGTCGGCAGCGTC
+>gnl|uv|NGB00728.1 Illumina Nextera PCR primer i5 index N502 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCTCTCTATTCGTCGGCAGCGTC
+>gnl|uv|NGB00729.1 Illumina Nextera PCR primer i5 index N503 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTATCCTCTTCGTCGGCAGCGTC
+>gnl|uv|NGB00730.1 Illumina Nextera PCR primer i5 index N504 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAGAGTAGATCGTCGGCAGCGTC
+>gnl|uv|NGB00731.1 Illumina Nextera PCR primer i5 index N505 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTAAGGAGTCGTCGGCAGCGTC
+>gnl|uv|NGB00732.1 Illumina Nextera PCR primer i5 index N506 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACACTGCATATCGTCGGCAGCGTC
+>gnl|uv|NGB00733.1 Illumina Nextera PCR primer i5 index N507 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAAGGAGTATCGTCGGCAGCGTC
+>gnl|uv|NGB00734.1 Illumina Nextera PCR primer i5 index N508 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCTAAGCCTTCGTCGGCAGCGTC
+>gnl|uv|NGB00735.1 Illumina Nextera PCR primer i7 index N701 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00736.1 Illumina Nextera PCR primer i7 index N702 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00737.1 Illumina Nextera PCR primer i7 index N703 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGG
+>gnl|uv|NGB00738.1 Illumina Nextera PCR primer i7 index N704 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00739.1 Illumina Nextera PCR primer i7 index N705 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTCTCGTGGGCTCGG
+>gnl|uv|NGB00740.1 Illumina Nextera PCR primer i7 index N706 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCATGCCTAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00741.1 Illumina Nextera PCR primer i7 index N707 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00742.1 Illumina Nextera PCR primer i7 index N708 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCTCTCTGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00743.1 Illumina Nextera PCR primer i7 index N709 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGCGTAGCGTCTCGTGGGCTCGG
+>gnl|uv|NGB00744.1 Illumina Nextera PCR primer i7 index N710 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00745.1 Illumina Nextera PCR primer i7 index N711 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTCTCGTGGGCTCGG
+>gnl|uv|NGB00746.1 Illumina Nextera PCR primer i7 index N712 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCCTCTACGTCTCGTGGGCTCGG
+>gnl|uv|NGB00747.1 Illumina TruSeq DNA HT and RNA HT i5 index D501 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTATAGCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00748.1 Illumina TruSeq DNA HT and RNA HT i5 index D502 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACATAGAGGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00749.1 Illumina TruSeq DNA HT and RNA HT i5 index D503 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCCTATCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00750.1 Illumina TruSeq DNA HT and RNA HT i5 index D504 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGGCTCTGAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00751.1 Illumina TruSeq DNA HT and RNA HT i5 index D505 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAGGCGAAGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00752.1 Illumina TruSeq DNA HT and RNA HT i5 index D506 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTAATCTTAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00753.1 Illumina TruSeq DNA HT and RNA HT i5 index D507 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCAGGACGTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00754.1 Illumina TruSeq DNA HT and RNA HT i5 index D508 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTACTGACACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00755.1 Illumina TruSeq DNA HT and RNA HT i7 index D701 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00756.1 Illumina TruSeq DNA HT and RNA HT i7 index D702 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00757.1 Illumina TruSeq DNA HT and RNA HT i7 index D703 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGCTCATTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00758.1 Illumina TruSeq DNA HT and RNA HT i7 index D704 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGATTCCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00759.1 Illumina TruSeq DNA HT and RNA HT i7 index D705 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCAGAAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00760.1 Illumina TruSeq DNA HT and RNA HT i7 index D706 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAATTCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00761.1 Illumina TruSeq DNA HT and RNA HT i7 index D707 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTGAAGCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00762.1 Illumina TruSeq DNA HT and RNA HT i7 index D708 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATGCGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00763.1 Illumina TruSeq DNA HT and RNA HT i7 index D709 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGGCTATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00764.1 Illumina TruSeq DNA HT and RNA HT i7 index D710 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGCGAAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00765.1 Illumina TruSeq DNA HT and RNA HT i7 index D711 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCGCGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00766.1 Illumina TruSeq DNA HT and RNA HT i7 index D712 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGCGATAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00767.1 Illumina TruSeq Adapter Index 1 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00768.1 Illumina TruSeq Adapter Index 2 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00769.1 Illumina TruSeq Adapter Index 3 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00770.1 Illumina TruSeq Adapter Index 4 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00771.1 Illumina TruSeq Adapter Index 5 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00772.1 Illumina TruSeq Adapter Index 6 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00773.1 Illumina TruSeq Adapter Index 7 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00774.1 Illumina TruSeq Adapter Index 8 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00775.1 Illumina TruSeq Adapter Index 9 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00776.1 Illumina TruSeq Adapter Index 10 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00777.1 Illumina TruSeq Adapter Index 11 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00778.1 Illumina TruSeq Adapter Index 12 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00779.1 Illumina TruSeq Adapter Index 13 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00780.1 Illumina TruSeq Adapter Index 14 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00781.1 Illumina TruSeq Adapter Index 15 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGTCAGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00782.1 Illumina TruSeq Adapter Index 16 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCGTCCCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00783.1 Illumina TruSeq Adapter Index 18 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCCGCACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00784.1 Illumina TruSeq Adapter Index 19 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGAAACGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00785.1 Illumina TruSeq Adapter Index 20 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGGCCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00786.1 Illumina TruSeq Adapter Index 21 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00787.1 Illumina TruSeq Adapter Index 22 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGTACGTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00788.1 Illumina TruSeq Adapter Index 23 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTGGATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00789.1 Illumina TruSeq Adapter Index 25 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTGATATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00790.1 Illumina TruSeq Adapter Index 27 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCCTTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00791.1 Illumina TruSeq Small RNA Sample Prep Kit Stop Oligo (STP) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GAAUUCCACCACGUUCCCGUGG
+>gnl|uv|NGB00792.1 Illumina TruSeq Small RNA Sample Prep Kit RNA RT Primer (RTP) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00793.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer (RP1) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGA
+>gnl|uv|NGB00794.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 1 (RPI1) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00795.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 2 (RPI2) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00796.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 3 (RPI3) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00797.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 4 (RPI4) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00798.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 5 (RPI5) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00799.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 6 (RPI6) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00800.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 7 (RPI7) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00801.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 8 (RPI8) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00802.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 9 (RPI9) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00803.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 10 (RPI10) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00804.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 11 (RPI11) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00805.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 12 (RPI12) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00806.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 13 (RPI13) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTGACTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00807.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 14 (RPI14) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGAACTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00808.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 15 (RPI15) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGACATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00809.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 16 (RPI16) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGACGGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00810.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 17 (RPI17) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTCTACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00811.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 18 (RPI18) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCGGACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00812.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 19 (RPI19) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTTCACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00813.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 20 (RPI20) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGCCACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00814.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 21 (RPI21) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGAAACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00815.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 22 (RPI22) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGTACGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00816.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 23 (RPI23) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCACTCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00817.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 24 (RPI24) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTACCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00818.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 25 (RPI25) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATCAGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00819.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 26 (RPI26) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTCATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00820.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 27 (RPI27) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGGAATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00821.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 28 (RPI28) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTTTTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00822.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 29 (RPI29) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTAGTTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00823.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 30 (RPI30) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCGGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00824.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 31 (RPI31) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATCGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00825.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 32 (RPI32) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGAGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00826.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 33 (RPI33) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGCCTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00827.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 34 (RPI34) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCCATGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00828.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 35 (RPI35) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAAAATGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00829.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 36 (RPI36) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGTTGGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00830.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 37 (RPI37) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTCCGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00831.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 38 (RPI38) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGCTAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00832.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 39 (RPI39) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTATAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00833.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 40 (RPI40) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCTGAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00834.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 41 (RPI41) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTCGTCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00835.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 42 (RPI42) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGATTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00836.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 43 (RPI43) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTGTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00837.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 44 (RPI44) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTATAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00838.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 45 (RPI45) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGAATGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00839.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 46 (RPI46) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCGGGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00840.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 47 (RPI47) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTTCGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00841.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 48 (RPI48) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGCCGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00844.1 Epicentre BiotechnologiesNextera DNA Sample Prep Kit Adaptor (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGCCTCCCTCGCGCCATCAG
+>gnl|uv|NGB00845.1 Epicentre Biotechnologies Nextera DNA Sample Prep Kit Adaptor, following the barcode (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CGGTCTGCCTTGCCAGCCCGCTCAG
+>gnl|uv|NGB00846.1 NEBNext Adaptor for Illumina
+GATCGGAAGAGCACACGTCTGAACTCCAGTCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00847.1 NEBNext Index 1 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00848.1 NEBNext Index 2 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00849.1 NEBNext Index 3 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00850.1 NEBNext Index 4 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00851.1 NEBNext Index 5 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00852.1 NEBNext Index 6 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00853.1 NEBNext Index 7 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00854.1 NEBNext Index 8 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00855.1 NEBNext Index 9 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00856.1 NEBNext Index 10 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00857.1 NEBNext Index 11 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00858.1 NEBNext Index 12 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00859.1 NEBNext Index 13 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTGTTGACTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00860.1 NEBNext Index 14 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATACGGAACTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00861.1 NEBNext Index 15 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCTGACATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00862.1 NEBNext Index 16 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGGGACGGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00863.1 NEBNext Index 18 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGTGCGGACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00864.1 NEBNext Index 19 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGTTTCACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00865.1 NEBNext Index 20 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAGGCCACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00866.1 NEBNext Index 21 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCCGAAACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00867.1 NEBNext Index 22 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTACGTACGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00868.1 NEBNext Index 23 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATCCACTCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00869.1 NEBNext Index 25 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATATCAGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00870.1 NEBNext Index 27 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAAGGAATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00871.1 Ion Xpress A Adapter
+AACCATCTCATCCCTGCGTGTCTCCGACTCAG
+>gnl|uv|NGB00872.1 Ion Xpress P1 Adapter
+AACCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00873.1 Ion Xpress Barcode 1 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGAT
+>gnl|uv|NGB00874.1 Ion Xpress Barcode 2 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGGAGAACGAT
+>gnl|uv|NGB00875.1 Ion Xpress Barcode 3 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAAGAGGATTCGAT
+>gnl|uv|NGB00876.1 Ion Xpress Barcode 4 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACCAAGATCGAT
+>gnl|uv|NGB00877.1 Ion Xpress Barcode 5 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGAAGGAACGAT
+>gnl|uv|NGB00878.1 Ion Xpress Barcode 6 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGCAAGTTCGAT
+>gnl|uv|NGB00879.1 Ion Xpress Barcode 7 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCGTGATTCGAT
+>gnl|uv|NGB00880.1 Ion Xpress Barcode 8 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCGATAACGAT
+>gnl|uv|NGB00881.1 Ion Xpress Barcode 9 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGCGGAACGAT
+>gnl|uv|NGB00882.1 Ion Xpress Barcode 10 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGACCGAACGAT
+>gnl|uv|NGB00883.1 Ion Xpress Barcode 11 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTCGAATCGAT
+>gnl|uv|NGB00884.1 Ion Xpress Barcode 12 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGGTGGTTCGAT
+>gnl|uv|NGB00885.1 Ion Xpress Barcode 13 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAACGGACGAT
+>gnl|uv|NGB00886.1 Ion Xpress Barcode 14 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGAGTGTCGAT
+>gnl|uv|NGB00887.1 Ion Xpress Barcode 15 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGAGGTCGAT
+>gnl|uv|NGB00888.1 Ion Xpress Barcode 16 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGGATGACGAT
+>gnl|uv|NGB00889.1 Ion Xpress Barcode 17 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTATTCGTCGAT
+>gnl|uv|NGB00890.1 Ion Xpress Barcode 18 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGGCAATTGCGAT
+>gnl|uv|NGB00891.1 Ion Xpress Barcode 19 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTAGTCGGACGAT
+>gnl|uv|NGB00892.1 Ion Xpress Barcode 20 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGATCCATCGAT
+>gnl|uv|NGB00893.1 Ion Xpress Barcode 21 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCAATTACGAT
+>gnl|uv|NGB00894.1 Ion Xpress Barcode 22 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCGAGACGCGAT
+>gnl|uv|NGB00895.1 Ion Xpress Barcode 23 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCCACGAACGAT
+>gnl|uv|NGB00896.1 Ion Xpress Barcode 24 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAACCTCATTCGAT
+>gnl|uv|NGB00897.1 Ion Xpress Barcode 25 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGAGATACGAT
+>gnl|uv|NGB00898.1 Ion Xpress Barcode 26 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTACAACCTCGAT
+>gnl|uv|NGB00899.1 Ion Xpress Barcode 27 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAACCATCCGCGAT
+>gnl|uv|NGB00900.1 Ion Xpress Barcode 28 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCCGGAATCGAT
+>gnl|uv|NGB00901.1 Ion Xpress Barcode 29 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGACCACTCGAT
+>gnl|uv|NGB00902.1 Ion Xpress Barcode 30 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGGTTATCGAT
+>gnl|uv|NGB00903.1 Ion Xpress Barcode 31 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCAAGCTGCGAT
+>gnl|uv|NGB00904.1 Ion Xpress Barcode 32 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTTACACACGAT
+>gnl|uv|NGB00905.1 Ion Xpress Barcode 33 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCTCATTGAACGAT
+>gnl|uv|NGB00906.1 Ion Xpress Barcode 34 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCATCGTTCGAT
+>gnl|uv|NGB00907.1 Ion Xpress Barcode 35 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGCCATTGTCGAT
+>gnl|uv|NGB00908.1 Ion Xpress Barcode 36 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAAGGAATCGTCGAT
+>gnl|uv|NGB00909.1 Ion Xpress Barcode 37 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGAGAATGTCGAT
+>gnl|uv|NGB00910.1 Ion Xpress Barcode 38 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGGAGGACGGACGAT
+>gnl|uv|NGB00911.1 Ion Xpress Barcode 39 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAACAATCGGCGAT
+>gnl|uv|NGB00912.1 Ion Xpress Barcode 40 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGACATAATCGAT
+>gnl|uv|NGB00913.1 Ion Xpress Barcode 41 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCACTTCGCGAT
+>gnl|uv|NGB00914.1 Ion Xpress Barcode 42 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCACGAATCGAT
+>gnl|uv|NGB00915.1 Ion Xpress Barcode 43 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGACACCGCGAT
+>gnl|uv|NGB00916.1 Ion Xpress Barcode 44 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGAGGCCAGCGAT
+>gnl|uv|NGB00917.1 Ion Xpress Barcode 45 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGGAGCTTCCTCGAT
+>gnl|uv|NGB00918.1 Ion Xpress Barcode 46 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAGTCCGAACGAT
+>gnl|uv|NGB00919.1 Ion Xpress Barcode 47 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGGCAACCACGAT
+>gnl|uv|NGB00920.1 Ion Xpress Barcode 48 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCTAAGAGACGAT
+>gnl|uv|NGB00921.1 Ion Xpress Barcode 49 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTAACATAACGAT
+>gnl|uv|NGB00922.1 Ion Xpress Barcode 50 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGACAATGGCGAT
+>gnl|uv|NGB00923.1 Ion Xpress Barcode 51 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGAGCCTATTCGAT
+>gnl|uv|NGB00924.1 Ion Xpress Barcode 52 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGCATGGAACGAT
+>gnl|uv|NGB00925.1 Ion Xpress Barcode 53 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGGCAATCCTCGAT
+>gnl|uv|NGB00926.1 Ion Xpress Barcode 54 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGGAGAATCGCGAT
+>gnl|uv|NGB00927.1 Ion Xpress Barcode 55 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCACCTCCTCGAT
+>gnl|uv|NGB00928.1 Ion Xpress Barcode 56 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGCATTAATTCGAT
+>gnl|uv|NGB00929.1 Ion Xpress Barcode 57 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGGCAACGGCGAT
+>gnl|uv|NGB00930.1 Ion Xpress Barcode 58 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTAGAACACGAT
+>gnl|uv|NGB00931.1 Ion Xpress Barcode 59 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTTGATGTTCGAT
+>gnl|uv|NGB00932.1 Ion Xpress Barcode 60 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGCTCTTCGAT
+>gnl|uv|NGB00933.1 Ion Xpress Barcode 61 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACTCGGATCGAT
+>gnl|uv|NGB00934.1 Ion Xpress Barcode 62 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCTGCTTCACGAT
+>gnl|uv|NGB00935.1 Ion Xpress Barcode 63 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTTAGAGTTCGAT
+>gnl|uv|NGB00936.1 Ion Xpress Barcode 64 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGAGTTCCGACGAT
+>gnl|uv|NGB00937.1 Ion Xpress Barcode 65 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTGGCACATCGAT
+>gnl|uv|NGB00938.1 Ion Xpress Barcode 66 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGCAATCATCGAT
+>gnl|uv|NGB00939.1 Ion Xpress Barcode 67 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCTACCAGTCGAT
+>gnl|uv|NGB00940.1 Ion Xpress Barcode 68 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAAGAAGTTCGAT
+>gnl|uv|NGB00941.1 Ion Xpress Barcode 69 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCAATTGGCGAT
+>gnl|uv|NGB00942.1 Ion Xpress Barcode 70 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTACTGGTCGAT
+>gnl|uv|NGB00943.1 Ion Xpress Barcode 71 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGGCTCCGACGAT
+>gnl|uv|NGB00944.1 Ion Xpress Barcode 72 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAAGGCCACACGAT
+>gnl|uv|NGB00945.1 Ion Xpress Barcode 73 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGCCTGTCGAT
+>gnl|uv|NGB00946.1 Ion Xpress Barcode 74 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCGGTTCGAT
+>gnl|uv|NGB00947.1 Ion Xpress Barcode 75 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAGGAATACGAT
+>gnl|uv|NGB00948.1 Ion Xpress Barcode 76 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGAAGAACCTCGAT
+>gnl|uv|NGB00949.1 Ion Xpress Barcode 77 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAAGCGATTCGAT
+>gnl|uv|NGB00950.1 Ion Xpress Barcode 78 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGCCAATTCTCGAT
+>gnl|uv|NGB00951.1 Ion Xpress Barcode 79 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGGTTGTCGAT
+>gnl|uv|NGB00952.1 Ion Xpress Barcode 80 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGAAGGCAGGCGAT
+>gnl|uv|NGB00953.1 Ion Xpress Barcode 81 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGCCATTCGCGAT
+>gnl|uv|NGB00954.1 Ion Xpress Barcode 82 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGCATCTCGAT
+>gnl|uv|NGB00955.1 Ion Xpress Barcode 83 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGGACATTCGAT
+>gnl|uv|NGB00956.1 Ion Xpress Barcode 84 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTCCATAACGAT
+>gnl|uv|NGB00957.1 Ion Xpress Barcode 85 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCAGCCTCAACGAT
+>gnl|uv|NGB00958.1 Ion Xpress Barcode 86 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGGTTATTCGAT
+>gnl|uv|NGB00959.1 Ion Xpress Barcode 87 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGCTGGACGAT
+>gnl|uv|NGB00960.1 Ion Xpress Barcode 88 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGAACACTTCGAT
+>gnl|uv|NGB00961.1 Ion Xpress Barcode 89 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTGAATCTCGAT
+>gnl|uv|NGB00962.1 Ion Xpress Barcode 90 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAACCACGGCGAT
+>gnl|uv|NGB00963.1 Ion Xpress Barcode 91 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGAAGGATGCGAT
+>gnl|uv|NGB00964.1 Ion Xpress Barcode 92 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGGAACCGCGAT
+>gnl|uv|NGB00965.1 Ion Xpress Barcode 93 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGTCCAATCGAT
+>gnl|uv|NGB00966.1 Ion Xpress Barcode 94 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCGACAAGCGAT
+>gnl|uv|NGB00967.1 Ion Xpress Barcode 95 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGACAGATCGAT
+>gnl|uv|NGB00968.1 Ion Xpress Barcode 96 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTAAGCGGTCGAT
+>gnl|uv|NGB00969.1 Illumina Single End Apapter 1 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+ACACTCTTTCCCTACACGACGCTGTTCCATCT
+>gnl|uv|NGB00970.1 ABI SOLiD SAGE Dynabeads Oligo-dT EcoP Primer
+CTGATCTAGAGGTACCGGATCCCAGCAGTTTTTTTTTTTTTTTTTTTTTTTTT
+>gnl|uv|NGB00971.1 ABI SOLiD SAGE Adapter A
+CTGCCCCGGGTTCCTCATTCTCTCAGCAGCATG
+>gnl|uv|NGB00972.1 Pacific Biosciences Blunt Adapter
+ATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGAT
+>gnl|uv|NGB00973.1 Pacific Biosciences C2 Primer
+AAAAAAAAAAAAAAAAAATTAACGGAGGAGGAGGA
+>gnl|uv|NGB00982.1 Universal primer-dN6
+GCCGGAGCTCTGCAGAATTCNNNNNN
+>gnl|uv|NGB00983.1 Whole Transcriptome Amplification 5'-end tag
+GTGGTGTGTTGGGTGTGTTTGGNNNNNNNNN
+>gnl|uv|NGB01026.1 SISPA primer FR20RV
+GCCGGAGCTCTGCAGATATC
+>gnl|uv|NGB01029.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT1
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01030.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT2
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01031.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT3
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01032.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT4
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01033.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT5
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01034.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT6
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01035.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT7
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01036.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT8
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01037.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT9
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01038.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT10
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01039.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT11
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01040.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT12
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01041.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT13
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGGTTGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01042.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT14
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCGGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01043.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT15
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAAGCGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01044.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT16
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGTCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01045.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT17
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTACCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01046.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT18
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCTGTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01047.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT19
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTGCTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01048.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT20
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGGAGGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01049.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT21
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGAGCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01050.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT22
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGATACGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01051.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT99
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGCTACCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01052.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT101
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGTTGGACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01053.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT25
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGCGATCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01054.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT26
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCCTGCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01055.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT27
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGTGACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01056.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT28
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTACAGGATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01057.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT29
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCTCAATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01058.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT30
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTGGTTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01059.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT31
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGTCTTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01060.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT32
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCCATTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01061.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT33
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGAAGTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01062.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT34
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAACGCTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01063.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT35
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGGTATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01064.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT36
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGAACTGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01065.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT102
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCACAACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01066.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT38
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCACGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01067.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT39
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCAGGAGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01068.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT40
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAAGTTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01069.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT41
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCAGTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01070.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT42
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTATGCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01071.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT43
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCATTGAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01072.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT44
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGGCTCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01073.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT45
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTATGCCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01074.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT46
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCAGATTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01075.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT47
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTACTAGTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01076.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT48
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCAGCTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01077.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D701
+AATGATACGGCGACCACCGAGATCTACACATTACTCGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01078.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D702
+AATGATACGGCGACCACCGAGATCTACACTCCGGAGAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01079.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D703
+AATGATACGGCGACCACCGAGATCTACACCGCTCATTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01080.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D704
+AATGATACGGCGACCACCGAGATCTACACGAGATTCCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01081.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D705
+AATGATACGGCGACCACCGAGATCTACACATTCAGAAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01082.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D706
+AATGATACGGCGACCACCGAGATCTACACGAATTCGTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01083.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D707
+AATGATACGGCGACCACCGAGATCTACACCTGAAGCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01084.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D708
+AATGATACGGCGACCACCGAGATCTACACTAATGCGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01085.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D709
+AATGATACGGCGACCACCGAGATCTACACCGGCTATGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01086.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D710
+AATGATACGGCGACCACCGAGATCTACACTCCGCGAAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01087.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D711
+AATGATACGGCGACCACCGAGATCTACACTCTCGCGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01088.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D712
+AATGATACGGCGACCACCGAGATCTACACAGCGATAGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01089.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D501
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTATAGCCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01090.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D502
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATAGAGGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01091.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D503
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCTATCCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01092.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D504
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTCTGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01093.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D505
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACAGGCGAAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01094.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D506
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATCTTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01095.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D507
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGGACGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01096.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D508
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTACTGACATCTCGTATGCCGTCTTCTGCTTG
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/dummy_file.txt	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,1 @@
+DuMmY
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/dummy_file2.txt	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,1 @@
+DuMmY
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/check_samplesheet.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,188 @@
+#!/usr/bin/env python3
+
+import argparse
+import errno
+import os
+import sys
+
+
+def parse_args(args=None):
+    Description = "Reformat samplesheet file and check its contents."
+    Epilog = "Example usage: python check_samplesheet.py <FILE_IN> <FILE_OUT>"
+
+    parser = argparse.ArgumentParser(description=Description, epilog=Epilog)
+    parser.add_argument("FILE_IN", help="Input samplesheet file.")
+    parser.add_argument("FILE_OUT", help="Output file.")
+    return parser.parse_args(args)
+
+
+def make_dir(path):
+    if len(path) > 0:
+        try:
+            os.makedirs(path)
+        except OSError as exception:
+            if exception.errno != errno.EEXIST:
+                raise exception
+
+
+def print_error(error, context="Line", context_str=""):
+    error_str = f"ERROR: Please check samplesheet -> {error}"
+    if context != "" and context_str != "":
+        error_str = f"ERROR: Please check samplesheet -> {error}\n{context.strip()}: '{context_str.strip()}'"
+    print(error_str)
+    sys.exit(1)
+
+
+def check_samplesheet(file_in, file_out):
+    """
+    This function checks that the samplesheet follows the following structure:
+
+    sample,fq1,fq2,strandedness
+    SAMPLE_PE,SAMPLE_PE_RUN1_1.fastq.gz,SAMPLE_PE_RUN1_2.fastq.gz,forward
+    SAMPLE_PE,SAMPLE_PE_RUN2_1.fastq.gz,SAMPLE_PE_RUN2_2.fastq.gz,forward
+    SAMPLE_SE,SAMPLE_SE_RUN1_1.fastq,,forward
+    SAMPLE_SE,SAMPLE_SE_RUN1_2.fastq.gz,,forward
+
+    For an example see:
+    https://github.com/nf-core/test-datasets/blob/rnaseq/samplesheet/v3.1/samplesheet_test.csv
+    """
+
+    sample_mapping_dict = {}
+    with open(file_in, "r", encoding="utf-8-sig") as fin:
+
+        ## Check header
+        MIN_COLS = 3
+        HEADER = ["sample", "fq1", "fq2", "strandedness"]
+        header = [x.strip('"') for x in fin.readline().strip().split(",")]
+        if header[: len(HEADER)] != HEADER:
+            print(
+                f"ERROR: Please check samplesheet header -> {','.join(header)} != {','.join(HEADER)}"
+            )
+            sys.exit(1)
+
+        ## Check sample entries
+        for line in fin:
+            if line.strip():
+                lspl = [x.strip().strip('"') for x in line.strip().split(",")]
+
+                ## Check valid number of columns per row
+                if len(lspl) < len(HEADER):
+                    print_error(
+                        f"Invalid number of columns (minimum = {len(HEADER)})!",
+                        "Line",
+                        line,
+                    )
+
+                num_cols = len([x for x in lspl if x])
+                if num_cols < MIN_COLS:
+                    print_error(
+                        f"Invalid number of populated columns (minimum = {MIN_COLS})!",
+                        "Line",
+                        line,
+                    )
+
+                ## Check sample name entries
+                sample, fq1, fq2, strandedness = lspl[: len(HEADER)]
+                if sample.find(" ") != -1:
+                    print(
+                        f"WARNING: Spaces have been replaced by underscores for sample: {sample}"
+                    )
+                    sample = sample.replace(" ", "_")
+                if not sample:
+                    print_error("Sample entry has not been specified!", "Line", line)
+
+                ## Check FastQ file extension
+                for fastq in [fq1, fq2]:
+                    if fastq:
+                        if fastq.find(" ") != -1:
+                            print_error("FastQ file contains spaces!", "Line", line)
+                        # if not fastq.endswith(".fastq.gz") and not fastq.endswith(".fq.gz"):
+                        #     print_error(
+                        #         "FastQ file does not have extension '.fastq.gz' or '.fq.gz'!",
+                        #         "Line",
+                        #         line,
+                        #     )
+
+                ## Check strandedness
+                strandednesses = ["unstranded", "forward", "reverse"]
+                if strandedness:
+                    if strandedness not in strandednesses:
+                        print_error(
+                            f"Strandedness must be one of '{', '.join(strandednesses)}'!",
+                            "Line",
+                            line,
+                        )
+                else:
+                    print_error(
+                        f"Strandedness has not been specified! Must be one of {', '.join(strandednesses)}.",
+                        "Line",
+                        line,
+                    )
+
+                ## Auto-detect paired-end/single-end
+                sample_info = []  ## [single_end, fq1, fq2, strandedness]
+                if sample and fq1 and fq2:  ## Paired-end short reads
+                    sample_info = ["0", fq1, fq2, strandedness]
+                elif sample and fq1 and not fq2:  ## Single-end short reads
+                    sample_info = ["1", fq1, fq2, strandedness]
+                else:
+                    print_error(
+                        "Invalid combination of columns provided!", "Line", line
+                    )
+
+                ## Create sample mapping dictionary = {sample: [[ single_end, fq1, fq2, strandedness ]]}
+                if sample not in sample_mapping_dict:
+                    sample_mapping_dict[sample] = [sample_info]
+                else:
+                    if sample_info in sample_mapping_dict[sample]:
+                        print_error(
+                            "Samplesheet contains duplicate rows!", "Line", line
+                        )
+                    else:
+                        sample_mapping_dict[sample].append(sample_info)
+
+    ## Write validated samplesheet with appropriate columns
+    if len(sample_mapping_dict) > 0:
+        out_dir = os.path.dirname(file_out)
+        make_dir(out_dir)
+        with open(file_out, "w") as fout:
+            fout.write(
+                ",".join(["sample", "single_end", "fq1", "fq2", "strandedness"]) + "\n"
+            )
+            for sample in sorted(sample_mapping_dict.keys()):
+
+                ## Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end
+                if not all(
+                    x[0] == sample_mapping_dict[sample][0][0]
+                    for x in sample_mapping_dict[sample]
+                ):
+                    print_error(
+                        f"Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end!",
+                        "Sample",
+                        sample,
+                    )
+
+                ## Check that multiple runs of the same sample are of the same strandedness
+                if not all(
+                    x[-1] == sample_mapping_dict[sample][0][-1]
+                    for x in sample_mapping_dict[sample]
+                ):
+                    print_error(
+                        f"Multiple runs of a sample must have the same strandedness!",
+                        "Sample",
+                        sample,
+                    )
+
+                for idx, val in enumerate(sample_mapping_dict[sample]):
+                    fout.write(",".join([f"{sample}_T{idx+1}"] + val) + "\n")
+    else:
+        print_error(f"No entries to process!", "Samplesheet: {file_in}")
+
+
+def main(args=None):
+    args = parse_args(args)
+    check_samplesheet(args.FILE_IN, args.FILE_OUT)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/create_fasta_and_lineages.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,518 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+import ssl
+import tempfile
+from html.parser import HTMLParser
+from urllib.request import urlopen
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# HTMLParser override class to get fna.gz and gbff.gz
+class NCBIHTMLParser(HTMLParser):
+    def __init__(self, *, convert_charrefs: bool = ...) -> None:
+        super().__init__(convert_charrefs=convert_charrefs)
+        self.reset()
+        self.href_data = list()
+
+    def handle_data(self, data):
+        self.href_data.append(data)
+
+
+# Download organelle FASTA and GenBank file.
+def dl_mito_seqs_and_flat_files(url: str, suffix: re, out: os.PathLike) -> os.PathLike:
+    """
+    Method to save .fna.gz and .gbff.gz files for the
+    RefSeq mitochondrion release.
+    """
+    contxt = ssl.create_default_context()
+    contxt.check_hostname = False
+    contxt.verify_mode = ssl.CERT_NONE
+
+    if url == None:
+        logging.error(
+            "Please provide the base URL where .fna.gz and .gbff.gz"
+            + "\nfiles for RefSeq mitochondrion can be found."
+        )
+        exit(1)
+
+    if os.path.exists(out):
+        for file in os.listdir(out):
+            file_path = os.path.join(out, file)
+
+            if suffix.match(file_path) and os.path.getsize(file_path) > 0:
+                logging.info(
+                    f"The required mitochondrion file(s)\n[{os.path.basename(file_path)}]"
+                    + " already exists.\nSkipping download from NCBI..."
+                    + "\nPlease use -f to delete and overwrite."
+                )
+                return file_path
+    else:
+        os.makedirs(out)
+
+    html_parser = NCBIHTMLParser()
+    logging.info(f"Finding latest NCBI RefSeq mitochondrion release at:\n{url}")
+
+    with urlopen(url, context=contxt) as response:
+        with tempfile.NamedTemporaryFile(delete=False) as tmp_html_file:
+            shutil.copyfileobj(response, tmp_html_file)
+
+    with open(tmp_html_file.name, "r") as html:
+        html_parser.feed("".join(html.readlines()))
+
+    file = suffix.search("".join(html_parser.href_data)).group(0)
+    file_url = "/".join([url, file + ".gz"])
+    file_at = os.path.join(out, file)
+
+    logging.info(f"Found NCBI RefSeq mitochondrian file(s):\n{file_url}")
+
+    logging.info(f"Saving to:\n{file_at}")
+
+    with tempfile.NamedTemporaryFile(delete=False) as tmp_gz:
+        with urlopen(file_url, context=contxt) as response:
+            tmp_gz.write(response.read())
+
+    with open(file_at, "w") as fh:
+        with gzip.open(tmp_gz.name, "rb") as web_gz:
+            fh.write(web_gz.read().decode("utf-8"))
+
+    html.close()
+    tmp_gz.close()
+    tmp_html_file.close()
+    os.unlink(tmp_gz.name)
+    os.unlink(tmp_html_file.name)
+    fh.close()
+    web_gz.close()
+    response.close()
+
+    return file_at
+
+
+def get_lineages(csv: os.PathLike, cols: list) -> list:
+    """
+    Parse the output from `ncbitax2lin` tool and
+    return a dict of lineages where the key is
+    genusspeciesstrain.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        header_cols = csv_fh.readline().strip().split(",")
+        user_req_cols = [
+            tcol_i for tcol_i, tcol in enumerate(header_cols) if tcol in cols
+        ]
+        cols_not_found = [tcol for tcol in cols if tcol not in header_cols]
+        raw_recs = 0
+
+        if len(cols_not_found) > 0:
+            logging.error(
+                f"The following columns do not exist in the"
+                + f"\nCSV file [ {os.path.basename(csv)} ]:\n"
+                + "".join(cols_not_found)
+            )
+            exit(1)
+        elif len(user_req_cols) > 9:
+            logging.error(
+                f"Only a total of 9 columns are needed!"
+                + "\ntax_id,kindom,phylum,class,order,family,genus,species,strain"
+            )
+            exit(1)
+
+        for tax in csv_fh:
+            raw_recs += 1
+            lcols = tax.strip().split(",")
+
+            if bool(lcols[user_req_cols[8]]):
+                lineages[lcols[user_req_cols[8]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:]]
+                )
+            elif bool(lcols[user_req_cols[7]]):
+                lineages[lcols[user_req_cols[7]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:8]] + [str()]
+                )
+
+    csv_fh.close()
+    return lineages, raw_recs
+
+
+def from_genbank(gbk: os.PathLike, min_len: int) -> dict:
+    """
+    Method to parse GenBank file and return
+    organism to latest accession mapping.
+    """
+    accs2orgs = dict()
+
+    if not (os.path.exists(gbk) or os.path.getsize(gbk) > 0):
+        logging.info(
+            f"The GenBank file [{os.path.basename(gbk)}] does not exist"
+            + "\nor is of size 0."
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(gbk)}...")
+
+    # a = open("./_accs", "w")
+    for record in SeqIO.parse(gbk, "genbank"):
+        if len(record.seq) < min_len:
+            continue
+        else:
+            # a.write(f"{record.id}\n")
+            accs2orgs[record.id] = record.annotations["organism"]
+
+    return accs2orgs
+
+
+def from_genbank_alt(gbk: os.PathLike) -> dict:
+    """
+    Method to parse GenBank file and return
+    organism to latest accession mapping without
+    using BioPython's GenBank Scanner
+    """
+    accs2orgs = dict()
+    accs = dict()
+    orgs = dict()
+    acc = False
+    acc_pat = re.compile(r"^VERSION\s+(.+)")
+    org_pat = re.compile(r"^\s+ORGANISM\s+(.+)")
+
+    if not (os.path.exists(gbk) or os.path.getsize(gbk) > 0):
+        logging.info(
+            f"The GenBank file [{os.path.basename(gbk)}] does not exist"
+            + "\nor is of size 0."
+        )
+        exit(1)
+
+    logging.info(
+        f"Indexing {os.path.basename(gbk)} without using\nBioPython's GenBank Scanner..."
+    )
+
+    with open(gbk, "r") as gbk_fh:
+        for line in gbk_fh:
+            line = line.rstrip()
+            if line.startswith("VERSION") and acc_pat.match(line):
+                acc = acc_pat.match(line).group(1)
+                accs[acc] = 1
+            if org_pat.match(line):
+                if acc and acc not in orgs.keys():
+                    orgs[acc] = org_pat.match(line).group(1)
+                elif acc and acc in orgs.keys():
+                    logging.error(f"Duplicate VERSION line: {acc}")
+                    exit(1)
+        if len(accs.keys()) != len(orgs.keys()):
+            logging.error(
+                f"Got unequal number of organisms ({len(orgs.keys())})\n"
+                + f"and accessions ({len(accs.keys())})"
+            )
+            exit(1)
+        else:
+            for acc in accs.keys():
+                if acc not in orgs.keys():
+                    logging.error(f"ORAGANISM not found for accession: {acc}")
+                    exit(1)
+                accs2orgs[acc] = orgs[acc]
+
+    gbk_fh.close()
+    return accs2orgs
+
+
+def write_fasta(seq: str, id: str, basedir: os.PathLike, suffix: str) -> None:
+    """
+    Write sequence with no description to specified file.
+    """
+    SeqIO.write(
+        SeqRecord(Seq(seq), id=id, description=str()),
+        os.path.join(basedir, id + suffix),
+        "fasta",
+    )
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. Downloads the RefSeq Mitochrondrial GenBank and FASTA format files.
+        2. Takes as input and output .csv.gz or .csv file generated by `ncbitax2lin`.
+
+    and then generates a folder containing individual FASTA sequence files
+    per organelle, and a corresponding lineage file in CSV format.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .csv or .csv.gz file which is generated "
+        + "\nby the `ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-cols",
+        dest="lineage_cols",
+        default="tax_id,superkingdom,phylum,class,order,family,genus,species,strain",
+        required=False,
+        help="Taxonomic lineage will be built using these columns from the output of"
+        + "\n`ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-url",
+        dest="url",
+        default="https://ftp.ncbi.nlm.nih.gov/refseq/release/mitochondrion",
+        required=False,
+        help="Base URL from where NCBI RefSeq mitochondrion files will be downloaded\nfrom.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "organelles"),
+        required=False,
+        help="By default, the output is written to this folder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "--fna-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual organelle FASTA files that will be saved.",
+    )
+    parser.add_argument(
+        "-ml",
+        dest="fa_min_len",
+        default=200,
+        required=False,
+        help="Minimum length of the FASTA sequence for it to be considered for"
+        + "\nfurther processing",
+    )
+    parser.add_argument(
+        "--skip-per-fa",
+        dest="skip_per_fa",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Do not generate per sequence FASTA file.",
+    )
+    parser.add_argument(
+        "--alt-gb-parser",
+        dest="alt_gb_parser",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Use alternate GenBank parser instead of BioPython's.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    csv = args.csv
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    url = args.url
+    tax_cols = args.lineage_cols
+    skip_per_fa = args.skip_per_fa
+    alt_gb_parser = args.alt_gb_parser
+    min_len = int(args.fa_min_len)
+    tcols_pat = re.compile(r"^[\w\,]+?\w$")
+    mito_fna_suffix = re.compile(r".*?\.genomic\.fna")
+    mito_gbff_suffix = re.compile(r".*?\.genomic\.gbff")
+    final_lineages = os.path.join(out, "lineages.csv")
+    lineages_not_found = os.path.join(out, "lineages_not_found.csv")
+    base_fasta_dir = os.path.join(out, "fasta")
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    if not tcols_pat.match(tax_cols):
+        logging.error(
+            f"Supplied columns' names {tax_cols} should only have words (alphanumeric) separated by a comma."
+        )
+        exit(1)
+    else:
+        tax_cols = re.sub("\n", "", tax_cols).split(",")
+
+    # Get .fna and .gbk files
+    fna = dl_mito_seqs_and_flat_files(url, mito_fna_suffix, out)
+    gbk = dl_mito_seqs_and_flat_files(url, mito_gbff_suffix, out)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages, raw_recs = get_lineages(csv, tax_cols)
+
+    # Get parsed organisms and latest accession from GenBank file.
+    if alt_gb_parser:
+        accs2orgs = from_genbank_alt(gbk)
+    else:
+        accs2orgs = from_genbank(gbk, min_len)
+
+    # # Finally, read FASTA and create individual FASTA if lineage exists.
+    logging.info(f"Creating new sequences and lineages...")
+
+    l_fh = open(final_lineages, "w")
+    ln_fh = open(lineages_not_found, "w")
+    l_fh.write(
+        "identifiers,superkingdom,phylum,class,order,family,genus,species,strain\n"
+    )
+    ln_fh.write("fna_id,gbk_org\n")
+    passed_lookup = 0
+    failed_lookup = 0
+    gbk_recs_missing = 0
+    skipped_len_short = 0
+
+    if not os.path.exists(base_fasta_dir):
+        os.makedirs(base_fasta_dir)
+
+    for record in SeqIO.parse(fna, "fasta"):
+        if len(record.seq) < min_len:
+            skipped_len_short += 1
+            continue
+        elif record.id in accs2orgs.keys():
+            org_words = accs2orgs[record.id].split(" ")
+        else:
+            gbk_recs_missing += 1
+            continue
+
+        genus_species = (
+            " ".join(org_words[0:2]) if len(org_words) > 2 else " ".join(org_words[0:])
+        )
+
+        if not skip_per_fa:
+            write_fasta(record.seq, record.id, base_fasta_dir, fna_suffix)
+
+        if record.id in accs2orgs.keys() and accs2orgs[record.id] in lineages.keys():
+            l_fh.write(",".join([record.id, lineages[accs2orgs[record.id]]]) + "\n")
+            passed_lookup += 1
+        elif record.id in accs2orgs.keys() and genus_species in lineages.keys():
+            if len(org_words) > 2:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            lineages[genus_species].rstrip(","),
+                            accs2orgs[record.id],
+                        ]
+                    )
+                    + "\n"
+                )
+            else:
+                l_fh.write(",".join([record.id, lineages[genus_species]]) + "\n")
+            passed_lookup += 1
+        else:
+            if len(org_words) > 2:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            org_words[0] + " " + org_words[1],
+                            accs2orgs[record.id],
+                        ]
+                    )
+                    + "\n"
+                )
+            else:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            accs2orgs[record.id],
+                            "",
+                        ]
+                    )
+                    + "\n"
+                )
+            ln_fh.write(",".join([record.id, accs2orgs[record.id]]) + "\n")
+            failed_lookup += 1
+
+    logging.info(
+        f"No. of raw records present in `ncbitax2lin` [{os.path.basename(csv)}]: {raw_recs}"
+        + f"\nNo. of valid records collected from `ncbitax2lin` [{os.path.basename(csv)}]: {len(lineages.keys())}"
+        + f"\nNo. of sequences skipped (Sequence length < {min_len}): {skipped_len_short}"
+        + f"\nNo. of records in FASTA [{os.path.basename(fna)}]: {passed_lookup + failed_lookup}"
+        + f"\nNo. of records in GenBank [{os.path.basename(gbk)}]: {len(accs2orgs.keys())}"
+        + f"\nNo. of FASTA records for which new lineages were created: {passed_lookup}"
+        + f"\nNo. of FASTA records for which only genus, species and/or strain information were created: {failed_lookup}"
+        + f"\nNo. of FASTA records for which no GenBank records exist: {gbk_recs_missing}"
+    )
+
+    if (passed_lookup + failed_lookup) != len(accs2orgs.keys()):
+        logging.error(
+            f"The number of FASTA records written [{len(accs2orgs.keys())}]"
+            + f"\nis not equal to number of lineages created [{passed_lookup + failed_lookup}]!"
+        )
+        exit(1)
+    else:
+        logging.info("Succesfully created lineages and FASTA records! Done!!")
+
+    l_fh.close()
+    ln_fh.close()
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/create_mqc_data_table.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,153 @@
+#!/usr/bin/env python
+
+import os
+import sys
+from textwrap import dedent
+
+import yaml
+
+
+def main():
+    """
+    Takes a tab-delimited text file with a mandatory header
+    column and generates an HTML table.
+    """
+
+    args = sys.argv
+    if len(args) < 2 or len(args) >= 4:
+        print(
+            f"\nAt least one argument specifying the *.tblsum file is required.\n"
+            + "No more than 2 command-line arguments should be passed.\n"
+        )
+        exit(1)
+
+    table_sum_on = str(args[1]).lower()
+    table_sum_on_file = table_sum_on + ".tblsum.txt"
+    cell_colors = f"{table_sum_on}.cellcolors.yml"
+
+    if len(args) == 3:
+        description = str(args[2])
+    else:
+        description = "The results table shown here is a collection from all samples."
+
+    if os.path.exists(cell_colors) and os.path.getsize(cell_colors) > 0:
+        with open(cell_colors, "r") as cc_yml:
+            cell_colors = yaml.safe_load(cc_yml)
+    else:
+        cell_colors = dict()
+
+    if not (
+        os.path.exists(table_sum_on_file) and os.path.getsize(table_sum_on_file) > 0
+    ):
+        exit(0)
+
+    with open(table_sum_on_file, "r") as tbl:
+        header = tbl.readline()
+        header_cols = header.strip().split("\t")
+
+        html = [
+            dedent(
+                f"""<script type="text/javascript">
+                    $(document).ready(function () {{
+                        $('#cpipes-process-custom-res-{table_sum_on}').DataTable({{
+                            scrollX: true,
+                            fixedColumns: true, dom: 'Bfrtip',
+                            buttons: [
+                                'copy',
+                                {{
+                                    extend: 'print',
+                                    title: 'CPIPES: MultiQC Report: {table_sum_on}'
+                                }},
+                                {{
+                                    extend: 'excel',
+                                    filename: '{table_sum_on}_results',
+                                }},
+                                {{
+                                    extend: 'csv',
+                                    filename: '{table_sum_on}_results',
+                                }}
+                            ]
+                        }});
+                    }});
+                </script>
+                <div class="table-responsive">
+                <style>
+                #cpipes-process-custom-res tr:nth-child(even) {{
+                    background-color: #f2f2f2;
+                }}
+                </style>
+                <table class="table" style="width:100%" id="cpipes-process-custom-res-{table_sum_on}">
+                <thead>
+                <tr>"""
+            )
+        ]
+
+        for header_col in header_cols:
+            html.append(
+                dedent(
+                    f"""
+                        <th> {header_col} </th>"""
+                )
+            )
+
+        html.append(
+            dedent(
+                """
+                </tr>
+                </thead>
+                <tbody>"""
+            )
+        )
+
+        for row in tbl:
+            html.append("<tr>\n")
+            data_cols = row.strip().split("\t")
+            if len(header_cols) != len(data_cols):
+                print(
+                    f"\nWARN: Number of header columns ({len(header_cols)}) and data "
+                    + f"columns ({len(data_cols)}) are not equal!\nWill append empty columns!\n"
+                )
+                if len(header_cols) > len(data_cols):
+                    data_cols += (len(header_cols) - len(data_cols)) * " "
+                    print(len(data_cols))
+                else:
+                    header_cols += (len(data_cols) - len(header_cols)) * " "
+
+            html.append(
+                dedent(
+                    f"""
+                        <td><samp>{data_cols[0]}</samp></td>
+                    """
+                )
+            )
+
+            for data_col in data_cols[1:]:
+                data_col_w_color = f"""<td>{data_col}</td>
+                """
+                if (
+                    table_sum_on in cell_colors.keys()
+                    and data_col in cell_colors[table_sum_on].keys()
+                ):
+                    data_col_w_color = f"""<td style="background-color: {cell_colors[table_sum_on][data_col]}">{data_col}</td>
+                    """
+                html.append(dedent(data_col_w_color))
+            html.append("</tr>\n")
+        html.append("</tbody>\n")
+        html.append("</table>\n")
+        html.append("</div>\n")
+
+        mqc_yaml = {
+            "id": f"{table_sum_on.upper()}_collated_table",
+            "section_name": f"{table_sum_on.upper()}",
+            "section_href": f"https://github.com/CFSAN-Biostatistics/nowayout",
+            "plot_type": "html",
+            "description": f"{description}",
+            "data": ("").join(html),
+        }
+
+        with open(f"{table_sum_on.lower()}_mqc.yml", "w") as html_mqc:
+            yaml.dump(mqc_yaml, html_mqc, default_flow_style=False)
+
+
+if __name__ == "__main__":
+    main()
Binary file 0.5.0/bin/dataformat has changed
Binary file 0.5.0/bin/datasets has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/fasta_join.pl	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,88 @@
+#!/usr/bin/env perl
+
+# Kranti Konganti
+# Takes in a gzipped multi-fasta file
+# and joins contigs by 10 N's
+
+use strict;
+use warnings;
+use Cwd;
+use Bio::SeqIO;
+use Getopt::Long;
+use File::Find;
+use File::Basename;
+use File::Spec::Functions;
+
+my ( $in_dir, $out_dir, $suffix, @uncatted_genomes );
+
+GetOptions(
+    'in_dir=s'  => \$in_dir,
+    'out_dir=s' => \$out_dir,
+    'suffix=s'  => \$suffix
+) or die usage();
+
+$in_dir  = getcwd            if ( !defined $in_dir );
+$out_dir = getcwd            if ( !defined $out_dir );
+$suffix  = '_genomic.fna.gz' if ( !defined $suffix );
+
+find(
+    {
+        wanted => sub {
+            push @uncatted_genomes, $File::Find::name if ( $_ =~ m/$suffix$/ );
+        }
+    },
+    $in_dir
+);
+
+if ( $out_dir ne getcwd && !-d $out_dir ) {
+    mkdir $out_dir || die "\nCannot create directory $out_dir: $!\n\n";
+}
+
+open( my $geno_path, '>genome_paths.txt' )
+  || die "\nCannot open file genome_paths.txt: $!\n\n";
+
+foreach my $uncatted_genome_path (@uncatted_genomes) {
+    my $catted_genome_header = '>' . basename( $uncatted_genome_path, $suffix );
+    $catted_genome_header =~ s/(GC[AF]\_\d+\.\d+)\_*.*/$1/;
+
+    my $catted_genome =
+      catfile( $out_dir, $catted_genome_header . '_scaffolded' . $suffix );
+
+    $catted_genome =~ s/\/\>(GC[AF])/\/$1/;
+
+    print $geno_path "$catted_genome\n";
+
+    open( my $fh, "gunzip -c $uncatted_genome_path |" )
+      || die "\nCannot create pipe for $uncatted_genome_path: $!\n\n";
+
+    open( my $fho, '|-', "gzip -c > $catted_genome" )
+      || die "\nCannot pipe to gzip: $!\n\n";
+
+    my $seq_obj = Bio::SeqIO->new(
+        -fh     => $fh,
+        -format => 'Fasta'
+    );
+
+    my $joined_seq = '';
+    while ( my $seq = $seq_obj->next_seq ) {
+        $joined_seq = $joined_seq . 'NNNNNNNNNN' . $seq->seq;
+    }
+
+    $joined_seq =~ s/NNNNNNNNNN$//;
+    $joined_seq =~ s/^NNNNNNNNNN//;
+
+    # $joined_seq =~ s/.{80}\K/\n/g;
+    # $joined_seq =~ s/\n$//;
+    print $fho $catted_genome_header, "\n", $joined_seq, "\n";
+
+    $seq_obj->close();
+    close $fh;
+    close $fho;
+}
+
+sub usage {
+    print
+"\nUsage: $0 [-in IN_DIR] [-ou OUT_DIR] [-su Filename Suffix for Header]\n\n";
+    exit;
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/fastq_dir_to_samplesheet.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,177 @@
+#!/usr/bin/env python3
+
+import os
+import sys
+import glob
+import argparse
+import re
+
+
+def parse_args(args=None):
+    Description = "Generate samplesheet from a directory of FastQ files."
+    Epilog = "Example usage: python fastq_dir_to_samplesheet.py <FASTQ_DIR> <SAMPLESHEET_FILE>"
+
+    parser = argparse.ArgumentParser(description=Description, epilog=Epilog)
+    parser.add_argument("FASTQ_DIR", help="Folder containing raw FastQ files.")
+    parser.add_argument("SAMPLESHEET_FILE", help="Output samplesheet file.")
+    parser.add_argument(
+        "-st",
+        "--strandedness",
+        type=str,
+        dest="STRANDEDNESS",
+        default="unstranded",
+        help="Value for 'strandedness' in samplesheet. Must be one of 'unstranded', 'forward', 'reverse'.",
+    )
+    parser.add_argument(
+        "-r1",
+        "--read1_extension",
+        type=str,
+        dest="READ1_EXTENSION",
+        default="_R1_001.fastq.gz",
+        help="File extension for read 1.",
+    )
+    parser.add_argument(
+        "-r2",
+        "--read2_extension",
+        type=str,
+        dest="READ2_EXTENSION",
+        default="_R2_001.fastq.gz",
+        help="File extension for read 2.",
+    )
+    parser.add_argument(
+        "-se",
+        "--single_end",
+        dest="SINGLE_END",
+        action="store_true",
+        help="Single-end information will be auto-detected but this option forces paired-end FastQ files to be treated as single-end so only read 1 information is included in the samplesheet.",
+    )
+    parser.add_argument(
+        "-sn",
+        "--sanitise_name",
+        dest="SANITISE_NAME",
+        action="store_true",
+        help="Whether to further sanitise FastQ file name to get sample id. Used in conjunction with --sanitise_name_delimiter and --sanitise_name_index.",
+    )
+    parser.add_argument(
+        "-sd",
+        "--sanitise_name_delimiter",
+        type=str,
+        dest="SANITISE_NAME_DELIMITER",
+        default="_",
+        help="Delimiter to use to sanitise sample name.",
+    )
+    parser.add_argument(
+        "-si",
+        "--sanitise_name_index",
+        type=int,
+        dest="SANITISE_NAME_INDEX",
+        default=1,
+        help="After splitting FastQ file name by --sanitise_name_delimiter all elements before this index (1-based) will be joined to create final sample name.",
+    )
+    return parser.parse_args(args)
+
+
+def fastq_dir_to_samplesheet(
+    fastq_dir,
+    samplesheet_file,
+    strandedness="unstranded",
+    read1_extension="_R1_001.fastq.gz",
+    read2_extension="_R2_001.fastq.gz",
+    single_end=False,
+    sanitise_name=False,
+    sanitise_name_delimiter="_",
+    sanitise_name_index=1,
+):
+    def sanitize_sample(path, extension):
+        """Retrieve sample id from filename"""
+        sample = os.path.basename(path).replace(extension, "")
+        if sanitise_name:
+            if sanitise_name_index > 0:
+                sample = sanitise_name_delimiter.join(
+                    os.path.basename(path).split(sanitise_name_delimiter)[
+                        :sanitise_name_index
+                    ]
+                )
+            # elif sanitise_name_index == -1:
+            #     sample = os.path.basename(path)[ :os.path.basename(path).index('.') ]
+        return sample
+
+    def get_fastqs(extension):
+        """
+        Needs to be sorted to ensure R1 and R2 are in the same order
+        when merging technical replicates. Glob is not guaranteed to produce
+        sorted results.
+        See also https://stackoverflow.com/questions/6773584/how-is-pythons-glob-glob-ordered
+        """
+        abs_fq_files = glob.glob(os.path.join(fastq_dir, f"**", f"*{extension}"), recursive=True)
+        return sorted(
+            [
+                fq for _, fq in enumerate(abs_fq_files) if re.match('^((?!undetermined|unclassified|downloads).)*$', fq, flags=re.IGNORECASE)
+            ]
+        )
+
+    read_dict = {}
+
+    ## Get read 1 files
+    for read1_file in get_fastqs(read1_extension):
+        sample = sanitize_sample(read1_file, read1_extension)
+        if sample not in read_dict:
+            read_dict[sample] = {"R1": [], "R2": []}
+        read_dict[sample]["R1"].append(read1_file)
+
+    ## Get read 2 files
+    if not single_end:
+        for read2_file in get_fastqs(read2_extension):
+            sample = sanitize_sample(read2_file, read2_extension)
+            read_dict[sample]["R2"].append(read2_file)
+
+    ## Write to file
+    if len(read_dict) > 0:
+        out_dir = os.path.dirname(samplesheet_file)
+        if out_dir and not os.path.exists(out_dir):
+            os.makedirs(out_dir)
+
+        with open(samplesheet_file, "w") as fout:
+            header = ["sample", "fq1", "fq2", "strandedness"]
+            fout.write(",".join(header) + "\n")
+            for sample, reads in sorted(read_dict.items()):
+                for idx, read_1 in enumerate(reads["R1"]):
+                    read_2 = ""
+                    if idx < len(reads["R2"]):
+                        read_2 = reads["R2"][idx]
+                    sample_info = ",".join([sample, read_1, read_2, strandedness])
+                    fout.write(f"{sample_info}\n")
+    else:
+        error_str = (
+            "\nWARNING: No FastQ files found so samplesheet has not been created!\n\n"
+        )
+        error_str += "Please check the values provided for the:\n"
+        error_str += "  - Path to the directory containing the FastQ files\n"
+        error_str += "  - '--read1_extension' parameter\n"
+        error_str += "  - '--read2_extension' parameter\n"
+        print(error_str)
+        sys.exit(1)
+
+
+def main(args=None):
+    args = parse_args(args)
+
+    strandedness = "unstranded"
+    if args.STRANDEDNESS in ["unstranded", "forward", "reverse"]:
+        strandedness = args.STRANDEDNESS
+
+    fastq_dir_to_samplesheet(
+        fastq_dir=args.FASTQ_DIR,
+        samplesheet_file=args.SAMPLESHEET_FILE,
+        strandedness=strandedness,
+        read1_extension=args.READ1_EXTENSION,
+        read2_extension=args.READ2_EXTENSION,
+        single_end=args.SINGLE_END,
+        sanitise_name=args.SANITISE_NAME,
+        sanitise_name_delimiter=args.SANITISE_NAME_DELIMITER,
+        sanitise_name_index=args.SANITISE_NAME_INDEX,
+    )
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_otf_genome.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,227 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import glob
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+
+# Set logging.
+logging.basicConfig(
+    format="\n"
+    + "=" * 55
+    + "\n%(asctime)s - %(levelname)s\n"
+    + "=" * 55
+    + "\n%(message)s\n\n",
+    level=logging.DEBUG,
+)
+
+# Debug print.
+ppp = pprint.PrettyPrinter(width=50, indent=4)
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def main() -> None:
+    """
+    This script works only in the context of a Nextflow workflow.
+    It takes:
+        1. A text file containing accessions or FASTA IDs, one per line and
+            then,
+        2. Optionally, searches for a genome FASTA file in gzipped format in specified
+            search path, where the prefix of the filename is the accession or
+            FASTA ID from 1. and then, creates a new concatenated gzipped genome FASTA
+            file with all the genomes in the text file from 1.
+        3. Creates a new FASTQ file with reads aligned to the accessions in the text
+            file from 1.
+    """
+
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-txt",
+        dest="accs_txt",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .txt file containing accessions\n"
+        + "FASTA IDs, one per line.",
+    )
+    required.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="CATTED_GENOMES",
+        required=True,
+        help="Set the output file prefix for .fna.gz and .txt\n" + "files.",
+    )
+    parser.add_argument(
+        "-gd",
+        dest="genomes_dir",
+        default=False,
+        required=False,
+        help="Absolute UNIX path to a directory containing\n"
+        + "gzipped genome FASTA files or a file.\n",
+    )
+    parser.add_argument(
+        "-gds",
+        dest="genomes_dir_suffix",
+        default="_scaffolded_genomic.fna.gz",
+        required=False,
+        help="Genome FASTA file suffix to search for\nin the directory mentioned using\n-gd.",
+    )
+    parser.add_argument(
+        "-query",
+        dest="id_is_query",
+        default=False,
+        action="store_true",
+        required=False,
+        help="In the produced FASTQ file, should the FASTA ID should be of KMA query ID\n"
+        + "or template ID.",
+    )
+    parser.add_argument(
+        "-txts",
+        dest="accs_suffix",
+        default="_template_hits.txt",
+        required=False,
+        help="The suffix of the file supplied with -txt option. It is assumed that the\n"
+        + "sample name is present in the file supplied with -txt option and the suffix\n"
+        + "will be stripped and stored in a file that logs samples which have no hits.",
+    )
+    parser.add_argument(
+        "-frag_delim",
+        dest="frag_delim",
+        default="\t",
+        required=False,
+        help="The delimitor by which the fields are separated in *_frag.gz file.",
+    )
+
+    args = parser.parse_args()
+    accs_txt = args.accs_txt
+    genomes_dir = args.genomes_dir
+    genomes_dir_suffix = args.genomes_dir_suffix
+    id_is_query = args.id_is_query
+    out_prefix = args.out_prefix
+    accs_suffix = args.accs_suffix
+    frag_delim = args.frag_delim
+    accs_seen = dict()
+    cat_genomes_gz = os.path.join(os.getcwd(), out_prefix + "_" + genomes_dir_suffix)
+    cat_genomes_gz = re.sub("__", "_", str(cat_genomes_gz))
+    frags_gz = os.path.join(os.getcwd(), out_prefix + ".frag.gz")
+    cat_reads_gz = os.path.join(os.getcwd(), out_prefix + "_aln_reads.fna.gz")
+    cat_reads_gz = re.sub("__", "_", cat_reads_gz)
+
+    if (
+        accs_txt
+        and os.path.exists(cat_genomes_gz)
+        and os.path.getsize(cat_genomes_gz) > 0
+    ):
+        logging.error(
+            "A concatenated genome FASTA file,\n"
+            + f"{os.path.basename(cat_genomes_gz)} already exists in:\n"
+            + f"{os.getcwd()}\n"
+            + "Please remove or move it as we will not "
+            + "overwrite it."
+        )
+        exit(1)
+
+    if accs_txt and (not os.path.exists(accs_txt) or not os.path.getsize(accs_txt) > 0):
+        logging.error("File,\n" + f"{accs_txt}\ndoes not exist " + "or is empty!")
+        failed_sample_name = re.sub(accs_suffix, "", os.path.basename(accs_txt))
+        with open(
+            os.path.join(os.getcwd(), "_".join([out_prefix, "FAILED.txt"])), "w"
+        ) as failed_sample_fh:
+            failed_sample_fh.write(f"{failed_sample_name}\n")
+        failed_sample_fh.close()
+        exit(0)
+
+    # ppp.pprint(mash_hits)
+    empty_lines = 0
+    empty_lines_msg = ""
+
+    with open(accs_txt, "r") as accs_txt_fh:
+        for line in accs_txt_fh:
+            if line in ["\n", "\n\r"]:
+                empty_lines += 1
+                continue
+            else:
+                line = line.strip()
+
+            if line in accs_seen.keys():
+                continue
+            else:
+                accs_seen[line] = 1
+    accs_txt_fh.close()
+
+    if genomes_dir:
+        if not os.path.isdir(genomes_dir):
+            logging.error("UNIX path\n" + f"{genomes_dir}\n" + "does not exist!")
+            exit(1)
+        if len(glob.glob(os.path.join(genomes_dir, "*" + genomes_dir_suffix))) <= 0:
+            logging.error(
+                "Genomes directory"
+                + f"{genomes_dir}"
+                + "\ndoes not seem to have any\n"
+                + f"files ending with suffix: {genomes_dir_suffix}"
+            )
+            exit(1)
+
+        with open(cat_genomes_gz, "wb") as genomes_out_gz:
+            for line in accs_seen.keys():
+                genome_file = os.path.join(genomes_dir, line + genomes_dir_suffix)
+
+                if not os.path.exists(genome_file) or os.path.getsize(genome_file) <= 0:
+                    logging.error(
+                        f"Genome file {os.path.basename(genome_file)} does not\n"
+                        + "exits or is empty!"
+                    )
+                    exit(1)
+                else:
+                    with open(genome_file, "rb") as genome_file_h:
+                        genomes_out_gz.writelines(genome_file_h.readlines())
+                    genome_file_h.close()
+        genomes_out_gz.close()
+
+    if (
+        len(accs_seen.keys()) > 0
+        and os.path.exists(frags_gz)
+        and os.path.getsize(frags_gz) > 0
+    ):
+        with gzip.open(
+            cat_reads_gz, "wt", encoding="utf-8", compresslevel=6
+        ) as cat_reads_gz_fh:
+            with gzip.open(frags_gz, "rb", compresslevel=6) as fragz_gz_fh:
+                fasta_id = 7 if id_is_query else 6
+                for frag_line in fragz_gz_fh:
+                    frag_lines = frag_line.decode("utf-8").strip().split(frag_delim)
+                    # Per KMA specification, 6=template, 7=query, 1=read
+                    cat_reads_gz_fh.write(f">{frag_lines[fasta_id]}\n{frag_lines[0]}\n")
+            fragz_gz_fh.close()
+        cat_reads_gz_fh.close()
+
+        if empty_lines > 0:
+            empty_lines_msg = f"Skipped {empty_lines} empty line(s).\n"
+
+        logging.info(
+            empty_lines_msg
+            + f"File {os.path.basename(cat_genomes_gz)}\n"
+            + f"written in:\n{os.getcwd()}\nDone! Bye!"
+        )
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_per_species_fa_from_bold.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,437 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+from collections import defaultdict
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def get_lineages(csv: os.PathLike, cols: list) -> list:
+    """
+    Parse the output from `ncbitax2lin` tool and
+    return a dict of lineages where the key is
+    genusspeciesstrain.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        header_cols = csv_fh.readline().strip().split(",")
+        user_req_cols = [
+            tcol_i for tcol_i, tcol in enumerate(header_cols) if tcol in cols
+        ]
+        cols_not_found = [tcol for tcol in cols if tcol not in header_cols]
+        raw_recs = 0
+
+        if len(cols_not_found) > 0:
+            logging.error(
+                f"The following columns do not exist in the"
+                + f"\nCSV file [ {os.path.basename(csv)} ]:\n"
+                + "".join(cols_not_found)
+            )
+            exit(1)
+        elif len(user_req_cols) > 9:
+            logging.error(
+                f"Only a total of 9 columns are needed!"
+                + "\ntax_id,kindom,phylum,class,order,family,genus,species,strain"
+            )
+            exit(1)
+
+        for tax in csv_fh:
+            raw_recs += 1
+            lcols = tax.strip().split(",")
+
+            if bool(lcols[user_req_cols[8]]):
+                lineages[lcols[user_req_cols[8]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:]]
+                )
+            elif bool(lcols[user_req_cols[7]]):
+                lineages[lcols[user_req_cols[7]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:8]] + [str()]
+                )
+
+    csv_fh.close()
+    return lineages, raw_recs
+
+
+def write_fasta(recs: list, basedir: os.PathLike, name: str, suffix: str) -> None:
+    """
+    Write sequence with no description to a specified file.
+    """
+    SeqIO.write(
+        recs,
+        os.path.join(basedir, name + suffix),
+        "fasta",
+    )
+
+
+def check_and_get_cols(pat: re, cols: str, delim: str) -> list:
+    """
+    Check if header column matches the pattern and return
+    columns.
+    """
+    if not pat.match(cols):
+        logging.error(
+            f"Supplied columns' names {cols} should only have words"
+            f"\n(alphanumeric) separated by: {delim}."
+        )
+        exit(1)
+    else:
+        cols = re.sub("\n", "", cols).split(delim)
+
+    return cols
+
+
+def parse_tsv(fh: Union[TextIO, BinaryIO], tcols: list, delim: str) -> list:
+    """
+    Parse the TSV file and produce the required per
+    species FASTA's.
+    """
+    records, sp2accs = (defaultdict(list), defaultdict(list))
+    header = fh.readline().strip().split(delim)
+    raw_recs = 0
+
+    if not all(col in header for col in tcols):
+        logging.error(
+            "The following columns were not found in the"
+            + f"\nheader row of file {os.path.basename(fh.name)}\n"
+            + "\n".join([ele for ele in tcols if ele not in header])
+        )
+
+    id_i, genus_i, species_i, strain_i, seq_i = [
+        i for i, ele in enumerate(header) if ele in tcols
+    ]
+
+    for record in fh:
+        raw_recs += 1
+
+        id = record.strip().split(delim)[id_i]
+        genus = record.strip().split(delim)[genus_i]
+        species = re.sub(r"[\/\\]+", "-", record.strip().split(delim)[species_i])
+        strain = record.strip().split(delim)[strain_i]
+        seq = re.sub(r"[^ATGC]+", "", record.strip().split(delim)[seq_i], re.IGNORECASE)
+
+        if re.match(r"None|Null", species, re.IGNORECASE):
+            continue
+
+        # print(id)
+        # print(genus)
+        # print(species)
+        # print(strain)
+        # print(seq)
+
+        records.setdefault(species, []).append(
+            SeqRecord(Seq(seq), id=id, description=str())
+        )
+        sp2accs.setdefault(species, []).append(id)
+
+    logging.info(f"Collected FASTA records for {len(records.keys())} species'.")
+    fh.close()
+    return records, sp2accs, raw_recs
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. The TSV file from BOLD systems,
+        2. Takes as input a .csv file generated by `ncbitax2lin`.
+
+    and then generates a folder containing individual FASTA sequence files
+    per species. This is only possible if the full taxonomy of the barcode
+    sequence is present in the FASTA header.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\r\r",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-tsv",
+        dest="tsv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the TSV file from BOLD systems"
+        + "\nin uncompressed TXT format.",
+    )
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .csv or .csv.gz file which is generated "
+        + "\nby the `ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "species"),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual species FASTA files\nthat will be saved.",
+    )
+    parser.add_argument(
+        "-ccols",
+        dest="csv_cols",
+        default="tax_id,superkingdom,phylum,class,order,family,genus,species,strain",
+        required=False,
+        help="Taxonomic lineage will be built using these columns from the output of"
+        + "\n`ncbitax2lin`\ntool.",
+    )
+    parser.add_argument(
+        "-ccols-sep",
+        dest="csv_delim",
+        default=",",
+        required=False,
+        help="The delimitor of the fields in the CSV file.",
+    )
+    parser.add_argument(
+        "-tcols",
+        dest="tsv_cols",
+        default="processid\tgenus\tspecies\tsubspecies\tnucraw",
+        required=False,
+        help="For each species, the nucletide sequences will be\naggregated.",
+    )
+    parser.add_argument(
+        "-tcols-sep",
+        dest="tsv_delim",
+        default="\t",
+        required=False,
+        help="The delimitor of the fields in the TSV file.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    tsv = args.tsv
+    csv = args.csv
+    csep = args.csv_delim
+    tsep = args.tsv_delim
+    csv_cols = args.csv_cols
+    tsv_cols = args.tsv_cols
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    ccols_pat = re.compile(f"^[\w\{csep}]+?\w$")
+    tcols_pat = re.compile(f"^[\w\{tsep}]+?\w$")
+    final_lineages = os.path.join(out, "lineages.csv")
+    lineages_not_found = os.path.join(out, "lineages_not_found.csv")
+    base_fasta_dir = os.path.join(out, "fasta")
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    # Validate user requested columns
+    passed_ccols = check_and_get_cols(ccols_pat, csv_cols, csep)
+    passed_tcols = check_and_get_cols(tcols_pat, tsv_cols, tsep)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages, raw_recs = get_lineages(csv, passed_ccols)
+
+    # Finally, read BOLD tsv if lineage exists.
+    logging.info(f"Creating new squences per species...")
+
+    if not os.path.exists(out):
+        os.makedirs(out)
+
+    try:
+        gz_fh = gzip.open(tsv, "rt")
+        records, sp2accs, traw_recs = parse_tsv(gz_fh, passed_tcols, tsep)
+    except gzip.BadGzipFile:
+        logging.info(f"Input TSV file {os.path.basename(tsv)} is not in\nGZIP format.")
+        txt_fh = open(tsv, "r")
+        records, sp2accs, traw_recs = parse_tsv(txt_fh, passed_tcols, tsep)
+
+    passed_tax_check = 0
+    failed_tax_check = 0
+    fasta_recs_written = 0
+    l_fh = open(final_lineages, "w")
+    ln_fh = open(lineages_not_found, "w")
+    l_fh.write(
+        "identifiers,superkingdom,phylum,class,order,family,genus,species,strain\n"
+    )
+    ln_fh.write("fna_id,parsed_org\n")
+
+    if not os.path.exists(base_fasta_dir):
+        os.makedirs(base_fasta_dir)
+
+    for genus_species in records.keys():
+        fasta_recs_written += len(records[genus_species])
+        write_fasta(
+            records[genus_species],
+            base_fasta_dir,
+            "_".join(genus_species.split(" ")),
+            fna_suffix,
+        )
+        org_words = genus_species.split(" ")
+
+        for id in sp2accs[genus_species]:
+            if genus_species in lineages.keys():
+                this_line = ",".join([id, lineages[genus_species]]) + "\n"
+
+                if len(org_words) > 2:
+                    this_line = (
+                        ",".join(
+                            [id, lineages[genus_species].rstrip(","), genus_species]
+                        )
+                        + "\n"
+                    )
+
+                l_fh.write(this_line)
+                passed_tax_check += 1
+            else:
+                this_line = (
+                    ",".join(
+                        [
+                            id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            genus_species,
+                            "",
+                        ]
+                    )
+                    + "\n"
+                )
+                if len(org_words) > 2:
+                    this_line = (
+                        ",".join(
+                            [
+                                id,
+                                "",
+                                "",
+                                "",
+                                "",
+                                "",
+                                org_words[0],
+                                org_words[0] + " " + org_words[1],
+                                genus_species,
+                            ]
+                        )
+                        + "\n"
+                    )
+                l_fh.write(this_line)
+                ln_fh.write(",".join([id, genus_species]) + "\n")
+                failed_tax_check += 1
+
+    logging.info(
+        f"No. of raw records present in `ncbitax2lin` [{os.path.basename(csv)}]: {raw_recs}"
+        + f"\nNo. of valid records collected from `ncbitax2lin` [{os.path.basename(csv)}]: {len(lineages.keys())}"
+        + f"\nNo. of raw records in TSV [{os.path.basename(tsv)}]: {traw_recs}"
+        + f"\nNo. of valid records in TSV [{os.path.basename(tsv)}]: {passed_tax_check + failed_tax_check}"
+        + f"\nNo. of FASTA records for which new lineages were created: {passed_tax_check}"
+        + f"\nNo. of FASTA records for which only genus, species and/or strain information were created: {failed_tax_check}"
+    )
+
+    if (passed_tax_check + failed_tax_check) != fasta_recs_written:
+        logging.error(
+            f"The number of input FASTA records [{fasta_recs_written}]"
+            + f"\nis not equal to number of lineages created [{passed_tax_check + failed_tax_check}]!"
+        )
+        exit(1)
+    else:
+        logging.info("Succesfully created lineages and FASTA records! Done!!")
+
+
+if __name__ == "__main__":
+    main()
+
+# ~/apps/nowayout/bin/gen_per_species_fa_from_bold.py -tsv BOLD_Public.05-Feb-2024.tsv -csv ../tax.csv                                  ─╯
+
+# =======================================================
+# 2024-02-08 21:37:28,541 - INFO
+# =======================================================
+# Indexing tax.csv...
+
+# =======================================================
+# 2024-02-08 21:38:06,567 - INFO
+# =======================================================
+# Creating new squences per species...
+
+# =======================================================
+# 2024-02-08 21:38:06,572 - INFO
+# =======================================================
+# Input TSV file BOLD_Public.05-Feb-2024.tsv is not in
+# GZIP format.
+
+# =======================================================
+# 2024-02-08 22:01:04,554 - INFO
+# =======================================================
+# Collected FASTA records for 497421 species'.
+
+# =======================================================
+# 2024-02-08 22:24:35,000 - INFO
+# =======================================================
+# No. of raw records present in `ncbitax2lin` [tax.csv]: 2550767
+# No. of valid records collected from `ncbitax2lin` [tax.csv]: 2134980
+# No. of raw records in TSV [BOLD_Public.05-Feb-2024.tsv]: 9735210
+# No. of valid records in TSV [BOLD_Public.05-Feb-2024.tsv]: 4988323
+# No. of FASTA records for which new lineages were created: 4069202
+# No. of FASTA records for which only genus, species and/or strain information were created: 919121
+
+# =======================================================
+# 2024-02-08 22:24:35,001 - INFO
+# =======================================================
+# Succesfully created lineages and FASTA records! Done!!
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_per_species_fa_from_lin.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,248 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+from collections import defaultdict
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def get_lineages(csv: os.PathLike) -> defaultdict:
+    """
+    Parse the lineages.csv file and store a list of
+    accessions.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        _ = csv_fh.readline().strip().split(",")
+        for line in csv_fh:
+            cols = line.strip().split(",")
+
+            if len(cols) < 9:
+                logging.error(
+                    f"The CSV file {os.path.basename(csv)} should have a mandatory 9 columns."
+                    + "\n\nEx: identifiers,superkingdom,phylum,class,order,family,genus,species,strain"
+                    + "\nAB211151.1,Eukaryota,Arthropoda,Malacostraca,Decapoda,Majidae,Chionoecetes,Chionoecetes opilio,"
+                    + f"\n\nGot:\n{line}"
+                )
+                exit(1)
+
+            lineages[cols[0]] = re.sub(r"\W+", "-", "_".join(cols[7].split(" ")))
+
+    csv_fh.close()
+    return lineages
+
+
+def write_fasta(recs: list, basedir: os.PathLike, name: str, suffix: str) -> None:
+    """
+    Write sequence with no description to a specified file.
+    """
+    SeqIO.write(
+        recs,
+        os.path.join(basedir, name + suffix),
+        "fasta",
+    )
+
+
+def parse_fasta(fh: Union[TextIO, BinaryIO], sp2accs: dict) -> list:
+    """
+    Parse the sequences and create per species FASTA record.
+    """
+    records = defaultdict()
+    logging.info("")
+
+    for record in SeqIO.parse(fh, "fasta"):
+
+        id = record.id
+        seq = record.seq
+
+        if id in sp2accs.keys():
+            records.setdefault(sp2accs[id], []).append(
+                SeqRecord(Seq(seq), id=id, description=str())
+            )
+        else:
+            print(f"Lineage row does not exist for accession: {id}")
+
+    logging.info(f"Collected FASTA records for {len(records.keys())} species'.")
+    fh.close()
+    return records
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. The FASTA file and,
+        2. Takes the corresponding lineages.csv file and,
+
+    then generates a folder containing individual FASTA sequence files
+    per species.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\r\r",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-fa",
+        dest="fna",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the FASTA file that corresponds"
+        + "\nto the lineages.csv file.",
+    )
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to lineages.csv which has a guaranteed 9 "
+        + "\ncolumns with the first being an accession.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "species"),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual species FASTA files\nthat will be saved.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    csv = args.csv
+    fna = args.fna
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages = get_lineages(csv)
+
+    logging.info(f"Creating new squences per species...")
+
+    if not os.path.exists(out):
+        os.makedirs(out)
+
+    try:
+        gz_fh = gzip.open(fna, "rt")
+        fa_recs = parse_fasta(gz_fh, lineages)
+    except gzip.BadGzipFile:
+        logging.info(
+            f"Input FASTA file {os.path.basename(csv)} is not in\nGZIP format."
+        )
+        txt_fh = open(fna, "r")
+        fa_recs = parse_fasta(txt_fh, lineages)
+    finally:
+        logging.info("Assigned FASTA records per species...")
+
+    logging.info("Writing FASTA records per species...")
+
+    for sp in fa_recs.keys():
+        write_fasta(fa_recs[sp], out, sp, fna_suffix)
+
+
+if __name__ == "__main__":
+    main()
+
+# ~/apps/nowayout/bin/gen_per_species_fa_from_bold.py -tsv BOLD_Public.05-Feb-2024.tsv -csv ../tax.csv                                  ─╯
+
+# =======================================================
+# 2024-02-08 21:37:28,541 - INFO
+# =======================================================
+# Indexing tax.csv...
+
+# =======================================================
+# 2024-02-08 21:38:06,567 - INFO
+# =======================================================
+# Creating new squences per species...
+
+# =======================================================
+# 2024-02-08 21:38:06,572 - INFO
+# =======================================================
+# Input TSV file BOLD_Public.05-Feb-2024.tsv is not in
+# GZIP format.
+
+# =======================================================
+# 2024-02-08 22:01:04,554 - INFO
+# =======================================================
+# Collected FASTA records for 497421 species'.
+
+# =======================================================
+# 2024-02-08 22:24:35,000 - INFO
+# =======================================================
+# No. of raw records present in `ncbitax2lin` [tax.csv]: 2550767
+# No. of valid records collected from `ncbitax2lin` [tax.csv]: 2134980
+# No. of raw records in TSV [BOLD_Public.05-Feb-2024.tsv]: 9735210
+# No. of valid records in TSV [BOLD_Public.05-Feb-2024.tsv]: 4988323
+# No. of FASTA records for which new lineages were created: 4069202
+# No. of FASTA records for which only genus, species and/or strain information were created: 919121
+
+# =======================================================
+# 2024-02-08 22:24:35,001 - INFO
+# =======================================================
+# Succesfully created lineages and FASTA records! Done!!
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_salmon_tph_and_krona_tsv.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,523 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+# 03/06/2024
+
+import argparse
+import glob
+import inspect
+import logging
+import os
+import pprint
+import re
+from collections import defaultdict
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# Main
+def main() -> None:
+    """
+    The succesful execution of this script requires access to properly formatted
+    lineages.csv file which has no more than 9 columns.
+
+    It takes the lineages.csv file, the *_hits.csv results from `sourmash gather`
+    mentioned with -smres option and and a root parent directory of the
+    `salmon quant` results mentioned with -sal option and generates a final
+    results table with the TPM values and a .krona.tsv file for each sample
+    to be used by KronaTools.
+    """
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = inspect.stack()[0].filename
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-sal",
+        dest="salmon_res_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "`salmon quant` results. For example, if path to\n"
+        + "`quant.sf` is in /hpc/john_doe/test/salmon_res/sampleA/quant.sf, then\n"
+        + "use this command-line option as:\n"
+        + "-sal /hpc/john_doe/test/salmon_res",
+    )
+    required.add_argument(
+        "-lin",
+        dest="lin",
+        default=False,
+        required=True,
+        help="Absolute UNIX Path to the lineages CSV file.\n"
+        + "This file should have only 9 columns.",
+    )
+    required.add_argument(
+        "-smres",
+        dest="sm_res_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "filtered `sourmas gather` results. For example, if path to\n"
+        + "`sampleA.csv` is in /hpc/john_doe/test/sourmash_gather/sampleA.csv,\n"
+        + "then use this command-line option as:\n"
+        + "-sal /hpc/john_doe/test",
+    )
+    parser.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="nowayout.tblsum",
+        required=False,
+        help="Set the output file(s) prefix for output(s) generated\n"
+        + "by this program.",
+    )
+    parser.add_argument(
+        "-sf",
+        dest="scale_down_factor",
+        default=float(10000),
+        required=False,
+        help="Set the scaling factor by which TPM values are scaled\ndown.",
+    )
+    parser.add_argument(
+        "-smres-suffix",
+        dest="sm_res_suffix",
+        default="_hits.csv",
+        required=False,
+        help="Find the `sourmash gather` result files ending in this\nsuffix.",
+    )
+    parser.add_argument(
+        "-failed-suffix",
+        dest="failed_suffix",
+        default="_FAILED.txt",
+        required=False,
+        help="Find the sample names which failed classification stored\n"
+        + "inside the files ending in this suffix.",
+    )
+    parser.add_argument(
+        "-num-lin-cols",
+        dest="num_lin_cols",
+        default=int(9),
+        required=False,
+        help="Number of columns expected in the lineages CSV file.",
+    )
+    parser.add_argument(
+        "-lin-acc-regex",
+        dest="lin_acc_regex",
+        default=re.compile(r"\w+[\-\.]{1}[0-9]+"),
+        required=False,
+        help="The pattern of the lineage's accession.",
+    )
+
+    args = parser.parse_args()
+    salmon_res_dir = args.salmon_res_dir
+    sm_res_dir = args.sm_res_dir
+    sm_res_suffix = args.sm_res_suffix
+    failed_suffix = args.failed_suffix
+    out_prefix = args.out_prefix
+    lin = args.lin
+    num_lin_cols = args.num_lin_cols
+    acc_pat = args.lin_acc_regex
+    scale_down = float(args.scale_down_factor)
+    no_hit = "Unclassified"
+    no_hit_reads = "reads mapped to the database"
+    tpm_const = float(1000000.0000000000)
+    round_to = 10
+    all_samples = set()
+    (
+        lineage2sample,
+        unclassified2sample,
+        lineage2sm,
+        sm2passed,
+        reads_total,
+        per_taxon_reads,
+        lineages,
+    ) = (
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(int),
+    )
+
+    salmon_comb_res = os.path.join(os.getcwd(), out_prefix + ".txt")
+    # salmon_comb_res_reads_mapped = os.path.join(
+    #     os.getcwd(), re.sub(".tblsum", "_reads_mapped.tblsum", out_prefix) + ".txt"
+    # )
+    salmon_comb_res_indiv_reads_mapped = os.path.join(
+        os.getcwd(),
+        re.sub(".tblsum", "_indiv_reads_mapped.tblsum", out_prefix) + ".txt",
+    )
+    salmon_res_files = glob.glob(
+        os.path.join(salmon_res_dir, "*", "quant.sf"), recursive=True
+    )
+    sample_res_files_failed = glob.glob(
+        os.path.join(salmon_res_dir, "*" + failed_suffix), recursive=True
+    )
+    sm_res_files = glob.glob(
+        os.path.join(sm_res_dir, "*" + sm_res_suffix), recursive=True
+    )
+
+    # Basic checks
+    if lin and not (os.path.exists(lin) and os.path.getsize(lin) > 0):
+        logging.error(
+            "The lineages file,\n"
+            + f"{os.path.basename(lin)} does not exist or is empty!"
+        )
+        exit(1)
+
+    if salmon_res_dir:
+        if not os.path.isdir(salmon_res_dir):
+            logging.error("UNIX path\n" + f"{salmon_res_dir}\n" + "does not exist!")
+            exit(1)
+        if len(salmon_res_files) <= 0:
+            with open(salmon_comb_res, "w") as salmon_comb_res_fh, open(
+                salmon_comb_res_indiv_reads_mapped, "w"
+            ) as salmon_comb_res_indiv_reads_mapped_fh:
+                salmon_comb_res_fh.write(f"Sample\n{no_hit} reads in all samples\n")
+                salmon_comb_res_indiv_reads_mapped_fh.write(
+                    f"Sample\nNo {no_hit_reads} from all samples\n"
+                )
+            salmon_comb_res_fh.close()
+            salmon_comb_res_indiv_reads_mapped_fh.close()
+            exit(0)
+
+    # Only proceed if lineages.csv exists.
+    if lin and os.path.exists(lin) and os.path.getsize(lin) > 0:
+        lin_fh = open(lin, "r")
+        _ = lin_fh.readline()
+
+        # Index lineages.csv
+        for line in lin_fh:
+            cols = line.strip().split(",")
+
+            if len(cols) < num_lin_cols:
+                logging.error(
+                    f"The file {os.path.basename(lin)} seems to\n"
+                    + "be malformed. It contains less than required 9 columns."
+                )
+                exit(1)
+
+            if cols[0] in lineages.keys():
+                continue
+                # logging.info(
+                #     f"There is a duplicate accession [{cols[0]}]"
+                #     + f" in the lineages file {os.path.basename(lin)}!"
+                # )
+            elif acc_pat.match(cols[0]):
+                lineages[cols[0]] = ",".join(cols[1:])
+
+        lin_fh.close()
+
+        # Index each samples' filtered sourmash results.
+        for sm_res_file in sm_res_files:
+            sample_name = re.sub(sm_res_suffix, "", os.path.basename(sm_res_file))
+
+            with open(sm_res_file, "r") as sm_res_fh:
+                _ = sm_res_fh.readline()
+                for line in sm_res_fh:
+                    acc = acc_pat.findall(line.strip().split(",")[9])
+
+                    if len(acc) == 0:
+                        logging.info(
+                            f"Got empty lineage accession: {acc}"
+                            + f"\nRow elements: {line.strip().split(',')}"
+                        )
+                        exit(1)
+                    if len(acc) not in [1]:
+                        logging.info(
+                            f"Got more than one lineage accession: {acc}"
+                            + f"\nRow elements: {line.strip().split(',')}"
+                        )
+                        logging.info(f"Considering first element: {acc[0]}")
+                    if acc[0] not in lineages.keys():
+                        logging.error(
+                            f"The lineage accession {acc[0]} is not found in {os.path.basename(lin)}"
+                        )
+                        exit(1)
+                    lineage2sm[lineages[acc[0]]].setdefault(sample_name, 1)
+                    sm2passed["sourmash_passed"].setdefault(sample_name, 1)
+            sm_res_fh.close()
+
+        # Index each samples' salmon results.
+        for salmon_res_file in salmon_res_files:
+            sample_name = re.match(
+                r"(^.+?)((\_salmon\_res)|(\.salmon))$",
+                os.path.basename(os.path.dirname(salmon_res_file)),
+            )[1]
+            salmon_meta_json = os.path.join(
+                os.path.dirname(salmon_res_file), "aux_info", "meta_info.json"
+            )
+
+            if (
+                not os.path.exists(salmon_meta_json)
+                or not os.path.getsize(salmon_meta_json) > 0
+            ):
+                logging.error(
+                    "The file\n"
+                    + f"{salmon_meta_json}\ndoes not exist or is empty!\n"
+                    + "Did `salmon quant` fail?"
+                )
+                exit(1)
+
+            if (
+                not os.path.exists(salmon_res_file)
+                or not os.path.getsize(salmon_res_file) > 0
+            ):
+                logging.error(
+                    "The file\n"
+                    + f"{salmon_res_file}\ndoes not exist or is empty!\n"
+                    + "Did `salmon quant` fail?"
+                )
+                exit(1)
+
+            # Initiate all_tpm, rem_tpm and reads_mapped
+            # all_tpm
+            reads_total[sample_name].setdefault("all_tpm", []).append(float(0.0))
+            # rem_tpm
+            reads_total[sample_name].setdefault("rem_tpm", []).append(float(0.0))
+            # reads_mapped
+            reads_total[sample_name].setdefault("reads_mapped", []).append(float(0.0))
+
+            with open(salmon_res_file, "r") as salmon_res_fh:
+                for line in salmon_res_fh.readlines():
+                    if re.match(r"^Name.+", line):
+                        continue
+                    cols = line.strip().split("\t")
+                    ref_acc = cols[0]
+                    tpm = cols[3]
+                    num_reads_mapped = cols[4]
+
+                    (
+                        reads_total[sample_name]
+                        .setdefault("all_tpm", [])
+                        .append(
+                            round(float(tpm), round_to),
+                        )
+                    )
+
+                    (
+                        reads_total[sample_name]
+                        .setdefault("reads_mapped", [])
+                        .append(
+                            round(float(num_reads_mapped), round_to),
+                        )
+                    )
+
+                    if lineages[ref_acc] in lineage2sm.keys():
+                        (
+                            lineage2sample[lineages[ref_acc]]
+                            .setdefault(sample_name, [])
+                            .append(round(float(tpm), round_to))
+                        )
+                        (
+                            per_taxon_reads[sample_name]
+                            .setdefault(lineages[ref_acc], [])
+                            .append(round(float(num_reads_mapped)))
+                        )
+                    else:
+                        (
+                            reads_total[sample_name]
+                            .setdefault("rem_tpm", [])
+                            .append(
+                                round(float(tpm), round_to),
+                            )
+                        )
+
+            salmon_res_fh.close()
+
+        # Index each samples' complete failure results i.e., 100% unclassified.
+        for sample_res_file_failed in sample_res_files_failed:
+            sample_name = re.sub(
+                failed_suffix, "", os.path.basename(sample_res_file_failed)
+            )
+            with open("".join(sample_res_file_failed), "r") as no_calls_fh:
+                for line in no_calls_fh.readlines():
+                    if line in ["\n", "\n\r", "\r"]:
+                        continue
+                    unclassified2sample[sample_name].setdefault(no_hit, tpm_const)
+            no_calls_fh.close()
+
+        # Finally, write all results.
+        for sample in sorted(reads_total.keys()) + sorted(unclassified2sample.keys()):
+            all_samples.add(sample)
+
+        # Check if sourmash results exist but salmon `quant` failed
+        # and if so, set the sample to 100% Unclassified as well.
+        for sample in sm2passed["sourmash_passed"].keys():
+            if sample not in all_samples:
+                unclassified2sample[sample].setdefault(no_hit, tpm_const)
+                all_samples.add(sample)
+
+        # Write total number of reads mapped to nowayout database.
+        # with open(salmon_comb_res_reads_mapped, "w") as nowo_reads_mapped_fh:
+        #     nowo_reads_mapped_fh.write(
+        #         "\t".join(
+        #             [
+        #                 "Sample",
+        #                 "All reads",
+        #                 "Classified reads",
+        #                 "Unclassified reads (Reads failed thresholds )",
+        #             ]
+        #         )
+        #     )
+
+        #     for sample in all_samples:
+        #         if sample in reads_total.keys():
+        #             nowo_reads_mapped_fh.write(
+        #                 "\n"
+        #                 + "\t".join(
+        #                     [
+        #                         f"\n{sample}",
+        #                         f"{int(sum(reads_total[sample]['reads_mapped']))}",
+        #                         f"{int(reads_total[sample]['reads_mapped'])}",
+        #                         f"{int(reads_total[sample]['rem_tpm'])}",
+        #                     ],
+        #                 )
+        #             )
+        #         else:
+        #             nowo_reads_mapped_fh.write(f"\n{sample}\t{int(0.0)}")
+        # nowo_reads_mapped_fh.close()
+
+        # Write scaled down TPM values for each sample.
+        with open(salmon_comb_res, "w") as salmon_comb_res_fh, open(
+            salmon_comb_res_indiv_reads_mapped, "w"
+        ) as salmon_comb_res_indiv_reads_mapped_fh:
+            salmon_comb_res_fh.write("Lineage\t" + "\t".join(all_samples) + "\n")
+            salmon_comb_res_indiv_reads_mapped_fh.write(
+                "Lineage\t" + "\t".join(all_samples) + "\n"
+            )
+
+            # Write *.krona.tsv header for all samples.
+            for sample in all_samples:
+                krona_fh = open(
+                    os.path.join(salmon_res_dir, sample + ".krona.tsv"), "w"
+                )
+                krona_fh.write(
+                    "\t".join(
+                        [
+                            "fraction",
+                            "superkingdom",
+                            "phylum",
+                            "class",
+                            "order",
+                            "family",
+                            "genus",
+                            "species",
+                        ]
+                    )
+                )
+                krona_fh.close()
+
+            # Write the TPM values (TPM/scale_down) for valid lineages.
+            for lineage in lineage2sm.keys():
+                salmon_comb_res_fh.write(lineage)
+                salmon_comb_res_indiv_reads_mapped_fh.write(lineage)
+
+                for sample in all_samples:
+                    krona_fh = open(
+                        os.path.join(salmon_res_dir, sample + ".krona.tsv"), "a"
+                    )
+
+                    if sample in unclassified2sample.keys():
+                        salmon_comb_res_fh.write(f"\t0.0")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    elif sample in lineage2sample[lineage].keys():
+                        reads = sum(per_taxon_reads[sample][lineage])
+                        tpm = sum(lineage2sample[lineage][sample])
+                        tph = round(tpm / scale_down, round_to)
+                        lineage2sample[sample].setdefault("hits_tpm", []).append(
+                            float(tpm)
+                        )
+
+                        salmon_comb_res_fh.write(f"\t{tph}")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t{reads}")
+                        krona_lin_row = lineage.split(",")
+
+                        if len(krona_lin_row) > num_lin_cols - 1:
+                            logging.error(
+                                "Taxonomy columns are more than 8 for the following lineage:"
+                                + f"{krona_lin_row}"
+                            )
+                            exit(1)
+                        else:
+                            krona_fh.write(
+                                "\n"
+                                + str(round((tpm / tpm_const), round_to))
+                                + "\t"
+                                + "\t".join(krona_lin_row[:-1])
+                            )
+                    else:
+                        salmon_comb_res_fh.write(f"\t0.0")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    krona_fh.close()
+
+                salmon_comb_res_fh.write("\n")
+                salmon_comb_res_indiv_reads_mapped_fh.write(f"\n")
+
+            # Finally write TPH (TPM/scale_down) for Unclassified
+            # Row = Unclassified / No reads mapped to the database ...
+            salmon_comb_res_fh.write(f"{no_hit}")
+            salmon_comb_res_indiv_reads_mapped_fh.write(f"Total {no_hit_reads}")
+
+            for sample in all_samples:
+                krona_ufh = open(
+                    os.path.join(salmon_res_dir, sample + ".krona.tsv"), "a"
+                )
+                # krona_ufh.write("\t")
+                if sample in unclassified2sample.keys():
+                    salmon_comb_res_fh.write(
+                        f"\t{round((unclassified2sample[sample][no_hit] / scale_down), round_to)}"
+                    )
+                    salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    krona_ufh.write(
+                        f"\n{round((unclassified2sample[sample][no_hit] / tpm_const), round_to)}"
+                    )
+                else:
+                    trace_tpm = tpm_const - sum(reads_total[sample]["all_tpm"])
+                    trace_tpm = float(f"{trace_tpm:.{round_to}f}")
+                    if trace_tpm <= 0:
+                        trace_tpm = float(0.0)
+                    tph_unclassified = float(
+                        f"{(sum(reads_total[sample]['rem_tpm']) + trace_tpm) / scale_down:{round_to}f}"
+                    )
+                    krona_unclassified = float(
+                        f"{(sum(reads_total[sample]['rem_tpm']) + trace_tpm) / tpm_const:{round_to}f}"
+                    )
+                    salmon_comb_res_fh.write(f"\t{tph_unclassified}")
+                    salmon_comb_res_indiv_reads_mapped_fh.write(
+                        f"\t{int(sum(sum(per_taxon_reads[sample].values(), [])))}"
+                    )
+                    krona_ufh.write(f"\n{krona_unclassified}")
+                krona_ufh.write("\t" + "\t".join(["unclassified"] * (num_lin_cols - 2)))
+                krona_ufh.close()
+
+        salmon_comb_res_fh.close()
+        salmon_comb_res_indiv_reads_mapped_fh.close()
+        # ppp.pprint(lineage2sample)
+        # ppp.pprint(lineage2sm)
+        # ppp.pprint(reads_total)
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_sim_abn_table.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,191 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import glob
+import inspect
+import logging
+import os
+import pprint
+import re
+from collections import defaultdict
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# Main
+def main() -> None:
+    """
+    This script will take the final taxonomic classification files and create a
+    global relative abundance type file in the current working directory. The
+    relative abundance type files should be in CSV or TSV format and should have
+    the lineage or taxonomy in first column and samples in the subsequent columns.
+    """
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = inspect.stack()[0].filename
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-abn",
+        dest="rel_abn_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "abundance type files.",
+    )
+    parser.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="nowayout.tblsum",
+        required=False,
+        help="Set the output file(s) prefix for output(s) generated\nby this program.",
+    )
+    parser.add_argument(
+        "-header",
+        dest="header",
+        action="store_true",
+        default=True,
+        required=False,
+        help="Do the relative abundance files have a header.",
+    )
+    parser.add_argument(
+        "-filepat",
+        dest="file_pat",
+        default="*.lineage_summary.tsv",
+        required=False,
+        help="Files will be searched by this suffix for merged output generation\nby this program.",
+    )
+    parser.add_argument(
+        "-failedfilepat",
+        dest="failed_file_pat",
+        default="*FAILED.txt",
+        required=False,
+        help="Files will be searched by this suffix for merged output generation\nby this program.",
+    )
+    parser.add_argument(
+        "-delim",
+        dest="delim",
+        default="\t",
+        required=False,
+        help="The delimitor by which the fields are separated in the file.",
+    )
+
+    args = parser.parse_args()
+    rel_abn_dir = args.rel_abn_dir
+    is_header = args.header
+    out_prefix = args.out_prefix
+    file_pat = args.file_pat
+    failed_file_pat = args.failed_file_pat
+    delim = args.delim
+    suffix = re.sub(r"^\*", "", file_pat)
+    rel_abn_comb = os.path.join(os.getcwd(), out_prefix + ".txt")
+    rel_abn_files = glob.glob(os.path.join(rel_abn_dir, file_pat))
+    failed_rel_abn_files = glob.glob(os.path.join(rel_abn_dir, failed_file_pat))
+    empty_results = "Relative abundance results did not pass thresholds"
+    sample2lineage, seen_lineage = (defaultdict(defaultdict), defaultdict(int))
+
+    if len(rel_abn_files) == 0:
+        logging.info(
+            "Unable to find any files with .tsv extentsion.\nNow trying .csv extension."
+        )
+        rel_abn_files = glob.glob(os.path.join(rel_abn_dir, "*.csv"))
+        delim = ","
+
+    if len(failed_rel_abn_files) == 0:
+        logging.info(
+            f"Unable to find any files with patttern {failed_file_pat}.\n"
+            + "The failed samples will not appear in the final aggregate file."
+        )
+
+    if rel_abn_dir:
+        if not os.path.isdir(rel_abn_dir):
+            logging.error("UNIX path\n" + f"{rel_abn_dir}\n" + "does not exist!")
+            exit(1)
+        if len(rel_abn_files) <= 0:
+            with open(rel_abn_comb, "w") as rel_abn_comb_fh:
+                rel_abn_comb_fh.write(f"Sample\n{empty_results} in any samples\n")
+            rel_abn_comb_fh.close()
+            exit(0)
+
+        for failed_rel_abn in failed_rel_abn_files:
+            with open(failed_rel_abn, "r") as failed_fh:
+                sample2lineage[failed_fh.readline().strip()].setdefault(
+                    "unclassified", []
+                ).append(float("1.0"))
+            failed_fh.close()
+
+        for rel_abn_file in rel_abn_files:
+            sample_name = re.match(r"(^.+?)\..*$", os.path.basename(rel_abn_file))[1]
+
+            with open(rel_abn_file, "r") as rel_abn_fh:
+                if is_header:
+                    sample_names = rel_abn_fh.readline().strip().split(delim)[1:]
+                    if len(sample_names) > 2:
+                        logging.error(
+                            "The individual relative abundance file has more "
+                            + "\nthan 1 sample. This is rare in the context of running the "
+                            + "\n nowayout Nextflow workflow."
+                        )
+                        exit(1)
+                    elif len(sample_names) < 2:
+                        sample_name = re.sub(suffix, "", os.path.basename(rel_abn_file))
+                        logging.info(
+                            "Seems like there is no sample name in the lineage summary file."
+                            + f"\nTherefore, sample name has been extracted from file name: {sample_name}."
+                        )
+                    else:
+                        sample_name = sample_names[0]
+
+                for line in rel_abn_fh.readlines():
+                    cols = line.strip().split(delim)
+                    lineage = cols[0]
+                    abn = cols[1]
+                    sample2lineage[sample_name].setdefault(lineage, []).append(
+                        float(abn)
+                    )
+                    seen_lineage[lineage] = 1
+
+        with open(rel_abn_comb, "w") as rel_abn_comb_fh:
+            samples = sorted(sample2lineage.keys())
+            rel_abn_comb_fh.write(f"Lineage{delim}" + delim.join(samples) + "\n")
+
+            for lineage in sorted(seen_lineage.keys()):
+                rel_abn_comb_fh.write(lineage)
+                for sample in samples:
+                    if lineage in sample2lineage[sample].keys():
+                        rel_abn_comb_fh.write(
+                            delim
+                            + "".join(
+                                [str(abn) for abn in sample2lineage[sample][lineage]]
+                            )
+                        )
+                    else:
+                        rel_abn_comb_fh.write(f"{delim}0.0")
+                rel_abn_comb_fh.write("\n")
+        rel_abn_comb_fh.close()
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/remove_dup_fasta_ids.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,201 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import shutil
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+from genericpath import isdir
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def write_fasta(seq: str, id: str, fh: Union[TextIO, BinaryIO]) -> None:
+    """
+    Write sequence with no description to specified file.
+    """
+    SeqIO.write(
+        SeqRecord(Seq(seq), id=id, description=str()),
+        fh,
+        "fasta",
+    )
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. A FASTA file in gzip or non-gzip (ASCII TXT) format and
+
+    and then generates a new FASTA file with duplicate FASTA IDs replaced
+    with a unique ID.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-fna",
+        dest="fna",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .fna or .fna.gz file.",
+    )
+    parser.add_argument(
+        "-lin",
+        dest="lineages",
+        default=False,
+        required=False,
+        help="Absolute UNIX path to lineages.csv file for which the"
+        + "\nthe duplicate IDs will be made unique corresponding to"
+        + "\nthe FASTA IDs",
+    )
+    parser.add_argument(
+        "-outdir",
+        dest="out_folder",
+        default=os.getcwd(),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite the output file.",
+    )
+    parser.add_argument(
+        "--fna-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the output FASTA file.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    fna = args.fna
+    lineages = args.lineages
+    outdir = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    new_fna = os.path.join(
+        outdir, os.path.basename(fna).split(".")[0] + "_dedup_ids" + fna_suffix
+    )
+    lin_header = False
+    new_lin = False
+    seen_ids = dict()
+    seen_lineages = dict()
+
+    # Basic checks
+    if not overwrite and os.path.exists(new_fna):
+        logging.warning(
+            f"Output destination [{os.path.basename(new_fna)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(new_fna):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(new_fna)}...")
+        if os.path.isdir(new_fna):
+            shutil.rmtree(new_fna)
+        else:
+            os.remove(new_fna)
+
+    # Prepare for writing
+    new_fna_fh = open(new_fna, "+at")
+
+    # If lineages file is mentioned, index it.
+    if lineages and os.path.exists(lineages) and os.path.getsize(lineages) > 0:
+        new_lin = os.path.join(os.getcwd(), os.path.basename(lineages) + "_dedup.csv")
+        new_lin_fh = open(new_lin, "w")
+        with open(lineages, "r") as l_fh:
+            lin_header = l_fh.readline()
+            for line in l_fh:
+                cols = line.strip().split(",")
+                if len(cols) < 9:
+                    logging.error(
+                        f"The row in the lineages file {os.path.basename(lineages)}"
+                        + f"\ndoes not have 9 required columns: {len(cols)}"
+                        + f"\n\n{lin_header.strip()}\n{line.strip()}"
+                    )
+                    exit(1)
+                elif len(cols) > 9:
+                    logging.info(
+                        f"The row in the lineages file {os.path.basename(lineages)}"
+                        + f"\nhas more than 9 required columns: {len(cols)}"
+                        + f"\nRetaining only 9 columns of the following 10 columns."
+                        + f"\n\n{lin_header.strip()}\n{line.strip()}"
+                    )
+
+                if cols[0] not in seen_lineages.keys():
+                    seen_lineages[cols[0]] = ",".join(cols[1:9])
+
+        new_lin_fh.write(lin_header)
+        l_fh.close()
+
+    # Read FASTA and create unique FASTA IDs.
+    logging.info(f"Creating new FASTA with unique IDs.")
+    try:
+        fna_fh = gzip.open(fna, "rt")
+        _ = fna_fh.readline()
+    except gzip.BadGzipFile:
+        logging.info(
+            f"Input FASTA file {os.path.basename(fna)} is not in\nGZIP format."
+            + "\nAttempting text parsing."
+        )
+        fna_fh = open(fna, "r")
+
+    for record in SeqIO.parse(fna_fh, format="fasta"):
+        seq_id = record.id
+
+        if record.id not in seen_ids.keys():
+            seen_ids[record.id] = 1
+        else:
+            seen_ids[record.id] += 1
+
+        if seen_ids[seq_id] > 1:
+            seq_id = str(record.id) + str(seen_ids[record.id])
+
+        if new_lin:
+            new_lin_fh.write(",".join([seq_id, seen_lineages[record.id]]) + "\n")
+
+        write_fasta(record.seq, seq_id, new_fna_fh)
+
+    if new_lin:
+        new_lin_fh.close()
+
+    logging.info("Done!")
+
+
+if __name__ == "__main__":
+
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/sourmash_filter_hits.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,193 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+
+# Set logging.
+logging.basicConfig(
+    format="\n" + "=" * 55 + "\n%(asctime)s - %(levelname)s\n" + "=" * 55 + "\n%(message)s\n\n",
+    level=logging.DEBUG,
+)
+
+# Debug print.
+ppp = pprint.PrettyPrinter(width=50, indent=4)
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter):
+    pass
+
+
+def write_failures(prefix: str, file: os.PathLike) -> None:
+    with open(file, "w") as outfile_failed_fh:
+        outfile_failed_fh.write(f"{prefix}\n")
+    outfile_failed_fh.close()
+
+
+def main() -> None:
+    """
+    This script will take the CSV output of `sourmash search` and `sourmash gather`
+    and will return a column's value filtered by requested column name and its value
+    """
+
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to CSV file containing output from\n"
+        + "`sourmash gather` or `sourmash search`",
+    )
+    required.add_argument(
+        "-extract",
+        dest="extract",
+        required=False,
+        default="name",
+        help="Extract this column's value which matches the filters.\n"
+        + "Controlled by -fcn and -fcv.",
+    )
+    parser.add_argument(
+        "-all",
+        dest="alllines",
+        required=False,
+        default=False,
+        action="store_true",
+        help="Instead of just the column value, print entire row.",
+    )
+    parser.add_argument(
+        "-fcn",
+        dest="filter_col_name",
+        default="f_match",
+        required=False,
+        help="Column name by which the filtering of rows\nshould be applied.",
+    )
+    parser.add_argument(
+        "-fcv",
+        dest="filter_col_val",
+        default="0",
+        required=False,
+        help="Only rows where the column (defined by --fcn)\nsatisfies this value will be\n"
+        + "will be considered. This can be numeric, regex\nor a string value.",
+    )
+    parser.add_argument(
+        "-gt",
+        dest="gt",
+        default=True,
+        required=False,
+        action="store_true",
+        help="Apply greater than or equal to condition on\nnumeric values of --fcn column.",
+    )
+    parser.add_argument(
+        "-lt",
+        dest="lt",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Apply less than or equal to condition on\nnumeric values of --fcn column.",
+    )
+
+    args = parser.parse_args()
+    csv = args.csv
+    ex = args.extract
+    all_lines = args.alllines
+    fcn = args.filter_col_name
+    fcv = args.filter_col_val
+    gt = args.gt
+    lt = args.lt
+    hits = set()
+    hit_lines = set()
+    empty_lines = 0
+
+    outfile_prefix = re.sub(r"(^.*?)\.csv\.gz", r"\1", os.path.basename(csv))
+    outfile_failed = os.path.join(os.getcwd(), "_".join([outfile_prefix, "FAILED.txt"]))
+
+    if csv and (not os.path.exists(csv) or not os.path.getsize(csv) > 0):
+        logging.error(
+            "The CSV file,\n" + f"{os.path.basename(csv)} does not exists or\nis of size zero."
+        )
+        write_failures(outfile_prefix, outfile_failed)
+        exit(0)
+
+    if all_lines:
+        outfile = os.path.join(os.getcwd(), "_".join([outfile_prefix, "hits.csv"]))
+    else:
+        outfile = os.path.join(os.getcwd(), "_".join([outfile_prefix, "template_hits.txt"]))
+
+    with gzip.open(csv, "rb") as csv_fh:
+        header_cols = dict(
+            [
+                (col, ele)
+                for ele, col in enumerate(csv_fh.readline().decode("utf-8").strip().split(","))
+            ]
+        )
+
+        if fcn and ex not in header_cols.keys():
+            logging.info(
+                f"The header row in file\n{os.path.basename(csv)}\n"
+                + "does not have a column whose names are:\n"
+                + f"-fcn: {fcn} and -extract: {ex}"
+            )
+            exit(1)
+
+        for line in csv_fh:
+            line = line.decode("utf-8")
+
+            if line in ["\n", "\n\r"]:
+                empty_lines += 1
+                continue
+
+            cols = [x.strip() for x in line.strip().split(",")]
+            investigate = float(format(float(cols[header_cols[fcn]]), '.10f'))
+            fcv = float(fcv)
+
+            if re.match(r"[\d\.]+", str(investigate)):
+                if gt and investigate >= fcv:
+                    hits.add(cols[header_cols[ex]])
+                    hit_lines.add(line.strip())
+                elif lt and investigate <= fcv:
+                    hits.add(cols[header_cols[ex]])
+                    hit_lines.add(line.strip())
+            elif investigate == fcv:
+                hits.add(cols[header_cols[ex]])
+                hit_lines.add(line.strip())
+
+        csv_fh.close()
+
+        if len(hits) >= 1:
+            with open(outfile, "w") as outfile_fh:
+                outfile_fh.write(",".join(header_cols.keys()) + "\n")
+                if all_lines:
+                    outfile_fh.write("\n".join(hit_lines) + "\n")
+                else:
+                    outfile_fh.writelines("\n".join(hits) + "\n")
+            outfile_fh.close()
+        else:
+            write_failures(outfile_prefix, outfile_failed)
+
+        if empty_lines > 0:
+            empty_lines_msg = f"Skipped {empty_lines} empty line(s).\n"
+
+            logging.info(
+                empty_lines_msg
+                + f"File {os.path.basename(csv)}\n"
+                + f"written in:\n{os.getcwd()}\nDone! Bye!"
+            )
+        exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/base.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,58 @@
+plugins {
+    id 'nf-amazon'
+}
+
+params {
+    fs = File.separator
+    cfsanpipename = 'CPIPES'
+    center = 'CFSAN, FDA.'
+    libs = "${projectDir}${params.fs}lib"
+    modules = "${projectDir}${params.fs}modules"
+    projectconf = "${projectDir}${params.fs}conf"
+    assetsdir = "${projectDir}${params.fs}assets"
+    subworkflows = "${projectDir}${params.fs}subworkflows"
+    workflows = "${projectDir}${params.fs}workflows"
+    workflowsconf = "${workflows}${params.fs}conf"
+    routines = "${libs}${params.fs}routines"
+    toolshelp = "${libs}${params.fs}help"
+    swmodulepath = "${params.fs}nfs${params.fs}software${params.fs}modules"
+    tracereportsdir = "${launchDir}${params.fs}${cfsanpipename}-${params.pipeline}${params.fs}nextflow-reports"
+    dummyfile = "${projectDir}${params.fs}assets${params.fs}dummy_file.txt"
+    dummyfile2 = "${projectDir}${params.fs}assets${params.fs}dummy_file2.txt"
+    max_cpus = 10
+    linewidth = 80
+    pad = 32
+    pipeline = null
+    help = null
+    input = null
+    output = null
+    metadata = null
+    publish_dir_mode = "copy"
+    publish_dir_overwrite = true
+    user_email = null
+}
+
+dag {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_dag.html"
+    overwrite = true
+}
+
+report {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_report.html"
+    overwrite = true
+}
+
+trace {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_trace.txt"
+    overwrite = true
+}
+
+timeline {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_timeline.html"
+    overwrite = true
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/computeinfra.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,155 @@
+standard {
+    process.executor = 'local'
+    process.cpus = 1
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+}
+
+stdkondagac {
+    process.executor = 'local'
+    process.cpus = 4
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    singularity.enabled = false
+    docker.enabled = false
+}
+
+stdcingularitygac {
+    process.executor = 'local'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = false
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+}
+
+raven {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprod {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprodkonda {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    singularity.enabled = false
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprodcingularity {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = false
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+cingularity {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    params.enable_conda = false
+    params.enable_module = false
+    clusterOptions = '--signal B:USR2'
+}
+
+cingularitygac {
+    process.executor = 'slurm'
+    executor.$slurm.exitReadTimeout = 120000
+    process.queue = 'centriflaken'
+    process.cpus = 4
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    params.enable_conda = false
+    params.enable_module = false
+    clusterOptions = '-n 1 --signal B:USR2'
+}
+
+konda {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    singularity.enabled = false
+    docker.enabled = false
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    clusterOptions = '--signal B:USR2'
+}
+
+kondagac {
+    process.executor = 'slurm'
+    executor.$slurm.exitReadTimeout = 120000
+    process.queue = 'centriflaken'
+    process.cpus = 4
+    singularity.enabled = false
+    docker.enabled = false
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    clusterOptions = '-n 1 --signal B:USR2'
+}
+
+cfsanawsbatch {
+    process.executor = 'awsbatch'
+    process.queue = 'cfsan-nf-batch-job-queue'
+    aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
+    aws.batch.region = 'us-east-1'
+    aws.batch.volumes = ['/hpc/db:/hpc/db:ro', '/hpc/scratch:/hpc/scratch:rw']
+    singularity.enabled = false
+    singularity.autoMounts = true
+    docker.enabled = true
+    params.enable_conda = false
+    conda.enabled = false
+    conda.useMicromamba = false
+    params.enable_module = false
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/fastq.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,9 @@
+params {
+    fq_filter_by_len = "4000"
+    fq_suffix = ".fastq.gz"
+    fq2_suffix = false
+    fq_strandedness = "unstranded"
+    fq_single_end = false
+    fq_filename_delim = "_"
+    fq_filename_delim_idx = "1"
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/logtheseparams.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,17 @@
+params {
+    logtheseparams = [
+        "${params.metadata}" ? 'metadata' : null,
+        "${params.input}" ? 'input' : null,
+        "${params.output}" ? 'output' : null,
+        "${params.fq_suffix}" ? 'fq_suffix' : null,
+        "${params.fq2_suffix}" ? 'fq2_suffix' : null,
+        "${params.fq_strandedness}" ? 'fq_strandedness' : null,
+        "${params.fq_single_end}" ? 'fq_single_end' : null,
+        "${params.fq_filter_by_len}" ? 'fq_filter_by_len' : null,
+        "${params.fq_filename_delim}" ? 'fq_filename_delim' : null,
+        "${params.fq_filename_delim_idx}" ? 'fq_filename_delim_idx' : null,
+        'enable_conda',
+        'enable_module',
+        'max_cpus'
+    ]
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/manifest.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,8 @@
+manifest {
+    author = 'Kranti.Konganti@fda.hhs.gov'
+    homePage = 'https://cfsan-git.fda.gov/Kranti.Konganti/cpipes'
+    name = 'CPIPES'
+    version = '0.8.0'
+    nextflowVersion = '>=23.04'
+    description = 'Modular Nextflow pipelines at CFSAN, FDA.'
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/modules.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,122 @@
+process {
+    publishDir = [
+        path: {
+            "${task.process.tokenize(':')[-1].toLowerCase()}" == "multiqc" ?
+                "${params.output}${params.fs}${params.pipeline.toLowerCase()}-${task.process.tokenize(':')[-1].toLowerCase()}" :
+                "${params.output}${params.fs}${task.process.tokenize(':')[-1].toLowerCase()}"
+        },
+        mode: params.publish_dir_mode,
+        overwrite: params.publish_dir_overwrite,
+        saveAs: { filename -> filename =~ /^versions.yml|.+?_mqc.*/ ? null : filename }
+    ]
+
+    errorStrategy = {
+        ![0].contains(task.exitStatus) ? dynamic_retry(task.attempt, 10) : 'finish'
+    }
+
+    maxRetries = 1
+    resourceLabels = {[
+        process: task.process,
+        memoryRequested: task.memory.toString(),
+        cpusRequested: task.cpus.toString()
+    ]}
+
+    withLabel: 'process_femto' {
+        cpus = { 1 * task.attempt }
+        memory = { 1.GB * task.attempt }
+        time = { 1.h * task.attempt }
+    }
+
+    withLabel: 'process_pico' {
+        cpus = { min_cpus(2) * task.attempt }
+        memory = { 4.GB * task.attempt }
+        time = { 2.h * task.attempt }
+    }
+
+    withLabel: 'process_nano' {
+        cpus = { min_cpus(4) * task.attempt }
+        memory = { 8.GB * task.attempt }
+        time = { 4.h * task.attempt }
+    }
+
+    withLabel: 'process_micro' {
+        cpus = { min_cpus(8) * task.attempt }
+        memory = { 16.GB * task.attempt }
+        time = { 8.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_low' {
+        cpus = { 1 * task.attempt }
+        memory = { 60.GB * task.attempt }
+        time = { 20.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_medium' {
+        cpus = { 1 * task.attempt }
+        memory = { 100.GB * task.attempt }
+        time = { 30.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_high' {
+        cpus = { 1 * task.attempt }
+        memory = { 128.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_low' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 60.GB * task.attempt }
+        time = { 20.h * task.attempt }
+    }
+
+    withLabel: 'process_medium' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 100.GB * task.attempt }
+        time = { 30.h * task.attempt }
+    }
+
+    withLabel: 'process_high' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 128.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_higher' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 256.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_gigantic' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 512.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+}
+
+if ( (params.input || params.metadata ) && params.pipeline ) {
+    try {
+        includeConfig "${params.workflowsconf}${params.fs}process${params.fs}${params.pipeline}.process.config"
+    } catch (Exception e) {
+        System.err.println('-'.multiply(params.linewidth) + "\n" +
+            "\033[0;31m${params.cfsanpipename} - ERROR\033[0m\n" +
+            '-'.multiply(params.linewidth) + "\n" + "\033[0;31mCould not load " +
+            "default pipeline's process configuration. Please provide a pipeline \n" +
+            "name using the --pipeline option.\n\033[0m" + '-'.multiply(params.linewidth) + "\n")
+        System.exit(1)
+    }
+}
+
+// Function will return after sleeping for some time.
+// Sleep time increases exponentially by task attempt.
+def dynamic_retry(task_retry_num, factor_by) {
+    // sleep(Math.pow(2, task_retry_num.toInteger()) * factor_by.toInteger() as long)
+    sleep(Math.pow(1.27, task_retry_num.toInteger()) as long)
+    return 'retry'
+}
+
+// Function that will adjust the minimum number of CPU
+// cores depending as requested by the user.
+def min_cpus(cores) {
+    return Math.min(cores as int, "${params.max_cpus}" as int)
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/multiqc/nowayout_mqc.yml	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,65 @@
+title: CPIPES Report
+intro_text: >
+    CPIPES (CFSAN PIPELINES) is a modular bioinformatics data analysis project at CFSAN, FDA based on NEXTFLOW DSL2.
+report_comment: >
+    This report has been generated by the <a href="https://github.com/CFSAN-Biostatistics/sequoia/blob/master/readme/Workflow_Name_Placeholder.md" target="_blank">CPIPES - Workflow_Name_Placeholder</a>
+    analysis pipeline. Only certain tables and plots are reported here. For complete results, please refer to the analysis pipeline output directory.
+report_header_info:
+    - CPIPES Version: CPIPES_Version_Placeholder
+    - Workflow: Workflow_Name_Placeholder
+    - Workflow Version: Workflow_Version_Placeholder
+    - Conceived By: "Kranti Konganti"
+    - Input Directory: Workflow_Input_Placeholder
+    - Output Directory: Workflow_Output_Placeholder
+show_analysis_paths: False
+show_analysis_time: False
+disable_version_detection: true
+report_section_order:
+    kraken:
+        order: -994
+    NOWAYOUT_collated_table:
+        order: -995
+    NOWAYOUT_INDIV_READS_MAPPED_collated_table:
+        order: -996
+    fastp:
+        order: -997
+    fastqc:
+        order: -998
+    software_versions:
+        order: -999
+
+export_plots: true
+
+# Run only these modules
+run_modules:
+    - fastqc
+    - fastp
+    - kraken
+    - custom_content
+
+module_order:
+    - kraken:
+          name: "SOURMASH TAX METAGENOME"
+          href: "https://sourmash.readthedocs.io/en/latest/command-line.html#sourmash-tax-metagenome-summarize-metagenome-content-from-gather-results"
+          doi: "10.21105/joss.00027"
+          info: >
+              section of the report shows how <b>reads</b> are approximately classified.
+              Please note that the plot title below is shown as
+              <b>Kraken2: Top taxa</b> since <code>kreport</code> fornat was used
+              to create Kraken-style reports with <code>sourmash tax metagenome</code>.
+          path_filters:
+              - "*.kreport.txt"
+    - fastqc:
+          name: "FastQC"
+          info: >
+              section of the report shows FastQC results <b>before</b> adapter trimming
+              on SE reads or on merged PE reads.
+          path_filters:
+              - "*_fastqc.zip"
+    - fastp:
+          name: "fastp"
+          info: >
+              section of the report shows read statistics <b>before</b> and <b>after</b> adapter trimming
+              with <code>fastp</code> on SE reads or on merged PE reads.
+          path_filters:
+              - "*.fastp.json"
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/cpipes	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,58 @@
+#!/usr/bin/env nextflow
+
+/*
+----------------------------------------------------------------------------------------
+    cfsan-dev/cpipes
+----------------------------------------------------------------------------------------
+    NAME          : CPIPES
+    DESCRIPTION   : Modular Nextflow pipelines at CFSAN, FDA.
+    GITLAB        : https://xxxxxxxxxx/Kranti.Konganti/cpipes-framework
+    JIRA          : https://xxxxxxxxxx/jira/projects/CPIPES/
+    CONTRIBUTORS  : Kranti Konganti
+----------------------------------------------------------------------------------------
+*/
+
+// Enable DSL 2
+nextflow.enable.dsl = 2
+
+// Default routines for MAIN
+include { pipelineBanner; stopNow; } from "${params.routines}"
+
+// Our banner for CPIPES
+log.info pipelineBanner()
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    INCLUDE ALL WORKFLOWS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+switch ("${params.pipeline}") {
+    case "nowayout":
+        include { NOWAYOUT } from "${params.workflows}${params.fs}${params.pipeline}"
+        break
+    default:
+        stopNow("PLEASE MENTION A PIPELINE NAME. Ex: --pipeline nowayout")
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    RUN ALL WORKFLOWS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow {
+    // THIS IS REPETETIVE BUT WE ARE NOT ALLOWED TO INCLUDE "INCLUDE"
+    // INSIDE WORKFLOW
+    switch ("${params.pipeline}") {
+        case "nowayout":
+            NOWAYOUT()
+            break
+    }
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    THE END
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/dbcheck	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,128 @@
+#!/usr/bin/env bash
+
+##########################################################
+# Constants
+##########################################################
+GREEN=$(tput setaf 2)
+RED=$(tput setaf 1)
+CYAN=$(tput setaf 6)
+CLRESET=$(tput sgr0)
+prog_name="nowayout"
+dbBuild="03182024"
+dbPath="/hpc/db/${prog_name}/$dbBuild"
+taxonomyPath="$dbPath/taxonomy"
+
+usage()
+{
+    echo
+    echo usage: "$0" [-h]
+    echo
+    echo "Check for species presence in ${prog_name} database(s)."
+    echo
+    echo 'Example usage:'
+    echo
+    echo 'dbcheck -l'
+    echo 'dbcheck -g Cathartus'
+    echo 'dbcheck -d mitomine -g Cathartus'
+    echo 'dbcheck -d mitomine -s "Cathartus quadriculus"'
+    echo
+    echo 'Options:'
+    echo " -l        : List ${prog_name} databases"
+    echo ' -d        : Search this database. Default: mitomine.'
+    echo ' -g        : Genus to search for.'
+    echo ' -s        : "Genus Species" to search for.'
+    echo ' -h        : Show this help message and exit'
+    echo
+    echo "$1"
+}
+
+while getopts ":d:g:s:l" OPT; do
+    case "${OPT}" in
+        l)
+            listdb="list"
+            ;;
+        d)
+            dbname=${OPTARG}
+            ;;
+        g)
+            genus=${OPTARG}
+            ;;
+        s)
+            species=${OPTARG}
+            ;;
+        ?)
+            usage
+            exit 0
+            ;;
+    esac
+done
+
+
+
+if [ -n "$listdb" ]; then
+    num_dbs=$(find "$taxonomyPath" -type d | tail -n+2 | wc -l)
+    echo "=============================================="
+
+    db_num="1"
+    find $taxonomyPath -type d | tail -n+2 | while read -r db; do
+        dbName=$(basename "$db")
+        echo "${db_num}. $dbName"
+        db_num=$(( db_num + 1 ))
+    done
+    echo "=============================================="
+    echo "Number of ${prog_name} databases: $num_dbs"
+    echo "=============================================="
+
+    exit 0
+fi
+
+
+
+if [ -z "$dbname" ]; then
+    dbname="mitomine"
+fi
+
+if [[ -n "$genus" && -n "$species" ]]; then
+    usage "ERROR: Only one of -g or -s needs to be defined!"
+    exit 1
+elif [ -n "$genus" ]; then
+    check="$genus"
+elif [ -n "$species" ]; then
+    check="$species"
+else
+    check=""
+fi
+
+if [ -z "$check" ]; then
+    usage "ERROR: -g or -s is required! check:$check"
+    exit 1
+fi
+
+lineages="$taxonomyPath/$dbname/lineages.csv"
+
+echo
+echo -e "Checking ${dbname} for ${CYAN}${check}${CLRESET}...\nPlease wait..."
+echo
+
+num=$(grep -F ",$check," "$lineages" | cut -f1 -d, | sort -u | wc -l)
+num_species=$(tail -n+2 "$lineages" | cut -f8 -d, | sort -u | wc -l)
+num_entries=$(tail -n+2 "$lineages" | wc -l)
+
+echo "$dbname brief stats"
+echo "=============================================="
+echo "DB Build: $dbBuild"
+echo "Number of unique species: $num_species"
+echo "Number of accessions in database: $num_entries"
+echo "=============================================="
+
+
+if [ "$num" -gt 0 ]; then
+    echo
+    echo "${GREEN}$check is present in ${dbname}${CLRESET}."
+    echo "Number of accessions representing $check: $num"
+    echo "=============================================="
+else
+    echo "${RED}$check is absent in ${dbname}${CLRESET}."
+    echo -e "No worries. Please request the developer of\n${prog_name} to augment the database!"
+    echo "=============================================="
+fi
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/fastp.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,280 @@
+// Help text for fastp within CPIPES.
+
+def fastpHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'fastp_run': [
+            clihelp: 'Run fastp tool. Default: ' +
+                (params.fastp_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_failed_out': [
+            clihelp: 'Specify whether to store reads that cannot pass the filters. ' +
+                "Default: ${params.fastp_failed_out}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_merged_out': [
+            clihelp: 'Specify whether to store merged output or not. ' +
+                "Default: ${params.fastp_merged_out}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_overlapped_out': [
+            clihelp: 'For each read pair, output the overlapped region if it has no mismatched base. ' +
+                "Default: ${params.fastp_overlapped_out}",
+            cliflag: '--overlapped_out',
+            clivalue: (params.fastp_overlapped_out ?: '')
+        ],
+        'fastp_6': [
+            clihelp: "Indicate that the input is using phred64 scoring (it'll be converted to phred33, " +
+                'so the output will still be phred33). ' +
+                "Default: ${params.fastp_6}",
+            cliflag: '-6',
+            clivalue: (params.fastp_6 ? ' ' : '')
+        ],
+        'fastp_reads_to_process': [
+            clihelp: 'Specify how many reads/pairs are to be processed. Default value 0 means ' +
+                'process all reads. ' +
+                "Default: ${params.fastp_reads_to_process}",
+            cliflag: '--reads_to_process',
+            clivalue: (params.fastp_reads_to_process ?: '')
+        ],
+        'fastp_fix_mgi_id': [
+            clihelp: 'The MGI FASTQ ID format is not compatible with many BAM operation tools, ' +
+                'enable this option to fix it. ' +
+                "Default: ${params.fastp_fix_mgi_id}",
+            cliflag: '--fix_mgi_id',
+            clivalue: (params.fastp_fix_mgi_id ? ' ' : '')
+        ],
+        'fastp_A': [
+            clihelp: 'Disable adapter trimming. On by default. ' +
+                "Default: ${params.fastp_A}",
+            cliflag: '-A',
+            clivalue: (params.fastp_A ? ' ' : '')
+        ],
+        'fastp_adapter_fasta': [
+            clihelp: 'Specify a FASTA file to trim both read1 and read2 (if PE) by all the sequences ' +
+                'in this FASTA file. ' +
+                "Default: ${params.fastp_adapter_fasta}",
+            cliflag: '--adapter_fasta',
+            clivalue: (params.fastp_adapter_fasta ?: '')
+        ],
+        'fastp_f': [
+            clihelp: 'Trim how many bases in front of read1. ' +
+                "Default: ${params.fastp_f}",
+            cliflag: '-f',
+            clivalue: (params.fastp_f ?: '')
+        ],
+        'fastp_t': [
+            clihelp: 'Trim how many bases at the end of read1. ' +
+                "Default: ${params.fastp_t}",
+            cliflag: '-t',
+            clivalue: (params.fastp_t ?: '')
+        ],
+        'fastp_b': [
+            clihelp: 'Max length of read1 after trimming. ' +
+                "Default: ${params.fastp_b}",
+            cliflag: '-b',
+            clivalue: (params.fastp_b ?: '')
+        ],
+        'fastp_F': [
+            clihelp: 'Trim how many bases in front of read2. ' +
+                "Default: ${params.fastp_F}",
+            cliflag: '-F',
+            clivalue: (params.fastp_F ?: '')
+        ],
+        'fastp_T': [
+            clihelp: 'Trim how many bases at the end of read2. ' +
+                "Default: ${params.fastp_T}",
+            cliflag: '-T',
+            clivalue: (params.fastp_T ?: '')
+        ],
+        'fastp_B': [
+            clihelp: 'Max length of read2 after trimming. ' +
+                "Default: ${params.fastp_B}",
+            cliflag: '-B',
+            clivalue: (params.fastp_B ?: '')
+        ],
+        'fastp_dedup': [
+            clihelp: 'Enable deduplication to drop the duplicated reads/pairs. ' +
+                "Default: ${params.fastp_dedup}",
+            cliflag: '--dedup',
+            clivalue: (params.fastp_dedup ? ' ' : '')
+        ],
+        'fastp_dup_calc_accuracy': [
+            clihelp: 'Accuracy level to calculate duplication (1~6), higher level uses more memory ' +
+                '(1G, 2G, 4G, 8G, 16G, 24G). Default 1 for no-dedup mode, and 3 for dedup mode. ' +
+                "Default: ${params.fastp_dup_calc_accuracy}",
+            cliflag: '--dup_calc_accuracy',
+            clivalue: (params.fastp_dup_calc_accuracy ?: '')
+        ],
+        'fastp_poly_g_min_len': [
+            clihelp: 'The minimum length to detect polyG in the read tail. ' +
+                "Default: ${params.fastp_poly_g_min_len}",
+            cliflag: '--poly_g_min_len',
+            clivalue: (params.fastp_poly_g_min_len ?: '')
+        ],
+        'fastp_G': [
+            clihelp: 'Disable polyG tail trimming. ' +
+                "Default: ${params.fastp_G}",
+            cliflag: '-G',
+            clivalue: (params.fastp_G ? ' ' : '')
+        ],
+        'fastp_x': [
+            clihelp: "Enable polyX trimming in 3' ends. " +
+                "Default: ${params.fastp_x}",
+            cliflag: 'x=',
+            clivalue: (params.fastp_x ? ' ' : '')
+        ],
+        'fastp_poly_x_min_len': [
+            clihelp: 'The minimum length to detect polyX in the read tail. ' +
+                "Default: ${params.fastp_poly_x_min_len}",
+            cliflag: '--poly_x_min_len',
+            clivalue: (params.fastp_poly_x_min_len ?: '')
+        ],
+        'fastp_cut_front': [
+            clihelp: "Move a sliding window from front (5') to tail, drop the bases in the window " +
+                'if its mean quality < threshold, stop otherwise. ' +
+                "Default: ${params.fastp_cut_front}",
+            cliflag: '--cut_front',
+            clivalue: (params.fastp_cut_front ? ' ' : '')
+        ],
+        'fastp_cut_tail': [
+            clihelp: "Move a sliding window from tail (3') to front, drop the bases in the window " +
+                'if its mean quality < threshold, stop otherwise. ' +
+                "Default: ${params.fastp_cut_tail}",
+            cliflag: '--cut_tail',
+            clivalue: (params.fastp_cut_tail ? ' ' : '')
+        ],
+        'fastp_cut_right': [
+            clihelp: "Move a sliding window from tail, drop the bases in the window and the right part " +
+                'if its mean quality < threshold, and then stop. ' +
+                "Default: ${params.fastp_cut_right}",
+            cliflag: '--cut_right',
+            clivalue: (params.fastp_cut_right ? ' ' : '')
+        ],
+        'fastp_W': [
+            clihelp: "Sliding window size shared by --fastp_cut_front, --fastp_cut_tail and " +
+                '--fastp_cut_right. ' +
+                "Default: ${params.fastp_W}",
+            cliflag: '--cut_window_size',
+            clivalue: (params.fastp_W ?: '')
+        ],
+        'fastp_M': [
+            clihelp: "The mean quality requirement shared by --fastp_cut_front, --fastp_cut_tail and " +
+                '--fastp_cut_right. ' +
+                "Default: ${params.fastp_M}",
+            cliflag: '--cut_mean_quality',
+            clivalue: (params.fastp_M ?: '')
+        ],
+        'fastp_q': [
+            clihelp: 'The quality value below which a base should is not qualified. ' +
+                "Default: ${params.fastp_q}",
+            cliflag: '-q',
+            clivalue: (params.fastp_q ?: '')
+        ],
+        'fastp_u': [
+            clihelp: 'What percent of bases are allowed to be unqualified. ' +
+                "Default: ${params.fastp_u}",
+            cliflag: '-u',
+            clivalue: (params.fastp_u ?: '')
+        ],
+        'fastp_n': [
+            clihelp: "How many N's can a read have. " +
+                "Default: ${params.fastp_n}",
+            cliflag: '-n',
+            clivalue: (params.fastp_n ?: '')
+        ],
+        'fastp_e': [
+            clihelp: "If the full reads' average quality is below this value, then it is discarded. " +
+                "Default: ${params.fastp_e}",
+            cliflag: '-e',
+            clivalue: (params.fastp_e ?: '')
+        ],
+        'fastp_l': [
+            clihelp: 'Reads shorter than this length will be discarded. ' +
+                "Default: ${params.fastp_l}",
+            cliflag: '-l',
+            clivalue: (params.fastp_l ?: '')
+        ],
+        'fastp_max_len': [
+            clihelp: 'Reads longer than this length will be discarded. ' +
+                "Default: ${params.fastp_max_len}",
+            cliflag: '--length_limit',
+            clivalue: (params.fastp_max_len ?: '')
+        ],
+        'fastp_y': [
+            clihelp: 'Enable low complexity filter. The complexity is defined as the percentage ' +
+                'of bases that are different from its next base (base[i] != base[i+1]). ' +
+                "Default: ${params.fastp_y}",
+            cliflag: '-y',
+            clivalue: (params.fastp_y ? ' ' : '')
+        ],
+        'fastp_Y': [
+            clihelp: 'The threshold for low complexity filter (0~100). Ex: A value of 30 means ' +
+                '30% complexity is required. ' +
+                "Default: ${params.fastp_Y}",
+            cliflag: '-Y',
+            clivalue: (params.fastp_Y ?: '')
+        ],
+        'fastp_U': [
+            clihelp: 'Enable Unique Molecular Identifier (UMI) pre-processing. ' +
+                "Default: ${params.fastp_U}",
+            cliflag: '-U',
+            clivalue: (params.fastp_U ? ' ' : '')
+        ],
+        'fastp_umi_loc': [
+            clihelp: 'Specify the location of UMI, can be one of ' +
+                'index1/index2/read1/read2/per_index/per_read. ' +
+                "Default: ${params.fastp_umi_loc}",
+            cliflag: '--umi_loc',
+            clivalue: (params.fastp_umi_loc ?: '')
+        ],
+        'fastp_umi_len': [
+            clihelp: 'If the UMI is in read1 or read2, its length should be provided. ' +
+                "Default: ${params.fastp_umi_len}",
+            cliflag: '--umi_len',
+            clivalue: (params.fastp_umi_len ?: '')
+        ],
+        'fastp_umi_prefix': [
+            clihelp: 'If specified, an underline will be used to connect prefix and UMI ' +
+                '(i.e. prefix=UMI, UMI=AATTCG, final=UMI_AATTCG). ' +
+                "Default: ${params.fastp_umi_prefix}",
+            cliflag: '--umi_prefix',
+            clivalue: (params.fastp_umi_prefix ?: '')
+        ],
+        'fastp_umi_skip': [
+            clihelp: 'If the UMI is in read1 or read2, fastp can skip several bases following the UMI. ' +
+                "Default: ${params.fastp_umi_skip}",
+            cliflag: '--umi_skip',
+            clivalue: (params.fastp_umi_skip ?: '')
+        ],
+        'fastp_p': [
+            clihelp: 'Enable overrepresented sequence analysis. ' +
+                "Default: ${params.fastp_p}",
+            cliflag: '-p',
+            clivalue: (params.fastp_p ? ' ' : '')
+        ],
+        'fastp_P': [
+            clihelp: 'One in this many number of reads will be computed for overrepresentation analysis ' +
+                '(1~10000), smaller is slower. ' +
+                "Default: ${params.fastp_P}",
+            cliflag: '-P',
+            clivalue: (params.fastp_P ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/gsalkronapy.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,58 @@
+// Help text for `gen_salmon_tph_and_krona_tsv.py` (gsalkronapy) within CPIPES.
+
+def gsalkronapyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'gsalkronapy_run': [
+            clihelp: 'Run the `gen_salmon_tph_and_krona_tsv.py` script. Default: ' +
+                (params.gsalkronapy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'gsalkronapy_sf': [
+            clihelp: 'Set the scaling factor by which TPM values ' +
+                'are scaled down.' +
+                " Default: ${params.gsalkronapy_sf}",
+            cliflag: '-sf',
+            clivalue: (params.gsalkronapy_sf ?: '')
+        ],
+        'gsalkronapy_smres_suffix': [
+            clihelp: 'Find the `sourmash gather` result files ' +
+                'ending in this suffix.' +
+                " Default: ${params.gsalkronapy_smres_suffix}",
+            cliflag: '-smres-suffix',
+            clivalue: (params.gsalkronapy_smres_suffix ?: '')
+        ],
+        'gsalkronapy_failed_suffix': [
+            clihelp: 'Find the sample names which failed classification stored ' +
+                'inside the files ending in this suffix.' +
+                " Default: ${params.gsalkronapy_failed_suffix}",
+            cliflag: '-failed-suffix',
+            clivalue: (params.gsalkronapy_failed_suffix ?: '')
+        ],
+        'gsalkronapy_num_lin_cols': [
+            clihelp: 'Number of columns expected in the lineages CSV file. ' +
+                " Default: ${params.gsalkronapy_num_lin_cols}",
+            cliflag: '-num-lin-cols',
+            clivalue: (params.gsalkronapy_num_lin_cols ?: '')
+        ],
+        'gsalkronapy_lin_regex': [
+            clihelp: 'Number of columns expected in the lineages CSV file. ' +
+                " Default: ${params.gsalkronapy_num_lin_cols}",
+            cliflag: '-num-lin-cols',
+            clivalue: (params.gsalkronapy_num_lin_cols ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/gsatpy.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,32 @@
+// Help text for gen_sim_abn_table.py (gsat) within CPIPES.
+
+def gsatpyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'gsatpy_run': [
+            clihelp: 'Run the gen_sim_abn_table.py script. Default: ' +
+                (params.gsatpy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'gsatpy_header': [
+            clihelp: 'Does the taxonomic summary result files have ' +
+                'a header line. ' +
+                " Default: ${params.gsatpy_header}",
+            cliflag: '-header',
+            clivalue: (params.gsatpy_header ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kmaalign.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,200 @@
+// Help text for kma align within CPIPES.
+
+def kmaalignHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'kmaalign_run': [
+            clihelp: 'Run kma tool. Default: ' +
+                (params.kmaalign_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'kmaalign_int': [
+            clihelp: 'Input file has interleaved reads. ' +
+                " Default: ${params.kmaalign_int}",
+            cliflag: '-int',
+            clivalue: (params.kmaalign_int ? ' ' : '')
+        ],
+        'kmaalign_ef': [
+            clihelp: 'Output additional features. ' +
+                "Default: ${params.kmaalign_ef}",
+            cliflag: '-ef',
+            clivalue: (params.kmaalign_ef ? ' ' : '')
+        ],
+        'kmaalign_vcf': [
+            clihelp: 'Output vcf file. 2 to apply FT. ' +
+                "Default: ${params.kmaalign_vcf}",
+            cliflag: '-vcf',
+            clivalue: (params.kmaalign_vcf ? ' ' : '')
+        ],
+        'kmaalign_sam': [
+            clihelp: 'Output SAM, 4/2096 for mapped/aligned. ' +
+                "Default: ${params.kmaalign_sam}",
+            cliflag: '-sam',
+            clivalue: (params.kmaalign_sam ? ' ' : '')
+        ],
+        'kmaalign_nc': [
+            clihelp: 'No consensus file. ' +
+                "Default: ${params.kmaalign_nc}",
+            cliflag: '-nc',
+            clivalue: (params.kmaalign_nc ? ' ' : '')
+        ],
+        'kmaalign_na': [
+            clihelp: 'No aln file. ' +
+                "Default: ${params.kmaalign_na}",
+            cliflag: '-na',
+            clivalue: (params.kmaalign_na ? ' ' : '')
+        ],
+        'kmaalign_nf': [
+            clihelp: 'No frag file. ' +
+                "Default: ${params.kmaalign_nf}",
+            cliflag: '-nf',
+            clivalue: (params.kmaalign_nf ? ' ' : '')
+        ],
+        'kmaalign_a': [
+            clihelp: 'Output all template mappings. ' +
+                "Default: ${params.kmaalign_a}",
+            cliflag: '-a',
+            clivalue: (params.kmaalign_a ? ' ' : '')
+        ],
+        'kmaalign_and': [
+            clihelp: 'Use both -mrs and p-value on consensus. ' +
+                "Default: ${params.kmaalign_and}",
+            cliflag: '-and',
+            clivalue: (params.kmaalign_and ? ' ' : '')
+        ],
+        'kmaalign_oa': [
+            clihelp: 'Use neither -mrs or p-value on consensus. ' +
+                "Default: ${params.kmaalign_oa}",
+            cliflag: '-oa',
+            clivalue: (params.kmaalign_oa ? ' ' : '')
+        ],
+        'kmaalign_bc': [
+            clihelp: 'Minimum support to call bases. ' +
+                "Default: ${params.kmaalign_bc}",
+            cliflag: '-bc',
+            clivalue: (params.kmaalign_bc ?: '')
+        ],
+        'kmaalign_bcNano': [
+            clihelp: 'Altered indel calling for ONT data. ' +
+                "Default: ${params.kmaalign_bcNano}",
+            cliflag: '-bcNano',
+            clivalue: (params.kmaalign_bcNano ? ' ' : '')
+        ],
+        'kmaalign_bcd': [
+            clihelp: 'Minimum depth to call bases. ' +
+                "Default: ${params.kmaalign_bcd}",
+            cliflag: '-bcd',
+            clivalue: (params.kmaalign_bcd ?: '')
+        ],
+        'kmaalign_bcg': [
+            clihelp: 'Maintain insignificant gaps. ' +
+                "Default: ${params.kmaalign_bcg}",
+            cliflag: '-bcg',
+            clivalue: (params.kmaalign_bcg ? ' ' : '')
+        ],
+        'kmaalign_ID': [
+            clihelp: 'Minimum consensus ID. ' +
+                "Default: ${params.kmaalign_ID}",
+            cliflag: '-ID',
+            clivalue: (params.kmaalign_ID ?: '')
+        ],
+        'kmaalign_md': [
+            clihelp: 'Minimum depth. ' +
+                "Default: ${params.kmaalign_md}",
+            cliflag: '-md',
+            clivalue: (params.kmaalign_md ?: '')
+        ],
+        'kmaalign_dense': [
+            clihelp: 'Skip insertion in consensus. ' +
+                "Default: ${params.kmaalign_dense}",
+            cliflag: '-dense',
+            clivalue: (params.kmaalign_dense ? ' ' : '')
+        ],
+        'kmaalign_ref_fsa': [
+            clihelp: 'Use Ns on indels. ' +
+                "Default: ${params.kmaalign_ref_fsa}",
+            cliflag: '-ref_fsa',
+            clivalue: (params.kmaalign_ref_fsa ? ' ' : '')
+        ],
+        'kmaalign_Mt1': [
+            clihelp: 'Map everything to one template. ' +
+                "Default: ${params.kmaalign_Mt1}",
+            cliflag: '-Mt1',
+            clivalue: (params.kmaalign_Mt1 ? ' ' : '')
+        ],
+        'kmaalign_1t1': [
+            clihelp: 'Map one query to one template. ' +
+                "Default: ${params.kmaalign_1t1}",
+            cliflag: '-1t1',
+            clivalue: (params.kmaalign_1t1 ? ' ' : '')
+        ],
+        'kmaalign_mrs': [
+            clihelp: 'Minimum relative alignment score. ' +
+                "Default: ${params.kmaalign_mrs}",
+            cliflag: '-mrs',
+            clivalue: (params.kmaalign_mrs ?: '')
+        ],
+        'kmaalign_mrc': [
+            clihelp: 'Minimum query coverage. ' +
+                "Default: ${params.kmaalign_mrc}",
+            cliflag: '-mrc',
+            clivalue: (params.kmaalign_mrc ?: '')
+        ],
+        'kmaalign_mp': [
+            clihelp: 'Minimum phred score of trailing and leading bases. ' +
+                "Default: ${params.kmaalign_mp}",
+            cliflag: '-mp',
+            clivalue: (params.kmaalign_mp ?: '')
+        ],
+        'kmaalign_mq': [
+            clihelp: 'Set the minimum mapping quality. ' +
+                "Default: ${params.kmaalign_mq}",
+            cliflag: '-mq',
+            clivalue: (params.kmaalign_mq ?: '')
+        ],
+        'kmaalign_eq': [
+            clihelp: 'Minimum average quality score. ' +
+                "Default: ${params.kmaalign_eq}",
+            cliflag: '-eq',
+            clivalue: (params.kmaalign_eq ?: '')
+        ],
+        'kmaalign_5p': [
+            clihelp: 'Trim 5 prime by this many bases. ' +
+                "Default: ${params.kmaalign_5p}",
+            cliflag: '-5p',
+            clivalue: (params.kmaalign_5p ?: '')
+        ],
+        'kmaalign_3p': [
+            clihelp: 'Trim 3 prime by this many bases ' +
+                "Default: ${params.kmaalign_3p}",
+            cliflag: '-3p',
+            clivalue: (params.kmaalign_3p ?: '')
+        ],
+        'kmaalign_apm': [
+            clihelp: 'Sets both -pm and -fpm ' +
+                "Default: ${params.kmaalign_apm}",
+            cliflag: '-apm',
+            clivalue: (params.kmaalign_apm ?: '')
+        ],
+        'kmaalign_cge': [
+            clihelp: 'Set CGE penalties and rewards ' +
+                "Default: ${params.kmaalign_cge}",
+            cliflag: '-cge',
+            clivalue: (params.kmaalign_cge ? ' ' : '')
+        ],
+
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kraken2.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,72 @@
+// Help text for kraken2 within CPIPES.
+
+def kraken2Help(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'kraken2_db': [
+            clihelp: "Absolute path to kraken database. Default: ${params.kraken2_db}",
+            cliflag: '--db',
+            clivalue: null
+        ],
+        'kraken2_confidence': [
+            clihelp: 'Confidence score threshold which must be ' +
+                "between 0 and 1. Default: ${params.kraken2_confidence}",
+            cliflag: '--confidence',
+            clivalue: (params.kraken2_confidence ?: '')
+        ],
+        'kraken2_quick': [
+            clihelp: "Quick operation (use first hit or hits). Default: ${params.kraken2_quick}",
+            cliflag: '--quick',
+            clivalue: (params.kraken2_quick ? ' ' : '')
+        ],
+        'kraken2_use_mpa_style': [
+            clihelp: "Report output like Kraken 1's " +
+                "kraken-mpa-report. Default: ${params.kraken2_use_mpa_style}",
+            cliflag: '--use-mpa-style',
+            clivalue: (params.kraken2_use_mpa_style ? ' ' : '')
+        ],
+        'kraken2_minimum_base_quality': [
+            clihelp: 'Minimum base quality used in classification ' +
+                " which is only effective with FASTQ input. Default: ${params.kraken2_minimum_base_quality}",
+            cliflag: '--minimum-base-quality',
+            clivalue: (params.kraken2_minimum_base_quality ?: '')
+        ],
+        'kraken2_report_zero_counts': [
+            clihelp: 'Report counts for ALL taxa, even if counts are zero. ' +
+                "Default: ${params.kraken2_report_zero_counts}",
+            cliflag: '--report-zero-counts',
+            clivalue: (params.kraken2_report_zero_counts ? ' ' : '')
+        ],
+        'kraken2_report_minmizer_data': [
+            clihelp: 'Report minimizer and distinct minimizer count' +
+                ' information in addition to normal Kraken report. ' +
+                "Default: ${params.kraken2_report_minimizer_data}",
+            cliflag: '--report-minimizer-data',
+            clivalue: (params.kraken2_report_minimizer_data ? ' ' : '')
+        ],
+        'kraken2_use_names': [
+            clihelp: 'Print scientific names instead of just taxids. ' +
+                "Default: ${params.kraken2_use_names}",
+            cliflag: '--use-names',
+            clivalue: (params.kraken2_use_names ? ' ' : '')
+        ],
+        'kraken2_extract_bug': [
+            clihelp: 'Extract the reads or contigs beloging to this bug. ' +
+                "Default: ${params.kraken2_extract_bug}",
+            cliflag: null,
+            clivalue: null
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kronaktimporttext.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,44 @@
+// Help text for ktImportText (krona) within CPIPES.
+
+def kronaktimporttextHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'krona_ktIT_run': [
+            clihelp: 'Run the ktImportText (ktIT) from krona. Default: ' +
+                (params.krona_ktIT_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'krona_ktIT_n': [
+            clihelp: 'Name of the highest level. ' +
+                "Default: ${params.krona_ktIT_n}",
+            cliflag: '-n',
+            clivalue: (params.krona_ktIT_n ?: '')
+        ],
+        'krona_ktIT_q': [
+            clihelp: 'Input file(s) do not have a field for quantity. ' +
+                "Default: ${params.krona_ktIT_q}",
+            cliflag: '-q',
+            clivalue: (params.krona_ktIT_q ? ' ' : '')
+        ],
+        'krona_ktIT_c': [
+            clihelp: 'Combine data from each file, rather than creating separate datasets '
+                + 'within the chart. ' +
+                "Default: ${params.krona_ktIT_c}",
+            cliflag: '-c',
+            clivalue: (params.krona_ktIT_c ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/salmonidx.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,91 @@
+// Help text for salmon index within CPIPES.
+
+def salmonidxHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'salmonidx_run': [
+            clihelp: 'Run `salmon index` tool. Default: ' +
+                (params.salmonidx_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'salmonidx_k': [
+            clihelp: 'The size of k-mers that should be used for the ' +
+                " quasi index. Default: ${params.salmonidx_k}",
+            cliflag: '-k',
+            clivalue: (params.salmonidx_k ?: '')
+        ],
+        'salmonidx_gencode': [
+            clihelp: 'This flag will expect the input transcript FASTA ' +
+                'to be in GENCODE format, and will split the transcript ' +
+                'name at the first `|` character. These reduced names ' +
+                'will be used in the output and when looking for these ' +
+                'transcripts in a gene to transcript GTF.' +
+                " Default: ${params.salmonidx_gencode}",
+            cliflag: '--gencode',
+            clivalue: (params.salmonidx_gencode ? ' ' : '')
+        ],
+        'salmonidx_features': [
+            clihelp: 'This flag will expect the input reference to be in the ' +
+                'tsv file format, and will split the feature name at the first ' +
+                '`tab` character. These reduced names will be used in the output ' +
+                'and when looking for the sequence of the features. GTF.' +
+                " Default: ${params.salmonidx_features}",
+            cliflag: '--features',
+            clivalue: (params.salmonidx_features ? ' ' : '')
+        ],
+        'salmonidx_keepDuplicates': [
+            clihelp: 'This flag will disable the default indexing behavior of ' +
+                'discarding sequence-identical duplicate transcripts. If this ' +
+                'flag is passed then duplicate transcripts that appear in the ' +
+                'input will be retained and quantified separately.' +
+                " Default: ${params.salmonidx_keepDuplicates}",
+            cliflag: '--keepDuplicates',
+            clivalue: (params.salmonidx_keepDuplicates ? ' ' : '')
+        ],
+        'salmonidx_keepFixedFasta': [
+            clihelp: 'Retain the fixed fasta file (without short ' +
+                'transcripts and duplicates, clipped, etc.) generated ' +
+                "during indexing. Default: ${params.salmonidx_keepFixedFasta}",
+            cliflag: '--keepFixedFasta',
+            clivalue: (params.salmonidx_keepFixedFasta ?: '')
+        ],
+        'salmonidx_filterSize': [
+            clihelp: 'The size of the Bloom filter that will be used ' +
+                'by TwoPaCo during indexing. The filter will be of ' +
+                'size 2^{filterSize}. A value of -1 means that the ' +
+                'filter size will be automatically set based on the ' +
+                'number of distinct k-mers in the input, as estimated by ' +
+                "nthll. Default: ${params.salmonidx_filterSize}",
+            cliflag: '--filterSize',
+            clivalue: (params.salmonidx_filterSize ?: '')
+        ],
+        'salmonidx_sparse': [
+            clihelp: 'Build the index using a sparse sampling of k-mer ' +
+                'positions This will require less memory (especially ' +
+                'during quantification), but will take longer to construct' +
+                'and can slow down mapping / alignment.' +
+                " Default: ${params.salmonidx_sparse}",
+            cliflag: '--sparse',
+            clivalue: (params.salmonidx_sparse ? ' ' : '')
+        ],
+        'salmonidx_n': [
+            clihelp: 'Do not clip poly-A tails from the ends of target ' +
+                "sequences. Default: ${params.salmonidx_n}",
+            cliflag: '-n',
+            clivalue: (params.salmonidx_n ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/seqkitgrep.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,75 @@
+// Help text for seqkit `grep` within CPIPES.
+
+def seqkitgrepHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'seqkit_grep_run': [
+            clihelp: 'Run the seqkit `grep` tool. Default: ' +
+                (params.seqkit_grep_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_grep_n': [
+            clihelp: 'Match by full name instead of just ID. ' +
+                "Default: " + (params.seqkit_grep_n ?: 'undefined'),
+            cliflag: '--seqkit_grep_n',
+            clivalue: (params.seqkit_grep_n ? ' ' : '')
+        ],
+        'seqkit_grep_s': [
+            clihelp: 'Search subseq on seq, both positive and negative ' +
+                'strand are searched, and mismatch allowed using flag --seqkit_grep_m. ' +
+                "Default: " + (params.seqkit_grep_s ?: 'undefined'),
+            cliflag: '--seqkit_grep_s',
+            clivalue: (params.seqkit_grep_s ? ' ' : '')
+        ],
+        'seqkit_grep_c': [
+            clihelp: 'Input is circular genome ' +
+                "Default: " + (params.seqkit_grep_c ?: 'undefined'),
+            cliflag: '--seqkit_grep_c',
+            clivalue: (params.seqkit_grep_c ? ' ' : '')
+        ],
+        'seqkit_grep_C': [
+            clihelp: 'Just print a count of matching records. With the ' +
+                '--seqkit_grep_v flag, count non-matching records. ' +
+                "Default: " + (params.seqkit_grep_v ?: 'undefined'),
+            cliflag: '--seqkit_grep_v',
+            clivalue: (params.seqkit_grep_v ? ' ' : '')
+        ],
+        'seqkit_grep_i': [
+            clihelp: 'Ignore case while using seqkit grep. ' +
+                "Default: " + (params.seqkit_grep_i ?: 'undefined'),
+            cliflag: '--seqkit_grep_i',
+            clivalue: (params.seqkit_grep_i ? ' ' : '')
+        ],
+        'seqkit_grep_v': [
+            clihelp: 'Invert the match i.e. select non-matching records. ' +
+                "Default: " + (params.seqkit_grep_v ?: 'undefined'),
+            cliflag: '--seqkit_grep_v',
+            clivalue: (params.seqkit_grep_v ? ' ' : '')
+        ],
+        'seqkit_grep_m': [
+            clihelp: 'Maximum mismatches when matching by sequence. ' +
+                "Default: " + (params.seqkit_grep_m ?: 'undefined'),
+            cliflag: '--seqkit_grep_m',
+            clivalue: (params.seqkit_grep_v ?: '')
+        ],
+        'seqkit_grep_r': [
+            clihelp: 'Input patters are regular expressions. ' +
+                "Default: " + (params.seqkit_grep_m ?: 'undefined'),
+            cliflag: '--seqkit_grep_m',
+            clivalue: (params.seqkit_grep_v ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/seqkitrmdup.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,61 @@
+// Help text for seqkit rmdup within CPIPES.
+
+def seqkitrmdupHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'seqkit_rmdup_run': [
+            clihelp: 'Remove duplicate sequences using seqkit rmdup. Default: ' +
+                (params.seqkit_rmdup_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_n': [
+            clihelp: 'Match and remove duplicate sequences by full name instead of just ID. ' +
+                "Default: ${params.seqkit_rmdup_n}",
+            cliflag: '-n',
+            clivalue: (params.seqkit_rmdup_n ? ' ' : '')
+        ],
+        'seqkit_rmdup_s': [
+            clihelp: 'Match and remove duplicate sequences by sequence content. ' +
+                "Default: ${params.seqkit_rmdup_s}",
+            cliflag: '-s',
+            clivalue: (params.seqkit_rmdup_s ? ' ' : '')
+        ],
+        'seqkit_rmdup_d': [
+            clihelp: 'Save the duplicated sequences to a file. ' +
+                "Default: ${params.seqkit_rmdup_d}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_D': [
+            clihelp: 'Save the number and list of duplicated sequences to a file. ' +
+                "Default: ${params.seqkit_rmdup_D}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_i': [
+            clihelp: 'Ignore case while using seqkit rmdup. ' +
+                "Default: ${params.seqkit_rmdup_i}",
+            cliflag: '-i',
+            clivalue: (params.seqkit_rmdup_i ? ' ' : '')
+        ],
+        'seqkit_rmdup_P': [
+            clihelp: "Only consider positive strand (i.e. 5') when comparing by sequence content. " +
+                "Default: ${params.seqkit_rmdup_P}",
+            cliflag: '-P',
+            clivalue: (params.seqkit_rmdup_P ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sfhpy.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,58 @@
+// Help text for sourmash_filter_hits.py (sfhpy) within CPIPES.
+def sfhpyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sfhpy_run': [
+            clihelp: 'Run the sourmash_filter_hits.py ' +
+                'script. Default: ' +
+                (params.sfhpy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sfhpy_fcn': [
+            clihelp: 'Column name by which filtering of rows should be applied. ' +
+                "Default: ${params.sfhpy_fcn}",
+            cliflag: '-fcn',
+            clivalue: (params.sfhpy_fcn ?: '')
+        ],
+        'sfhpy_fcv': [
+            clihelp: 'Remove genomes whose match with the query FASTQ is less than ' +
+                'this much. ' +
+                "Default: ${params.sfhpy_fcv}",
+            cliflag: '-fcv',
+            clivalue: (params.sfhpy_fcv ?: '')
+        ],
+        'sfhpy_gt': [
+            clihelp: 'Apply greather than or equal to condition on numeric values of ' +
+                '--sfhpy_fcn column. ' +
+                "Default: ${params.sfhpy_gt}",
+            cliflag: '-gt',
+            clivalue: (params.sfhpy_gt ? ' ' : '')
+        ],
+        'sfhpy_lt': [
+            clihelp: 'Apply less than or equal to condition on numeric values of ' +
+                '--sfhpy_fcn column. ' +
+                "Default: ${params.sfhpy_lt}",
+            cliflag: '-gt',
+            clivalue: (params.sfhpy_lt ? ' ' : '')
+        ],
+        'sfhpy_all': [
+            clihelp: 'Instead of just the column value, print entire row. ' +
+                "Default: ${params.sfhpy_all}",
+            cliflag: '-all',
+            clivalue: (params.sfhpy_all ? ' ' : '')
+        ],
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashgather.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,86 @@
+// Help text for sourmash gather within CPIPES.mashsketch
+
+def sourmashgatherHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashgather_run': [
+            clihelp: 'Run `sourmash gather` tool. Default: ' +
+                (params.sourmashgather_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashgather_n': [
+            clihelp: 'Number of results to report. ' +
+                'By default, will terminate at --sourmashgather_thr_bp value. ' +
+                "Default: ${params.sourmashgather_n}",
+            cliflag: '-n',
+            clivalue: (params.sourmashgather_n ?: '')
+        ],
+        'sourmashgather_thr_bp': [
+            clihelp: 'Reporting threshold (in bp) for estimated overlap with remaining query. ' +
+                "Default: ${params.sourmashgather_thr_bp}",
+            cliflag: '--threshold-bp',
+            clivalue: (params.sourmashgather_thr_bp ?: '')
+        ],
+        'sourmashgather_ani_ci': [
+            clihelp: 'Output confidence intervals for ANI estimates. ' +
+                "Default: ${params.sourmashgather_ani_ci}",
+            cliflag: '--estimate-ani-ci',
+            clivalue: (params.sourmashgather_ani_ci ? ' ' : '')
+        ],
+        'sourmashgather_k': [
+            clihelp: 'The k-mer size to select. ' +
+                "Default: ${params.sourmashgather_k}",
+            cliflag: '-k',
+            clivalue: (params.sourmashgather_k ?: '')
+        ],
+        'sourmashgather_dna': [
+            clihelp: 'Choose DNA signature. ' +
+                "Default: ${params.sourmashgather_dna}",
+            cliflag: '--dna',
+            clivalue: (params.sourmashgather_dna ? ' ' : '')
+        ],
+        'sourmashgather_rna': [
+            clihelp: 'Choose RNA signature. ' +
+                "Default: ${params.sourmashgather_rna}",
+            cliflag: '--rna',
+            clivalue: (params.sourmashgather_rna ? ' ' : '')
+        ],
+        'sourmashgather_nuc': [
+            clihelp: 'Choose Nucleotide signature. ' +
+                "Default: ${params.sourmashgather_nuc}",
+            cliflag: '--nucleotide',
+            clivalue: (params.sourmashgather_nuc ? ' ' : '')
+        ],
+        'sourmashgather_scaled': [
+            clihelp: 'Scaled value should be between 100 and 1e6. ' +
+                "Default: ${params.sourmashgather_scaled}",
+            cliflag: '--scaled',
+            clivalue: (params.sourmashgather_scaled ?: '')
+        ],
+        'sourmashgather_inc_pat': [
+            clihelp: 'Search only signatures that match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashgather_inc_pat}",
+            cliflag: '--include-db-pattern',
+            clivalue: (params.sourmashgather_inc_pat ?: '')
+        ],
+        'sourmashgather_exc_pat': [
+            clihelp: 'Search only signatures that do not match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashgather_exc_pat}",
+            cliflag: '--exclude-db-pattern',
+            clivalue: (params.sourmashgather_exc_pat ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashsearch.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,134 @@
+// Help text for sourmash search within CPIPES.
+
+def sourmashsearchHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashsearch_run': [
+            clihelp: 'Run `sourmash search` tool. Default: ' +
+                (params.sourmashsearch_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashsearch_n': [
+            clihelp: 'Number of results to report. ' +
+                'By default, will terminate at --sourmashsearch_thr value. ' +
+                "Default: ${params.sourmashsearch_n}",
+            cliflag: '-n',
+            clivalue: (params.sourmashsearch_n ?: '')
+        ],
+        'sourmashsearch_thr': [
+            clihelp: 'Reporting threshold (similarity) to return results. ' +
+                "Default: ${params.sourmashsearch_thr}",
+            cliflag: '--threshold',
+            clivalue: (params.sourmashsearch_thr ?: '')
+        ],
+        'sourmashsearch_contain': [
+            clihelp: 'Score based on containment rather than similarity. ' +
+                "Default: ${params.sourmashsearch_contain}",
+            cliflag: '--containment',
+            clivalue: (params.sourmashsearch_contain ? ' ' : '')
+        ],
+        'sourmashsearch_maxcontain': [
+            clihelp: 'Score based on max containment rather than similarity. ' +
+                "Default: ${params.sourmashsearch_contain}",
+            cliflag: '--max-containment',
+            clivalue: (params.sourmashsearch_maxcontain ? ' ' : '')
+        ],
+        'sourmashsearch_ignoreabn': [
+            clihelp: 'Do NOT use k-mer abundances if present. ' +
+                "Default: ${params.sourmashsearch_ignoreabn}",
+            cliflag: '--ignore-abundance',
+            clivalue: (params.sourmashsearch_ignoreabn ? ' ' : '')
+        ],
+        'sourmashsearch_ani_ci': [
+            clihelp: 'Output confidence intervals for ANI estimates. ' +
+                "Default: ${params.sourmashsearch_ani_ci}",
+            cliflag: '--estimate-ani-ci',
+            clivalue: (params.sourmashsearch_ani_ci ? ' ' : '')
+        ],
+        'sourmashsearch_k': [
+            clihelp: 'The k-mer size to select. ' +
+                "Default: ${params.sourmashsearch_k}",
+            cliflag: '-k',
+            clivalue: (params.sourmashsearch_k ?: '')
+        ],
+        'sourmashsearch_protein': [
+            clihelp: 'Choose a protein signature. ' +
+                "Default: ${params.sourmashsearch_protein}",
+            cliflag: '--protein',
+            clivalue: (params.sourmashsearch_protein ? ' ' : '')
+        ],
+        'sourmashsearch_noprotein': [
+            clihelp: 'Do not choose a protein signature. ' +
+                "Default: ${params.sourmashsearch_noprotein}",
+            cliflag: '--no-protein',
+            clivalue: (params.sourmashsearch_noprotein ? ' ' : '')
+        ],
+        'sourmashsearch_dayhoff': [
+            clihelp: 'Choose Dayhoff-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_dayhoff}",
+            cliflag: '--dayhoff',
+            clivalue: (params.sourmashsearch_dayhoff ? ' ' : '')
+        ],
+        'sourmashsearch_nodayhoff': [
+            clihelp: 'Do not choose Dayhoff-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_nodayhoff}",
+            cliflag: '--no-dayhoff',
+            clivalue: (params.sourmashsearch_nodayhoff ? ' ' : '')
+        ],
+        'sourmashsearch_hp': [
+            clihelp: 'Choose hydrophobic-polar-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_hp}",
+            cliflag: '--hp',
+            clivalue: (params.sourmashsearch_hp ? ' ' : '')
+        ],
+        'sourmashsearch_nohp': [
+            clihelp: 'Do not choose hydrophobic-polar-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_nohp}",
+            cliflag: '--no-hp',
+            clivalue: (params.sourmashsearch_nohp ? ' ' : '')
+        ],
+        'sourmashsearch_dna': [
+            clihelp: 'Choose DNA signature. ' +
+                "Default: ${params.sourmashsearch_dna}",
+            cliflag: '--dna',
+            clivalue: (params.sourmashsearch_dna ? ' ' : '')
+        ],
+        'sourmashsearch_nodna': [
+            clihelp: 'Do not choose DNA signature. ' +
+                "Default: ${params.sourmashsearch_nodna}",
+            cliflag: '--no-dna',
+            clivalue: (params.sourmashsearch_nodna ? ' ' : '')
+        ],
+        'sourmashsearch_scaled': [
+            clihelp: 'Scaled value should be between 100 and 1e6. ' +
+                "Default: ${params.sourmashsearch_scaled}",
+            cliflag: '--scaled',
+            clivalue: (params.sourmashsearch_scaled ?: '')
+        ],
+        'sourmashsearch_inc_pat': [
+            clihelp: 'Search only signatures that match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashsearch_inc_pat}",
+            cliflag: '--include-db-pattern',
+            clivalue: (params.sourmashsearch_inc_pat ?: '')
+        ],
+        'sourmashsearch_exc_pat': [
+            clihelp: 'Search only signatures that do not match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashsearch_exc_pat}",
+            cliflag: '--exclude-db-pattern',
+            clivalue: (params.sourmashsearch_exc_pat ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashsketch.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,61 @@
+// Help text for sourmash sketch dna within CPIPES.
+
+def sourmashsketchHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashsketch_run': [
+            clihelp: 'Run `sourmash sketch dna` tool. Default: ' +
+                (params.sourmashsketch_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashsketch_mode': [
+            clihelp: "Select which type of signatures to be created: dna, protein, fromfile or translate. "
+                + "Default: ${params.sourmashsketch_mode}",
+            cliflag: "${params.sourmashsketch_mode}",
+            clivalue: ' '
+        ],
+        'sourmashsketch_p': [
+            clihelp: 'Signature parameters to use. ' +
+                "Default: ${params.sourmashsketch_p}",
+            cliflag: '-p',
+            clivalue: (params.sourmashsketch_p ?: '')
+        ],
+        'sourmashsketch_file': [
+            clihelp: '<path>  A text file containing a list of sequence files to load. ' +
+                "Default: ${params.sourmashsketch_file}",
+            cliflag: '--from-file',
+            clivalue: (params.sourmashsketch_file ?: '')
+        ],
+        'sourmashsketch_f': [
+            clihelp: 'Recompute signatures even if the file exists. ' +
+                "Default: ${params.sourmashsketch_f}",
+            cliflag: '-f',
+            clivalue: (params.sourmashsketch_f ? ' ' : '')
+        ],
+        'sourmashsketch_name': [
+            clihelp: 'Name the signature generated from each file after the first record in the file. ' +
+                "Default: ${params.sourmashsketch_name}",
+            cliflag: '--name-from-first',
+            clivalue: (params.sourmashsketch_name ? ' ' : '')
+        ],
+        'sourmashsketch_randomize': [
+            clihelp: 'Shuffle the list of input files randomly. ' +
+                "Default: ${params.sourmashsketch_randomize}",
+            cliflag: '--randomize',
+            clivalue: (params.sourmashsketch_randomize ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashtaxmetagenome.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,69 @@
+// Help text for sourmash tax metagenome within CPIPES.
+
+def sourmashtaxmetagenomeHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashtaxmetagenome_run': [
+            clihelp: 'Run `sourmash tax metagenome` tool. Default: ' +
+                (params.sourmashtaxmetagenome_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashtaxmetagenome_t': [
+            clihelp: "Taxonomy CSV file. "
+                + "Default: ${params.sourmashtaxmetagenome_t}",
+            cliflag: '-t',
+            clivalue: (params.sourmashtaxmetagenome_t ?: '')
+        ],
+        'sourmashtaxmetagenome_r': [
+            clihelp: 'For non-default output formats: Summarize genome'
+                + ' taxonomy at this rank and above. Note that the taxonomy CSV must'
+                + ' contain lineage information at this rank.'
+                + " Default: ${params.sourmashtaxmetagenome_r}",
+            cliflag: '-r',
+            clivalue: (params.sourmashtaxmetagenome_r ?: '')
+        ],
+        'sourmashtaxmetagenome_F': [
+            clihelp: 'Choose output format. ' +
+                "Default: ${params.sourmashtaxmetagenome_F}",
+            cliflag: '--output-format',
+            clivalue: (params.sourmashtaxmetagenome_F ?: '')
+        ],
+        'sourmashtaxmetagenome_f': [
+            clihelp: 'Continue past errors in taxonomy database loading. ' +
+                "Default: ${params.sourmashtaxmetagenome_f}",
+            cliflag: '-f',
+            clivalue: (params.sourmashtaxmetagenome_f ?: '')
+        ],
+        'sourmashtaxmetagenome_kfi': [
+            clihelp: 'Do not split identifiers on whitespace. ' +
+                "Default: ${params.sourmashtaxmetagenome_kfi}",
+            cliflag: '--keep-full-identifiers',
+            clivalue: (params.sourmashtaxmetagenome_kfi ? ' ' : '')
+        ],
+        'sourmashtaxmetagenome_kiv': [
+            clihelp: 'After splitting identifiers do not remove accession versions. ' +
+                "Default: ${params.sourmashtaxmetagenome_kiv}",
+            cliflag: '--keep-identifier-versions',
+            clivalue: (params.sourmashtaxmetagenome_kiv ?: '')
+        ],
+        'sourmashtaxmetagenome_fomt': [
+            clihelp: 'Fail quickly if taxonomy is not available for an identifier. ' +
+                "Default: ${params.sourmashtaxmetagenome_fomt}",
+            cliflag: '--fail-on-missing-taxonomy',
+            clivalue: (params.sourmashtaxmetagenome_fomt ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/routines.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,391 @@
+// Hold methods to print:
+//      1. Colored logo.
+//      2. Summary of parameters.
+//      3. Single dashed line.
+//      4. Double dashed line.
+//
+
+import groovy.json.JsonSlurper
+import nextflow.config.ConfigParser
+// import groovy.json.JsonOutput
+
+// ASCII logo
+def pipelineBanner() {
+
+    def padding = (params.pad) ?: 30
+    Map fgcolors = getANSIColors()
+
+    def banner = [
+        name: "${fgcolors.magenta}${workflow.manifest.name}${fgcolors.reset}",
+        author: "${fgcolors.cyan}${workflow.manifest.author}${fgcolors.reset}",
+        // workflow: "${fgcolors.magenta}${params.pipeline}${fgcolors.reset}",
+        version:  "${fgcolors.green}${workflow.manifest.version}${fgcolors.reset}",
+        center: "${fgcolors.green}${params.center}${fgcolors.reset}",
+        pad: padding
+    ]
+
+    manifest = addPadding(banner)
+
+    return """${fgcolors.white}${dashedLine(type: '=')}${fgcolors.magenta}
+             (o)
+  ___  _ __   _  _ __    ___  ___
+ / __|| '_ \\ | || '_ \\  / _ \\/ __|
+| (__ | |_) || || |_) ||  __/\\__ \\
+ \\___|| .__/ |_|| .__/  \\___||___/
+      | |       | |
+      |_|       |_|${fgcolors.reset}
+${dashedLine()}
+${fgcolors.blue}A collection of modular pipelines at CFSAN, FDA.${fgcolors.reset}
+${dashedLine()}
+${manifest}
+${dashedLine(type: '=')}
+""".stripIndent()
+}
+
+// Add padding to keys so that
+// they indent nicely on the
+// terminal
+def addPadding(values) {
+
+    def pad = (params.pad) ?: 30
+    values.pad = pad
+
+    def padding = values.pad.toInteger()
+    def nocapitalize = values.nocapitalize
+    def stopnow = values.stopNow
+    def help = values.help
+
+    values.removeAll {
+        k, v -> [
+            'nocapitalize',
+            'pad',
+            'stopNow',
+            'help'
+        ].contains(k)
+    }
+
+    values.keySet().each { k ->
+        v = values[k]
+        s = params.linewidth - (pad + 5)
+        if (v.toString().size() > s && !stopnow) {
+            def sen = ''
+            // v.toString().findAll(/.{1,${s}}\b(?:\W*|\s*)/).each {
+            //     sen += ' '.multiply(padding + 2) + it + '\n'
+            // }
+            v.toString().eachMatch(/.{1,${s}}(?=.*)\b|\w+/) {
+                sen += ' '.multiply(padding + 2) + it.trim() + '\n'
+            }
+            values[k] = (
+                help ? sen.replaceAll(/^(\n|\s)*/, '') : sen.trim()
+            )
+        } else {
+            values[k] = (help ? v + "\n" : v)
+        }
+        k = k.replaceAll(/\./, '_')
+    }
+
+    return values.findResults {
+        k, v -> nocapitalize ?
+            k.padRight(padding) + ': ' + v :
+            k.capitalize().padRight(padding) + ': ' + v
+    }.join("\n")
+}
+
+// Method for error messages
+def stopNow(msg) {
+
+    Map fgcolors = getANSIColors()
+    Map errors = [:]
+
+    if (msg == null) {
+        msg = "Unknown error"
+    }
+
+    errors['stopNow'] = true
+    errors["${params.cfsanpipename} - ${params.pipeline} - ERROR"] = """
+${fgcolors.reset}${dashedLine()}
+${fgcolors.red}${msg}${fgcolors.reset}
+${dashedLine()}
+""".stripIndent()
+    // println dashedLine() // defaults to stdout
+    // log.info addPadding(errors) // prints to stdout
+    exit 1, "\n" + dashedLine() +
+        "${fgcolors.red}\n" + addPadding(errors)
+}
+
+// Method to validate 4 required parameters
+// if input for entry point is FASTQ files
+def validateParamsForFASTQ() {
+    switch (params) {
+        case { params.metadata == null && params.input == null }:
+            stopNow("Either metadata CSV file with 5 required columns\n" +
+                "in order: sample, fq1, fq2, strandedness, single_end or \n" +
+                "input directory of only FASTQ files (gzipped or unzipped) should be provided\n" +
+                "using --metadata or --input options.\n" +
+                "None of these two options were provided!")
+            break
+        case { params.metadata != null && params.input != null }:
+            stopNow("Either metadata or input directory of FASTQ files\n" +
+                "should be provided using --metadata or --input options.\n" +
+                "Using both these options is not allowed!")
+            break
+        case { params.output == null }:
+            stopNow("Please mention output directory to store all results " +
+                "using --output option!")
+            break
+    }
+    return 1
+}
+
+// Method to print summary of parameters
+// before running
+def summaryOfParams() {
+
+    def pipeline_specific_config = new ConfigParser().setIgnoreIncludes(true).parse(
+        file("${params.workflowsconf}${params.fs}${params.pipeline}.config").text
+    )
+    Map fgcolors = getANSIColors()
+    Map globalparams = [:]
+    Map localparams = params.subMap(
+        pipeline_specific_config.params.keySet().toList() + params.logtheseparams
+    )
+
+    if (localparams !instanceof Map) {
+        stopNow("Need a Map of paramters. We got: " + localparams.getClass())
+    }
+
+    if (localparams.size() != 0) {
+        localparams['nocapitalize'] = true
+        globalparams['nocapitalize'] = true
+        globalparams['nextflow_version'] = "${nextflow.version}"
+        globalparams['nextflow_build'] = "${nextflow.build}"
+        globalparams['nextflow_timestamp'] = "${nextflow.timestamp}"
+        globalparams['workflow_projectDir'] = "${workflow.projectDir}"
+        globalparams['workflow_launchDir'] = "${workflow.launchDir}"
+        globalparams['workflow_workDir'] = "${workflow.workDir}"
+        globalparams['workflow_container'] = "${workflow.container}"
+        globalparams['workflow_containerEngine'] = "${workflow.containerEngine}"
+        globalparams['workflow_runName'] = "${workflow.runName}"
+        globalparams['workflow_sessionId'] = "${workflow.sessionId}"
+        globalparams['workflow_profile'] = "${workflow.profile}"
+        globalparams['workflow_start'] = "${workflow.start}"
+        globalparams['workflow_commandLine'] = "${workflow.commandLine}"
+        return """${dashedLine()}
+Summary of the current workflow (${fgcolors.magenta}${params.pipeline}${fgcolors.reset}) parameters
+${dashedLine()}
+${addPadding(localparams)}
+${dashedLine()}
+${fgcolors.cyan}N E X T F L O W${fgcolors.reset} - ${fgcolors.magenta}${params.cfsanpipename}${fgcolors.reset} - Runtime metadata
+${dashedLine()}
+${addPadding(globalparams)}
+${dashedLine()}""".stripIndent()
+    }
+    return 1
+}
+
+// Method to display
+// Return dashed line either '-'
+// type or '=' type
+def dashedLine(Map defaults = [:]) {
+
+    Map fgcolors = getANSIColors()
+    def line = [color: 'white', type: '-']
+
+    if (!defaults.isEmpty()) {
+        line.putAll(defaults)
+    }
+
+    return fgcolors."${line.color}" +
+        "${line.type}".multiply(params.linewidth) +
+        fgcolors.reset
+}
+
+// Return slurped keys parsed from JSON
+def slurpJson(file) {
+    def slurped = null
+    def jsonInst = new JsonSlurper()
+
+    try {
+        slurped = jsonInst.parse(new File ("${file}"))
+    }
+    catch (Exception e) {
+        log.error 'Please check your JSON schema. Invalid JSON file: ' + file
+    }
+
+    // Declare globals for the nanofactory
+    // workflow.
+    return [keys: slurped.keySet().toList(), cparams: slurped]
+}
+
+// Default help text in a map if the entry point
+// to a pipeline is FASTQ files.
+def fastqEntryPointHelp() {
+
+    Map helptext = [:]
+    Map fgcolors = getANSIColors()
+
+    helptext['Workflow'] =  "${fgcolors.magenta}${params.pipeline}${fgcolors.reset}"
+    helptext['Author'] =  "${fgcolors.cyan}${params.workflow_built_by}${fgcolors.reset}"
+    helptext['Version'] = "${fgcolors.green}${params.workflow_version}${fgcolors.reset}\n"
+    helptext['Usage'] = "cpipes --pipeline ${params.pipeline} [options]\n"
+    helptext['Required'] = ""
+    helptext['--input'] = "Absolute path to directory containing FASTQ files. " +
+        "The directory should contain only FASTQ files as all the " +
+        "files within the mentioned directory will be read. " +
+        "Ex: --input /path/to/fastq_pass"
+    helptext['--output'] = "Absolute path to directory where all the pipeline " +
+        "outputs should be stored. Ex: --output /path/to/output"
+    helptext['Other options'] = ""
+    helptext['--metadata'] = "Absolute path to metadata CSV file containing five " +
+        "mandatory columns: sample,fq1,fq2,strandedness,single_end. The fq1 and fq2 " +
+        "columns contain absolute paths to the FASTQ files. This option can be used in place " +
+        "of --input option. This is rare. Ex: --metadata samplesheet.csv"
+    helptext['--fq_suffix'] = "The suffix of FASTQ files (Unpaired reads or R1 reads or Long reads) if " +
+        "an input directory is mentioned via --input option. Default: ${params.fq_suffix}"
+    helptext['--fq2_suffix'] = "The suffix of FASTQ files (Paired-end reads or R2 reads) if an input directory is mentioned via " +
+        "--input option. Default: ${params.fq2_suffix}"
+    helptext['--fq_filter_by_len'] = "Remove FASTQ reads that are less than this many bases. " +
+        "Default: ${params.fq_filter_by_len}"
+    helptext['--fq_strandedness'] = "The strandedness of the sequencing run. This is mostly needed " +
+        "if your sequencing run is RNA-SEQ. For most of the other runs, it is probably safe to use " +
+        "unstranded for the option. Default: ${params.fq_strandedness}"
+    helptext['--fq_single_end'] = "SINGLE-END information will be auto-detected but this option forces " +
+        "PAIRED-END FASTQ files to be treated as SINGLE-END so only read 1 information is included in " +
+        "auto-generated samplesheet. Default: ${params.fq_single_end}"
+    helptext['--fq_filename_delim'] = "Delimiter by which the file name is split to obtain sample name. " +
+        "Default: ${params.fq_filename_delim}"
+    helptext['--fq_filename_delim_idx'] = "After splitting FASTQ file name by using the --fq_filename_delim option," +
+        " all elements before this index (1-based) will be joined to create final sample name." +
+        " Default: ${params.fq_filename_delim_idx}"
+
+    return helptext
+}
+
+// Show concise help text if configured within the main workflow.
+def conciseHelp(def tool = null) {
+    Map fgcolors = getANSIColors()
+
+    tool ?= "fastp"
+    tools = tool?.tokenize(',')
+
+    return """
+${dashedLine()}
+Show configurable CLI options for each tool within ${fgcolors.magenta}${params.pipeline}${fgcolors.reset}
+${dashedLine()}
+Ex: cpipes --pipeline ${params.pipeline} --help
+""" + (tools.size() > 1 ? "Ex: cpipes --pipeline ${params.pipeline} --help ${tools[0]}"
+    + """
+Ex: cpipes --pipeline ${params.pipeline} --help ${tools[0]},${tools[1]}
+${dashedLine()}""".stripIndent() : """Ex: cpipes --pipeline ${params.pipeline} --help ${tool}
+${dashedLine()}""".stripIndent())
+
+}
+
+// Wrap help text with the following options
+def wrapUpHelp() {
+
+    return [
+        'Help options' : "",
+        '--help': "Display this message.\n",
+        'help': true,
+        'nocapitalize': true
+    ]
+}
+
+// Method to send email on workflow complete.
+def sendMail() {
+
+    if (params.user_email == null) {
+        return 1
+    }
+
+    def pad = (params.pad) ?: 30
+    def contact_emails = [
+        stakeholder: (params.workflow_blueprint_by ?: 'Not defined'),
+        author: (params.workflow_built_by ?: 'Not defined')
+    ]
+    def msg = """
+${pipelineBanner()}
+${summaryOfParams()}
+${params.cfsanpipename} - ${params.pipeline}
+${dashedLine()}
+Please check the following directory for N E X T F L O W
+reports. You can view the HTML files directly by double clicking
+them on your workstation.
+${dashedLine()}
+${params.tracereportsdir}
+${dashedLine()}
+Please send any bug reports to CFSAN Dev Team or the author or
+the stakeholder of the current pipeline.
+${dashedLine()}
+Error messages (if any)
+${dashedLine()}
+${workflow.errorMessage}
+${workflow.errorReport}
+${dashedLine()}
+Contact emails
+${dashedLine()}
+${addPadding(contact_emails)}
+${dashedLine()}
+Thank you for using ${params.cfsanpipename} - ${params.pipeline}!
+${dashedLine()}
+""".stripIndent()
+
+    def mail_cmd = [
+        'sendmail',
+        '-f', 'noreply@gmail.com',
+        '-F', 'noreply',
+        '-t', "${params.user_email}"
+    ]
+
+    def email_subject = "${params.cfsanpipename} - ${params.pipeline}"
+    Map fgcolors = getANSIColors()
+
+    if (workflow.success) {
+        email_subject += ' completed successfully!'
+    }
+    else if (!workflow.success) {
+        email_subject += ' has failed!'
+    }
+
+    try {
+        ['env', 'bash'].execute() << """${mail_cmd.join(' ')}
+Subject: ${email_subject}
+Mime-Version: 1.0
+Content-Type: text/html
+<pre>
+${msg.replaceAll(/\x1b\[[0-9;]*m/, '')}
+</pre>
+""".stripIndent()
+    } catch (all) {
+        def warning_msg = "${fgcolors.yellow}${params.cfsanpipename} - ${params.pipeline} - WARNING"
+            .padRight(pad) + ':'
+        log.info """
+${dashedLine()}
+${warning_msg}
+${dashedLine()}
+Could not send mail with the sendmail command!
+${dashedLine()}
+""".stripIndent()
+    }
+    return 1
+}
+
+// Set ANSI colors for any and all
+// STDOUT or STDERR
+def getANSIColors() {
+
+    Map fgcolors = [:]
+
+    fgcolors['reset']   = "\033[0m"
+    fgcolors['black']   = "\033[0;30m"
+    fgcolors['red']     = "\033[0;31m"
+    fgcolors['green']   = "\033[0;32m"
+    fgcolors['yellow']  = "\033[0;33m"
+    fgcolors['blue']    = "\033[0;34m"
+    fgcolors['magenta'] = "\033[0;35m"
+    fgcolors['cyan']    = "\033[0;36m"
+    fgcolors['white']   = "\033[0;37m"
+
+    return fgcolors
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/bwa/mem/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,50 @@
+process BWA_MEM {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}bwa${params.fs}0.7.17" : null)
+    conda (params.enable_conda ? "bioconda::bwa=0.7.17 conda-forge::perl" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/bwa:0.7.17--he4a0461_11' :
+        'quay.io/biocontainers/bwa:0.7.17--he4a0461_11' }"
+
+    input:
+        tuple val(meta), path(reads), path(index)
+        val index2
+
+    output:
+        tuple val(meta), path("*.sam"), emit: aligned_sam
+        path  "versions.yml"          , emit: versions
+
+    when:
+
+
+    script:
+        def args   = task.ext.args ?: ''
+        def args2  = task.ext.args2 ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def this_index = (index ?: index2)
+        """
+
+        if [ "${params.fq_single_end}" = "false" ]; then
+            bwa mem \\
+                $args \\
+                -t $task.cpus \\
+                $this_index \\
+                ${reads[0]} ${reads[1]} > ${prefix}.aligned.sam
+        else
+            bwa mem \\
+                $args \\
+                -t $task.cpus \\
+                -a \\
+                $this_index \\
+                $reads > ${prefix}.aligned.sam
+
+        fi
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            bwa: \$(echo \$(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*\$//')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/fastq/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,96 @@
+# NextFlow DSL2 Module
+
+```bash
+CAT_FASTQ
+```
+
+## Description
+
+Concatenates a list of FASTQ files. Produces 2 files per sample (`id:`) if `single_end` is `false` as mentioned in the metadata Groovy Map.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of FASTQ files of input type `path` (`reads`) to be concatenated.
+
+Ex:
+
+```groovy
+[ [id: 'sample1', single_end: true], ['/data/sample1/f_L001.fq', '/data/sample1/f_L002.fq'] ]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[ id: 'FAL00870', strandedness: 'unstranded', single_end: true ]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to list of FASTQ files.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'CAT_FASTQ' {
+    ext.args = '--genome_size 5.5m'
+}
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and list of concatenated FASTQ files (`catted_reads`).
+
+\
+&nbsp;
+
+#### `catted_reads`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the concatenated FASTQ files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/fastq/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,89 @@
+process CAT_FASTQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    conda (params.enable_conda ? "conda-forge::sed=4.7 conda-forge::gzip" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' :
+        'biocontainers/biocontainers:v1.2.0_cv1' }"
+
+    input:
+        tuple val(meta), path(reads, stageAs: "input*/*")
+
+    output:
+        tuple val(meta), path("*.merged.fastq.gz"), emit: catted_reads
+        path "versions.yml"                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def readList = reads.collect{ it.toString() }
+        def is_in_gz = readList[0].endsWith('.gz')
+        def gz_or_ungz = (is_in_gz ? '' : ' | gzip')
+        def pigz_or_ungz = (is_in_gz ? '' : " | pigz -p ${task.cpus}")
+        if (meta.single_end) {
+            if (readList.size > 1) {
+                """
+                zcmd="gzip"
+                zver=""
+
+                if type pigz > /dev/null 2>&1; then
+                    cat ${readList.join(' ')} ${pigz_or_ungz} > ${prefix}.merged.fastq.gz
+                    zcmd="pigz"
+                    zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed -e '1!d' | sed "s/\$zcmd //" )
+                else
+                    cat ${readList.join(' ')} ${gz_or_ungz} > ${prefix}.merged.fastq.gz
+                    zcmd="gzip"
+
+                    if [ "${workflow.containerEngine}" != "null" ]; then
+                        zver=\$( echo \$( \$zcmd --help 2>&1 ) | sed -e '1!d; s/ (.*\$//' )
+                    else
+                        zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed "s/^.*(\$zcmd) //; s/\$zcmd //; s/ Copyright.*\$//" )
+                    fi
+                fi
+
+                cat <<-END_VERSIONS > versions.yml
+                "${task.process}":
+                    cat: \$( echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//' )
+                    \$zcmd: \$zver
+                END_VERSIONS
+                """
+            }
+        } else {
+            if (readList.size > 2) {
+                def read1 = []
+                def read2 = []
+                readList.eachWithIndex{ v, ix -> ( ix & 1 ? read2 : read1 ) << v }
+                """
+                zcmd="gzip"
+                zver=""
+
+                if type pigz > /dev/null 2>&1; then
+                    cat ${read1.join(' ')} ${pigz_or_ungz} > ${prefix}_1.merged.fastq.gz
+                    cat ${read2.join(' ')} ${pigz_or_ungz} > ${prefix}_2.merged.fastq.gz
+                    zcmd="pigz"
+                    zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed -e '1!d' | sed "s/\$zcmd //" )
+                else
+                    cat ${read1.join(' ')} ${gz_or_ungz} > ${prefix}_1.merged.fastq.gz
+                    cat ${read2.join(' ')} ${gz_or_ungz} > ${prefix}_2.merged.fastq.gz
+                    zcmd="gzip"
+
+                    if [ "${workflow.containerEngine}" != "null" ]; then
+                        zver=\$( echo \$( \$zcmd --help 2>&1 ) | sed -e '1!d; s/ (.*\$//' )
+                    else
+                        zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed "s/^.*(\$zcmd) //; s/\$zcmd //; s/ Copyright.*\$//" )
+                    fi
+                fi
+
+                cat <<-END_VERSIONS > versions.yml
+                "${task.process}":
+                    cat: \$( echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//' )
+                    \$zcmd: \$zver
+                END_VERSIONS
+                """
+            }
+        }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/tables/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,88 @@
+# NextFlow DSL2 Module
+
+```bash
+TABLE_SUMMARY
+```
+
+## Description
+
+Concatenates a list of tables (CSV or TAB delimited) in `.txt` or `.csv` format. The table files to be concatenated **must** have a header as the header from one of the table files will be used as the header for the concatenated result table file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of `val` table key (`table_sum_on`) and a list of table files of input type `path` (`tables`) to be concatenated. For this module to work, a `bin` directory with the script `create_mqc_data_table.py` should be present where the NextFlow script using this DSL2 module will be run. This `python` script will convert the aggregated table to `.yml` format to be used with `multiqc`.
+
+Ex:
+
+```groovy
+[ ['ectyper'], ['/data/sample1/f1_ectyper.txt', '/data/sample2/f2_ectyper.txt'] ]
+```
+
+\
+&nbsp;
+
+#### `table_sum_on`
+
+Type: `val`
+
+A single key defining what tables are being concatenated. For example, if all the `ectyper` results are being concatenated for all samples, then this can be `ectyper`.
+
+Ex:
+
+```groovy
+[ ['ectyper'], ['/data/sample1/f1_ectyper.txt', '/data/sample2/f2_ectyper.txt'] ]
+```
+
+\
+&nbsp;
+
+#### `tables`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to a list of tables (files) to be concatenated.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of table key (`table_sum_on` from `input:`) and list of concatenated table files (`tblsummed`).
+
+\
+&nbsp;
+
+#### `tblsummed`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the concatenated table files per table key (Ex: `ectyper`).
+
+\
+&nbsp;
+
+#### `mqc_yml`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing table contents in `YAML` format which can be used to inject this table as part of the `multiqc` report.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/tables/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,58 @@
+process TABLE_SUMMARY {
+    tag "$table_sum_on"
+    label 'process_low'
+
+    // Requires `pyyaml` which does not have a dedicated container but is in the MultiQC container
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9 conda-forge::pyyaml conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0' }"
+
+    input:
+    tuple val(table_sum_on), path(tables)
+
+    output:
+    tuple val(table_sum_on), path("*.tblsum.txt"), emit: tblsummed
+    path "*_mqc.yml"                             , emit: mqc_yml
+    path "versions.yml"                          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when || tables
+
+    script:
+    def args = task.ext.args ?: ''
+    def onthese = tables.collect().join('\\n')
+    """
+    filenum="1"
+    header=""
+
+    echo -e "$onthese" | while read -r file; do
+
+        if [ "\${filenum}" == "1" ]; then
+            header=\$( head -n1 "\${file}" )
+            echo -e "\${header}" > ${table_sum_on}.tblsum.txt
+        fi
+
+        tail -n+2 "\${file}" >> ${table_sum_on}.tblsum.txt
+
+        filenum=\$((filenum+1))
+    done
+
+    create_mqc_data_table.py $table_sum_on ${workflow.manifest.name}
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        bash: \$( bash --version 2>&1 | sed '1!d; s/^.*version //; s/ (.*\$//' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+
+    headver=\$( head --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+    tailver=\$( tail --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+
+    cat <<-END_VERSIONS >> versions.yml
+        head: \$headver
+        tail: \$tailver
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,57 @@
+# NextFlow DSL2 Module
+
+```bash
+DUMP_SOFTWARE_VERSIONS
+```
+
+## Description
+
+Given an `YAML` format file, produce a final `.yml` file which has unique entries and a corresponding `.mqc.yml` file for use with `multiqc`.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in a `path` (`versions`) type pointing to the file to be used to produce a final `.yml` file without any duplicate entries and a `.mqc.yml` file. Generally, this is passed by mixing `versions` from various run time channels and finally passed to this module to produce a final software versions list.
+
+Ex:
+
+```groovy
+[ '/hpc/scratch/test/work/9b/e7bf7e28806419c1c9a571dacd1f67/versions.yml' ]
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+#### `yml`
+
+Type: `path`
+
+NextFlow output type of `path` type pointing to an `YAML` file with software versions.
+
+\
+&nbsp;
+
+#### `mqc_yml`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to `.mqc.yml` file which can be used to produce a software versions' table with `multiqc`.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,26 @@
+process DUMP_SOFTWARE_VERSIONS {
+    tag "${params.pipeline} software versions"
+    label 'process_pico'
+
+    // Requires `pyyaml` which does not have a dedicated container but is in the MultiQC container
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9 conda-forge::pyyaml" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/mulled-v2-ca258a039fcd88610bc4e297b13703e8be53f5ca:d638c4f85566099ea0c74bc8fddc6f531fe56753-0' :
+        'quay.io/biocontainers/mulled-v2-ca258a039fcd88610bc4e297b13703e8be53f5ca:d638c4f85566099ea0c74bc8fddc6f531fe56753-0' }"
+
+    input:
+    path versions
+
+    output:
+    path "software_versions.yml"    , emit: yml
+    path "software_versions_mqc.yml", emit: mqc_yml
+    path "versions.yml"             , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    template 'dumpsoftwareversions.py'
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/templates/dumpsoftwareversions.py	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,101 @@
+#!/usr/bin/env python
+
+import platform
+import subprocess
+from textwrap import dedent
+
+import yaml
+
+
+def _make_versions_html(versions):
+    html = [
+        dedent(
+            """\\
+            <link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/dt/jszip-2.5.0/dt-1.12.1/b-2.2.3/b-colvis-2.2.3/b-html5-2.2.3/b-print-2.2.3/fc-4.1.0/r-2.3.0/sc-2.0.6/sb-1.3.3/sp-2.0.1/datatables.min.css"/>
+            <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.36/pdfmake.min.js"></script>
+            <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.36/vfs_fonts.js"></script>
+            <script type="text/javascript" src="https://cdn.datatables.net/v/dt/jszip-2.5.0/dt-1.12.1/b-2.2.3/b-colvis-2.2.3/b-html5-2.2.3/b-print-2.2.3/fc-4.1.0/r-2.3.0/sc-2.0.6/sb-1.3.3/sp-2.0.1/datatables.min.js"></script>
+            <style>
+            #cpipes-software-versions tbody:nth-child(even) {
+                background-color: #f2f2f2;
+            }
+            </style>
+            <table class="table" style="width:100%" id="cpipes-software-versions">
+                <thead>
+                    <tr>
+                        <th> Process Name </th>
+                        <th> Software </th>
+                        <th> Version  </th>
+                    </tr>
+                </thead>
+            """
+        )
+    ]
+    for process, tmp_versions in sorted(versions.items()):
+        html.append("<tbody>")
+        for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
+            html.append(
+                dedent(
+                    f"""\\
+                    <tr>
+                        <td><samp>{process if (i == 0) else ''}</samp></td>
+                        <td><samp>{tool}</samp></td>
+                        <td><samp>{version}</samp></td>
+                    </tr>
+                    """
+                )
+            )
+        html.append("</tbody>")
+    html.append("</table>")
+    return "\\n".join(html)
+
+
+versions_this_module = {}
+versions_this_module["${task.process}"] = {
+    "python": platform.python_version(),
+    "yaml": yaml.__version__,
+}
+
+with open("$versions") as f:
+    versions_by_process = yaml.load(f, Loader=yaml.BaseLoader)
+    versions_by_process.update(versions_this_module)
+
+# aggregate versions by the module name (derived from fully-qualified process name)
+versions_by_module = {}
+for process, process_versions in versions_by_process.items():
+    module = process.split(":")[-1]
+    try:
+        assert versions_by_module[module] == process_versions, (
+            "We assume that software versions are the same between all modules. "
+            "If you see this error-message it means you discovered an edge-case "
+            "and should open an issue in nf-core/tools. "
+        )
+    except KeyError:
+        versions_by_module[module] = process_versions
+
+versions_by_module["CPIPES"] = {
+    "Nextflow": "$workflow.nextflow.version",
+    "$workflow.manifest.name": "$workflow.manifest.version",
+    "${params.pipeline}": "${params.workflow_version}",
+}
+
+versions_mqc = {
+    "id": "software_versions",
+    "section_name": "${workflow.manifest.name} Software Versions",
+    "section_href": "https://cfsan-git.fda.gov/Kranti.Konganti/${workflow.manifest.name.toLowerCase()}",
+    "plot_type": "html",
+    "description": "Collected at run time from the software output (STDOUT/STDERR).",
+    "data": _make_versions_html(versions_by_module),
+}
+
+with open("software_versions.yml", "w") as f:
+    yaml.dump(versions_by_module, f, default_flow_style=False)
+
+# print('sed -i -e "' + "s%'%%g" + '" *.yml')
+subprocess.run('sed -i -e "' + "s%'%%g" + '" software_versions.yml', shell=True)
+
+with open("software_versions_mqc.yml", "w") as f:
+    yaml.dump(versions_mqc, f, default_flow_style=False)
+
+with open("versions.yml", "w") as f:
+    yaml.dump(versions_this_module, f, default_flow_style=False)
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastp/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,96 @@
+process FASTP {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}fastp${params.fs}0.23.2" : null)
+    conda (params.enable_conda ? "bioconda::fastp=0.23.2 conda-forge::isa-l" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/fastp:0.23.2--h79da9fb_0' :
+        'quay.io/biocontainers/fastp:0.23.2--h79da9fb_0' }"
+
+    input:
+        tuple val(meta), path(reads)
+
+    output:
+        tuple val(meta), path('*.fastp.fastq.gz') , emit: passed_reads, optional: true
+        tuple val(meta), path('*.fail.fastq.gz')  , emit: failed_reads, optional: true
+        tuple val(meta), path('*.merged.fastq.gz'), emit: merged_reads, optional: true
+        tuple val(meta), path('*.json')           , emit: json
+        tuple val(meta), path('*.html')           , emit: html
+        tuple val(meta), path('*.log')            , emit: log
+        path "versions.yml"                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def fail_fastq = params.fastp_failed_out && meta.single_end ? "--failed_out ${prefix}.fail.fastq.gz" : params.fastp_failed_out && !meta.single_end ? "--unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : ''
+        // Added soft-links to original fastqs for consistent naming in MultiQC
+        // Use single ended for interleaved. Add --interleaved_in in config.
+        if ( task.ext.args?.contains('--interleaved_in') ) {
+            """
+            [ ! -f  ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz
+
+            fastp \\
+                --stdout \\
+                --in1 ${prefix}.fastq.gz \\
+                --thread $task.cpus \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $args \\
+                2> ${prefix}.fastp.log \\
+            | gzip -c > ${prefix}.fastp.fastq.gz
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+        } else if (meta.single_end) {
+            """
+            [ ! -f  ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz
+
+            fastp \\
+                --in1 ${prefix}.fastq.gz \\
+                --out1  ${prefix}.fastp.fastq.gz \\
+                --thread $task.cpus \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $args \\
+                2> ${prefix}.fastp.log
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+        } else {
+            def merge_fastq = params.fastp_merged_out ? "-m --merged_out ${prefix}.merged.fastq.gz" : ''
+            """
+            [ ! -f  ${prefix}_1.fastq.gz ] && ln -sf ${reads[0]} ${prefix}_1.fastq.gz
+            [ ! -f  ${prefix}_2.fastq.gz ] && ln -sf ${reads[1]} ${prefix}_2.fastq.gz
+            fastp \\
+                --in1 ${prefix}_1.fastq.gz \\
+                --in2 ${prefix}_2.fastq.gz \\
+                --out1 ${prefix}_1.fastp.fastq.gz \\
+                --out2 ${prefix}_2.fastp.fastq.gz \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $merge_fastq \\
+                --thread $task.cpus \\
+                --detect_adapter_for_pe \\
+                $args \\
+                2> ${prefix}.fastp.log
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+    }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastqc/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,113 @@
+# NextFlow DSL2 Module
+
+```bash
+FASTQC
+```
+
+## Description
+
+Run `fastqc` tool on reads in FASTQ format. Produces a HTML report file and a `.zip` file containing plots and data used to produce the plots.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `fastqc` classification should be run.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'FASTQC' {
+    ext.args = '--nano'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and list of `fastqc` result files.
+
+\
+&nbsp;
+
+#### `html`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `fastqc` report file in HTML format per sample (`id:`).
+
+\
+&nbsp;
+
+#### `zip`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the zipped `fastqc` results per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastqc/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,48 @@
+process FASTQC {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}fastqc${params.fs}0.11.9" : null)
+    conda (params.enable_conda ? "conda-forge::perl bioconda::fastqc=0.11.9" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' :
+        'quay.io/biocontainers/fastqc:0.11.9--0' }"
+
+    input:
+    tuple val(meta), path(reads)
+
+    output:
+    tuple val(meta), path("*.html"), emit: html
+    tuple val(meta), path("*.zip") , emit: zip
+    path  "versions.yml"           , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    // Add soft-links to original FastQs for consistent naming in pipeline
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    if (meta.single_end) {
+        """
+        [ ! -f  ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz
+        fastqc $args --threads $task.cpus ${prefix}.fastq.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
+        END_VERSIONS
+        """
+    } else {
+        """
+        [ ! -f  ${prefix}_1.fastq.gz ] && ln -s ${reads[0]} ${prefix}_1.fastq.gz
+        [ ! -f  ${prefix}_2.fastq.gz ] && ln -s ${reads[1]} ${prefix}_2.fastq.gz
+        fastqc $args --threads $task.cpus ${prefix}_1.fastq.gz ${prefix}_2.fastq.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
+        END_VERSIONS
+        """
+    }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/gen_samplesheet/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,55 @@
+# NextFlow DSL2 Module
+
+```bash
+GEN_SAMPLESHEET
+```
+
+## Description
+
+Generates a sample sheet in CSV format that contains required fields to be used to construct a Groovy Map of metadata. It requires as input, an absolute UNIX path to a folder containing only FASTQ files. This module requires the `fastq_dir_to_samplesheet.py` script to be present in the `bin` folder from where the NextFlow script including this module will be executed.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `val`
+
+Takes in the absolute UNIX path to a folder containing only FASTQ files (`inputdir`).
+
+Ex:
+
+```groovy
+'/hpc/scratch/test/reads'
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `path`
+
+NextFlow output of type `path` pointing to auto-generated CSV sample sheet (`csv`).
+
+\
+&nbsp;
+
+#### `csv`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to auto-generated CSV sample sheet for all FASTQ files present in the folder given by NextFlow input type of `val` (`inputdir`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/gen_samplesheet/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,41 @@
+process GEN_SAMPLESHEET {
+    tag "${inputdir.simpleName}"
+    label "process_pico"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9.5" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.9--1' :
+        'quay.io/biocontainers/python:3.9--1' }"
+
+    input:
+        val inputdir
+
+    output:
+        path '*.csv'       , emit: csv
+        path 'versions.yml', emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    // This script (fastq_dir_to_samplesheet.py) is distributed
+    // as part of the pipeline nf-core/rnaseq/bin/. MIT License.
+    script:
+        def this_script_args = (params.fq_single_end ? ' -se' : '')
+        this_script_args += (params.fq_suffix ? " -r1 '${params.fq_suffix}'" : '')
+        this_script_args += (params.fq2_suffix ? " -r2 '${params.fq2_suffix}'" : '')
+
+        """
+        fastq_dir_to_samplesheet.py -sn \\
+            -st '${params.fq_strandedness}' \\
+            -sd '${params.fq_filename_delim}' \\
+            -si ${params.fq_filename_delim_idx} \\
+            ${this_script_args} \\
+            ${inputdir} autogen_samplesheet.csv
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/align/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,135 @@
+# NextFlow DSL2 Module
+
+```bash
+KMA_ALIGN
+```
+
+## Description
+
+Run `kma` alinger on input FASTQ files with a pre-formatted `kma` index.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) and a correspondonding `kma` pre-formatted index folder per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+        strandedness: 'unstranded',
+        single_end: false
+    ],
+    [
+        '/hpc/scratch/test/f1.R1.fq.gz',
+        '/hpc/scratch/test/f1.R2.fq.gz'
+    ],
+    '/path/to/kma/index/folder'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to paired-end FASTQ files on which `bbmerge.sh` should be run.
+
+\
+&nbsp;
+
+#### `index`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to folder containing `kma` index files.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'KMA_ALIGN' {
+    ext.args = '-mint2'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and `kma` result files.
+
+\
+&nbsp;
+
+#### `res`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.res` file from `kma` per sample (`id:`).
+
+\
+&nbsp;
+
+#### `mapstat`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.map` file from `kma` per sample (`id:`). Optional: `true`
+
+\
+&nbsp;
+
+#### `hits`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to a `*_template_hits.txt` file containing only hit IDs. Optional: `true`
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/align/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,73 @@
+process KMA_ALIGN {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}kma${params.fs}1.4.4" : null)
+    conda (params.enable_conda ? "conda-forge::libgcc-ng bioconda::kma=1.4.3" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/kma:1.4.3--h7132678_1':
+        'quay.io/biocontainers/kma:1.4.3--h7132678_1' }"
+
+    input:
+        tuple val(meta), path(reads), path(index)
+
+    output:
+        path "${meta.id}_kma_res"
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.res")              , emit: res
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.mapstat")          , emit: mapstat, optional: true
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.frag.gz")          , emit: frags, optional: true
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*_template_hits.txt"), emit: hits, optional: true
+        path "versions.yml"                                                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def reads_in = (meta.single_end ? "-i $reads" : "-ipe ${reads[0]} ${reads[1]}")
+        def db = (meta.kma_t_db ?: "${index}")
+        def db_basename = (meta.kma_t_db ? '' : "${params.fs}${index.baseName}")
+        def get_hit_accs = (meta.get_kma_hit_accs ? 'true' : 'false')
+        def res_dir = prefix + '_kma_res'
+        reads_in = (params.kmaalign_int ? "-int $reads" : "$reads_in")
+        """
+        mkdir -p $res_dir || exit 1
+        kma \\
+            $args \\
+            -t_db $db$db_basename \\
+            -t $task.cpus \\
+            -o $res_dir${params.fs}$prefix \\
+            $reads_in
+
+        if [ "$get_hit_accs" == "true" ]; then
+            grep -v '^#' $res_dir${params.fs}${prefix}.res | \\
+                grep -E -o '^[[:alnum:]]+\\-*\\.*[0-9]+' > $res_dir${params.fs}${prefix}_template_hits.txt || true
+        fi
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            kma: \$( kma -v | sed -e 's%KMA-%%' )
+        END_VERSIONS
+
+        mkdirver=""
+        cutver=""
+        grepver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            mkdirver=\$( mkdir --help 2>&1 | sed -e '1!d; s/ (.*\$//' |  cut -f1-2 -d' ' )
+            cutver="\$mkdirver"
+            grepver="\$mkdirver"
+        else
+            mkdirver=\$( mkdir --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            cutver=\$( cut --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            grepver=\$( echo \$(grep --version 2>&1) | sed 's/^.*(GNU grep) //; s/ Copyright.*\$//' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            mkdir: \$mkdirver
+            cut: \$cutver
+            grep: \$grepver
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/index/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,86 @@
+# NextFlow DSL2 Module
+
+```bash
+KMA_INDEX
+```
+
+## Description
+
+Run `kma index` alinger on input FASTA files.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a FASTA file of type `path` (`fasta`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+    ],
+    '/path/to/FAL00870_contigs.fasta'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTA file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870'
+]
+```
+
+\
+&nbsp;
+
+#### `fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to the FASTA file on which the `kma index` command should be run.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `kma index` files.
+
+\
+&nbsp;
+
+#### `idx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the folder containing `kma index` files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/index/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,57 @@
+process KMA_INDEX {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}kma${params.fs}1.4.4" : null)
+    conda (params.enable_conda ? "conda-forge::libgcc-ng bioconda::kma=1.4.3 conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/kma:1.4.3--h7132678_1':
+        'quay.io/biocontainers/kma:1.4.3--h7132678_1' }"
+
+    input:
+        tuple val(meta), path(fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_kma_idx"), emit: idx
+        path "versions.yml"                        , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}_kma_idx"
+        def add_to_db = (meta.kmaindex_t_db ? "-t_db ${meta.kmaindex_t_db}" : '')
+        """
+        mkdir -p $prefix && cd $prefix || exit 1
+        kma \\
+            index \\
+            $args \\
+            $add_to_db \\
+            -i ../$fasta \\
+            -o $prefix
+        cd .. || exit 1
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            kma: \$( kma -v | sed -e 's%KMA-%%' )
+        END_VERSIONS
+
+        mkdirver=""
+        cutver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            mkdirver=\$( mkdir --help 2>&1 | sed -e '1!d; s/ (.*\$//' |  cut -f1-2 -d' ' )
+            cutver="\$mkdirver"
+        else
+            mkdirver=\$( mkdir --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            cutver=\$( cut --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            mkdir: \$mkdirver
+            cut: \$cutver
+            cd: \$( bash --version 2>&1 | sed '1!d; s/^.*version //; s/ (.*\$//' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/krona/ktimporttext/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,39 @@
+process KRONA_KTIMPORTTEXT {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}krona${params.fs}2.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::curl bioconda::krona=2.8.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/krona:2.8.1--pl5321hdfd78af_1':
+        'quay.io/biocontainers/krona:2.8.1--pl5321hdfd78af_1' }"
+
+    input:
+        tuple val(meta), path(report)
+
+    output:
+        tuple val(meta), path ('*.html'), emit: html
+        path "versions.yml"             , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def krona_suffix = params.krona_res_suffix ?: '.krona.tsv'
+        def reports = report.collect {
+            it = it.toString() + ',' + it.toString().replaceAll(/(.*)${krona_suffix}$/, /$1/)
+        }.sort().join(' ')
+        """
+        ktImportText  \\
+            $args \\
+            -o ${prefix}.html \\
+            $reports
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            krona: \$( echo \$(ktImportText 2>&1) | sed 's/^.*KronaTools //g; s/- ktImportText.*\$//g')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/multiqc/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,67 @@
+# NextFlow DSL2 Module
+
+```bash
+MULTIQC
+```
+
+## Description
+
+Generate an aggregated [**MultiQC**](https://multiqc.info/) report. This particular module **will only work** within the framework of `cpipes` as in, it uses many `cpipes` related UNIX absolute paths to store and retrieve **MultiQC** related configration files and `cpipes` context aware metadata. It also uses a custom logo with filename `FDa-Logo-Blue---medium-01.png` which should be located inside an `assets` folder from where the NextFlow script including this module will be executed.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in NextFlow input type of `path` which points to many log files that **MultiQC** should parse.
+
+Ex:
+
+```groovy
+[ '/data/sample1/centrifuge/cent_output.txt', '/data/sample1/kraken/kraken_output.txt'] ]
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+#### `report`
+
+Type: `path`
+
+Outputs a NextFlow output type of `path` pointing to the location of **MultiQC** final HTML report.
+
+\
+&nbsp;
+
+#### `data`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the data files folder generated by **MultiQC** which were used to generate plots and HTML report.
+
+\
+&nbsp;
+
+#### `plots`
+
+Type: `path`
+Optional: `true`
+
+NextFlow output type of `path` pointing to the plots folder generated by **MultiQC**.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/multiqc/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,51 @@
+process MULTIQC {
+    label 'process_micro'
+    tag 'MultiQC'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}multiqc${params.fs}1.19" : null)
+    conda (params.enable_conda ? 'conda-forge::python=3.11 conda-forge::spectra conda-forge::lzstring conda-forge::imp bioconda::multiqc=1.19' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.19--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.19--pyhdfd78af_0' }"
+
+    input:
+        path multiqc_files
+
+    output:
+        path "*multiqc*"
+        path "*multiqc_report.html", emit: report
+        path "*_data"              , emit: data
+        path "*_plots"             , emit: plots, optional: true
+        path "versions.yml"        , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        """
+        cp ${params.projectconf}${params.fs}multiqc${params.fs}${params.pipeline}_mqc.yml cpipes_mqc_config.yml
+        sed -i -e 's/Workflow_Name_Placeholder/${params.pipeline}/g; s/Workflow_Version_Placeholder/${params.workflow_version}/g' cpipes_mqc_config.yml
+        sed -i -e 's/CPIPES_Version_Placeholder/${workflow.manifest.version}/g; s%Workflow_Output_Placeholder%${params.output}%g' cpipes_mqc_config.yml
+        sed -i -e 's%Workflow_Input_Placeholder%${params.input}%g' cpipes_mqc_config.yml
+
+        multiqc --interactive -c cpipes_mqc_config.yml -f $args .
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" )
+        END_VERSIONS
+
+        sedver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            sedver=\$( sed --help 2>&1 | sed -e '1!d; s/ (.*\$//' )
+        else
+            sedver=\$( echo \$(sed --version 2>&1) | sed 's/^.*(GNU sed) //; s/ Copyright.*\$//' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            sed: \$sedver
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/nowayout_results/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,45 @@
+process NOWAYOUT_RESULTS {
+    tag "nowayout aggregate"
+    label "process_pico"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? 'conda-forge::python=3.11 conda-forge::spectra conda-forge::lzstring conda-forge::imp bioconda::multiqc=1.19' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.19--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.19--pyhdfd78af_0' }"
+
+    input:
+        path pass_and_fail_rel_abn_files
+        path lineage_csv
+
+    output:
+        path '*.tblsum.txt', emit: mqc_txt, optional: true
+        path '*_mqc.json'  , emit: mqc_json, optional: true
+        path '*_mqc.yml'   , emit: mqc_yml, optional: true
+        path '*.tsv'       , emit: tsv, optional: true
+        path 'versions.yml', emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        """
+        gen_salmon_tph_and_krona_tsv.py \\
+            $args \\
+            -sal "." \\
+            -smres "." \\
+            -lin $lineage_csv
+
+        create_mqc_data_table.py \\
+            "nowayout" "The results shown here are <code>salmon quant</code> TPM values scaled down by a factor of ${params.gsalkronapy_sf}."
+
+        create_mqc_data_table.py \\
+            "nowayout_indiv_reads_mapped" "The results shown here are the number of reads mapped (post threshold filters) per taxon to the <code>nowayout</code>'s custom <code>${params.db_mode}</code> database for each sample."
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/otf_genome/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,39 @@
+process OTF_GENOME {
+    tag "$meta.id"
+    label "process_nano"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.10.4" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.10.4' :
+        'quay.io/biocontainers/python:3.10.4' }"
+
+    input:
+        tuple val(meta), path(kma_hits), path(kma_fragz)
+
+    output:
+        tuple val(meta), path('*_scaffolded_genomic.fna.gz'), emit: genomes_fasta, optional: true
+        tuple val(meta), path('*_aln_reads.fna.gz')         , emit: reads_extracted, optional: true
+        path '*FAILED.txt'                                  , emit: failed, optional: true
+        path 'versions.yml'                                 , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        args += (kma_hits ? " -txt ${kma_hits}" : '')
+        args += (params.tuspy_gd ? " -gd ${params.tuspy_gd}" : '')
+        args += (prefix ? " -op ${prefix}" : '')
+
+        """
+        gen_otf_genome.py \\
+            $args
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/index/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,88 @@
+# NextFlow DSL2 Module
+
+```bash
+SALMON_INDEX
+```
+
+## Description
+
+Run `salmon index` command on input FASTA file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a FASTA file of type `path` (`genome_fasta`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870'
+    ],
+    [
+        '/hpc/scratch/test/FAL00870_contigs.fasta',
+    ]
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the genome FASTA file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870'
+]
+```
+
+\
+&nbsp;
+
+#### `genome_fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to the FASTA file (gzipped or unzipped) on which `salmon index` should be run.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `salmon index` result files.
+
+\
+&nbsp;
+
+#### `idx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `salmon index` result files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/index/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,40 @@
+process SALMON_INDEX {
+    tag "$meta.id"
+    label "process_micro"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}salmon${params.fs}1.10.0" : null)
+    conda (params.enable_conda ? 'conda-forge::libgcc-ng bioconda::salmon=1.10.1' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/salmon:1.10.1--h7e5ed60_1' :
+        'quay.io/biocontainers/salmon:1.10.1--h7e5ed60_1' }"
+
+    input:
+        tuple val(meta), path(genome_fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_salmon_idx"), emit: idx
+        path "versions.yml"                           , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}_salmon_idx"
+        def decoys_file = file( meta.salmon_decoys )
+        def decoys = !("${decoys_file.simpleName}" ==~ 'dummy_file.*') && decoys_file.exits() ? "--decoys ${meta.salmon_decoys}" : ''
+        """
+        salmon \\
+            index \\
+            $decoys \\
+            --threads $task.cpus \\
+            $args \\
+            --index $prefix \\
+            --transcripts $genome_fasta
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            salmon: \$(echo \$(salmon --version) | sed -e "s/salmon //g")
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/quant/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,118 @@
+# NextFlow DSL2 Module
+
+```bash
+SALMON_QUANT
+```
+
+## Description
+
+Run `salmon quant` in `reads` or `alignments` mode. The inputs can be either the alignment (Ex: `.bam`) files or read (Ex: `.fastq.gz`) files.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and either an alignment file or reads file and a `salmon index` or a transcript FASTA file per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+        strandedness: 'unstranded',
+        single_end: true
+    ],
+    [
+        '/hpc/scratch/test/FAL00870_R1.fastq.gz'
+    ],
+    [
+        '/hpc/scratch/test/salmon_idx_for_FAL00870'
+    ]
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the input setup for `salmon quant`.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads_or_bam`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to either an alignment file (Ex: `.bam`) or a reads file (Ex: `.fastq.gz`) on which `salmon quant` should be run.
+
+\
+&nbsp;
+
+#### `index_or_tr_fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to either a folder containing `salmon index` files or a trasnscript FASTA file.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SALMON_QUANT' {
+    ext.args = '--vbPrior 0.02'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `salmon quant` result files.
+
+\
+&nbsp;
+
+#### `results`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `salmon quant` result files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/quant/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,75 @@
+process SALMON_QUANT {
+    tag "$meta.id"
+    label "process_micro"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}salmon${params.fs}1.10.0" : null)
+    conda (params.enable_conda ? 'conda-forge::libgcc-ng bioconda::salmon=1.10.1' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/salmon:1.10.1--h7e5ed60_1' :
+        'quay.io/biocontainers/salmon:1.10.1--h7e5ed60_1' }"
+    input:
+        tuple val(meta), path(reads_or_bam), path(index_or_tr_fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_salmon_res"), emit: results
+        path  "versions.yml"                          , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args   ?: ''
+        def prefix   = task.ext.prefix ?: "${meta.id}_salmon_res"
+        def reference   = "--index $index_or_tr_fasta"
+        def lib_type = (meta.salmon_lib_type ?: '')
+        def alignment_mode = (meta.salmon_alignment_mode ?: '')
+        def gtf = (meta.salmon_gtf ? "--geneMap ${meta.salmon_gtf}" : '')
+        def input_reads =(meta.single_end || !reads_or_bam[1] ? "-r $reads_or_bam" : "-1 ${reads_or_bam[0]} -2 ${reads_or_bam[1]}")
+
+        // Use path(reads_or_bam) to point to BAM and path(index_or_tr_fasta) to point to transcript fasta
+        // if using salmon DSL2 module in alignment-based mode.
+        // By default, this module will be run in selective-alignment-based mode of salmon.
+        if (alignment_mode) {
+            reference   = "-t $index_or_tr_fasta"
+            input_reads = "-a $reads_or_bam"
+        }
+
+        def strandedness_opts = [
+            'A', 'U', 'SF', 'SR',
+            'IS', 'IU' , 'ISF', 'ISR',
+            'OS', 'OU' , 'OSF', 'OSR',
+            'MS', 'MU' , 'MSF', 'MSR'
+        ]
+
+        def strandedness =  'A'
+
+        if (lib_type) {
+            if (strandedness_opts.contains(lib_type)) {
+                strandedness = lib_type
+            } else {
+                log.info "[Salmon Quant] Invalid library type specified '--libType=${lib_type}', defaulting to auto-detection with '--libType=A'."
+            }
+        } else {
+            strandedness = meta.single_end ? 'U' : 'IU'
+            if (meta.strandedness == 'forward') {
+                strandedness = meta.single_end ? 'SF' : 'ISF'
+            } else if (meta.strandedness == 'reverse') {
+                strandedness = meta.single_end ? 'SR' : 'ISR'
+            }
+        }
+        """
+        salmon quant \\
+            --threads $task.cpus \\
+            --libType=$strandedness \\
+            $gtf \\
+            $args \\
+            -o $prefix \\
+            $reference \\
+            $input_reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            salmon: \$(echo \$(salmon --version) | sed -e "s/salmon //g")
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samplesheet_check/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,55 @@
+# NextFlow DSL2 Module
+
+```bash
+SAMPLESHEET_CHECK
+```
+
+## Description
+
+Checks the validity of the sample sheet in CSV format to make sure there are required mandatory fields. This module generally succeeds `GEN_SAMPLESHEET` module as part of the `cpipes` pipelines to make sure that all fields of the columns are properly formatted to be used as Groovy Map for `meta` which is of input type `val`. This module requires the `check_samplesheet.py` script to be present in the `bin` folder from where the NextFlow script including this module will be executed
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in the absolute UNIX path to the sample sheet in CSV format (`samplesheet`).
+
+Ex:
+
+```groovy
+'/hpc/scratch/test/reads/output/gen_samplesheet/autogen_samplesheet.csv'
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `path`
+
+NextFlow output of type `path` pointing to properly formatted CSV sample sheet (`csv`).
+
+\
+&nbsp;
+
+#### `csv`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to auto-generated CSV sample sheet for all FASTQ files present in the folder given by NextFlow input type of `val` (`inputdir`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samplesheet_check/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,32 @@
+process SAMPLESHEET_CHECK {
+    tag "$samplesheet"
+    label "process_femto"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9.5" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.9--1' :
+        'quay.io/biocontainers/python:3.9--1' }"
+
+    input:
+        path samplesheet
+
+    output:
+        path '*.csv'       , emit: csv
+        path "versions.yml", emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script: // This script is bundled with the pipeline, in nf-core/rnaseq/bin/
+        """
+        check_samplesheet.py \\
+            $samplesheet \\
+            samplesheet.valid.csv
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samtools/fastq/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,53 @@
+process SAMTOOLS_FASTQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}samtools${params.fs}1.13" : null)
+    conda (params.enable_conda ? "bioconda::samtools=1.18 bioconda::htslib=1.18 conda-forge::bzip2" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/samtools:1.18--h50ea8bc_1' :
+        'quay.io/biocontainers/samtools:1.18--h50ea8bc_1' }"
+
+    input:
+        tuple val(meta), path(input)
+        val(interleave)
+
+    output:
+        tuple val(meta), path("*_{1,2}.fastq.gz")      , optional:true, emit: fastq
+        tuple val(meta), path("*_{1,2}.fastq.gz")      , optional:true, emit: mapped_refs
+        tuple val(meta), path("*_interleaved.fastq")   , optional:true, emit: interleaved
+        tuple val(meta), path("*_singleton.fastq.gz")  , optional:true, emit: singleton
+        tuple val(meta), path("*_other.fastq.gz")      , optional:true, emit: other
+        path  "versions.yml"                           , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def output = ( interleave && ! meta.single_end ) ? "> ${prefix}_interleaved.fastq" :
+            meta.single_end ? "-1 ${prefix}_1.fastq.gz -s ${prefix}_singleton.fastq.gz" :
+            "-1 ${prefix}_1.fastq.gz -2 ${prefix}_2.fastq.gz -s ${prefix}_singleton.fastq.gz"
+        """
+        samtools \\
+            fastq \\
+            $args \\
+            --threads ${task.cpus-1} \\
+            -0 ${prefix}_other.fastq.gz \\
+            $input \\
+            $output
+
+        samtools \\
+            view \\
+            $args2 \\
+            --threads ${task.cpus-1} \\
+            $input \\
+            | grep -v '*' | cut -f3 | sort -u > mapped_refs.txt
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/grep/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,113 @@
+# NextFlow DSL2 Module
+
+```bash
+SEQKIT_GREP
+```
+
+## Description
+
+Run `seqkit grep` command on reads in FASTQ format. Produces a filtered FASTQ file as per the filter strategy in the supplied input file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `seqkit grep` should be run.
+
+\
+&nbsp;
+
+#### `pattern_file`
+
+Type: path
+
+NextFlow input type of `path` pointing to the pattern file which has the patterns, one per line, by which FASTQ sequence ids should be searched and whose reads will be extracted.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SEQKIT_GREP' {
+    ext.args = '--only-positive-strand'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and and filtered gzipped FASTQ file.
+
+\
+&nbsp;
+
+#### `fastx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the FASTQ format filtered gzipped file per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/grep/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,90 @@
+process SEQKIT_GREP {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}seqkit${params.fs}2.2.0" : null)
+    conda (params.enable_conda ? "bioconda::seqkit=2.2.0 conda-forge::sed=4.7 conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/seqkit:2.1.0--h9ee0642_0':
+        'quay.io/biocontainers/seqkit:2.1.0--h9ee0642_0' }"
+
+    input:
+    tuple val(meta), path(reads), path(pattern_file)
+
+    output:
+    tuple val(meta), path("*.gz"), emit: fastx
+    path "versions.yml"          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def num_read_files = reads.toList().size()
+    def extension = "fastq"
+    if ("$reads" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) {
+        extension = "fasta"
+    }
+
+    if (meta.single_end || num_read_files == 1) {
+        """
+        pattern_file_contents=\$(sed '1!d' $pattern_file)
+        if [ "\$pattern_file_contents" != "DuMmY" ]; then
+            cut -f1 -d " " $pattern_file > ${prefix}.seqids.txt
+            additional_args="-f ${prefix}.seqids.txt $args"
+        else
+            additional_args="$args"
+        fi
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            $reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    } else {
+        """
+        pattern_file_contents=\$(sed '1!d' $pattern_file)
+        if [ "\$pattern_file_contents" != "DuMmY" ]; then
+            additional_args="-f $pattern_file $args"
+        else
+            additional_args="$args"
+        fi
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.R1.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            ${reads[0]}
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.R2.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            ${reads[1]}
+
+        seqkit \\
+            pair \\
+            -j $task.cpus \\
+            -1 ${prefix}.R1.seqkit-grep.${extension}.gz \\
+            -2 ${prefix}.R2.seqkit-grep.${extension}.gz
+
+        rm ${prefix}.R1.seqkit-grep.${extension}.gz
+        rm ${prefix}.R2.seqkit-grep.${extension}.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    }
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/seq/README.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,104 @@
+# NextFlow DSL2 Module
+
+```bash
+SEQKIT_SEQ
+```
+
+## Description
+
+Run `seqkit seq` command on reads in FASTQ format. Produces a filtered FASTQ file as per the filter strategy mentioned using the `ext.args` within the process scope.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `seqkit seq` should be run.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SEQKIT_SEQ' {
+    ext.args = '--max-len 4000'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and filtered gzipped FASTQ file.
+
+\
+&nbsp;
+
+#### `fastx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the FASTQ format filtered gzipped file per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/seq/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,75 @@
+process SEQKIT_SEQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}seqkit${params.fs}2.2.0" : null)
+    conda (params.enable_conda ? "bioconda::seqkit=2.2.0" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/seqkit:2.1.0--h9ee0642_0':
+        'quay.io/biocontainers/seqkit:2.1.0--h9ee0642_0' }"
+
+    input:
+    tuple val(meta), path(reads)
+
+    output:
+    tuple val(meta), path("*.gz"), emit: fastx
+    path "versions.yml"          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+
+    def extension = "fastq"
+    if ("$reads" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) {
+        extension = "fasta"
+    }
+
+    if (meta.single_end) {
+        """
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.seqkit-seq.${extension}.gz \\
+            $args \\
+            $reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    } else {
+        """
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.R1.seqkit-seq.${extension}.gz \\
+            $args \\
+            ${reads[0]}
+
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.R2.seqkit-seq.${extension}.gz \\
+            $args \\
+            ${reads[1]}
+
+        seqkit \\
+            pair \\
+            -j $task.cpus \\
+            -1 ${prefix}.R1.seqkit-seq.${extension}.gz \\
+            -2 ${prefix}.R2.seqkit-seq.${extension}.gz
+
+        rm ${prefix}.R1.seqkit-seq.${extension}.gz
+        rm ${prefix}.R2.seqkit-seq.${extension}.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    }
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/gather/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,60 @@
+process SOURMASH_GATHER {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(signature), path(database)
+    val save_unassigned
+    val save_matches_sig
+    val save_prefetch
+    val save_prefetch_csv
+
+    output:
+    tuple val(meta), path("*_hits.csv")          , emit: result       , optional: true
+    tuple val(meta), path("*_unassigned.sig.zip"), emit: unassigned   , optional: true
+    tuple val(meta), path("*_matches.sig.zip")   , emit: matches      , optional: true
+    tuple val(meta), path("*_prefetch.sig.zip")  , emit: prefetch     , optional: true
+    tuple val(meta), path("*_prefetch.csv.gz")   , emit: prefetchcsv  , optional: true
+    tuple val(meta), path("*FAILED.txt")         , emit: failed       , optional: true
+    path "versions.yml"                          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args        = task.ext.args ?: ''
+    def args2       = task.ext.args2 ?: ''
+    def prefix      = task.ext.prefix ?: "${meta.id}"
+    def unassigned  = save_unassigned   ? "--output-unassigned ${prefix}_unassigned.sig.zip" : ''
+    def matches     = save_matches_sig  ? "--save-matches ${prefix}_matches.sig.zip"         : ''
+    def prefetch    = save_prefetch     ? "--save-prefetch ${prefix}_prefetch.sig.zip"       : ''
+    def prefetchcsv = save_prefetch_csv ? "--save-prefetch-csv ${prefix}_prefetch.csv.gz"    : ''
+
+    """
+    sourmash gather \\
+        $args \\
+        --output ${prefix}.csv.gz \\
+        ${unassigned} \\
+        ${matches} \\
+        ${prefetch} \\
+        ${prefetchcsv} \\
+        ${signature} \\
+        ${database}
+
+    sourmash_filter_hits.py \\
+        $args2 \\
+        -csv ${prefix}.csv.gz
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/search/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,55 @@
+process SOURMASH_SEARCH {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(signature), path(database)
+    val save_matches_sig
+
+    output:
+    tuple val(meta), path("*.csv.gz")                   , emit: result       , optional: true
+    tuple val(meta), path("*_scaffolded_genomic.fna.gz"), emit: genomes_fasta, optional: true
+    tuple val(meta), path("*_matches.sig.zip")          , emit: matches      , optional: true
+    path "*FAILED.txt"                                  , emit: failed       , optional: true
+    path "versions.yml"                                 , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args        = task.ext.args ?: ''
+    def args2       = task.ext.args2 ?: ''
+    def prefix      = task.ext.prefix ?: "${meta.id}"
+    def matches     = save_matches_sig  ? "--save-matches ${prefix}_matches.sig.zip" : ''
+    def gd          = params.tuspy_gd   ? "-gd ${params.tuspy_gd}"                   : ''
+
+    """
+    sourmash search \\
+        $args \\
+        --output ${prefix}.csv.gz \\
+        ${matches} \\
+        ${signature} \\
+        ${database}
+
+    sourmash_filter_hits.py \\
+        $args2 \\
+        -csv ${prefix}.csv.gz
+
+    gen_otf_genome.py \\
+        $gd \\
+        -op ${prefix} \\
+        -txt ${prefix}_template_hits.txt
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/sketch/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,43 @@
+process SOURMASH_SKETCH {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(sequence)
+    val singleton
+    val merge
+    val db_or_query
+
+    output:
+    tuple val(meta), path("*.{query,db}.sig"), emit: signatures
+    path "versions.yml"                      , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    // required defaults for the tool to run, but can be overridden
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def merge_sig = merge ? "--merge ${meta.id}" : ''
+    def singleton = singleton ? '--singleton' : ''
+    """
+    sourmash sketch \\
+        $args \\
+        $merge_sig \\
+        $singleton \\
+        --output "${prefix}.${db_or_query}.sig" \\
+        $sequence
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/tax/metagenome/main.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,40 @@
+process SOURMASH_TAX_METAGENOME {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(csv), path(lineage)
+
+    output:
+    tuple val(meta), path("*.txt"), emit: txt, optional: true
+    tuple val(meta), path("*.tsv"), emit: tsv, optional: true
+    tuple val(meta), path("*.csv"), emit: csv, optional: true
+    path "versions.yml"           , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    // required defaults for the tool to run, but can be overridden
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def output_format = args.findAll(/(--output-format\s+[\w\,]+)\s*/).join("").replaceAll(/\,/, / --output-format /)
+    args = args.replaceAll(/--output-format\s+[\w\,]+\s*/, /${output_format}/)
+    """
+    sourmash tax metagenome \\
+        $args \\
+        -g $csv \\
+        --output-base $prefix \\
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/nextflow.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,41 @@
+// Main driver script
+manifest.mainScript = 'cpipes'
+
+def fs = File.separator
+def pd = "${projectDir}"
+
+// Global parameters
+includeConfig "${pd}${fs}conf${fs}manifest.config"
+includeConfig "${pd}${fs}conf${fs}base.config"
+
+// Include FASTQ config to prepare for a case when the entry point is
+// FASTQ metadata CSV or FASTQ input directory
+includeConfig "${pd}${fs}conf${fs}fastq.config"
+
+if (params.pipeline != null) {
+    try {
+        includeConfig "${params.workflowsconf}${fs}${params.pipeline}.config"
+    } catch (Exception e) {
+        System.err.println('-'.multiply(params.linewidth) + "\n" +
+            "\033[0;31m${params.cfsanpipename} - ERROR\033[0m\n" +
+            '-'.multiply(params.linewidth) + "\n" + "\033[0;31mCould not load " +
+            "default pipeline configuration. Please provide a pipeline \n" +
+            "name using the --pipeline option.\n\033[0m" + '-'.multiply(params.linewidth) + "\n")
+        System.exit(1)
+    }
+}
+
+// Include modules' config last.
+includeConfig "${pd}${fs}conf${fs}logtheseparams.config"
+includeConfig "${pd}${fs}conf${fs}modules.config"
+
+// Nextflow runtime profiles
+conda.cacheDir = "${pd}${fs}kondagac_cache"
+singularity.cacheDir = "${pd}${fs}cingularitygac_cache"
+
+// Clean up after successfull run
+// cleanup = true
+
+profiles {
+    includeConfig "${pd}${fs}conf${fs}computeinfra.config"
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/readme/centriflaken.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,276 @@
+# CPIPES (CFSAN PIPELINES)
+
+## The modular pipeline repository at CFSAN, FDA
+
+**CPIPES** (CFSAN PIPELINES) is a collection of modular pipelines based on **NEXTFLOW**,
+mostly for bioinformatics data analysis at **CFSAN, FDA.**
+
+---
+
+### **centriflaken**
+
+---
+Precision long-read metagenomics sequencing for food safety by detection and assembly of Shiga toxin-producing Escherichia coli.
+
+#### Workflow Usage
+
+```bash
+module load cpipes/0.4.0
+
+cpipes --pipeline centriflaken [options]
+```
+
+Example: Run the default `centriflaken` pipeline with taxa of interest as *E. coli*.
+
+```bash
+cd /hpc/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes --pipeline centriflaken --input /path/to/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
+```
+
+Example: Run the `centriflaken` pipeline with taxa of interest as *Salmonella*. In this mode, `SerotypeFinder` tool will be replaced with `SeqSero2` tool.
+
+```bash
+cd /hpc/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes --pipeline centriflaken --centrifuge_extract_bug 'Salmonella' --input /path/to/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
+```
+
+#### `centriflaken` Help
+
+```text
+[Kranti.Konganti@login2-slurm ]$ cpipes --pipeline centriflaken --help
+N E X T F L O W  ~  version 21.12.1-edge
+Launching `/nfs/software/apps/cpipes/0.4.0/cpipes` [crazy_euler] - revision: 72db279311
+================================================================================
+             (o)
+  ___  _ __   _  _ __    ___  ___
+ / __|| '_ \ | || '_ \  / _ \/ __|
+| (__ | |_) || || |_) ||  __/\__ \
+ \___|| .__/ |_|| .__/  \___||___/
+      | |       | |
+      |_|       |_|
+--------------------------------------------------------------------------------
+A collection of modular pipelines at CFSAN, FDA.
+--------------------------------------------------------------------------------
+Name                            : CPIPES
+Author                          : Kranti.Konganti@fda.hhs.gov
+Version                         : 0.4.0
+Center                          : CFSAN, FDA.
+================================================================================
+
+Workflow                        : centriflaken
+
+Author                          : Kranti.Konganti@fda.hhs.gov
+
+Version                         : 0.2.1
+
+
+Usage                           : cpipes --pipeline centriflaken [options]
+
+
+Required                        :
+
+--input                         : Absolute path to directory containing FASTQ
+                                  files. The directory should contain only
+                                  FASTQ files as all the files within the
+                                  mentioned directory will be read. Ex: --
+                                  input /path/to/fastq_pass
+
+--output                        : Absolute path to directory where all the
+                                  pipeline outputs should be stored. Ex: --
+                                  output /path/to/output
+
+Other options                   :
+
+--metadata                      : Absolute path to metadata CSV file
+                                  containing five mandatory columns: sample,
+                                  fq1,fq2,strandedness,single_end. The fq1
+                                  and fq2 columns contain absolute paths to
+                                  the FASTQ files. This option can be used in
+                                  place of --input option. This is rare. Ex: --
+                                  metadata samplesheet.csv
+
+--fq_suffix                     : The suffix of FASTQ files (Unpaired reads
+                                  or R1 reads or Long reads) if an input
+                                  directory is mentioned via --input option.
+                                  Default: .fastq.gz
+
+--fq2_suffix                    : The suffix of FASTQ files (Paired-end reads
+                                  or R2 reads) if an input directory is
+                                  mentioned via --input option. Default:
+                                  false
+
+--fq_filter_by_len              : Remove FASTQ reads that are less than this
+                                  many bases. Default: 4000
+
+--fq_strandedness               : The strandedness of the sequencing run.
+                                  This is mostly needed if your sequencing
+                                  run is RNA-SEQ. For most of the other runs,
+                                  it is probably safe to use unstranded for
+                                  the option. Default: unstranded
+
+--fq_single_end                 : SINGLE-END information will be auto-
+                                  detected but this option forces PAIRED-END
+                                  FASTQ files to be treated as SINGLE-END so
+                                  only read 1 information is included in auto-
+                                  generated samplesheet. Default: false
+
+--fq_filename_delim             : Delimiter by which the file name is split
+                                  to obtain sample name. Default: _
+
+--fq_filename_delim_idx         : After splitting FASTQ file name by using
+                                  the --fq_filename_delim option, all
+                                  elements before this index (1-based) will
+                                  be joined to create final sample name.
+                                  Default: 1
+
+--kraken2_db                    : Absolute path to kraken database. Default: /
+                                  hpc/db/kraken2/standard-210914
+
+--kraken2_confidence            : Confidence score threshold which must be
+                                  between 0 and 1. Default: 0.0
+
+--kraken2_quick                 : Quick operation (use first hit or hits).
+                                  Default: false
+
+--kraken2_use_mpa_style         : Report output like Kraken 1's kraken-mpa-
+                                  report. Default: false
+
+--kraken2_minimum_base_quality  : Minimum base quality used in classification
+                                  which is only effective with FASTQ input.
+                                  Default: 0
+
+--kraken2_report_zero_counts    : Report counts for ALL taxa, even if counts
+                                  are zero. Default: false
+
+--kraken2_report_minmizer_data  : Report minimizer and distinct minimizer
+                                  count information in addition to normal
+                                  Kraken report. Default: false
+
+--kraken2_use_names             : Print scientific names instead of just
+                                  taxids. Default: true
+
+--kraken2_extract_bug           : Extract the reads or contigs beloging to
+                                  this bug. Default: Escherichia coli
+
+--centrifuge_x                  : Absolute path to centrifuge database.
+                                  Default: /hpc/db/centrifuge/2022-04-12/ab
+
+--centrifuge_save_unaligned     : Save SINGLE-END reads that did not align.
+                                  For PAIRED-END reads, save read pairs that
+                                  did not align concordantly. Default: false
+
+--centrifuge_save_aligned       : Save SINGLE-END reads that aligned. For
+                                  PAIRED-END reads, save read pairs that
+                                  aligned concordantly. Default: false
+
+--centrifuge_out_fmt_sam        : Centrifuge output should be in SAM. Default:
+                                  false
+
+--centrifuge_extract_bug        : Extract this bug from centrifuge results.
+                                  Default: Escherichia coli
+
+--centrifuge_ignore_quals       : Treat all quality values as 30 on Phred
+                                  scale. Default: false
+
+--flye_pacbio_raw               : Input FASTQ reads are PacBio regular CLR
+                                  reads (<20% error) Defaut: false
+
+--flye_pacbio_corr              : Input FASTQ reads are PacBio reads that
+                                  were corrected with other methods (<3%
+                                  error). Default: false
+
+--flye_pacbio_hifi              : Input FASTQ reads are PacBio HiFi reads (<1%
+                                  error). Default: false
+
+--flye_nano_raw                 : Input FASTQ reads are ONT regular reads,
+                                  pre-Guppy5 (<20% error). Default: true
+
+--flye_nano_corr                : Input FASTQ reads are ONT reads that were
+                                  corrected with other methods (<3% error).
+                                  Default: false
+
+--flye_nano_hq                  : Input FASTQ reads are ONT high-quality
+                                  reads: Guppy5+ SUP or Q20 (<5% error).
+                                  Default: false
+
+--flye_genome_size              : Estimated genome size (for example, 5m or 2.
+                                  6g). Default: 5.5m
+
+--flye_polish_iter              : Number of genome polishing iterations.
+                                  Default: false
+
+--flye_meta                     : Do a metagenome assembly (unenven coverage
+                                  mode). Default: true
+
+--flye_min_overlap              : Minimum overlap between reads. Default:
+                                  false
+
+--flye_scaffold                 : Enable scaffolding using assembly graph.
+                                  Default: false
+
+--serotypefinder_run            : Run SerotypeFinder tool. Default: true
+
+--serotypefinder_x              : Generate extended output files. Default:
+                                  true
+
+--serotypefinder_db             : Path to SerotypeFinder databases. Default: /
+                                  hpc/db/serotypefinder/2.0.2
+
+--serotypefinder_min_threshold  : Minimum percent identity (in float)
+                                  required for calling a hit. Default: 0.85
+
+--serotypefinder_min_cov        : Minumum percent coverage (in float)
+                                  required for calling a hit. Default: 0.80
+
+--seqsero2_run                  : Run SeqSero2 tool. Default: false
+
+--seqsero2_t                    : '1' for interleaved paired-end reads, '2'
+                                  for separated paired-end reads, '3' for
+                                  single reads, '4' for genome assembly, '5'
+                                  for nanopore reads (fasta/fastq). Default:
+                                  4
+
+--seqsero2_m                    : Which workflow to apply, 'a'(raw reads
+                                  allele micro-assembly), 'k'(raw reads and
+                                  genome assembly k-mer). Default: k
+
+--seqsero2_c                    : SeqSero2 will only output serotype
+                                  prediction without the directory containing
+                                  log files. Default: false
+
+--seqsero2_s                    : SeqSero2 will not output header in
+                                  SeqSero_result.tsv. Default: false
+
+--mlst_run                      : Run MLST tool. Default: true
+
+--mlst_minid                    : DNA %identity of full allelle to consider '
+                                  similar' [~]. Default: 95
+
+--mlst_mincov                   : DNA %cov to report partial allele at all [?].
+                                  Default: 10
+
+--mlst_minscore                 : Minumum score out of 100 to match a scheme.
+                                  Default: 50
+
+--abricate_run                  : Run ABRicate tool. Default: true
+
+--abricate_minid                : Minimum DNA %identity. Defaut: 90
+
+--abricate_mincov               : Minimum DNA %coverage. Defaut: 80
+
+--abricate_datadir              : ABRicate databases folder. Defaut: /hpc/db/
+                                  abricate/1.0.1/db
+
+Help options                    :
+
+--help                          : Display this message.
+```
+
+### **BETA**
+
+---
+The development of the modular structure and flow is an ongoing effort and may change depending on assessment of various computational topics and other considerations.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/readme/centriflaken_hy.md	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,367 @@
+# CPIPES (CFSAN PIPELINES)
+
+## The modular pipeline repository at CFSAN, FDA
+
+**CPIPES** (CFSAN PIPELINES) is a collection of modular pipelines based on **NEXTFLOW**,
+mostly for bioinformatics data analysis at **CFSAN, FDA.**
+
+---
+
+### **centriflaken_hy**
+
+---
+`centriflaken_hy` is a variant of the original `centriflaken` pipeline but for Illumina short reads either single-end or paired-end.
+
+#### Workflow Usage
+
+```bash
+module load cpipes/0.4.0
+
+cpipes --pipeline centriflaken_hy [options]
+```
+
+Example: Run the default `centriflaken_hy` pipeline with taxa of interest as *E. coli*.
+
+```bash
+cd /hpc/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes --pipeline centriflaken_hy --input /path/to/illumina/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
+```
+
+Example: Run the `centriflaken_hy` pipeline with taxa of interest as *Salmonella*. In this mode, `SerotypeFinder` tool will be replaced with `SeqSero2` tool.
+
+```bash
+cd /hpc/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes --pipeline centriflaken_hy --centrifuge_extract_bug 'Salmonella' --input /path/to/illumina/fastq/dir --output /path/to/output --user_email 'Kranti.Konganti@fda.hhs.gov'
+```
+
+#### `centriflaken_hy` Help
+
+```text
+[Kranti.Konganti@login2-slurm ]$ cpipes --pipeline centriflaken_hy --help
+N E X T F L O W  ~  version 21.12.1-edge
+Launching `/home/Kranti.Konganti/apps/cpipes/cpipes` [soggy_curie] - revision: 72db279311
+================================================================================
+             (o)
+  ___  _ __   _  _ __    ___  ___
+ / __|| '_ \ | || '_ \  / _ \/ __|
+| (__ | |_) || || |_) ||  __/\__ \
+ \___|| .__/ |_|| .__/  \___||___/
+      | |       | |
+      |_|       |_|
+--------------------------------------------------------------------------------
+A collection of modular pipelines at CFSAN, FDA.
+--------------------------------------------------------------------------------
+Name                            : CPIPES
+Author                          : Kranti.Konganti@fda.hhs.gov
+Version                         : 0.4.0
+Center                          : CFSAN, FDA.
+================================================================================
+
+Workflow                        : centriflaken_hy
+
+Author                          : Kranti.Konganti@fda.hhs.gov
+
+Version                         : 0.4.0
+
+
+Usage                           : cpipes --pipeline centriflaken_hy [options]
+
+
+Required                        :
+
+--input                         : Absolute path to directory containing FASTQ
+                                  files. The directory should contain only
+                                  FASTQ files as all the files within the
+                                  mentioned directory will be read. Ex: --
+                                  input /path/to/fastq_pass
+
+--output                        : Absolute path to directory where all the
+                                  pipeline outputs should be stored. Ex: --
+                                  output /path/to/output
+
+Other options                   :
+
+--metadata                      : Absolute path to metadata CSV file
+                                  containing five mandatory columns: sample,
+                                  fq1,fq2,strandedness,single_end. The fq1
+                                  and fq2 columns contain absolute paths to
+                                  the FASTQ files. This option can be used in
+                                  place of --input option. This is rare. Ex: --
+                                  metadata samplesheet.csv
+
+--fq_suffix                     : The suffix of FASTQ files (Unpaired reads
+                                  or R1 reads or Long reads) if an input
+                                  directory is mentioned via --input option.
+                                  Default: _R1_001.fastq.gz
+
+--fq2_suffix                    : The suffix of FASTQ files (Paired-end reads
+                                  or R2 reads) if an input directory is
+                                  mentioned via --input option. Default:
+                                  _R2_001.fastq.gz
+
+--fq_filter_by_len              : Remove FASTQ reads that are less than this
+                                  many bases. Default: 75
+
+--fq_strandedness               : The strandedness of the sequencing run.
+                                  This is mostly needed if your sequencing
+                                  run is RNA-SEQ. For most of the other runs,
+                                  it is probably safe to use unstranded for
+                                  the option. Default: unstranded
+
+--fq_single_end                 : SINGLE-END information will be auto-
+                                  detected but this option forces PAIRED-END
+                                  FASTQ files to be treated as SINGLE-END so
+                                  only read 1 information is included in auto-
+                                  generated samplesheet. Default: false
+
+--fq_filename_delim             : Delimiter by which the file name is split
+                                  to obtain sample name. Default: _
+
+--fq_filename_delim_idx         : After splitting FASTQ file name by using
+                                  the --fq_filename_delim option, all
+                                  elements before this index (1-based) will
+                                  be joined to create final sample name.
+                                  Default: 1
+
+--seqkit_rmdup_run              : Remove duplicate sequences using seqkit
+                                  rmdup. Default: false
+
+--seqkit_rmdup_n                : Match and remove duplicate sequences by
+                                  full name instead of just ID. Defaut: false
+
+--seqkit_rmdup_s                : Match and remove duplicate sequences by
+                                  sequence content. Defaut: true
+
+--seqkit_rmdup_d                : Save the duplicated sequences to a file.
+                                  Defaut: false
+
+--seqkit_rmdup_D                : Save the number and list of duplicated
+                                  sequences to a file. Defaut: false
+
+--seqkit_rmdup_i                : Ignore case while using seqkit rmdup.
+                                  Defaut: false
+
+--seqkit_rmdup_P                : Only consider positive strand (i.e. 5')
+                                  when comparing by sequence content. Defaut:
+                                  false
+
+--kraken2_db                    : Absolute path to kraken database. Default: /
+                                  hpc/db/kraken2/standard-210914
+
+--kraken2_confidence            : Confidence score threshold which must be
+                                  between 0 and 1. Default: 0.0
+
+--kraken2_quick                 : Quick operation (use first hit or hits).
+                                  Default: false
+
+--kraken2_use_mpa_style         : Report output like Kraken 1's kraken-mpa-
+                                  report. Default: false
+
+--kraken2_minimum_base_quality  : Minimum base quality used in classification
+                                  which is only effective with FASTQ input.
+                                  Default: 0
+
+--kraken2_report_zero_counts    : Report counts for ALL taxa, even if counts
+                                  are zero. Default: false
+
+--kraken2_report_minmizer_data  : Report minimizer and distinct minimizer
+                                  count information in addition to normal
+                                  Kraken report. Default: false
+
+--kraken2_use_names             : Print scientific names instead of just
+                                  taxids. Default: true
+
+--kraken2_extract_bug           : Extract the reads or contigs beloging to
+                                  this bug. Default: Escherichia coli
+
+--centrifuge_x                  : Absolute path to centrifuge database.
+                                  Default: /hpc/db/centrifuge/2022-04-12/ab
+
+--centrifuge_save_unaligned     : Save SINGLE-END reads that did not align.
+                                  For PAIRED-END reads, save read pairs that
+                                  did not align concordantly. Default: false
+
+--centrifuge_save_aligned       : Save SINGLE-END reads that aligned. For
+                                  PAIRED-END reads, save read pairs that
+                                  aligned concordantly. Default: false
+
+--centrifuge_out_fmt_sam        : Centrifuge output should be in SAM. Default:
+                                  false
+
+--centrifuge_extract_bug        : Extract this bug from centrifuge results.
+                                  Default: Escherichia coli
+
+--centrifuge_ignore_quals       : Treat all quality values as 30 on Phred
+                                  scale. Default: false
+
+--megahit_run                   : Run MEGAHIT assembler. Default: true
+
+--megahit_min_count             : <int>. Minimum multiplicity for filtering (
+                                  k_min+1)-mers. Defaut: false
+
+--megahit_k_list                : Comma-separated list of kmer size. All
+                                  values must be odd, in the range 15-255,
+                                  increment should be <= 28. Ex: '21,29,39,59,
+                                  79,99,119,141'. Default: false
+
+--megahit_no_mercy              : Do not add mercy k-mers. Default: false
+
+--megahit_bubble_level          : <int>. Intensity of bubble merging (0-2), 0
+                                  to disable. Default: false
+
+--megahit_merge_level           : <l,s>. Merge complex bubbles of length <= l*
+                                  kmer_size and similarity >= s. Default:
+                                  false
+
+--megahit_prune_level           : <int>. Strength of low depth pruning (0-3).
+                                  Default: false
+
+--megahit_prune_depth           : <int>. Remove unitigs with avg k-mer depth
+                                  less than this value. Default: false
+
+--megahit_low_local_ratio       : <float>. Ratio threshold to define low
+                                  local coverage contigs. Default: false
+
+--megahit_max_tip_len           : <int>. remove tips less than this value [<
+                                  int> * k]. Default: false
+
+--megahit_no_local              : Disable local assembly. Default: false
+
+--megahit_kmin_1pass            : Use 1pass mode to build SdBG of k_min.
+                                  Default: false
+
+--megahit_preset                : <str>. Override a group of parameters.
+                                  Valid values are meta-sensitive which
+                                  enforces '--min-count 1 --k-list 21,29,39,
+                                  49,...,129,141', meta-large (large &
+                                  complex metagenomes, like soil) which
+                                  enforces '--k-min 27 --k-max 127 --k-step
+                                  10'. Default: meta-sensitive
+
+--megahit_mem_flag              : <int>. SdBG builder memory mode. 0: minimum;
+                                  1: moderate; 2: use all memory specified.
+                                  Default: 2
+
+--megahit_min_contig_len        : <int>.  Minimum length of contigs to output.
+                                  Default: false
+
+--spades_run                    : Run SPAdes assembler. Default: false
+
+--spades_isolate                : This flag is highly recommended for high-
+                                  coverage isolate and multi-cell data.
+                                  Defaut: false
+
+--spades_sc                     : This flag is required for MDA (single-cell)
+                                  data. Default: false
+
+--spades_meta                   : This flag is required for metagenomic data.
+                                  Default: true
+
+--spades_bio                    : This flag is required for biosytheticSPAdes
+                                  mode. Default: false
+
+--spades_corona                 : This flag is required for coronaSPAdes mode.
+                                  Default: false
+
+--spades_rna                    : This flag is required for RNA-Seq data.
+                                  Default: false
+
+--spades_plasmid                : Runs plasmidSPAdes pipeline for plasmid
+                                  detection. Default: false
+
+--spades_metaviral              : Runs metaviralSPAdes pipeline for virus
+                                  detection. Default: false
+
+--spades_metaplasmid            : Runs metaplasmidSPAdes pipeline for plasmid
+                                  detection in metagenomics datasets. Default:
+                                  false
+
+--spades_rnaviral               : This flag enables virus assembly module
+                                  from RNA-Seq data. Default: false
+
+--spades_iontorrent             : This flag is required for IonTorrent data.
+                                  Default: false
+
+--spades_only_assembler         : Runs only the SPAdes assembler module (
+                                  without read error correction). Default:
+                                  false
+
+--spades_careful                : Tries to reduce the number of mismatches
+                                  and short indels in the assembly. Default:
+                                  false
+
+--spades_cov_cutoff             : Coverage cutoff value (a positive float
+                                  number). Default: false
+
+--spades_k                      : List of k-mer sizes (must be odd and less
+                                  than 128). Default: false
+
+--spades_hmm                    : Directory with custom hmms that replace the
+                                  default ones (very rare). Default: false
+
+--serotypefinder_run            : Run SerotypeFinder tool. Default: true
+
+--serotypefinder_x              : Generate extended output files. Default:
+                                  true
+
+--serotypefinder_db             : Path to SerotypeFinder databases. Default: /
+                                  hpc/db/serotypefinder/2.0.2
+
+--serotypefinder_min_threshold  : Minimum percent identity (in float)
+                                  required for calling a hit. Default: 0.85
+
+--serotypefinder_min_cov        : Minumum percent coverage (in float)
+                                  required for calling a hit. Default: 0.80
+
+--seqsero2_run                  : Run SeqSero2 tool. Default: false
+
+--seqsero2_t                    : '1' for interleaved paired-end reads, '2'
+                                  for separated paired-end reads, '3' for
+                                  single reads, '4' for genome assembly, '5'
+                                  for nanopore reads (fasta/fastq). Default:
+                                  4
+
+--seqsero2_m                    : Which workflow to apply, 'a'(raw reads
+                                  allele micro-assembly), 'k'(raw reads and
+                                  genome assembly k-mer). Default: k
+
+--seqsero2_c                    : SeqSero2 will only output serotype
+                                  prediction without the directory containing
+                                  log files. Default: false
+
+--seqsero2_s                    : SeqSero2 will not output header in
+                                  SeqSero_result.tsv. Default: false
+
+--mlst_run                      : Run MLST tool. Default: true
+
+--mlst_minid                    : DNA %identity of full allelle to consider '
+                                  similar' [~]. Default: 95
+
+--mlst_mincov                   : DNA %cov to report partial allele at all [?].
+                                  Default: 10
+
+--mlst_minscore                 : Minumum score out of 100 to match a scheme.
+                                  Default: 50
+
+--abricate_run                  : Run ABRicate tool. Default: true
+
+--abricate_minid                : Minimum DNA %identity. Defaut: 90
+
+--abricate_mincov               : Minimum DNA %coverage. Defaut: 80
+
+--abricate_datadir              : ABRicate databases folder. Defaut: /hpc/db/
+                                  abricate/1.0.1/db
+
+Help options                    :
+
+--help                          : Display this message.
+```
+
+### **BETA**
+
+---
+The development of the modular structure and flow is an ongoing effort and may change depending on assessment of various computational topics and other considerations.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/subworkflows/process_fastq.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,144 @@
+// Include any necessary methods and modules
+include { stopNow; validateParamsForFASTQ } from "${params.routines}"
+include { GEN_SAMPLESHEET                 } from "${params.modules}${params.fs}gen_samplesheet${params.fs}main"
+include { SAMPLESHEET_CHECK               } from "${params.modules}${params.fs}samplesheet_check${params.fs}main"
+include { CAT_FASTQ                       } from "${params.modules}${params.fs}cat${params.fs}fastq${params.fs}main"
+include { SEQKIT_SEQ                      } from "${params.modules}${params.fs}seqkit${params.fs}seq${params.fs}main"
+
+// Validate 4 required workflow parameters if
+// FASTQ files are the input for the
+// entry point.
+validateParamsForFASTQ()
+
+// Start the subworkflow
+workflow PROCESS_FASTQ {
+    main:
+        versions = Channel.empty()
+        input_ch = Channel.empty()
+        reads = Channel.empty()
+
+        def input = file( (params.input ?: params.metadata) )
+
+        if (params.input) {
+            def fastq_files = []
+
+            if (params.fq_suffix == null) {
+            stopNow("We need to know what suffix the FASTQ files ends with inside the\n" +
+                "directory. Please use the --fq_suffix option to indicate the file\n" +
+                "suffix by which the files are to be collected to run the pipeline on.")
+            }
+
+            if (params.fq_strandedness == null) {
+                stopNow("We need to know if the FASTQ files inside the directory\n" +
+                    "are sequenced using stranded or non-stranded sequencing. This is generally\n" +
+                    "required if the sequencing experiment is RNA-SEQ. For almost all of the other\n" +
+                    "cases, you can probably use the --fq_strandedness unstranded option to indicate\n" +
+                    "that the reads are unstranded.")
+            }
+
+            if (params.fq_filename_delim == null || params.fq_filename_delim_idx == null) {
+                stopNow("We need to know the delimiter of the filename of the FASTQ files.\n" +
+                    "By default the filename delimiter is _ (underscore). This delimiter character\n" +
+                    "is used to split and assign a group name. The group name can be controlled by\n" +
+                    "using the --fq_filename_delim_idx option (1-based). For example, if the FASTQ\n" +
+                    "filename is WT_REP1_001.fastq, then to create a group WT, use the following\n" +
+                    "options: --fq_filename_delim _ --fq_filename_delim_idx 1")
+            }
+
+            if (!input.exists()) {
+                stopNow("The input directory,\n${params.input}\ndoes not exist!")
+            }
+
+            input.eachFileRecurse {
+                it.name.endsWith("${params.fq_suffix}") ? fastq_files << it : fastq_files << null
+            }
+
+            if (fastq_files.findAll{ it != null }.size() == 0) {
+                stopNow("The input directory,\n${params.input}\nis empty! or does not " +
+                    "have FASTQ files ending with the suffix: ${params.fq_suffix}")
+            }
+
+            GEN_SAMPLESHEET( Channel.fromPath(params.input, type: 'dir') )
+            GEN_SAMPLESHEET.out.csv.set{ input_ch }
+            versions.mix( GEN_SAMPLESHEET.out.versions )
+                .set { versions }
+        } else if (params.metadata) {
+            if (!input.exists()) {
+                stopNow("The metadata CSV file,\n${params.metadata}\ndoes not exist!")
+            }
+
+            if (input.size() <= 0) {
+                stopNow("The metadata CSV file,\n${params.metadata}\nis empty!")
+            }
+
+            Channel.fromPath(params.metadata, type: 'file')
+                .set { input_ch }
+        }
+
+        SAMPLESHEET_CHECK( input_ch )
+            .csv
+            .splitCsv( header: true, sep: ',')
+            .map { create_fastq_channel(it) }
+            .groupTuple(by: [0])
+            .branch {
+                meta, fastq ->
+                    single   : fastq.size() == 1
+                        return [ meta, fastq.flatten() ]
+                    multiple : fastq.size() > 1
+                        return [ meta, fastq.flatten() ]
+            }
+            .set { reads }
+
+        CAT_FASTQ( reads.multiple )
+            .catted_reads
+            .mix( reads.single )
+            .set { processed_reads }
+
+        if (params.fq_filter_by_len.toInteger() > 0) {
+            SEQKIT_SEQ( processed_reads )
+                .fastx
+                .set { processed_reads }
+
+            versions.mix( SEQKIT_SEQ.out.versions.first().ifEmpty(null) )
+                .set { versions }
+        }
+
+        versions.mix(
+            SAMPLESHEET_CHECK.out.versions,
+            CAT_FASTQ.out.versions.first().ifEmpty(null)
+        )
+        .set { versions }
+
+    emit:
+        processed_reads
+        versions
+}
+
+// Function to get list of [ meta, [ fq1, fq2 ] ]
+def create_fastq_channel(LinkedHashMap row) {
+
+    def meta = [:]
+    meta.id           = row.sample
+    meta.single_end   = row.single_end.toBoolean()
+    meta.strandedness = row.strandedness
+    meta.id = meta.id.split(params.fq_filename_delim)[0..params.fq_filename_delim_idx.toInteger() - 1]
+        .join(params.fq_filename_delim)
+    meta.id = (meta.id =~ /\./ ? meta.id.take(meta.id.indexOf('.')) : meta.id)
+
+    def array = []
+
+    if (!file(row.fq1).exists()) {
+        stopNow("Please check input metadata CSV. The following Read 1 FASTQ file does not exist!" +
+            "\n${row.fq1}")
+    }
+    if (meta.single_end) {
+        array = [ meta, [ file(row.fq1) ] ]
+    } else {
+        if (!file(row.fq2).exists()) {
+            stopNow("Please check input metadata CSV. The following Read 2 FASTQ file does not exist!" +
+                "\n${row.fq2}")
+        }
+        array = [ meta, [ file(row.fq1), file(row.fq2) ] ]
+    }
+    return array
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/subworkflows/prodka.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,45 @@
+// Include any necessary methods and modules
+include { PRODIGAL                } from "${params.modules}${params.fs}prodigal${params.fs}main"
+include { PROKKA                  } from "${params.modules}${params.fs}prokka${params.fs}main"
+
+// Start the subworkflow
+workflow PRODKA {
+    take:
+        trained_asm
+        predict_asm
+
+    main:
+        PRODIGAL(
+            trained_asm,
+            (params.prodigal_f ?: 'gbk')
+        )
+
+        PROKKA(
+            predict_asm
+                .join(PRODIGAL.out.proteins)
+                .join(PRODIGAL.out.trained)
+        )
+
+        PRODIGAL.out.versions
+            .mix( PROKKA.out.versions )
+            .set{ versions }
+    emit:
+        prodigal_gene_annots     = PRODIGAL.out.gene_annotations
+        prodigal_fna             = PRODIGAL.out.cds
+        prodigal_faa             = PRODIGAL.out.proteins
+        prodigal_all_gene_annots = PRODIGAL.out.all_gene_annotations
+        prodigal_trained         = PRODIGAL.out.trained
+        prokka_gff               = PROKKA.out.gff
+        prokka_gbk               = PROKKA.out.gbk
+        prokka_fna               = PROKKA.out.fna
+        prokka_sqn               = PROKKA.out.sqn
+        prokka_ffn               = PROKKA.out.ffn
+        prokka_fsa               = PROKKA.out.fsa
+        prokka_faa               = PROKKA.out.faa
+        prokka_tbl               = PROKKA.out.tbl
+        prokka_err               = PROKKA.out.err
+        prokka_log               = PROKKA.out.log
+        prokka_txt               = PROKKA.out.txt
+        prokka_tsv               = PROKKA.out.tsv
+        versions
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/conf/nowayout.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,180 @@
+params {
+    workflow_conceived_by = 'Kranti Konganti'
+    workflow_built_by = 'Kranti Konganti'
+    workflow_version = '0.5.0'
+    db_mode = 'mitomine'
+    db_root = '/galaxy/cfsan-centriflaken-db/nowayout'
+    nowo_thresholds = 'strict'
+    fastp_run = true
+    fastp_failed_out = false
+    fastp_merged_out = false
+    fastp_overlapped_out = false
+    fastp_6 = false
+    fastp_reads_to_process = 0
+    fastp_fix_mgi_id = false
+    fastp_A = false
+    fastp_use_custom_adapters = false
+    fastp_adapter_fasta = (params.fastp_use_custom_adapters ? "${projectDir}"
+        + File.separator
+        + 'assets'
+        + File.separator
+        + 'adaptors.fa' : false)
+    fastp_f = 0
+    fastp_t = 0
+    fastp_b = 0
+    fastp_F = 0
+    fastp_T = 0
+    fastp_B = 0
+    fastp_dedup = true
+    fastp_dup_calc_accuracy = 6
+    fastp_poly_g_min_len = 10
+    fastp_G = true
+    fastp_x = false
+    fastp_poly_x_min_len = 10
+    fastp_cut_front = true
+    fastp_cut_tail = false
+    fastp_cut_right = true
+    fastp_W = 20
+    fastp_M = 30
+    fastp_q = 30
+    fastp_u = 40
+    fastp_n = 5
+    fastp_e = 0
+    fastp_l = 35
+    fastp_max_len = 0
+    fastp_y = true
+    fastp_Y = 30
+    fastp_U = false
+    fastp_umi_loc = false
+    fastp_umi_len = false
+    fastp_umi_prefix = false
+    fastp_umi_skip = false
+    fastp_p = true
+    fastp_P = 20
+    kmaalign_run = true
+    kmaalign_idx = ("${params.db_root}"
+        + File.separator
+        + "kma"
+        + File.separator
+        + "${params.db_mode}")
+    kmaalign_ignorequals = false
+    kmaalign_int = false
+    kmaalign_ef = false
+    kmaalign_vcf = false
+    kmaalign_sam = false
+    kmaalign_nc = true
+    kmaalign_na = true
+    kmaalign_nf = false
+    kmaalign_a = false
+    kmaalign_and = true
+    kmaalign_oa = false
+    kmaalign_bc = false
+    kmaalign_bcNano = false
+    kmaalign_bcd = false
+    kmaalign_bcg = false
+    kmaalign_ID = (params.nowo_thresholds =~ /strict|mild/ ? 85.0 : 50.0)
+    kmaalign_md = false
+    kmaalign_dense = false
+    kmaalign_ref_fsa = false
+    kmaalign_Mt1 = false
+    kmaalign_1t1 = false
+    kmaalign_mrs = (params.nowo_thresholds ==~ /strict/ ? 0.99 : 0.90)
+    kmaalign_mrc = (params.nowo_thresholds ==~ /strict/ ? 0.99 : 0.90)
+    kmaalign_mp = (params.nowo_thresholds ==~ /strict/ ? 30 : 20)
+    kmaalign_eq = (params.nowo_thresholds ==~ /strict/ ? 30 : 20)
+    kmaalign_mrs = (params.nowo_thresholds ==~ /mild/ ? 0.90 : params.kmaalign_mrs)
+    kmaalign_mrc = (params.nowo_thresholds ==~ /mild/ ? 0.90 : params.kmaalign_mrc)
+    kmaalign_mp = (params.nowo_thresholds ==~ /mild/ ? 20 : params.kmaalign_mp)
+    kmaalign_eq = (params.nowo_thresholds ==~ /mild/ ? 20 : params.kmaalign_eq)
+    kmaalign_mp = (params.kmaalign_ignorequals ? 0 : params.kmaalign_mp)
+    kmaalign_eq = (params.kmaalign_ignorequals ? 0 : params.kmaalign_eq)
+    kmaalign_mq = false
+    kmaalign_5p = false
+    kmaalign_3p = false
+    kmaalign_apm = false
+    kmaalign_cge = false
+    tuspy_gd = false
+    seqkit_grep_run = true
+    seqkit_grep_n = false
+    seqkit_grep_s = false
+    seqkit_grep_c = false
+    seqkit_grep_C = false
+    seqkit_grep_i = false
+    seqkit_grep_v = false
+    seqkit_grep_m = false
+    seqkit_grep_r = false
+    salmonidx_run = true
+    salmonidx_k = false
+    salmonidx_gencode = false
+    salmonidx_features = false
+    salmonidx_keepDuplicates = true
+    salmonidx_keepFixedFasta = false
+    salmonidx_filterSize = false
+    salmonidx_sparse = false
+    salmonidx_n = true
+    salmonidx_decoys = false
+    salmonalign_libtype = 'SF'
+    ref_fna = ("${params.db_root}"
+        + File.separator
+        + "reference"
+        + File.separator
+        + "${params.db_mode}"
+        + ".fna")
+    sourmash_k = (params.nowo_thresholds ==~ /strict/ ? 71 : 51)
+    sourmash_scale = (params.nowo_thresholds ==~ /strict/ ? 100 : 100)
+    sourmashsketch_run = true
+    sourmashsketch_mode = 'dna'
+    sourmashsketch_file = false
+    sourmashsketch_f = false
+    sourmashsketch_name = false
+    sourmashsketch_p = "'abund,scaled=${params.sourmash_scale},k=${params.sourmash_k}'"
+    sourmashsketch_randomize = false
+    sourmashgather_run = (params.sourmashsketch_run ?: false)
+    sourmashgather_n = false
+    sourmashgather_thr_bp = (params.nowo_thresholds ==~ /strict/ ? 100 : 100)
+    sourmashgather_ignoreabn = false
+    sourmashgather_prefetch = false
+    sourmashgather_noprefetch = false
+    sourmashgather_ani_ci = true
+    sourmashgather_k = "${params.sourmash_k}"
+    sourmashgather_protein = false
+    sourmashgather_rna = false
+    sourmashgather_nuc = false
+    sourmashgather_noprotein = false
+    sourmashgather_dayhoff = false
+    sourmashgather_nodayhoff = false
+    sourmashgather_hp = false
+    sourmashgather_nohp = false
+    sourmashgather_dna = true
+    sourmashgather_nodna = false
+    sourmashgather_scaled = false
+    sourmashgather_inc_pat = false
+    sourmashgather_exc_pat = false
+    sfhpy_run = true
+    sfhpy_fcn = 'f_match'
+    sfhpy_fcv = (params.nowo_thresholds ==~ /strict/ ? "0.8" : "0.5")
+    sfhpy_gt = true
+    sfhpy_lt = false
+    sfhpy_all = true
+    lineages_csv = ("${params.db_root}"
+        + File.separator
+        + "taxonomy"
+        + File.separator
+        + "${params.db_mode}"
+        + File.separator
+        + "lineages.csv")
+    gsalkronapy_run = true
+    gsalkronapy_sf = 10000
+    gsalkronapy_smres_suffix = false
+    gsalkronapy_failed_suffix = false
+    gsalkronapy_num_lin_cols = false
+    gsalkronapy_lin_regex = false
+    krona_ktIT_run = true
+    krona_ktIT_n = 'all'
+    krona_ktIT_q = false
+    krona_ktIT_c = false
+    krona_res_suffix = '.krona.tsv'
+    fq_filter_by_len = 0
+    fq_suffix = (params.fq_single_end ? '.fastq.gz' : '_R1_001.fastq.gz')
+    fq2_suffix = '_R2_001.fastq.gz'
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/conf/process/nowayout.process.config	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,121 @@
+process {
+    withName: 'SEQKIT_SEQ' {
+        ext.args = [
+            params.fq_filter_by_len ? "-m ${params.fq_filter_by_len}" : ''
+        ].join(' ').trim()
+    }
+
+    // withName: 'SAMTOOLS_FASTQ' {
+    //     ext.args = (params.fq_single_end ? '-F 4' : '-f 2')
+    // }
+
+    if (params.fastp_run) {
+        withName: 'FASTP' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}fastp.nf").fastpHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.kmaalign_run) {
+        withName: 'KMA_ALIGN' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}kmaalign.nf").kmaalignHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.seqkit_grep_run) {
+        withName: 'SEQKIT_GREP' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}seqkitgrep.nf").seqkitgrepHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.salmonidx_run){
+        withName: 'SALMON_INDEX' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}salmonidx.nf").salmonidxHelp(params).helpparams
+            )
+        }
+
+        withName: 'SALMON_QUANT' {
+            errorStrategy = 'ignore'
+            ext.args = '--minAssignedFrags 1'
+        }
+    }
+
+    if (params.sourmashsketch_run) {
+        withName: 'SOURMASH_SKETCH' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}sourmashsketch.nf").sourmashsketchHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.sourmashgather_run) {
+        withName: 'SOURMASH_GATHER' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}sourmashgather.nf").sourmashgatherHelp(params).helpparams
+            )
+
+            if (params.sfhpy_run) {
+                ext.args2 = addParamsToSummary(
+                    loadThisFunction("${params.toolshelp}${params.fs}sfhpy.nf").sfhpyHelp(params).helpparams
+                )
+            }
+        }
+    }
+
+    // if (params.sourmashtaxmetagenome_run) {
+    //     withName: 'SOURMASH_TAX_METAGENOME' {
+    //         ext.args = addParamsToSummary(
+    //             loadThisFunction("${params.toolshelp}${params.fs}sourmashtaxmetagenome.nf").sourmashtaxmetagenomeHelp(params).helpparams
+    //         )
+    //     }
+    // }
+
+    if (params.gsalkronapy_run) {
+        withName: 'NOWAYOUT_RESULTS' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}gsalkronapy.nf").gsalkronapyHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.krona_ktIT_run) {
+        withName: 'KRONA_KTIMPORTTEXT' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}kronaktimporttext.nf").kronaktimporttextHelp(params).helpparams
+            )
+        }
+    }
+}
+
+// Method to instantiate a new function parser
+// Need to refactor using ScriptParser... another day
+def loadThisFunction (func_file) {
+    GroovyShell grvy_sh = new GroovyShell()
+    def func = grvy_sh.parse(new File ( func_file ) )
+    return func
+}
+
+// Method to add relevant final parameters to summary log
+def addParamsToSummary(Map params_to_add = [:]) {
+
+    if (!params_to_add.isEmpty()) {
+        def not_null_params_to_add = params_to_add.findAll {
+            it.value.clivalue != null &&
+                it.value.clivalue != '[:]' &&
+                it.value.clivalue != ''
+        }
+
+        params.logtheseparams += not_null_params_to_add.keySet().toList()
+
+        return not_null_params_to_add.collect {
+            "${it.value.cliflag} ${it.value.clivalue.toString().replaceAll(/(?:^\s+|\s+$)/, '')}"
+        }.join(' ').trim()
+    }
+    return 1
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/nowayout.nf	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,340 @@
+// Define any required imports for this specific workflow
+import java.nio.file.Paths
+import java.util.zip.GZIPInputStream
+import java.io.FileInputStream
+import nextflow.file.FileHelper
+
+
+// Include any necessary methods
+include { \
+    summaryOfParams; stopNow; fastqEntryPointHelp; sendMail; \
+    addPadding; wrapUpHelp           } from "${params.routines}"
+include { fastpHelp                  } from "${params.toolshelp}${params.fs}fastp"
+include { kmaalignHelp               } from "${params.toolshelp}${params.fs}kmaalign"
+include { seqkitgrepHelp             } from "${params.toolshelp}${params.fs}seqkitgrep"
+include { salmonidxHelp              } from "${params.toolshelp}${params.fs}salmonidx"
+include { sourmashsketchHelp         } from "${params.toolshelp}${params.fs}sourmashsketch"
+include { sourmashgatherHelp         } from "${params.toolshelp}${params.fs}sourmashgather"
+include { sfhpyHelp                  } from "${params.toolshelp}${params.fs}sfhpy"
+include { gsalkronapyHelp            } from "${params.toolshelp}${params.fs}gsalkronapy"
+include { kronaktimporttextHelp      } from "${params.toolshelp}${params.fs}kronaktimporttext"
+
+// Exit if help requested before any subworkflows
+if (params.help) {
+    log.info help()
+    exit 0
+}
+
+
+// Include any necessary modules and subworkflows
+include { PROCESS_FASTQ           } from "${params.subworkflows}${params.fs}process_fastq"
+include { FASTP                   } from "${params.modules}${params.fs}fastp${params.fs}main"
+include { KMA_ALIGN               } from "${params.modules}${params.fs}kma${params.fs}align${params.fs}main"
+include { OTF_GENOME              } from "${params.modules}${params.fs}otf_genome${params.fs}main"
+include { SEQKIT_GREP             } from "${params.modules}${params.fs}seqkit${params.fs}grep${params.fs}main"
+include { SALMON_INDEX            } from "${params.modules}${params.fs}salmon${params.fs}index${params.fs}main"
+include { SALMON_QUANT            } from "${params.modules}${params.fs}salmon${params.fs}quant${params.fs}main"
+include { SOURMASH_SKETCH         } from "${params.modules}${params.fs}sourmash${params.fs}sketch${params.fs}main"
+include { SOURMASH_SKETCH \
+    as REDUCE_DB_IDX              } from "${params.modules}${params.fs}sourmash${params.fs}sketch${params.fs}main"
+include { SOURMASH_GATHER         } from "${params.modules}${params.fs}sourmash${params.fs}gather${params.fs}main"
+include { NOWAYOUT_RESULTS        } from "${params.modules}${params.fs}nowayout_results${params.fs}main"
+include { KRONA_KTIMPORTTEXT      } from "${params.modules}${params.fs}krona${params.fs}ktimporttext${params.fs}main"
+include { DUMP_SOFTWARE_VERSIONS  } from "${params.modules}${params.fs}custom${params.fs}dump_software_versions${params.fs}main"
+include { MULTIQC                 } from "${params.modules}${params.fs}multiqc${params.fs}main"
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    INPUTS AND ANY CHECKS FOR THE BETTERCALLSAL WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def reads_platform = 0
+reads_platform += (params.input ? 1 : 0)
+
+if (reads_platform < 1 || reads_platform == 0) {
+    stopNow("Please mention at least one absolute path to input folder which contains\n" +
+            "FASTQ files sequenced using the --input option.\n" +
+        "Ex: --input (Illumina or Generic short reads in FASTQ format)")
+}
+
+params.fastp_adapter_fasta ? checkMetadataExists(params.fastp_adapter_fasta, 'Adapter sequences FASTA') : null
+checkMetadataExists(params.lineages_csv, 'Lineages CSV')
+checkMetadataExists(params.kmaalign_idx, 'KMA Indices')
+checkMetadataExists(params.ref_fna, 'FASTA reference')
+
+ch_sourmash_lin = file( params.lineages_csv )
+
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    RUN THE BETTERCALLSAL WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow NOWAYOUT {
+    main:
+        log.info summaryOfParams()
+
+        PROCESS_FASTQ()
+
+        PROCESS_FASTQ.out.versions
+            .set { software_versions }
+
+        PROCESS_FASTQ.out.processed_reads
+            .set { ch_processed_reads }
+
+        ch_processed_reads
+            .map { meta, fastq ->
+                meta.get_kma_hit_accs = true
+                meta.salmon_decoys = params.dummyfile
+                meta.salmon_lib_type = (params.salmonalign_libtype ?: false)
+                meta.kma_t_db = params.kmaalign_idx
+                [ meta, fastq ]
+            }
+            .filter { meta, fastq ->
+                fq_file = ( fastq.getClass().toString() =~ /ArrayList/ ? fastq : [ fastq ] )
+                fq_gzip = new GZIPInputStream( new FileInputStream( fq_file[0].toAbsolutePath().toString() ) )
+                fq_gzip.read() != -1
+            }
+            .set { ch_processed_reads }
+
+        FASTP( ch_processed_reads )
+
+        FASTP.out.json
+            .map { meta, json ->
+                json
+            }
+            .collect()
+            .set { ch_multiqc }
+
+        KMA_ALIGN(
+            FASTP.out.passed_reads
+                .map { meta, fastq ->
+                    [meta, fastq, []]
+                }
+        )
+
+        OTF_GENOME(
+            KMA_ALIGN.out.hits
+                .join(KMA_ALIGN.out.frags)
+        )
+
+        OTF_GENOME.out.reads_extracted
+            .filter { meta, fasta ->
+                fa_file = ( fasta.getClass().toString() =~ /ArrayList/ ? fasta : [ fasta ] )
+                fa_gzip = new GZIPInputStream( new FileInputStream( fa_file[0].toAbsolutePath().toString() ) )
+                fa_gzip.read() != -1
+            }
+            .set { ch_mito_aln_reads }
+
+        SEQKIT_GREP(
+            KMA_ALIGN.out.hits
+                .filter { meta, mapped_refs ->
+                    patterns = file( mapped_refs )
+                    patterns.size() > 0
+                }
+                .map { meta, mapped_refs ->
+                    [meta, params.ref_fna, mapped_refs]
+                }
+        )
+
+        SALMON_INDEX( SEQKIT_GREP.out.fastx )
+
+        SALMON_QUANT(
+            ch_mito_aln_reads
+                .join( SALMON_INDEX.out.idx )
+        )
+
+        REDUCE_DB_IDX(
+            SEQKIT_GREP.out.fastx,
+            true,
+            false,
+            'db'
+        )
+
+        SOURMASH_SKETCH(
+            ch_mito_aln_reads,
+            false,
+            false,
+            'query'
+        )
+
+        SOURMASH_GATHER(
+            SOURMASH_SKETCH.out.signatures
+                .join( REDUCE_DB_IDX.out.signatures ),
+                [], [], [], []
+        )
+
+        // SOURMASH_TAX_METAGENOME(
+        //     SOURMASH_GATHER.out.result
+        //         .groupTuple(by: [0])
+        //         .map { meta, csv ->
+        //             [ meta, csv, ch_sourmash_lin ]
+        //         }
+        // )
+
+        // SOURMASH_TAX_METAGENOME.out.csv
+        //     .map { meta, csv ->
+        //         csv
+        //     }
+        //     .set { ch_lin_csv }
+
+        // SOURMASH_TAX_METAGENOME.out.tsv
+        //     .tap { ch_lin_krona }
+        //     .map { meta, tsv ->
+        //         tsv
+        //     }
+        //     .tap { ch_lin_tsv }
+
+        SOURMASH_GATHER.out.result
+            .groupTuple(by: [0])
+            .map { meta, csv ->
+                [ csv ]
+            }
+            .concat(
+                SALMON_QUANT.out.results
+                    .map { meta, salmon_res ->
+                        [ salmon_res ]
+                    }
+            )
+            .concat(
+                SOURMASH_GATHER.out.failed
+                    .map { meta, failed ->
+                        [ failed ]
+                    }
+            )
+            .concat( OTF_GENOME.out.failed )
+            .collect()
+            .flatten()
+            .collect()
+            .set { ch_gene_abn }
+
+        NOWAYOUT_RESULTS( ch_gene_abn, ch_sourmash_lin )
+
+        NOWAYOUT_RESULTS.out.tsv
+            .flatten()
+            .filter { tsv -> tsv.toString() =~ /.*${params.krona_res_suffix}$/ }
+            .map { tsv ->
+                    meta = [:]
+                    meta.id = "${params.cfsanpipename}_${params.pipeline}_krona"
+                    [ meta, tsv ]
+            }
+            .groupTuple(by: [0])
+            .set { ch_lin_krona }
+
+        // ch_lin_tsv
+        //     .mix( ch_lin_csv )
+        //     .collect()
+        //     .set { ch_lin_summary }
+
+        // SOURMASH_TAX_METAGENOME.out.txt
+        //     .map { meta, txt ->
+        //         txt
+        //     }
+        //     .collect()
+        //     .set { ch_lin_kreport }
+
+        // NOWAYOUT_RESULTS(
+        //     ch_lin_summary
+        //         .concat( SOURMASH_GATHER.out.failed )
+        //         .concat( OTF_GENOME.out.failed )
+        //         .collect()
+        // )
+
+        KRONA_KTIMPORTTEXT( ch_lin_krona )
+
+        DUMP_SOFTWARE_VERSIONS(
+            software_versions
+                .mix (
+                    FASTP.out.versions,
+                    KMA_ALIGN.out.versions,
+                    SEQKIT_GREP.out.versions,
+                    REDUCE_DB_IDX.out.versions,
+                    SOURMASH_SKETCH.out.versions,
+                    SOURMASH_GATHER.out.versions,
+                    SALMON_INDEX.out.versions,
+                    SALMON_QUANT.out.versions,
+                    NOWAYOUT_RESULTS.out.versions,
+                    KRONA_KTIMPORTTEXT.out.versions
+                )
+                .unique()
+                .collectFile(name: 'collected_versions.yml')
+        )
+
+        DUMP_SOFTWARE_VERSIONS.out.mqc_yml
+            .concat(
+                ch_multiqc,
+                NOWAYOUT_RESULTS.out.mqc_yml
+            )
+            .collect()
+            .flatten()
+            .collect()
+            .set { ch_multiqc }
+
+        MULTIQC( ch_multiqc )
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    ON COMPLETE, SHOW GORY DETAILS OF ALL PARAMS WHICH WILL BE HELPFUL TO DEBUG
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow.onComplete {
+    if (workflow.success) {
+        sendMail()
+    }
+}
+
+workflow.onError {
+    sendMail()
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    METHOD TO CHECK METADATA EXISTENCE
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def checkMetadataExists(file_path, msg) {
+    file_path_obj = file( file_path )
+
+    if (msg.toString().find(/(?i)KMA/)) {
+        if (!file_path_obj.parent.exists() || file_path_obj.parent.size() == 0) {
+            stopNow("Please check if your ${msg}\n" +
+                "[ ${file_path} ]\nexists and that the files are not of size 0.")
+        }
+    }
+    else if (!file_path_obj.exists() || file_path_obj.size() == 0) {
+        stopNow("Please check if your ${msg} file\n" +
+            "[ ${file_path} ]\nexists and is not of size 0.")
+    }
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    HELP TEXT METHODS FOR BETTERCALLSAL WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def help() {
+
+    Map helptext = [:]
+
+    helptext.putAll (
+        fastqEntryPointHelp() +
+        fastpHelp(params).text +
+        kmaalignHelp(params).text +
+        seqkitgrepHelp(params).text +
+        salmonidxHelp(params).text +
+        sourmashsketchHelp(params).text +
+        sourmashgatherHelp(params).text +
+        sfhpyHelp(params).text +
+        gsalkronapyHelp(params).text +
+        kronaktimporttextHelp(params).text +
+        wrapUpHelp()
+    )
+
+    return addPadding(helptext)
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/hfp_nowayout.xml	Mon Mar 31 14:50:40 2025 -0400
@@ -0,0 +1,199 @@
+<tool id="hfp_nowayout" name="nowayout" version="0.5.0+galaxy24">
+    <description>An automated workflow to identify Mitochondrial reads and classify Eukaryotes.</description>
+    <requirements>
+        <container type="docker">quay.io/biocontainers/nextflow:24.10.4--hdfd78af_0</container>
+    </requirements>
+    <version_command>nextflow -version</version_command>
+    <command detect_errors="exit_code"><![CDATA[
+	input_path=\$(pwd)"/cpipes-input";
+    mkdir -p "\${input_path}" || exit 1;
+    #import re
+    #if (str($input_read_type_cond.input_read_type) == "single_long"):
+	    #for _, $unpaired in enumerate($input_read_type_cond.input):
+            #set read1 = str($unpaired.name)
+            #if not str($unpaired.name).endswith(('.fastq', '.fastq.gz')):
+                #set read1_ext = re.sub('fastqsanger', 'fastq', str($unpaired.ext))
+                #set read1 = str($unpaired.name) + str('.') + $read1_ext
+            #end if
+            ln -sf '$unpaired' "\${input_path}/$read1";
+	    #end for
+    #elif (str($input_read_type_cond.input_read_type) == "paired"):
+	    #for _, $pair in enumerate($input_read_type_cond.input_pair)
+            #set read_R1 = re.sub('\:forward', '_forward', str($pair.forward.name))
+            #set read_R2 = re.sub('\:reverse', '_reverse', str($pair.reverse.name))
+            #set read_R1_ext = re.sub('fastqsanger', 'fastq', str($pair.forward.ext))
+            #set read_R2_ext = re.sub('fastqsanger', 'fastq', str($pair.reverse.ext))
+            #if not str($pair.forward.name).endswith(('.fastq', '.fastq.gz')):
+	            #set read_R1 = $read_R1 + str('.') + $read_R1_ext
+            #end if
+            #if not str($pair.reverse.name).endswith(('.fastq', '.fastq.gz')):
+                #set read_R2 = $read_R2 + str('.') + $read_R2_ext
+            #end if
+	        ln -sf '$pair.forward' "\${input_path}/$read_R1";
+	        ln -sf '$pair.reverse' "\${input_path}/$read_R2";
+	    #end for
+    #end if
+	$__tool_directory__/0.5.0/cpipes
+    --pipeline nowayout
+    --input \${input_path}
+	--output cpipes-output
+    --fq_suffix '${input_read_type_cond.fq_suffix}'
+    #if (str($input_read_type_cond.input_read_type) == "single_long"):
+        --fq_single_end true
+    #elif (str($input_read_type_cond.input_read_type) == "paired"):
+        --fq_single_end false --fq2_suffix '${input_read_type_cond.fq2_suffix}'
+    #end if
+    --db_mode $nowo_db_mode
+    --nowo_thresholds $nowo_thresholds
+	--fq_filename_delim '${fq_filename_delim}'
+	--fq_filename_delim_idx $fq_filename_delim_idx
+	-profile gxkubernetes;
+    mv './cpipes-output/nowayout-multiqc/multiqc_report.html' './multiqc_report.html' || exit 1;
+    mv './cpipes-output/krona_ktimporttext/CPIPES_nowayout_krona.html './CPIPES_nowayout_krona.html' || exit 1;
+    rm -rf ./cpipes-output || exit 1;
+    rm -rf ./work || exit 1;
+    ]]></command>
+    <inputs>
+        <conditional name="input_read_type_cond">
+            <param name="input_read_type" type="select" label="Select the read collection type">
+                <option value="single_long" selected="true">Single-End short reads</option>
+                <option value="paired">Paired-End short reads</option>
+            </param>
+            <when value="single_long">
+                <param name="input" type="data_collection" collection_type="list" format="fastq,fastq.gz"
+                    label="Dataset list of unpaired short reads or long reads" />
+                <param name="fq_suffix" value=".fastq.gz" type="text" label="Suffix of the Single-End FASTQ"/>
+            </when>
+            <when value="paired">
+                <param name="input_pair" type="data_collection" collection_type="list:paired" format="fastq,fastq.gz" label="List of Dataset pairs" />
+                <param name="fq_suffix" value="_R1_001.fastq.gz" type="text" label="Suffix of the R1 FASTQ"
+                    help="For any data sets downloaded from NCBI into Galaxy, change this to _forward.fastq.gz suffix."/>
+                <param name="fq2_suffix" value="_R2_001.fastq.gz" type="text" label="Suffix of the R2 FASTQ"
+                    help="For any data sets downloaded from NCBI into Galaxy, change this to _reverse.fastq.gz suffix."/>
+            </when>
+        </conditional>
+        <param name="nowo_db_mode" type="select" label="Select the database with nowayout"
+            help="Please see below about different databases.">
+            <option value="mitomine" selected="true">mitomine</option>
+            <option value="cytox1">cytox1</option>
+            <option value="voucher">voucher</option>
+            <option value="ganoderma">ganoderma</option>
+            <option value="listeria">listeria</option>
+        </param>
+        <param name="nowo_thresholds" type="select" label="Enter the type of base quality thresholds to be set with nowayout"
+            help="The default value sets strictest thresholds that tends to filter out most of the false positive hits.">
+            <option value="strict" selected="true">strict</option>
+            <option value="relax">relax</option>
+        </param>
+        <param name="fq_filename_delim" type="text" value="_" label="File name delimitor by which samples are grouped together (--fq_filename_delim)"
+            help="This is the delimitor by which samples are grouped together to display in the final MultiQC report. For example, if your input data sets are mango_replicate1.fastq.gz, mango_replicate2.fastq.gz, orange_replicate1_maryland.fastq.gz, orange_replicate2_maryland.fastq.gz, then to create 2 samples mango and orange, the value for --fq_filename_delim would be _ (underscore) and the value for --fq_filename_delim_idx would be 1, since you want to group by the first word (i.e. mango or orange) after splitting the filename based on _ (underscore)."/>
+        <param name="fq_filename_delim_idx" type="integer" value="1" label="File name delimitor index (--fq_filename_delim_idx)" />
+    </inputs>
+    <outputs>
+        <data name="krona_chart" format="html" label="nowayout: Krona Chart on ${on_string}" from_work_dir="CPIPES_nowayout_krona.html"/>
+        <data name="multiqc_report" format="html" label="nowayout: MultiQC Report on ${on_string}" from_work_dir="multiqc_report.html"/>
+    </outputs>
+    <tests>
+        <!--Test 01: long reads-->
+        <test expect_num_outputs="2">
+            <param name="input">
+                <collection type="list">
+                    <element name="FAL11127.fastq.gz" value="FAL11127.fastq.gz" />
+                    <element name="FAL11341.fastq.gz" value="FAL11341.fastq.gz" />
+                    <element name="FAL11342.fastq.gz" value="FAL11342.fastq.gz" />
+                </collection>
+            </param>
+            <param name="fq_suffix" value=".fastq.gz"/>
+            <output name="multiqc_report" file="multiqc_report.html" ftype="html" compare="sim_size"/>
+            <!-- <output name="assembled_mags" file="FAL11127.assembly_filtered.contigs.fasta" ftype="fasta" compare="sim_size"/> -->
+        </test>
+    </tests>
+    <help><![CDATA[
+
+.. class:: infomark
+
+**Purpose**
+
+nowayout is a mitochondrial metagenomics classifier for Eukaryotes.
+It uses a custom kma database to identify mitochondrial reads and
+performs read classification followed by further read classification
+reinforcement using sourmash.
+
+It is written in Nextflow and is part of the modular data analysis pipelines (CFSAN PIPELINES or CPIPES for short) at HFP.
+
+
+----
+
+.. class:: infomark
+
+** Databases **
+
+    - mitomine: Big database that works in almost all scenarios.
+    - cytox1: Collection of only non-redundant COXI genes from NCBI.
+    - voucher: Collection of only non-redundant voucher sequences from NCBI.
+    - ganoderma: Collection of only non-redundant mtDNA sequences of Ganoderma fungi.
+    - listeria: Collection of organelle sequences and other rRNA genes for Listeria.
+
+
+----
+
+.. class:: infomark
+
+**Testing and Validation**
+
+The CPIPES - nowayout Nextflow pipeline has been wrapped to make it work in Galaxy.
+It takes in either paired or unpaired short reads list as an input and generates a MultiQC report
+which contains relative abundances in context of number of mitochondrial reads identified. It also
+generates a Krona chart for each sample. The pipeline has been tested on multiple internal insect
+mixture samples. All the original testing and validation was done on the command line on the
+HFP Reedling HPC Cluster.
+
+
+----
+
+.. class:: infomark
+
+** Please note **
+
+    ::
+
+        - nowayout only works on Illumina short reads (paired or unpaired).
+        - nowayout uses a custom kma database named mitomine.
+        - The custom database will be incrementally augmented and refined over time.
+        - mitomine stats:
+            Contains ~ 2.93M non-redundant mitochondrial and voucher sequences.
+            Represents ~ 717K unique species.
+        - Other databases are also available but will be seldom updated.
+
+----
+
+.. class:: infomark
+
+**Outputs**
+
+The main output file is a:
+
+    ::
+
+        - MultiQC Report: Contains a brief summary report including individual Mitochondrial reads identified
+                          per sample and relative abundances in context of the total number of Mitochondrial reads
+                          identified.
+                          Please note that due to MultiQC customizations, the preview (eye icon) will not
+                          work within Galaxy for the MultiQC report. Please download the file by clicking
+                          on the floppy icon and view it in your browser on your local desktop/workstation.
+                          You can export the tables and plots from the downloaded MultiQC report.
+
+  ]]></help>
+    <citations>
+        <citation type="bibtex">
+            @article{nowayout,
+            author = {Konganti, Kranti},
+            year = {2025},
+            month = {May},
+            title = {nowayout: An automated mitrochiondrial read classifier for Eukaryotes.},
+            journal = {Manuscript in preparation},
+            doi = {10.3389/xxxxxxxxxxxxxxxxxx},
+            url = {https://xxxxxxx/articles/10.3389/xxxxxxxxxxxx/full}}
+        </citation>
+    </citations>
+</tool>