Mercurial > repos > galaxytrakr > hfp_nowayout_awsbatch

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/LICENSE.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,98 @@
+# CPIPES (CFSAN PIPELINES)
+
+## The modular pipeline repository at CFSAN, FDA
+
+**CPIPES** (CFSAN PIPELINES) is a collection of modular pipelines based on **NEXTFLOW**,
+mostly for bioinformatics data analysis at **CFSAN, FDA.**
+
+---
+
+### **LICENSES**
+
+\
+&nbsp;
+
+**CPIPES** is licensed under:
+
+```text
+MIT License
+
+In the U.S.A. Public Domain; elsewhere Copyright (c) 2022 U.S. Food and Drug Administration
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
+
+\
+&nbsp;
+
+Portions of **CPIPES** are built on modified versions of many tools, scripts and libraries from [nf-core/modules](https://github.com/nf-core/modules) and [nf-core/rnaseq](https://github.com/nf-core/rna-seq) which are originally licensed under:
+
+```text
+MIT License
+
+Copyright (c) Philip Ewels
+Copyright (c) Phil Ewels, Rickard Hammarén
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
+
+\
+&nbsp;
+
+The **MultiQC** report, in addition uses [DataTables](https://datatables.net), which is licensed under:
+
+```text
+MIT License
+
+Copyright (C) 2008-2022, SpryMedia Ltd.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,62 @@
+<p align="center">
+    <img src="./assets/nowayout-icon.png" width="20%" height="20%" />
+</p>
+
+---
+
+`nowayout` is a **super-fast** automated software pipeline for taxonomic classification of Eukaryotic mitochondrial reads. It uses a custom database to first identify mitochondrial reads and performs read classification on those identified reads. This pipeline has been specifically used for detecting and identifying insects or insect fragments in foods and also has been used as a verification tool for labeling claims when insects are used as foods. `nowayout` can also be used to detect any Eukaryotic DNA from shotgun metagenomic datasets. The pipeline is under active rapid development and more research is being currently undertaken to solve the ambigous read assignments.
+
+`nowayout` currently works on **Illumina** short reads and in future will support **Oxford Nanopore** long reads.
+
+It is written in **Nextflow** and is part of the modular data analysis pipelines at **HFP**.
+
+\
+&nbsp;
+
+## Workflows
+
+**CPIPES**:
+
+- `nowayout`       : [README](./readme/nowayout.md).
+
+\
+&nbsp;
+
+### Citing `nowayout`
+
+---
+Manuscript is in preparation. Please cite our **GitHub** repo.
+
+>
+>**nowayout: an automated pipeline for taxonomic classification of Eukaryotic mitochondrial reads (<https://github.com/CFSAN-Biostatistics/nowayout>).**
+>
+>Kranti Konganti, Monica Pava-Ripoll, Amanda Windsor, Christopher Grim, Mark Mammel and Padmini Ramachandran
+>
+
+\
+&nbsp;
+
+### Future work
+
+---
+
+- Incorporation of custom algorithms and classification methods to deal with ambiguos read assignments.
+- Incorporation of methods to support processing of Oxford Nanopore reads.
+
+\
+&nbsp;
+
+### Caveats
+
+---
+
+- The main workflow has been used for **research purposes** only.
+- Analysis results should be interpreted with caution.
+
+\
+&nbsp;
+
+### Disclaimer
+
+---
+**HFP, FDA** assumes no responsibility whatsoever for use by other parties of the Software, its source code, documentation or compiled or uncompiled executables, and makes no guarantees, expressed or implied, about its quality, reliability, or any other characteristic. Further, **HFP, FDA** makes no representations that the use of the Software will not infringe any patent or proprietary rights of third parties. The use of this code in no way implies endorsement by the **HFP, FDA** or confers any advantage in regulatory decisions.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/adaptors.fa	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,1194 @@
+>gnl|uv|NGB00360.1 Illumina PCR Primer
+AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00362.1 Illumina Paired End PCR Primer 2.0
+CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
+>gnl|uv|NGB00363.1 Illumina Multiplexing PCR Primer 2.0
+GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00364.1 Illumina Multiplexing PCR Primer Index 1
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTC
+>gnl|uv|NGB00365.1 Illumina Multiplexing PCR Primer Index 2
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTC
+>gnl|uv|NGB00366.1 Illumina Multiplexing PCR Primer Index 3
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTC
+>gnl|uv|NGB00367.1 Illumina Multiplexing PCR Primer Index 4
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTC
+>gnl|uv|NGB00368.1 Illumina Multiplexing PCR Primer Index 5
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTC
+>gnl|uv|NGB00369.1 Illumina Multiplexing PCR Primer Index 6
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTC
+>gnl|uv|NGB00370.1 Illumina Multiplexing PCR Primer Index 7
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTC
+>gnl|uv|NGB00371.1 Illumina Multiplexing PCR Primer Index 8
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTC
+>gnl|uv|NGB00372.1 Illumina Multiplexing PCR Primer Index 9
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTC
+>gnl|uv|NGB00373.1 Illumina Multiplexing PCR Primer Index 10
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTC
+>gnl|uv|NGB00374.1 Illumina Multiplexing PCR Primer Index 11
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTC
+>gnl|uv|NGB00375.1 Illumina Multiplexing PCR Primer Index 12
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTC
+>gnl|uv|NGB00376.1 Illumina Gex PCR Primer 2
+AATGATACGGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGA
+>gnl|uv|NGB00377.1 Illumina DpnII Gex Sequencing Primer
+CGACAGGTTCAGAGTTCTACAGTCCGACGATC
+>gnl|uv|NGB00378.1 Illumina NlaIII Gex Sequencing Primer
+CCGACAGGTTCAGAGTTCTACAGTCCGACATG
+>gnl|uv|NGB00379.1 Illumina 3' RNA Adapter
+TCGTATGCCGTCTTCTGCTTGTT
+>gnl|uv|NGB00380.1 Illumina Small RNA 3' Adapter
+AATCTCGTATGCCGTCTTCTGCTTGC
+>gnl|uv|NGB00385.1 454 FLX linker
+GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC
+>gnl|uv|NGB00414.1 454 Life Sciences GS FLX Titanium Primer A-key
+CGTATCGCCTCCCTCGCGCCATCAG
+>gnl|uv|NGB00415.1 454 Life Sciences GS FLX Titanium Primer B-key
+CTATGCGCCTTGCCAGCCCGCTCAG
+>gnl|uv|NGB00416.1 454 Life Sciences GS FLX Titanium MID Adaptor B
+CCTATCCCCTGTGTGCCTTGGCAGTCTCAG
+>gnl|uv|NGB00417.1 454 Life Sciences GS FLX Titanium MID-1 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGAGTGCGT
+>gnl|uv|NGB00418.1 454 Life Sciences GS FLX Titanium MID-2 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCTCGACA
+>gnl|uv|NGB00419.1 454 Life Sciences GS FLX Titanium MID-3 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACGCACTC
+>gnl|uv|NGB00420.1 454 Life Sciences GS FLX Titanium MID-4 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCACTGTAG
+>gnl|uv|NGB00421.1 454 Life Sciences GS FLX Titanium MID-5 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCAGACACG
+>gnl|uv|NGB00422.1 454 Life Sciences GS FLX Titanium MID-6 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATATCGCGAG
+>gnl|uv|NGB00423.1 454 Life Sciences GS FLX Titanium MID-7 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTGTCTCTA
+>gnl|uv|NGB00424.1 454 Life Sciences GS FLX Titanium MID-8 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGCGTGTC
+>gnl|uv|NGB00425.1 454 Life Sciences GS FLX Titanium MID-10 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTCTATGCG
+>gnl|uv|NGB00426.1 454 Life Sciences GS FLX Titanium MID-11 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATACGTCT
+>gnl|uv|NGB00427.1 454 Life Sciences GS FLX Titanium MID-13 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCATAGTAGTG
+>gnl|uv|NGB00428.1 454 Life Sciences GS FLX Titanium MID-14 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGAGATAC
+>gnl|uv|NGB00429.1 454 Life Sciences GS FLX Titanium MID-15 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATACGACGTA
+>gnl|uv|NGB00430.1 454 Life Sciences GS FLX Titanium MID-16 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACGTACTA
+>gnl|uv|NGB00431.1 454 Life Sciences GS FLX Titanium MID-17 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTCTAGTAC
+>gnl|uv|NGB00432.1 454 Life Sciences GS FLX Titanium MID-18 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTACGTAGC
+>gnl|uv|NGB00433.1 454 Life Sciences GS FLX Titanium MID-19 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTACTACTC
+>gnl|uv|NGB00434.1 454 Life Sciences GS FLX Titanium MID-20 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGACTACAG
+>gnl|uv|NGB00435.1 454 Life Sciences GS FLX Titanium MID-21 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTAGACTAG
+>gnl|uv|NGB00436.1 454 Life Sciences GS FLX Titanium MID-22 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGAGTATG
+>gnl|uv|NGB00437.1 454 Life Sciences GS FLX Titanium MID-23 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACTCTCGTG
+>gnl|uv|NGB00438.1 454 Life Sciences GS FLX Titanium MID-24 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGAGACGAG
+>gnl|uv|NGB00439.1 454 Life Sciences GS FLX Titanium MID-25 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGTCGCTCG
+>gnl|uv|NGB00440.1 454 Life Sciences GS FLX Titanium MID-26 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACATACGCGT
+>gnl|uv|NGB00441.1 454 Life Sciences GS FLX Titanium MID-27 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCGAGTAT
+>gnl|uv|NGB00442.1 454 Life Sciences GS FLX Titanium MID-28 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTACTATGT
+>gnl|uv|NGB00443.1 454 Life Sciences GS FLX Titanium MID-29 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTGTACAGT
+>gnl|uv|NGB00444.1 454 Life Sciences GS FLX Titanium MID-30 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACTATACT
+>gnl|uv|NGB00445.1 454 Life Sciences GS FLX Titanium MID-31 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCGTCGTCT
+>gnl|uv|NGB00446.1 454 Life Sciences GS FLX Titanium MID-32 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTACGCTAT
+>gnl|uv|NGB00447.1 454 Life Sciences GS FLX Titanium MID-33 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGAGTACT
+>gnl|uv|NGB00448.1 454 Life Sciences GS FLX Titanium MID-34 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGCTACGT
+>gnl|uv|NGB00449.1 454 Life Sciences GS FLX Titanium MID-35 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGTAGACGT
+>gnl|uv|NGB00450.1 454 Life Sciences GS FLX Titanium MID-36 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACGTGACT
+>gnl|uv|NGB00451.1 454 Life Sciences GS FLX Titanium MID-37 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACACACT
+>gnl|uv|NGB00452.1 454 Life Sciences GS FLX Titanium MID-38 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACGTGAT
+>gnl|uv|NGB00453.1 454 Life Sciences GS FLX Titanium MID-39 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACAGATCGT
+>gnl|uv|NGB00454.1 454 Life Sciences GS FLX Titanium MID-40 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGCTGTCT
+>gnl|uv|NGB00455.1 454 Life Sciences GS FLX Titanium MID-41 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGTGTAGAT
+>gnl|uv|NGB00456.1 454 Life Sciences GS FLX Titanium MID-42 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGATCACGT
+>gnl|uv|NGB00457.1 454 Life Sciences GS FLX Titanium MID-43 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCACTAGT
+>gnl|uv|NGB00458.1 454 Life Sciences GS FLX Titanium MID-44 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGCGACT
+>gnl|uv|NGB00459.1 454 Life Sciences GS FLX Titanium MID-45 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTATACTAT
+>gnl|uv|NGB00460.1 454 Life Sciences GS FLX Titanium MID-46 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGACGTATGT
+>gnl|uv|NGB00461.1 454 Life Sciences GS FLX Titanium MID-47 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTGAGTAGT
+>gnl|uv|NGB00462.1 454 Life Sciences GS FLX Titanium MID-48 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACAGTATATA
+>gnl|uv|NGB00463.1 454 Life Sciences GS FLX Titanium MID-49 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCGATCGA
+>gnl|uv|NGB00464.1 454 Life Sciences GS FLX Titanium MID-50 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTAGCAGTA
+>gnl|uv|NGB00465.1 454 Life Sciences GS FLX Titanium MID-51 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCTCACGTA
+>gnl|uv|NGB00466.1 454 Life Sciences GS FLX Titanium MID-52 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTATACATA
+>gnl|uv|NGB00467.1 454 Life Sciences GS FLX Titanium MID-53 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTCGAGAGA
+>gnl|uv|NGB00468.1 454 Life Sciences GS FLX Titanium MID-54 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGCTACGA
+>gnl|uv|NGB00469.1 454 Life Sciences GS FLX Titanium MID-55 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCGTATA
+>gnl|uv|NGB00470.1 454 Life Sciences GS FLX Titanium MID-56 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCAGTACGA
+>gnl|uv|NGB00471.1 454 Life Sciences GS FLX Titanium MID-57 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCGTATACA
+>gnl|uv|NGB00472.1 454 Life Sciences GS FLX Titanium MID-58 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTACAGTCA
+>gnl|uv|NGB00473.1 454 Life Sciences GS FLX Titanium MID-59 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTACTCAGA
+>gnl|uv|NGB00474.1 454 Life Sciences GS FLX Titanium MID-60 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTACGCTCTA
+>gnl|uv|NGB00475.1 454 Life Sciences GS FLX Titanium MID-61 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTATAGCGTA
+>gnl|uv|NGB00476.1 454 Life Sciences GS FLX Titanium MID-62 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGTCATCA
+>gnl|uv|NGB00477.1 454 Life Sciences GS FLX Titanium MID-63 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGTCGCATA
+>gnl|uv|NGB00478.1 454 Life Sciences GS FLX Titanium MID-64 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATATATACA
+>gnl|uv|NGB00479.1 454 Life Sciences GS FLX Titanium MID-65 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATGCTAGTA
+>gnl|uv|NGB00480.1 454 Life Sciences GS FLX Titanium MID-66 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACGCGAGA
+>gnl|uv|NGB00481.1 454 Life Sciences GS FLX Titanium MID-67 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGATAGTGA
+>gnl|uv|NGB00482.1 454 Life Sciences GS FLX Titanium MID-68 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCTGCGTA
+>gnl|uv|NGB00483.1 454 Life Sciences GS FLX Titanium MID-69 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGACGTCA
+>gnl|uv|NGB00484.1 454 Life Sciences GS FLX Titanium MID-70 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGTCAGTA
+>gnl|uv|NGB00485.1 454 Life Sciences GS FLX Titanium MID-71 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTAGTGTGA
+>gnl|uv|NGB00486.1 454 Life Sciences GS FLX Titanium MID-72 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTCACACGA
+>gnl|uv|NGB00487.1 454 Life Sciences GS FLX Titanium MID-73 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTCGTCGCA
+>gnl|uv|NGB00488.1 454 Life Sciences GS FLX Titanium MID-74 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACACATACGC
+>gnl|uv|NGB00489.1 454 Life Sciences GS FLX Titanium MID-75 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACAGTCGTGC
+>gnl|uv|NGB00490.1 454 Life Sciences GS FLX Titanium MID-76 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACATGACGAC
+>gnl|uv|NGB00491.1 454 Life Sciences GS FLX Titanium MID-77 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGACAGCTC
+>gnl|uv|NGB00492.1 454 Life Sciences GS FLX Titanium MID-78 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGTCTCATC
+>gnl|uv|NGB00493.1 454 Life Sciences GS FLX Titanium MID-79 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCATCTAC
+>gnl|uv|NGB00494.1 454 Life Sciences GS FLX Titanium MID-80 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCGCGCAC
+>gnl|uv|NGB00495.1 454 Life Sciences GS FLX Titanium MID-81 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGAGCGTCAC
+>gnl|uv|NGB00496.1 454 Life Sciences GS FLX Titanium MID-82 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCGACTAGC
+>gnl|uv|NGB00497.1 454 Life Sciences GS FLX Titanium MID-83 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTAGTGATC
+>gnl|uv|NGB00498.1 454 Life Sciences GS FLX Titanium MID-84 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGACACAC
+>gnl|uv|NGB00499.1 454 Life Sciences GS FLX Titanium MID-85 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGTATGTC
+>gnl|uv|NGB00500.1 454 Life Sciences GS FLX Titanium MID-86 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGATAGAC
+>gnl|uv|NGB00501.1 454 Life Sciences GS FLX Titanium MID-87 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATATAGTCGC
+>gnl|uv|NGB00502.1 454 Life Sciences GS FLX Titanium MID-88 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCTACTGAC
+>gnl|uv|NGB00503.1 454 Life Sciences GS FLX Titanium MID-89 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGTAGATC
+>gnl|uv|NGB00504.1 454 Life Sciences GS FLX Titanium MID-90 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGTGTCGC
+>gnl|uv|NGB00505.1 454 Life Sciences GS FLX Titanium MID-91 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCATACTCTAC
+>gnl|uv|NGB00506.1 454 Life Sciences GS FLX Titanium MID-92 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACACTATC
+>gnl|uv|NGB00507.1 454 Life Sciences GS FLX Titanium MID-93 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGACGCGC
+>gnl|uv|NGB00508.1 454 Life Sciences GS FLX Titanium MID-94 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTATGCGAC
+>gnl|uv|NGB00509.1 454 Life Sciences GS FLX Titanium MID-95 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTCGATCTC
+>gnl|uv|NGB00510.1 454 Life Sciences GS FLX Titanium MID-96 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTACGACTGC
+>gnl|uv|NGB00511.1 454 Life Sciences GS FLX Titanium MID-97 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGTCACTC
+>gnl|uv|NGB00512.1 454 Life Sciences GS FLX Titanium MID-98 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCTACGCTC
+>gnl|uv|NGB00513.1 454 Life Sciences GS FLX Titanium MID-99 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGTACATAC
+>gnl|uv|NGB00514.1 454 Life Sciences GS FLX Titanium MID-100 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGACTGCAC
+>gnl|uv|NGB00515.1 454 Life Sciences GS FLX Titanium MID-101 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCGCGCGC
+>gnl|uv|NGB00516.1 454 Life Sciences GS FLX Titanium MID-102 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCTCTATC
+>gnl|uv|NGB00517.1 454 Life Sciences GS FLX Titanium MID-103 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATAGACATC
+>gnl|uv|NGB00518.1 454 Life Sciences GS FLX Titanium MID-104 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATGATACGC
+>gnl|uv|NGB00519.1 454 Life Sciences GS FLX Titanium MID-105 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACTCATAC
+>gnl|uv|NGB00520.1 454 Life Sciences GS FLX Titanium MID-106 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCATCGAGTC
+>gnl|uv|NGB00521.1 454 Life Sciences GS FLX Titanium MID-107 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGAGCTCTC
+>gnl|uv|NGB00522.1 454 Life Sciences GS FLX Titanium MID-108 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCAGACAC
+>gnl|uv|NGB00523.1 454 Life Sciences GS FLX Titanium MID-109 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGTCTCGC
+>gnl|uv|NGB00524.1 454 Life Sciences GS FLX Titanium MID-110 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGTGACGC
+>gnl|uv|NGB00525.1 454 Life Sciences GS FLX Titanium MID-111 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATGTGTAC
+>gnl|uv|NGB00526.1 454 Life Sciences GS FLX Titanium MID-112 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTATAGAC
+>gnl|uv|NGB00527.1 454 Life Sciences GS FLX Titanium MID-113 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTCGCTAC
+>gnl|uv|NGB00528.1 454 Life Sciences GS FLX Titanium MID-114 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACGTGCAGCG
+>gnl|uv|NGB00529.1 454 Life Sciences GS FLX Titanium MID-115 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGACTCACAGAG
+>gnl|uv|NGB00530.1 454 Life Sciences GS FLX Titanium MID-116 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACTCAGCG
+>gnl|uv|NGB00531.1 454 Life Sciences GS FLX Titanium MID-117 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGAGAGTGTG
+>gnl|uv|NGB00532.1 454 Life Sciences GS FLX Titanium MID-118 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCTATCGCG
+>gnl|uv|NGB00533.1 454 Life Sciences GS FLX Titanium MID-119 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTCTGACTG
+>gnl|uv|NGB00534.1 454 Life Sciences GS FLX Titanium MID-120 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTGAGCTCG
+>gnl|uv|NGB00535.1 454 Life Sciences GS FLX Titanium MID-121 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGCTCTCG
+>gnl|uv|NGB00536.1 454 Life Sciences GS FLX Titanium MID-122 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCACGTGCG
+>gnl|uv|NGB00537.1 454 Life Sciences GS FLX Titanium MID-123 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCGTAGCAG
+>gnl|uv|NGB00538.1 454 Life Sciences GS FLX Titanium MID-124 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCGTCTGTG
+>gnl|uv|NGB00539.1 454 Life Sciences GS FLX Titanium MID-125 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATGTACGATG
+>gnl|uv|NGB00540.1 454 Life Sciences GS FLX Titanium MID-126 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATGTGTCTAG
+>gnl|uv|NGB00541.1 454 Life Sciences GS FLX Titanium MID-127 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACACGATAG
+>gnl|uv|NGB00542.1 454 Life Sciences GS FLX Titanium MID-128 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCACTCGCACG
+>gnl|uv|NGB00543.1 454 Life Sciences GS FLX Titanium MID-129 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGACGTCTG
+>gnl|uv|NGB00544.1 454 Life Sciences GS FLX Titanium MID-130 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGTACTGCG
+>gnl|uv|NGB00545.1 454 Life Sciences GS FLX Titanium MID-131 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACAGCGAG
+>gnl|uv|NGB00546.1 454 Life Sciences GS FLX Titanium MID-132 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCTGTCG
+>gnl|uv|NGB00547.1 454 Life Sciences GS FLX Titanium MID-133 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCGTGCTAG
+>gnl|uv|NGB00548.1 454 Life Sciences GS FLX Titanium MID-134 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGCTCGAGTG
+>gnl|uv|NGB00549.1 454 Life Sciences GS FLX Titanium MID-135 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTGATGACG
+>gnl|uv|NGB00550.1 454 Life Sciences GS FLX Titanium MID-136 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTATGTACAG
+>gnl|uv|NGB00551.1 454 Life Sciences GS FLX Titanium MID-137 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGATATAG
+>gnl|uv|NGB00552.1 454 Life Sciences GS FLX Titanium MID-138 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTCGCACGCG
+>gnl|uv|NGB00553.1 454 Life Sciences GS FLX Titanium MID-139 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGCGTCACG
+>gnl|uv|NGB00554.1 454 Life Sciences GS FLX Titanium MID-140 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGTGCGTCG
+>gnl|uv|NGB00555.1 454 Life Sciences GS FLX Titanium MID-141 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGCATACTG
+>gnl|uv|NGB00556.1 454 Life Sciences GS FLX Titanium MID-142 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATACATGTG
+>gnl|uv|NGB00557.1 454 Life Sciences GS FLX Titanium MID-143 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATCACTCAG
+>gnl|uv|NGB00558.1 454 Life Sciences GS FLX Titanium MID-144 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTATCTGATAG
+>gnl|uv|NGB00559.1 454 Life Sciences GS FLX Titanium MID-145 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGTGACATG
+>gnl|uv|NGB00560.1 454 Life Sciences GS FLX Titanium MID-146 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGATCGAG
+>gnl|uv|NGB00561.1 454 Life Sciences GS FLX Titanium MID-147 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGACATCTCG
+>gnl|uv|NGB00562.1 454 Life Sciences GS FLX Titanium MID-148 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGCTAGAG
+>gnl|uv|NGB00563.1 454 Life Sciences GS FLX Titanium MID-149 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGATAGAGCG
+>gnl|uv|NGB00564.1 454 Life Sciences GS FLX Titanium MID-150 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCGTGTGCG
+>gnl|uv|NGB00565.1 454 Life Sciences GS FLX Titanium MID-151 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCTAGTCAG
+>gnl|uv|NGB00566.1 454 Life Sciences GS FLX Titanium MID-152 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTATCACAG
+>gnl|uv|NGB00567.1 454 Life Sciences GS FLX Titanium MID-153 Adaptor A
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTGCGCGTG
+>gnl|uv|NGB00568.1 454 GS FLX Titanium Rapid Library Adaptor A universal segment
+CCATCTCATCCCTGCGTGTCTCCGACGACT
+>gnl|uv|NGB00569.1 454 GS FLX Titanium Rapid Library Adaptor B universal segment
+NGTCGNCGTCTCTCAAGGCACACAGGGGATAGG
+>gnl|uv|NGB00099.1 CLONTECH GenomeWalker Adaptor
+GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGGT
+>gnl|uv|NGB00361.2 Illumina PCR Primer (Oligonucleotide sequence copyright 2007-2009 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
+>gnl|uv|NGB00623.1 ABI SOLiD P1 Adaptor
+AACCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00624.1 ABI SOLiD P2 Adaptor
+AGAGAATGAGGAACCCGGGGCAGTT
+>gnl|uv|NGB00625.1 ABI SOLiD P2-T Adaptor
+AGAGAATGAGGAACCCGGGGCAGCC
+>gnl|uv|NGB00626.1 ABI SOLiD Internal Adaptor
+CTGCTGTACCGTACATCCGCCTTGGCCGTACAGCAG
+>gnl|uv|NGB00627.1 ABI SOLiD P1-T Adaptor
+GGCCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00628.1 ABI SOLiD Barcode Adaptor T-001
+CTGCCCCGGGTTCCTCATTCTCTGTGTAAGAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00629.1 ABI SOLiD Barcode Adaptor T-002
+CTGCCCCGGGTTCCTCATTCTCTAGGGAGTGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00630.1 ABI SOLiD Barcode Adaptor T-003
+CTGCCCCGGGTTCCTCATTCTCTATAGGTTATACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00631.1 ABI SOLiD Barcode Adaptor T-004
+CTGCCCCGGGTTCCTCATTCTCTGGATGCGGTCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00632.1 ABI SOLiD Barcode Adaptor T-005
+CTGCCCCGGGTTCCTCATTCTCTGTGGTGTAAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00633.1 ABI SOLiD Barcode Adaptor T-006
+CTGCCCCGGGTTCCTCATTCTCTGCGAGGGACACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00634.1 ABI SOLiD Barcode Adaptor T-007
+CTGCCCCGGGTTCCTCATTCTCTGGGTTATGCCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00635.1 ABI SOLiD Barcode Adaptor T-008
+CTGCCCCGGGTTCCTCATTCTCTGAGCGAGGATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00636.1 ABI SOLiD Barcode Adaptor T-009
+CTGCCCCGGGTTCCTCATTCTCTAGGTTGCGACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00637.1 ABI SOLiD Barcode Adaptor T-010
+CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGCTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00638.1 ABI SOLiD Barcode Adaptor T-011
+CTGCCCCGGGTTCCTCATTCTCTGTGCGACACGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00639.1 ABI SOLiD Barcode Adaptor T-012
+CTGCCCCGGGTTCCTCATTCTCTAAGAGGAAAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00640.1 ABI SOLiD Barcode Adaptor T-013
+CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00641.1 ABI SOLiD Barcode Adaptor T-014
+CTGCCCCGGGTTCCTCATTCTCTGTGCGGCAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00642.1 ABI SOLiD Barcode Adaptor T-015
+CTGCCCCGGGTTCCTCATTCTCTGAGTTGAATGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00643.1 ABI SOLiD Barcode Adaptor T-016
+CTGCCCCGGGTTCCTCATTCTCTGGGAGACGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00644.1 ABI SOLiD Barcode Adaptor T-017
+CTGCCCCGGGTTCCTCATTCTCTGGCTCACCGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00645.1 ABI SOLiD Barcode Adaptor T-018
+CTGCCCCGGGTTCCTCATTCTCTAGGCGGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00646.1 ABI SOLiD Barcode Adaptor T-019
+CTGCCCCGGGTTCCTCATTCTCTATGGTAACTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00647.1 ABI SOLiD Barcode Adaptor T-020
+CTGCCCCGGGTTCCTCATTCTCTGTCAAGCTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00648.1 ABI SOLiD Barcode Adaptor T-021
+CTGCCCCGGGTTCCTCATTCTCTGTGCGGTTCCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00649.1 ABI SOLiD Barcode Adaptor T-022
+CTGCCCCGGGTTCCTCATTCTCTGAGAAGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00650.1 ABI SOLiD Barcode Adaptor T-023
+CTGCCCCGGGTTCCTCATTCTCTGCGGTGCTTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00651.1 ABI SOLiD Barcode Adaptor T-024
+CTGCCCCGGGTTCCTCATTCTCTGGGTCGGTATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00652.1 ABI SOLiD Barcode Adaptor T-025
+CTGCCCCGGGTTCCTCATTCTCTAACATGATGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00653.1 ABI SOLiD Barcode Adaptor T-026
+CTGCCCCGGGTTCCTCATTCTCTCGGGAGCCCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00654.1 ABI SOLiD Barcode Adaptor T-027
+CTGCCCCGGGTTCCTCATTCTCTCAGCAAACTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00655.1 ABI SOLiD Barcode Adaptor T-028
+CTGCCCCGGGTTCCTCATTCTCTAGCTTACTACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00656.1 ABI SOLiD Barcode Adaptor T-029
+CTGCCCCGGGTTCCTCATTCTCTGAATCTAGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00657.1 ABI SOLiD Barcode Adaptor T-030
+CTGCCCCGGGTTCCTCATTCTCTGTAGCGAAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00658.1 ABI SOLiD Barcode Adaptor T-031
+CTGCCCCGGGTTCCTCATTCTCTGCTGGTGCGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00659.1 ABI SOLiD Barcode Adaptor T-032
+CTGCCCCGGGTTCCTCATTCTCTGGTTGGGTGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00660.1 ABI SOLiD Barcode Adaptor T-033
+CTGCCCCGGGTTCCTCATTCTCTCGTTGGATACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00661.1 ABI SOLiD Barcode Adaptor T-034
+CTGCCCCGGGTTCCTCATTCTCTTCGTTAAAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00662.1 ABI SOLiD Barcode Adaptor T-035
+CTGCCCCGGGTTCCTCATTCTCTAAGCGTAGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00663.1 ABI SOLiD Barcode Adaptor T-036
+CTGCCCCGGGTTCCTCATTCTCTGTTCTCACATCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00664.1 ABI SOLiD Barcode Adaptor T-037
+CTGCCCCGGGTTCCTCATTCTCTCTGTTATACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00665.1 ABI SOLiD Barcode Adaptor T-038
+CTGCCCCGGGTTCCTCATTCTCTGTCGTCTTAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00666.1 ABI SOLiD Barcode Adaptor T-039
+CTGCCCCGGGTTCCTCATTCTCTTATCGTGAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00667.1 ABI SOLiD Barcode Adaptor T-040
+CTGCCCCGGGTTCCTCATTCTCTAAAAGGGTTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00668.1 ABI SOLiD Barcode Adaptor T-041
+CTGCCCCGGGTTCCTCATTCTCTTGTGGGATTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00669.1 ABI SOLiD Barcode Adaptor T-042
+CTGCCCCGGGTTCCTCATTCTCTGAATGTACTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00670.1 ABI SOLiD Barcode Adaptor T-043
+CTGCCCCGGGTTCCTCATTCTCTCGCTAGGGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00671.1 ABI SOLiD Barcode Adaptor T-044
+CTGCCCCGGGTTCCTCATTCTCTAAGGATGATCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00672.1 ABI SOLiD Barcode Adaptor T-045
+CTGCCCCGGGTTCCTCATTCTCTGTACTTGGCTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00673.1 ABI SOLiD Barcode Adaptor T-046
+CTGCCCCGGGTTCCTCATTCTCTGGTCGTCGAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00674.1 ABI SOLiD Barcode Adaptor T-047
+CTGCCCCGGGTTCCTCATTCTCTGAGGGATGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00675.1 ABI SOLiD Barcode Adaptor T-048
+CTGCCCCGGGTTCCTCATTCTCTGCCGTAAGTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00676.1 ABI SOLiD Barcode Adaptor T-049
+CTGCCCCGGGTTCCTCATTCTCTATGTCATAAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00677.1 ABI SOLiD Barcode Adaptor T-050
+CTGCCCCGGGTTCCTCATTCTCTGAAGGCTTGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00678.1 ABI SOLiD Barcode Adaptor T-051
+CTGCCCCGGGTTCCTCATTCTCTAAGCAGGAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00679.1 ABI SOLiD Barcode Adaptor T-052
+CTGCCCCGGGTTCCTCATTCTCTGTAATTGTAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00680.1 ABI SOLiD Barcode Adaptor T-053
+CTGCCCCGGGTTCCTCATTCTCTGTCATCAAGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00681.1 ABI SOLiD Barcode Adaptor T-054
+CTGCCCCGGGTTCCTCATTCTCTAAAAGGCGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00682.1 ABI SOLiD Barcode Adaptor T-055
+CTGCCCCGGGTTCCTCATTCTCTAGCTTAAGCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00683.1 ABI SOLiD Barcode Adaptor T-056
+CTGCCCCGGGTTCCTCATTCTCTGCATGTCACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00684.1 ABI SOLiD Barcode Adaptor T-057
+CTGCCCCGGGTTCCTCATTCTCTCTAGTAAGAACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00685.1 ABI SOLiD Barcode Adaptor T-058
+CTGCCCCGGGTTCCTCATTCTCTTAAAGTGGCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00686.1 ABI SOLiD Barcode Adaptor T-059
+CTGCCCCGGGTTCCTCATTCTCTAAGTAATGTCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00687.1 ABI SOLiD Barcode Adaptor T-060
+CTGCCCCGGGTTCCTCATTCTCTGTGCCTCGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00688.1 ABI SOLiD Barcode Adaptor T-061
+CTGCCCCGGGTTCCTCATTCTCTAAGATTATCGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00689.1 ABI SOLiD Barcode Adaptor T-062
+CTGCCCCGGGTTCCTCATTCTCTAGGTGAGGGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00690.1 ABI SOLiD Barcode Adaptor T-063
+CTGCCCCGGGTTCCTCATTCTCTGCGGGTTCGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00691.1 ABI SOLiD Barcode Adaptor T-064
+CTGCCCCGGGTTCCTCATTCTCTGTGCTACACCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00692.1 ABI SOLiD Barcode Adaptor T-065
+CTGCCCCGGGTTCCTCATTCTCTGGGATCAAGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00693.1 ABI SOLiD Barcode Adaptor T-066
+CTGCCCCGGGTTCCTCATTCTCTGATGTAATGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00694.1 ABI SOLiD Barcode Adaptor T-067
+CTGCCCCGGGTTCCTCATTCTCTGTCCTTAGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00695.1 ABI SOLiD Barcode Adaptor T-068
+CTGCCCCGGGTTCCTCATTCTCTGCATTGACGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00696.1 ABI SOLiD Barcode Adaptor T-069
+CTGCCCCGGGTTCCTCATTCTCTGATATGCTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00697.1 ABI SOLiD Barcode Adaptor T-070
+CTGCCCCGGGTTCCTCATTCTCTGCCCTACAGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00698.1 ABI SOLiD Barcode Adaptor T-071
+CTGCCCCGGGTTCCTCATTCTCTACAGGGAACGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00699.1 ABI SOLiD Barcode Adaptor T-072
+CTGCCCCGGGTTCCTCATTCTCTAAGTGAATACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00700.1 ABI SOLiD Barcode Adaptor T-073
+CTGCCCCGGGTTCCTCATTCTCTGCAATGACGTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00701.1 ABI SOLiD Barcode Adaptor T-074
+CTGCCCCGGGTTCCTCATTCTCTAGGACGCTGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00702.1 ABI SOLiD Barcode Adaptor T-075
+CTGCCCCGGGTTCCTCATTCTCTGTATCTGGGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00703.1 ABI SOLiD Barcode Adaptor T-076
+CTGCCCCGGGTTCCTCATTCTCTAAGTTTTAGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00704.1 ABI SOLiD Barcode Adaptor T-077
+CTGCCCCGGGTTCCTCATTCTCTATCTGGTCTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00705.1 ABI SOLiD Barcode Adaptor T-078
+CTGCCCCGGGTTCCTCATTCTCTGGCAATCATCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00706.1 ABI SOLiD Barcode Adaptor T-079
+CTGCCCCGGGTTCCTCATTCTCTAGTAGAATTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00707.1 ABI SOLiD Barcode Adaptor T-080
+CTGCCCCGGGTTCCTCATTCTCTGTTTACGGTGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00708.1 ABI SOLiD Barcode Adaptor T-081
+CTGCCCCGGGTTCCTCATTCTCTGAACGTCATTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00709.1 ABI SOLiD Barcode Adaptor T-082
+CTGCCCCGGGTTCCTCATTCTCTGTGAAGGGAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00710.1 ABI SOLiD Barcode Adaptor T-083
+CTGCCCCGGGTTCCTCATTCTCTGGATGGCGTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00711.1 ABI SOLiD Barcode Adaptor T-084
+CTGCCCCGGGTTCCTCATTCTCTGCGGATGAACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00712.1 ABI SOLiD Barcode Adaptor T-085
+CTGCCCCGGGTTCCTCATTCTCTGGAAAGCGTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00713.1 ABI SOLiD Barcode Adaptor T-086
+CTGCCCCGGGTTCCTCATTCTCTAGTACCAGGACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00714.1 ABI SOLiD Barcode Adaptor T-087
+CTGCCCCGGGTTCCTCATTCTCTATAGCAAAGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00715.1 ABI SOLiD Barcode Adaptor T-088
+CTGCCCCGGGTTCCTCATTCTCTGTTGATCATGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00716.1 ABI SOLiD Barcode Adaptor T-089
+CTGCCCCGGGTTCCTCATTCTCTAGGCTGTCTACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00717.1 ABI SOLiD Barcode Adaptor T-090
+CTGCCCCGGGTTCCTCATTCTCTGTGACCTACTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00718.1 ABI SOLiD Barcode Adaptor T-091
+CTGCCCCGGGTTCCTCATTCTCTGCGTATTGGGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00719.1 ABI SOLiD Barcode Adaptor T-092
+CTGCCCCGGGTTCCTCATTCTCTAAGGGATTACCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00720.1 ABI SOLiD Barcode Adaptor T-093
+CTGCCCCGGGTTCCTCATTCTCTGTTACGATGCCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00721.1 ABI SOLiD Barcode Adaptor T-094
+CTGCCCCGGGTTCCTCATTCTCTATGGGTGTTTCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00722.1 ABI SOLiD Barcode Adaptor T-095
+CTGCCCCGGGTTCCTCATTCTCTGAGTCCGGCACTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00723.1 ABI SOLiD Barcode Adaptor T-096
+CTGCCCCGGGTTCCTCATTCTCTAATCGAAGAGCTGCTGTACGGCCAAGGCGT
+>gnl|uv|NGB00724.1 ABI SOLiD Barcode Adaptor A
+GCTGTACGGCCAAGGCGCAGCAGCATG
+>gnl|uv|NGB00727.1 Illumina Nextera PCR primer i5 index N501 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTAGATCGCTCGTCGGCAGCGTC
+>gnl|uv|NGB00728.1 Illumina Nextera PCR primer i5 index N502 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCTCTCTATTCGTCGGCAGCGTC
+>gnl|uv|NGB00729.1 Illumina Nextera PCR primer i5 index N503 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTATCCTCTTCGTCGGCAGCGTC
+>gnl|uv|NGB00730.1 Illumina Nextera PCR primer i5 index N504 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAGAGTAGATCGTCGGCAGCGTC
+>gnl|uv|NGB00731.1 Illumina Nextera PCR primer i5 index N505 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTAAGGAGTCGTCGGCAGCGTC
+>gnl|uv|NGB00732.1 Illumina Nextera PCR primer i5 index N506 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACACTGCATATCGTCGGCAGCGTC
+>gnl|uv|NGB00733.1 Illumina Nextera PCR primer i5 index N507 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAAGGAGTATCGTCGGCAGCGTC
+>gnl|uv|NGB00734.1 Illumina Nextera PCR primer i5 index N508 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCTAAGCCTTCGTCGGCAGCGTC
+>gnl|uv|NGB00735.1 Illumina Nextera PCR primer i7 index N701 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00736.1 Illumina Nextera PCR primer i7 index N702 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00737.1 Illumina Nextera PCR primer i7 index N703 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGG
+>gnl|uv|NGB00738.1 Illumina Nextera PCR primer i7 index N704 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00739.1 Illumina Nextera PCR primer i7 index N705 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTCTCGTGGGCTCGG
+>gnl|uv|NGB00740.1 Illumina Nextera PCR primer i7 index N706 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCATGCCTAGTCTCGTGGGCTCGG
+>gnl|uv|NGB00741.1 Illumina Nextera PCR primer i7 index N707 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00742.1 Illumina Nextera PCR primer i7 index N708 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCTCTCTGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00743.1 Illumina Nextera PCR primer i7 index N709 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGCGTAGCGTCTCGTGGGCTCGG
+>gnl|uv|NGB00744.1 Illumina Nextera PCR primer i7 index N710 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTCTCGTGGGCTCGG
+>gnl|uv|NGB00745.1 Illumina Nextera PCR primer i7 index N711 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTCTCGTGGGCTCGG
+>gnl|uv|NGB00746.1 Illumina Nextera PCR primer i7 index N712 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCCTCTACGTCTCGTGGGCTCGG
+>gnl|uv|NGB00747.1 Illumina TruSeq DNA HT and RNA HT i5 index D501 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTATAGCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00748.1 Illumina TruSeq DNA HT and RNA HT i5 index D502 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACATAGAGGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00749.1 Illumina TruSeq DNA HT and RNA HT i5 index D503 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCCTATCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00750.1 Illumina TruSeq DNA HT and RNA HT i5 index D504 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGGCTCTGAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00751.1 Illumina TruSeq DNA HT and RNA HT i5 index D505 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACAGGCGAAGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00752.1 Illumina TruSeq DNA HT and RNA HT i5 index D506 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACTAATCTTAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00753.1 Illumina TruSeq DNA HT and RNA HT i5 index D507 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACCAGGACGTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00754.1 Illumina TruSeq DNA HT and RNA HT i5 index D508 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTACTGACACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00755.1 Illumina TruSeq DNA HT and RNA HT i7 index D701 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00756.1 Illumina TruSeq DNA HT and RNA HT i7 index D702 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00757.1 Illumina TruSeq DNA HT and RNA HT i7 index D703 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGCTCATTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00758.1 Illumina TruSeq DNA HT and RNA HT i7 index D704 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGATTCCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00759.1 Illumina TruSeq DNA HT and RNA HT i7 index D705 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCAGAAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00760.1 Illumina TruSeq DNA HT and RNA HT i7 index D706 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAATTCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00761.1 Illumina TruSeq DNA HT and RNA HT i7 index D707 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTGAAGCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00762.1 Illumina TruSeq DNA HT and RNA HT i7 index D708 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATGCGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00763.1 Illumina TruSeq DNA HT and RNA HT i7 index D709 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGGCTATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00764.1 Illumina TruSeq DNA HT and RNA HT i7 index D710 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGCGAAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00765.1 Illumina TruSeq DNA HT and RNA HT i7 index D711 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCGCGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00766.1 Illumina TruSeq DNA HT and RNA HT i7 index D712 adapter (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGCGATAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00767.1 Illumina TruSeq Adapter Index 1 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00768.1 Illumina TruSeq Adapter Index 2 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00769.1 Illumina TruSeq Adapter Index 3 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00770.1 Illumina TruSeq Adapter Index 4 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00771.1 Illumina TruSeq Adapter Index 5 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00772.1 Illumina TruSeq Adapter Index 6 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00773.1 Illumina TruSeq Adapter Index 7 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00774.1 Illumina TruSeq Adapter Index 8 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00775.1 Illumina TruSeq Adapter Index 9 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00776.1 Illumina TruSeq Adapter Index 10 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00777.1 Illumina TruSeq Adapter Index 11 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00778.1 Illumina TruSeq Adapter Index 12 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00779.1 Illumina TruSeq Adapter Index 13 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00780.1 Illumina TruSeq Adapter Index 14 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00781.1 Illumina TruSeq Adapter Index 15 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGTCAGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00782.1 Illumina TruSeq Adapter Index 16 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCGTCCCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00783.1 Illumina TruSeq Adapter Index 18 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCCGCACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00784.1 Illumina TruSeq Adapter Index 19 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGAAACGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00785.1 Illumina TruSeq Adapter Index 20 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGGCCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00786.1 Illumina TruSeq Adapter Index 21 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00787.1 Illumina TruSeq Adapter Index 22 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGTACGTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00788.1 Illumina TruSeq Adapter Index 23 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTGGATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00789.1 Illumina TruSeq Adapter Index 25 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTGATATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00790.1 Illumina TruSeq Adapter Index 27 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCCTTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB00791.1 Illumina TruSeq Small RNA Sample Prep Kit Stop Oligo (STP) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GAAUUCCACCACGUUCCCGUGG
+>gnl|uv|NGB00792.1 Illumina TruSeq Small RNA Sample Prep Kit RNA RT Primer (RTP) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+GCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00793.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer (RP1) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGA
+>gnl|uv|NGB00794.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 1 (RPI1) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00795.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 2 (RPI2) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00796.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 3 (RPI3) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00797.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 4 (RPI4) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00798.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 5 (RPI5) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00799.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 6 (RPI6) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00800.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 7 (RPI7) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00801.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 8 (RPI8) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00802.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 9 (RPI9) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00803.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 10 (RPI10) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00804.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 11 (RPI11) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00805.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 12 (RPI12) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00806.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 13 (RPI13) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTGACTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00807.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 14 (RPI14) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGAACTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00808.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 15 (RPI15) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGACATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00809.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 16 (RPI16) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGACGGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00810.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 17 (RPI17) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTCTACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00811.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 18 (RPI18) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCGGACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00812.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 19 (RPI19) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTTTCACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00813.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 20 (RPI20) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGGCCACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00814.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 21 (RPI21) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGAAACGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00815.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 22 (RPI22) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGTACGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00816.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 23 (RPI23) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCACTCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00817.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 24 (RPI24) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTACCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00818.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 25 (RPI25) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATCAGTGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00819.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 26 (RPI26) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTCATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00820.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 27 (RPI27) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGGAATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00821.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 28 (RPI28) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTTTTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00822.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 29 (RPI29) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTAGTTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00823.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 30 (RPI30) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCCGGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00824.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 31 (RPI31) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATCGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00825.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 32 (RPI32) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGAGTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00826.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 33 (RPI33) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGCCTGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00827.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 34 (RPI34) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCCATGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00828.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 35 (RPI35) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAAAATGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00829.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 36 (RPI36) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGTTGGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00830.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 37 (RPI37) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTCCGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00831.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 38 (RPI38) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATAGCTAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00832.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 39 (RPI39) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTATAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00833.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 40 (RPI40) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCTGAGGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00834.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 41 (RPI41) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGTCGTCGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00835.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 42 (RPI42) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCGATTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00836.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 43 (RPI43) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGCTGTAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00837.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 44 (RPI44) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATATTATAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00838.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 45 (RPI45) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATGAATGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00839.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 46 (RPI46) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTCGGGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00840.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 47 (RPI47) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATCTTCGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00841.1 Illumina TruSeq Small RNA Sample Prep Kit RNA PCR Primer Index 48 (RPI48) (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CAAGCAGAAGACGGCATACGAGATTGCCGAGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA
+>gnl|uv|NGB00844.1 Epicentre BiotechnologiesNextera DNA Sample Prep Kit Adaptor (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+AATGATACGGCGACCACCGAGATCTACACGCCTCCCTCGCGCCATCAG
+>gnl|uv|NGB00845.1 Epicentre Biotechnologies Nextera DNA Sample Prep Kit Adaptor, following the barcode (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+CGGTCTGCCTTGCCAGCCCGCTCAG
+>gnl|uv|NGB00846.1 NEBNext Adaptor for Illumina
+GATCGGAAGAGCACACGTCTGAACTCCAGTCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB00847.1 NEBNext Index 1 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00848.1 NEBNext Index 2 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00849.1 NEBNext Index 3 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00850.1 NEBNext Index 4 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00851.1 NEBNext Index 5 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00852.1 NEBNext Index 6 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00853.1 NEBNext Index 7 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00854.1 NEBNext Index 8 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00855.1 NEBNext Index 9 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00856.1 NEBNext Index 10 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00857.1 NEBNext Index 11 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00858.1 NEBNext Index 12 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00859.1 NEBNext Index 13 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTGTTGACTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00860.1 NEBNext Index 14 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATACGGAACTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00861.1 NEBNext Index 15 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCTGACATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00862.1 NEBNext Index 16 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGGGACGGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00863.1 NEBNext Index 18 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATGTGCGGACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00864.1 NEBNext Index 19 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATCGTTTCACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00865.1 NEBNext Index 20 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAGGCCACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00866.1 NEBNext Index 21 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTCCGAAACGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00867.1 NEBNext Index 22 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATTACGTACGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00868.1 NEBNext Index 23 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATCCACTCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00869.1 NEBNext Index 25 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATATATCAGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00870.1 NEBNext Index 27 Primer for Illumina
+CAAGCAGAAGACGGCATACGAGATAAAGGAATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
+>gnl|uv|NGB00871.1 Ion Xpress A Adapter
+AACCATCTCATCCCTGCGTGTCTCCGACTCAG
+>gnl|uv|NGB00872.1 Ion Xpress P1 Adapter
+AACCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT
+>gnl|uv|NGB00873.1 Ion Xpress Barcode 1 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGAT
+>gnl|uv|NGB00874.1 Ion Xpress Barcode 2 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGGAGAACGAT
+>gnl|uv|NGB00875.1 Ion Xpress Barcode 3 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAAGAGGATTCGAT
+>gnl|uv|NGB00876.1 Ion Xpress Barcode 4 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTACCAAGATCGAT
+>gnl|uv|NGB00877.1 Ion Xpress Barcode 5 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGAAGGAACGAT
+>gnl|uv|NGB00878.1 Ion Xpress Barcode 6 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGCAAGTTCGAT
+>gnl|uv|NGB00879.1 Ion Xpress Barcode 7 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCGTGATTCGAT
+>gnl|uv|NGB00880.1 Ion Xpress Barcode 8 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCGATAACGAT
+>gnl|uv|NGB00881.1 Ion Xpress Barcode 9 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGCGGAACGAT
+>gnl|uv|NGB00882.1 Ion Xpress Barcode 10 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGACCGAACGAT
+>gnl|uv|NGB00883.1 Ion Xpress Barcode 11 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTCGAATCGAT
+>gnl|uv|NGB00884.1 Ion Xpress Barcode 12 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGGTGGTTCGAT
+>gnl|uv|NGB00885.1 Ion Xpress Barcode 13 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAACGGACGAT
+>gnl|uv|NGB00886.1 Ion Xpress Barcode 14 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGAGTGTCGAT
+>gnl|uv|NGB00887.1 Ion Xpress Barcode 15 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGAGGTCGAT
+>gnl|uv|NGB00888.1 Ion Xpress Barcode 16 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGGATGACGAT
+>gnl|uv|NGB00889.1 Ion Xpress Barcode 17 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTATTCGTCGAT
+>gnl|uv|NGB00890.1 Ion Xpress Barcode 18 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGGCAATTGCGAT
+>gnl|uv|NGB00891.1 Ion Xpress Barcode 19 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTAGTCGGACGAT
+>gnl|uv|NGB00892.1 Ion Xpress Barcode 20 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGATCCATCGAT
+>gnl|uv|NGB00893.1 Ion Xpress Barcode 21 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCAATTACGAT
+>gnl|uv|NGB00894.1 Ion Xpress Barcode 22 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCGAGACGCGAT
+>gnl|uv|NGB00895.1 Ion Xpress Barcode 23 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGCCACGAACGAT
+>gnl|uv|NGB00896.1 Ion Xpress Barcode 24 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAACCTCATTCGAT
+>gnl|uv|NGB00897.1 Ion Xpress Barcode 25 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGAGATACGAT
+>gnl|uv|NGB00898.1 Ion Xpress Barcode 26 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTACAACCTCGAT
+>gnl|uv|NGB00899.1 Ion Xpress Barcode 27 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAACCATCCGCGAT
+>gnl|uv|NGB00900.1 Ion Xpress Barcode 28 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGATCCGGAATCGAT
+>gnl|uv|NGB00901.1 Ion Xpress Barcode 29 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGACCACTCGAT
+>gnl|uv|NGB00902.1 Ion Xpress Barcode 30 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGGTTATCGAT
+>gnl|uv|NGB00903.1 Ion Xpress Barcode 31 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCAAGCTGCGAT
+>gnl|uv|NGB00904.1 Ion Xpress Barcode 32 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTTACACACGAT
+>gnl|uv|NGB00905.1 Ion Xpress Barcode 33 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCTCATTGAACGAT
+>gnl|uv|NGB00906.1 Ion Xpress Barcode 34 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGCATCGTTCGAT
+>gnl|uv|NGB00907.1 Ion Xpress Barcode 35 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGCCATTGTCGAT
+>gnl|uv|NGB00908.1 Ion Xpress Barcode 36 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAAGGAATCGTCGAT
+>gnl|uv|NGB00909.1 Ion Xpress Barcode 37 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGAGAATGTCGAT
+>gnl|uv|NGB00910.1 Ion Xpress Barcode 38 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGGAGGACGGACGAT
+>gnl|uv|NGB00911.1 Ion Xpress Barcode 39 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAACAATCGGCGAT
+>gnl|uv|NGB00912.1 Ion Xpress Barcode 40 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGACATAATCGAT
+>gnl|uv|NGB00913.1 Ion Xpress Barcode 41 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCACTTCGCGAT
+>gnl|uv|NGB00914.1 Ion Xpress Barcode 42 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCACGAATCGAT
+>gnl|uv|NGB00915.1 Ion Xpress Barcode 43 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGACACCGCGAT
+>gnl|uv|NGB00916.1 Ion Xpress Barcode 44 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGAGGCCAGCGAT
+>gnl|uv|NGB00917.1 Ion Xpress Barcode 45 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGGAGCTTCCTCGAT
+>gnl|uv|NGB00918.1 Ion Xpress Barcode 46 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAGTCCGAACGAT
+>gnl|uv|NGB00919.1 Ion Xpress Barcode 47 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTAAGGCAACCACGAT
+>gnl|uv|NGB00920.1 Ion Xpress Barcode 48 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCTAAGAGACGAT
+>gnl|uv|NGB00921.1 Ion Xpress Barcode 49 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTAACATAACGAT
+>gnl|uv|NGB00922.1 Ion Xpress Barcode 50 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGACAATGGCGAT
+>gnl|uv|NGB00923.1 Ion Xpress Barcode 51 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGAGCCTATTCGAT
+>gnl|uv|NGB00924.1 Ion Xpress Barcode 52 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGCATGGAACGAT
+>gnl|uv|NGB00925.1 Ion Xpress Barcode 53 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGGCAATCCTCGAT
+>gnl|uv|NGB00926.1 Ion Xpress Barcode 54 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGGAGAATCGCGAT
+>gnl|uv|NGB00927.1 Ion Xpress Barcode 55 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCACCTCCTCGAT
+>gnl|uv|NGB00928.1 Ion Xpress Barcode 56 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGCATTAATTCGAT
+>gnl|uv|NGB00929.1 Ion Xpress Barcode 57 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGGCAACGGCGAT
+>gnl|uv|NGB00930.1 Ion Xpress Barcode 58 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTAGAACACGAT
+>gnl|uv|NGB00931.1 Ion Xpress Barcode 59 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTTGATGTTCGAT
+>gnl|uv|NGB00932.1 Ion Xpress Barcode 60 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTAGCTCTTCGAT
+>gnl|uv|NGB00933.1 Ion Xpress Barcode 61 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACTCGGATCGAT
+>gnl|uv|NGB00934.1 Ion Xpress Barcode 62 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCTGCTTCACGAT
+>gnl|uv|NGB00935.1 Ion Xpress Barcode 63 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTTAGAGTTCGAT
+>gnl|uv|NGB00936.1 Ion Xpress Barcode 64 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGAGTTCCGACGAT
+>gnl|uv|NGB00937.1 Ion Xpress Barcode 65 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTGGCACATCGAT
+>gnl|uv|NGB00938.1 Ion Xpress Barcode 66 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGCAATCATCGAT
+>gnl|uv|NGB00939.1 Ion Xpress Barcode 67 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCTACCAGTCGAT
+>gnl|uv|NGB00940.1 Ion Xpress Barcode 68 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAAGAAGTTCGAT
+>gnl|uv|NGB00941.1 Ion Xpress Barcode 69 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTCAATTGGCGAT
+>gnl|uv|NGB00942.1 Ion Xpress Barcode 70 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTACTGGTCGAT
+>gnl|uv|NGB00943.1 Ion Xpress Barcode 71 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGGCTCCGACGAT
+>gnl|uv|NGB00944.1 Ion Xpress Barcode 72 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAAGGCCACACGAT
+>gnl|uv|NGB00945.1 Ion Xpress Barcode 73 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGCCTGTCGAT
+>gnl|uv|NGB00946.1 Ion Xpress Barcode 74 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGATCGGTTCGAT
+>gnl|uv|NGB00947.1 Ion Xpress Barcode 75 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCAGGAATACGAT
+>gnl|uv|NGB00948.1 Ion Xpress Barcode 76 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGAAGAACCTCGAT
+>gnl|uv|NGB00949.1 Ion Xpress Barcode 77 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAAGCGATTCGAT
+>gnl|uv|NGB00950.1 Ion Xpress Barcode 78 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGCCAATTCTCGAT
+>gnl|uv|NGB00951.1 Ion Xpress Barcode 79 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGGTTGTCGAT
+>gnl|uv|NGB00952.1 Ion Xpress Barcode 80 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGAAGGCAGGCGAT
+>gnl|uv|NGB00953.1 Ion Xpress Barcode 81 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCTGCCATTCGCGAT
+>gnl|uv|NGB00954.1 Ion Xpress Barcode 82 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGCATCTCGAT
+>gnl|uv|NGB00955.1 Ion Xpress Barcode 83 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGGACATTCGAT
+>gnl|uv|NGB00956.1 Ion Xpress Barcode 84 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTCCATAACGAT
+>gnl|uv|NGB00957.1 Ion Xpress Barcode 85 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCAGCCTCAACGAT
+>gnl|uv|NGB00958.1 Ion Xpress Barcode 86 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGGTTATTCGAT
+>gnl|uv|NGB00959.1 Ion Xpress Barcode 87 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTGGCTGGACGAT
+>gnl|uv|NGB00960.1 Ion Xpress Barcode 88 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCCGAACACTTCGAT
+>gnl|uv|NGB00961.1 Ion Xpress Barcode 89 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTGAATCTCGAT
+>gnl|uv|NGB00962.1 Ion Xpress Barcode 90 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAACCACGGCGAT
+>gnl|uv|NGB00963.1 Ion Xpress Barcode 91 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGAAGGATGCGAT
+>gnl|uv|NGB00964.1 Ion Xpress Barcode 92 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAGGAACCGCGAT
+>gnl|uv|NGB00965.1 Ion Xpress Barcode 93 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCTTGTCCAATCGAT
+>gnl|uv|NGB00966.1 Ion Xpress Barcode 94 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCGACAAGCGAT
+>gnl|uv|NGB00967.1 Ion Xpress Barcode 95 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGCGGACAGATCGAT
+>gnl|uv|NGB00968.1 Ion Xpress Barcode 96 A Adapter
+CCATCTCATCCCTGCGTGTCTCCGACTCAGTTAAGCGGTCGAT
+>gnl|uv|NGB00969.1 Illumina Single End Apapter 1 (Oligonucleotide sequence copyright 2007-2012 Illumina, Inc. All rights reserved.)
+ACACTCTTTCCCTACACGACGCTGTTCCATCT
+>gnl|uv|NGB00970.1 ABI SOLiD SAGE Dynabeads Oligo-dT EcoP Primer
+CTGATCTAGAGGTACCGGATCCCAGCAGTTTTTTTTTTTTTTTTTTTTTTTTT
+>gnl|uv|NGB00971.1 ABI SOLiD SAGE Adapter A
+CTGCCCCGGGTTCCTCATTCTCTCAGCAGCATG
+>gnl|uv|NGB00972.1 Pacific Biosciences Blunt Adapter
+ATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGAT
+>gnl|uv|NGB00973.1 Pacific Biosciences C2 Primer
+AAAAAAAAAAAAAAAAAATTAACGGAGGAGGAGGA
+>gnl|uv|NGB00982.1 Universal primer-dN6
+GCCGGAGCTCTGCAGAATTCNNNNNN
+>gnl|uv|NGB00983.1 Whole Transcriptome Amplification 5'-end tag
+GTGGTGTGTTGGGTGTGTTTGGNNNNNNNNN
+>gnl|uv|NGB01026.1 SISPA primer FR20RV
+GCCGGAGCTCTGCAGATATC
+>gnl|uv|NGB01029.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT1
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01030.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT2
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01031.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT3
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01032.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT4
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01033.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT5
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01034.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT6
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01035.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT7
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01036.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT8
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01037.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT9
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01038.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT10
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01039.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT11
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01040.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT12
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01041.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT13
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGGTTGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01042.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT14
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCGGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01043.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT15
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAAGCGTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01044.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT16
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGTCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01045.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT17
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTACCTTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01046.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT18
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCTGTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01047.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT19
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTGCTGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01048.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT20
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGGAGGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01049.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT21
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGAGCGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01050.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT22
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGATACGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01051.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT99
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGCTACCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01052.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT101
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGTTGGACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01053.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT25
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGCGATCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01054.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT26
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCCTGCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01055.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT27
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGTGACTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01056.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT28
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTACAGGATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01057.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT29
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCTCAATATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01058.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT30
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTGGTTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01059.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT31
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGTCTTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01060.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT32
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCCATTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01061.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT33
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGAAGTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01062.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT34
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAACGCTGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01063.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT35
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGGTATGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01064.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT36
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGAACTGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01065.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT102
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCACAACATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01066.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT38
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCTCACGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01067.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT39
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCAGGAGGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01068.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT40
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAAGTTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01069.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT41
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCAGTCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01070.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT42
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGTATGCGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01071.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT43
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCATTGAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01072.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT44
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGGCTCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01073.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT45
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTATGCCAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01074.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT46
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCAGATTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01075.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT47
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTACTAGTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01076.1 Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT48
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTCAGCTCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01077.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D701
+AATGATACGGCGACCACCGAGATCTACACATTACTCGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01078.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D702
+AATGATACGGCGACCACCGAGATCTACACTCCGGAGAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01079.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D703
+AATGATACGGCGACCACCGAGATCTACACCGCTCATTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01080.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D704
+AATGATACGGCGACCACCGAGATCTACACGAGATTCCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01081.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D705
+AATGATACGGCGACCACCGAGATCTACACATTCAGAAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01082.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D706
+AATGATACGGCGACCACCGAGATCTACACGAATTCGTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01083.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D707
+AATGATACGGCGACCACCGAGATCTACACCTGAAGCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01084.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D708
+AATGATACGGCGACCACCGAGATCTACACTAATGCGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01085.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D709
+AATGATACGGCGACCACCGAGATCTACACCGGCTATGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01086.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D710
+AATGATACGGCGACCACCGAGATCTACACTCCGCGAAACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01087.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D711
+AATGATACGGCGACCACCGAGATCTACACTCTCGCGCACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01088.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D712
+AATGATACGGCGACCACCGAGATCTACACAGCGATAGACACTCTTTCCCTACACGACGCTCTTCCGATCT
+>gnl|uv|NGB01089.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D501
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTATAGCCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01090.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D502
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATAGAGGCATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01091.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D503
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCTATCCTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01092.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D504
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTCTGAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01093.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D505
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACAGGCGAAGATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01094.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D506
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATCTTAATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01095.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D507
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGGACGTATCTCGTATGCCGTCTTCTGCTTG
+>gnl|uv|NGB01096.1 Rubicon Genomics ThruPLEX DNA-seq dual-index D508
+AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTACTGACATCTCGTATGCCGTCTTCTGCTTG
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/dummy_file.txt	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,1 @@
+DuMmY
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/assets/dummy_file2.txt	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,1 @@
+DuMmY
Binary file 0.5.0/assets/nowayout-icon.png has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/check_samplesheet.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,188 @@
+#!/usr/bin/env python3
+
+import argparse
+import errno
+import os
+import sys
+
+
+def parse_args(args=None):
+    Description = "Reformat samplesheet file and check its contents."
+    Epilog = "Example usage: python check_samplesheet.py <FILE_IN> <FILE_OUT>"
+
+    parser = argparse.ArgumentParser(description=Description, epilog=Epilog)
+    parser.add_argument("FILE_IN", help="Input samplesheet file.")
+    parser.add_argument("FILE_OUT", help="Output file.")
+    return parser.parse_args(args)
+
+
+def make_dir(path):
+    if len(path) > 0:
+        try:
+            os.makedirs(path)
+        except OSError as exception:
+            if exception.errno != errno.EEXIST:
+                raise exception
+
+
+def print_error(error, context="Line", context_str=""):
+    error_str = f"ERROR: Please check samplesheet -> {error}"
+    if context != "" and context_str != "":
+        error_str = f"ERROR: Please check samplesheet -> {error}\n{context.strip()}: '{context_str.strip()}'"
+    print(error_str)
+    sys.exit(1)
+
+
+def check_samplesheet(file_in, file_out):
+    """
+    This function checks that the samplesheet follows the following structure:
+
+    sample,fq1,fq2,strandedness
+    SAMPLE_PE,SAMPLE_PE_RUN1_1.fastq.gz,SAMPLE_PE_RUN1_2.fastq.gz,forward
+    SAMPLE_PE,SAMPLE_PE_RUN2_1.fastq.gz,SAMPLE_PE_RUN2_2.fastq.gz,forward
+    SAMPLE_SE,SAMPLE_SE_RUN1_1.fastq,,forward
+    SAMPLE_SE,SAMPLE_SE_RUN1_2.fastq.gz,,forward
+
+    For an example see:
+    https://github.com/nf-core/test-datasets/blob/rnaseq/samplesheet/v3.1/samplesheet_test.csv
+    """
+
+    sample_mapping_dict = {}
+    with open(file_in, "r", encoding="utf-8-sig") as fin:
+
+        ## Check header
+        MIN_COLS = 3
+        HEADER = ["sample", "fq1", "fq2", "strandedness"]
+        header = [x.strip('"') for x in fin.readline().strip().split(",")]
+        if header[: len(HEADER)] != HEADER:
+            print(
+                f"ERROR: Please check samplesheet header -> {','.join(header)} != {','.join(HEADER)}"
+            )
+            sys.exit(1)
+
+        ## Check sample entries
+        for line in fin:
+            if line.strip():
+                lspl = [x.strip().strip('"') for x in line.strip().split(",")]
+
+                ## Check valid number of columns per row
+                if len(lspl) < len(HEADER):
+                    print_error(
+                        f"Invalid number of columns (minimum = {len(HEADER)})!",
+                        "Line",
+                        line,
+                    )
+
+                num_cols = len([x for x in lspl if x])
+                if num_cols < MIN_COLS:
+                    print_error(
+                        f"Invalid number of populated columns (minimum = {MIN_COLS})!",
+                        "Line",
+                        line,
+                    )
+
+                ## Check sample name entries
+                sample, fq1, fq2, strandedness = lspl[: len(HEADER)]
+                if sample.find(" ") != -1:
+                    print(
+                        f"WARNING: Spaces have been replaced by underscores for sample: {sample}"
+                    )
+                    sample = sample.replace(" ", "_")
+                if not sample:
+                    print_error("Sample entry has not been specified!", "Line", line)
+
+                ## Check FastQ file extension
+                for fastq in [fq1, fq2]:
+                    if fastq:
+                        if fastq.find(" ") != -1:
+                            print_error("FastQ file contains spaces!", "Line", line)
+                        # if not fastq.endswith(".fastq.gz") and not fastq.endswith(".fq.gz"):
+                        #     print_error(
+                        #         "FastQ file does not have extension '.fastq.gz' or '.fq.gz'!",
+                        #         "Line",
+                        #         line,
+                        #     )
+
+                ## Check strandedness
+                strandednesses = ["unstranded", "forward", "reverse"]
+                if strandedness:
+                    if strandedness not in strandednesses:
+                        print_error(
+                            f"Strandedness must be one of '{', '.join(strandednesses)}'!",
+                            "Line",
+                            line,
+                        )
+                else:
+                    print_error(
+                        f"Strandedness has not been specified! Must be one of {', '.join(strandednesses)}.",
+                        "Line",
+                        line,
+                    )
+
+                ## Auto-detect paired-end/single-end
+                sample_info = []  ## [single_end, fq1, fq2, strandedness]
+                if sample and fq1 and fq2:  ## Paired-end short reads
+                    sample_info = ["0", fq1, fq2, strandedness]
+                elif sample and fq1 and not fq2:  ## Single-end short reads
+                    sample_info = ["1", fq1, fq2, strandedness]
+                else:
+                    print_error(
+                        "Invalid combination of columns provided!", "Line", line
+                    )
+
+                ## Create sample mapping dictionary = {sample: [[ single_end, fq1, fq2, strandedness ]]}
+                if sample not in sample_mapping_dict:
+                    sample_mapping_dict[sample] = [sample_info]
+                else:
+                    if sample_info in sample_mapping_dict[sample]:
+                        print_error(
+                            "Samplesheet contains duplicate rows!", "Line", line
+                        )
+                    else:
+                        sample_mapping_dict[sample].append(sample_info)
+
+    ## Write validated samplesheet with appropriate columns
+    if len(sample_mapping_dict) > 0:
+        out_dir = os.path.dirname(file_out)
+        make_dir(out_dir)
+        with open(file_out, "w") as fout:
+            fout.write(
+                ",".join(["sample", "single_end", "fq1", "fq2", "strandedness"]) + "\n"
+            )
+            for sample in sorted(sample_mapping_dict.keys()):
+
+                ## Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end
+                if not all(
+                    x[0] == sample_mapping_dict[sample][0][0]
+                    for x in sample_mapping_dict[sample]
+                ):
+                    print_error(
+                        f"Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end!",
+                        "Sample",
+                        sample,
+                    )
+
+                ## Check that multiple runs of the same sample are of the same strandedness
+                if not all(
+                    x[-1] == sample_mapping_dict[sample][0][-1]
+                    for x in sample_mapping_dict[sample]
+                ):
+                    print_error(
+                        f"Multiple runs of a sample must have the same strandedness!",
+                        "Sample",
+                        sample,
+                    )
+
+                for idx, val in enumerate(sample_mapping_dict[sample]):
+                    fout.write(",".join([f"{sample}_T{idx+1}"] + val) + "\n")
+    else:
+        print_error(f"No entries to process!", "Samplesheet: {file_in}")
+
+
+def main(args=None):
+    args = parse_args(args)
+    check_samplesheet(args.FILE_IN, args.FILE_OUT)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/create_fasta_and_lineages.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,532 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+import ssl
+import tempfile
+from html.parser import HTMLParser
+from urllib.request import urlopen
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# HTMLParser override class to get fna.gz and gbff.gz
+class NCBIHTMLParser(HTMLParser):
+    def __init__(self, *, convert_charrefs: bool = ...) -> None:
+        super().__init__(convert_charrefs=convert_charrefs)
+        self.reset()
+        self.href_data = list()
+
+    def handle_data(self, data):
+        self.href_data.append(data)
+
+
+# Download organelle FASTA and GenBank file.
+def dl_mito_seqs_and_flat_files(url: str, suffix: re, out: os.PathLike) -> os.PathLike:
+    """
+    Method to save .fna.gz and .gbff.gz files for the
+    RefSeq mitochondrion release.
+    """
+    contxt = ssl.create_default_context()
+    contxt.check_hostname = False
+    contxt.verify_mode = ssl.CERT_NONE
+
+    if url == None:
+        logging.error(
+            "Please provide the base URL where .fna.gz and .gbff.gz"
+            + "\nfiles for RefSeq mitochondrion can be found."
+        )
+        exit(1)
+
+    if os.path.exists(out):
+        for file in os.listdir(out):
+            file_path = os.path.join(out, file)
+
+            if suffix.match(file_path) and os.path.getsize(file_path) > 0:
+                logging.info(
+                    f"The required mitochondrion file(s)\n[{os.path.basename(file_path)}]"
+                    + " already exists.\nSkipping download from NCBI..."
+                    + "\nPlease use -f to delete and overwrite."
+                )
+                return file_path
+    else:
+        os.makedirs(out)
+
+    html_parser = NCBIHTMLParser()
+    logging.info(f"Finding latest NCBI RefSeq mitochondrion release at:\n{url}")
+
+    with urlopen(url, context=contxt) as response:
+        with tempfile.NamedTemporaryFile(delete=False) as tmp_html_file:
+            shutil.copyfileobj(response, tmp_html_file)
+
+    with open(tmp_html_file.name, "r") as html:
+        html_parser.feed("".join(html.readlines()))
+
+    file = suffix.search("".join(html_parser.href_data)).group(0)
+    file_url = "/".join([url, file + ".gz"])
+    file_at = os.path.join(out, file)
+
+    logging.info(f"Found NCBI RefSeq mitochondrian file(s):\n{file_url}")
+
+    logging.info(f"Saving to:\n{file_at}")
+
+    with tempfile.NamedTemporaryFile(delete=False) as tmp_gz:
+        with urlopen(file_url, context=contxt) as response:
+            tmp_gz.write(response.read())
+
+    with open(file_at, "w") as fh:
+        with gzip.open(tmp_gz.name, "rb") as web_gz:
+            fh.write(web_gz.read().decode("utf-8"))
+
+    html.close()
+    tmp_gz.close()
+    tmp_html_file.close()
+    os.unlink(tmp_gz.name)
+    os.unlink(tmp_html_file.name)
+    fh.close()
+    web_gz.close()
+    response.close()
+
+    return file_at
+
+
+def get_lineages(csv: os.PathLike, cols: list) -> list:
+    """
+    Parse the output from `ncbitax2lin` tool and
+    return a dict of lineages where the key is
+    genusspeciesstrain.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        header_cols = csv_fh.readline().strip().split(",")
+        user_req_cols = [
+            tcol_i for tcol_i, tcol in enumerate(header_cols) if tcol in cols
+        ]
+        cols_not_found = [tcol for tcol in cols if tcol not in header_cols]
+        raw_recs = 0
+
+        if len(cols_not_found) > 0:
+            logging.error(
+                f"The following columns do not exist in the"
+                + f"\nCSV file [ {os.path.basename(csv)} ]:\n"
+                + "".join(cols_not_found)
+            )
+            exit(1)
+        elif len(user_req_cols) > 9:
+            logging.error(
+                f"Only a total of 9 columns are needed!"
+                + "\ntax_id,kindom,phylum,class,order,family,genus,species,strain"
+            )
+            exit(1)
+
+        for tax in csv_fh:
+            raw_recs += 1
+            lcols = tax.strip().split(",")
+
+            if bool(lcols[user_req_cols[8]]):
+                lineages[lcols[user_req_cols[8]]] = ",".join(
+                    [re.sub(r"[\,\"]", "", lcols[l]) for l in user_req_cols[1:]]
+                )
+            elif bool(lcols[user_req_cols[7]]):
+                lineages[lcols[user_req_cols[7]]] = ",".join(
+                    [re.sub(r"[\,\"]", "", lcols[l]) for l in user_req_cols[1:8]]
+                    + [str()]
+                )
+
+            if lcols[7] == "Rondeletia bicolor Goode & Bean":
+                print(lineages[lcols[8]])
+                exit(0)
+
+    csv_fh.close()
+    return lineages, raw_recs
+
+
+def from_genbank(gbk: os.PathLike, min_len: int) -> dict:
+    """
+    Method to parse GenBank file and return
+    organism to latest accession mapping.
+    """
+    accs2orgs = dict()
+    sanitize_pat = re.compile(r"[\,\"]")
+
+    if not (os.path.exists(gbk) or os.path.getsize(gbk) > 0):
+        logging.info(
+            f"The GenBank file [{os.path.basename(gbk)}] does not exist"
+            + "\nor is of size 0."
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(gbk)}...")
+
+    # a = open("./_accs", "w")
+    try:
+        for record in SeqIO.parse(gbk, "genbank"):
+            if len(record.seq) < min_len:
+                continue
+            else:
+                # a.write(f"{record.id}\n")
+                accs2orgs[record.id] = sanitize_pat.sub(
+                    "", record.annotations["organism"]
+                )
+    except Exception as e:
+        logging.error(f"Error occured around GenBank ID: {record.id}")
+        logging.error(f"Error: {e}")
+        exit(1)
+
+    return accs2orgs
+
+
+def from_genbank_alt(gbk: os.PathLike) -> dict:
+    """
+    Method to parse GenBank file and return
+    organism to latest accession mapping without
+    using BioPython's GenBank Scanner
+    """
+    accs2orgs = dict()
+    accs = dict()
+    orgs = dict()
+    acc = False
+    acc_pat = re.compile(r"^VERSION\s+(.+)")
+    org_pat = re.compile(r"^\s+ORGANISM\s+(.+)")
+    sanitize_pat = re.compile(r"[\,\"]")
+
+    if not (os.path.exists(gbk) or os.path.getsize(gbk) > 0):
+        logging.info(
+            f"The GenBank file [{os.path.basename(gbk)}] does not exist"
+            + "\nor is of size 0."
+        )
+        exit(1)
+
+    logging.info(
+        f"Indexing {os.path.basename(gbk)} without using\nBioPython's GenBank Scanner..."
+    )
+
+    with open(gbk, "r") as gbk_fh:
+        for line in gbk_fh:
+            line = line.rstrip()
+            if line.startswith("VERSION") and acc_pat.match(line):
+                acc = acc_pat.match(line).group(1)
+                accs[acc] = 1
+            if org_pat.match(line):
+                if acc and acc not in orgs.keys():
+                    orgs[acc] = sanitize_pat.sub("", org_pat.match(line).group(1))
+                elif acc and acc in orgs.keys():
+                    logging.error(f"Duplicate VERSION line: {acc}")
+                    exit(1)
+        if len(accs.keys()) != len(orgs.keys()):
+            logging.error(
+                f"Got unequal number of organisms ({len(orgs.keys())})\n"
+                + f"and accessions ({len(accs.keys())})"
+            )
+            exit(1)
+        else:
+            for acc in accs.keys():
+                if acc not in orgs.keys():
+                    logging.error(f"ORAGANISM not found for accession: {acc}")
+                    exit(1)
+                accs2orgs[acc] = orgs[acc]
+
+    gbk_fh.close()
+    return accs2orgs
+
+
+def write_fasta(seq: str, id: str, basedir: os.PathLike, suffix: str) -> None:
+    """
+    Write sequence with no description to specified file.
+    """
+    SeqIO.write(
+        SeqRecord(Seq(seq), id=id, description=str()),
+        os.path.join(basedir, id + suffix),
+        "fasta",
+    )
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. Downloads the RefSeq Mitochrondrial GenBank and FASTA format files.
+        2. Takes as input and output .csv.gz or .csv file generated by `ncbitax2lin`.
+
+    and then generates a folder containing individual FASTA sequence files
+    per organelle, and a corresponding lineage file in CSV format.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .csv or .csv.gz file which is generated "
+        + "\nby the `ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-cols",
+        dest="lineage_cols",
+        default="tax_id,domain,phylum,class,order,family,genus,species,strain",
+        required=False,
+        help="Taxonomic lineage will be built using these columns from the output of"
+        + "\n`ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-url",
+        dest="url",
+        default="https://ftp.ncbi.nlm.nih.gov/refseq/release/mitochondrion",
+        required=False,
+        help="Base URL from where NCBI RefSeq mitochondrion files will be downloaded\nfrom.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "organelles"),
+        required=False,
+        help="By default, the output is written to this folder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "--fna-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual organelle FASTA files that will be saved.",
+    )
+    parser.add_argument(
+        "-ml",
+        dest="fa_min_len",
+        default=200,
+        required=False,
+        help="Minimum length of the FASTA sequence for it to be considered for"
+        + "\nfurther processing",
+    )
+    parser.add_argument(
+        "--gen-per-fa",
+        dest="gen_per_fa",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Generate per sequence FASTA file.",
+    )
+    parser.add_argument(
+        "--alt-gb-parser",
+        dest="alt_gb_parser",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Use alternate GenBank parser instead of BioPython's.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    csv = args.csv
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    url = args.url
+    tax_cols = args.lineage_cols
+    gen_per_fa = args.gen_per_fa
+    alt_gb_parser = args.alt_gb_parser
+    min_len = int(args.fa_min_len)
+    tcols_pat = re.compile(r"^[\w\,]+?\w$")
+    mito_fna_suffix = re.compile(r".*?\.genomic\.fna")
+    mito_gbff_suffix = re.compile(r".*?\.genomic\.gbff")
+    final_lineages = os.path.join(out, "lineages.csv")
+    lineages_not_found = os.path.join(out, "lineages_not_found.csv")
+    base_fasta_dir = os.path.join(out, "fasta")
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    if not tcols_pat.match(tax_cols):
+        logging.error(
+            f"Supplied columns' names {tax_cols} should only have words (alphanumeric) separated by a comma."
+        )
+        exit(1)
+    else:
+        tax_cols = re.sub("\n", "", tax_cols).split(",")
+
+    # Get .fna and .gbk files
+    fna = dl_mito_seqs_and_flat_files(url, mito_fna_suffix, out)
+    gbk = dl_mito_seqs_and_flat_files(url, mito_gbff_suffix, out)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages, raw_recs = get_lineages(csv, tax_cols)
+
+    # Get parsed organisms and latest accession from GenBank file.
+    if alt_gb_parser:
+        accs2orgs = from_genbank_alt(gbk)
+    else:
+        accs2orgs = from_genbank(gbk, min_len)
+
+    # # Finally, read FASTA and create individual FASTA if lineage exists.
+    logging.info(f"Creating new sequences and lineages...")
+
+    l_fh = open(final_lineages, "w")
+    ln_fh = open(lineages_not_found, "w")
+    l_fh.write(
+        "identifiers,superkingdom,phylum,class,order,family,genus,species,strain\n"
+    )
+    ln_fh.write("fna_id,gbk_org\n")
+    passed_lookup = 0
+    failed_lookup = 0
+    gbk_recs_missing = 0
+    skipped_len_short = 0
+
+    if gen_per_fa and not os.path.exists(base_fasta_dir):
+        os.makedirs(base_fasta_dir)
+
+    for record in SeqIO.parse(fna, "fasta"):
+        if len(record.seq) < min_len:
+            skipped_len_short += 1
+            continue
+        elif record.id in accs2orgs.keys():
+            org_words = accs2orgs[record.id].split(" ")
+        else:
+            gbk_recs_missing += 1
+            continue
+
+        genus_species = (
+            " ".join(org_words[0:2]) if len(org_words) > 2 else " ".join(org_words[0:])
+        )
+
+        if gen_per_fa:
+            write_fasta(record.seq, record.id, base_fasta_dir, fna_suffix)
+
+        if record.id in accs2orgs.keys() and accs2orgs[record.id] in lineages.keys():
+            l_fh.write(",".join([record.id, lineages[accs2orgs[record.id]]]) + "\n")
+            passed_lookup += 1
+        elif record.id in accs2orgs.keys() and genus_species in lineages.keys():
+            if len(org_words) > 2:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            lineages[genus_species].rstrip(","),
+                            accs2orgs[record.id],
+                        ]
+                    )
+                    + "\n"
+                )
+            else:
+                l_fh.write(",".join([record.id, lineages[genus_species]]) + "\n")
+            passed_lookup += 1
+        else:
+            if len(org_words) > 2:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            org_words[0] + " " + org_words[1],
+                            accs2orgs[record.id],
+                        ]
+                    )
+                    + "\n"
+                )
+            else:
+                l_fh.write(
+                    ",".join(
+                        [
+                            record.id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            accs2orgs[record.id],
+                            "",
+                        ]
+                    )
+                    + "\n"
+                )
+            ln_fh.write(",".join([record.id, accs2orgs[record.id]]) + "\n")
+            failed_lookup += 1
+
+    logging.info(
+        f"No. of raw records present in `ncbitax2lin` [{os.path.basename(csv)}]: {raw_recs}"
+        + f"\nNo. of valid records collected from `ncbitax2lin` [{os.path.basename(csv)}]: {len(lineages.keys())}"
+        + f"\nNo. of sequences skipped (Sequence length < {min_len}): {skipped_len_short}"
+        + f"\nNo. of records in FASTA [{os.path.basename(fna)}]: {passed_lookup + failed_lookup}"
+        + f"\nNo. of records in GenBank [{os.path.basename(gbk)}]: {len(accs2orgs.keys())}"
+        + f"\nNo. of FASTA records for which new lineages were created: {passed_lookup}"
+        + f"\nNo. of FASTA records for which only genus, species and/or strain information were created: {failed_lookup}"
+        + f"\nNo. of FASTA records for which no GenBank records exist: {gbk_recs_missing}"
+    )
+
+    if (passed_lookup + failed_lookup) != len(accs2orgs.keys()):
+        logging.error(
+            f"The number of FASTA records written [{len(accs2orgs.keys())}]"
+            + f"\nis not equal to number of lineages created [{passed_lookup + failed_lookup}]!"
+        )
+        exit(1)
+    else:
+        logging.info("Succesfully created lineages and FASTA records! Done!!")
+
+    l_fh.close()
+    ln_fh.close()
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/create_mqc_data_table.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,153 @@
+#!/usr/bin/env python
+
+import os
+import sys
+from textwrap import dedent
+
+import yaml
+
+
+def main():
+    """
+    Takes a tab-delimited text file with a mandatory header
+    column and generates an HTML table.
+    """
+
+    args = sys.argv
+    if len(args) < 2 or len(args) >= 4:
+        print(
+            f"\nAt least one argument specifying the *.tblsum file is required.\n"
+            + "No more than 2 command-line arguments should be passed.\n"
+        )
+        exit(1)
+
+    table_sum_on = str(args[1]).lower()
+    table_sum_on_file = table_sum_on + ".tblsum.txt"
+    cell_colors = f"{table_sum_on}.cellcolors.yml"
+
+    if len(args) == 3:
+        description = str(args[2])
+    else:
+        description = "The results table shown here is a collection from all samples."
+
+    if os.path.exists(cell_colors) and os.path.getsize(cell_colors) > 0:
+        with open(cell_colors, "r") as cc_yml:
+            cell_colors = yaml.safe_load(cc_yml)
+    else:
+        cell_colors = dict()
+
+    if not (
+        os.path.exists(table_sum_on_file) and os.path.getsize(table_sum_on_file) > 0
+    ):
+        exit(0)
+
+    with open(table_sum_on_file, "r") as tbl:
+        header = tbl.readline()
+        header_cols = header.strip().split("\t")
+
+        html = [
+            dedent(
+                f"""<script type="text/javascript">
+                    $(document).ready(function () {{
+                        $('#cpipes-process-custom-res-{table_sum_on}').DataTable({{
+                            scrollX: true,
+                            fixedColumns: true, dom: 'Bfrtip',
+                            buttons: [
+                                'copy',
+                                {{
+                                    extend: 'print',
+                                    title: 'CPIPES: MultiQC Report: {table_sum_on}'
+                                }},
+                                {{
+                                    extend: 'excel',
+                                    filename: '{table_sum_on}_results',
+                                }},
+                                {{
+                                    extend: 'csv',
+                                    filename: '{table_sum_on}_results',
+                                }}
+                            ]
+                        }});
+                    }});
+                </script>
+                <div class="table-responsive">
+                <style>
+                #cpipes-process-custom-res tr:nth-child(even) {{
+                    background-color: #f2f2f2;
+                }}
+                </style>
+                <table class="table" style="width:100%" id="cpipes-process-custom-res-{table_sum_on}">
+                <thead>
+                <tr>"""
+            )
+        ]
+
+        for header_col in header_cols:
+            html.append(
+                dedent(
+                    f"""
+                        <th> {header_col} </th>"""
+                )
+            )
+
+        html.append(
+            dedent(
+                """
+                </tr>
+                </thead>
+                <tbody>"""
+            )
+        )
+
+        for row in tbl:
+            html.append("<tr>\n")
+            data_cols = row.strip().split("\t")
+            if len(header_cols) != len(data_cols):
+                print(
+                    f"\nWARN: Number of header columns ({len(header_cols)}) and data "
+                    + f"columns ({len(data_cols)}) are not equal!\nWill append empty columns!\n"
+                )
+                if len(header_cols) > len(data_cols):
+                    data_cols += (len(header_cols) - len(data_cols)) * " "
+                    print(len(data_cols))
+                else:
+                    header_cols += (len(data_cols) - len(header_cols)) * " "
+
+            html.append(
+                dedent(
+                    f"""
+                        <td><samp>{data_cols[0]}</samp></td>
+                    """
+                )
+            )
+
+            for data_col in data_cols[1:]:
+                data_col_w_color = f"""<td>{data_col}</td>
+                """
+                if (
+                    table_sum_on in cell_colors.keys()
+                    and data_col in cell_colors[table_sum_on].keys()
+                ):
+                    data_col_w_color = f"""<td style="background-color: {cell_colors[table_sum_on][data_col]}">{data_col}</td>
+                    """
+                html.append(dedent(data_col_w_color))
+            html.append("</tr>\n")
+        html.append("</tbody>\n")
+        html.append("</table>\n")
+        html.append("</div>\n")
+
+        mqc_yaml = {
+            "id": f"{table_sum_on.upper()}_collated_table",
+            "section_name": f"{table_sum_on.upper()}",
+            "section_href": f"https://github.com/CFSAN-Biostatistics/nowayout",
+            "plot_type": "html",
+            "description": f"{description}",
+            "data": ("").join(html),
+        }
+
+        with open(f"{table_sum_on.lower()}_mqc.yml", "w") as html_mqc:
+            yaml.dump(mqc_yaml, html_mqc, default_flow_style=False)
+
+
+if __name__ == "__main__":
+    main()
Binary file 0.5.0/bin/dataformat has changed
Binary file 0.5.0/bin/datasets has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/fasta_join.pl	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,88 @@
+#!/usr/bin/env perl
+
+# Kranti Konganti
+# Takes in a gzipped multi-fasta file
+# and joins contigs by 10 N's
+
+use strict;
+use warnings;
+use Cwd;
+use Bio::SeqIO;
+use Getopt::Long;
+use File::Find;
+use File::Basename;
+use File::Spec::Functions;
+
+my ( $in_dir, $out_dir, $suffix, @uncatted_genomes );
+
+GetOptions(
+    'in_dir=s'  => \$in_dir,
+    'out_dir=s' => \$out_dir,
+    'suffix=s'  => \$suffix
+) or die usage();
+
+$in_dir  = getcwd            if ( !defined $in_dir );
+$out_dir = getcwd            if ( !defined $out_dir );
+$suffix  = '_genomic.fna.gz' if ( !defined $suffix );
+
+find(
+    {
+        wanted => sub {
+            push @uncatted_genomes, $File::Find::name if ( $_ =~ m/$suffix$/ );
+        }
+    },
+    $in_dir
+);
+
+if ( $out_dir ne getcwd && !-d $out_dir ) {
+    mkdir $out_dir || die "\nCannot create directory $out_dir: $!\n\n";
+}
+
+open( my $geno_path, '>genome_paths.txt' )
+  || die "\nCannot open file genome_paths.txt: $!\n\n";
+
+foreach my $uncatted_genome_path (@uncatted_genomes) {
+    my $catted_genome_header = '>' . basename( $uncatted_genome_path, $suffix );
+    $catted_genome_header =~ s/(GC[AF]\_\d+\.\d+)\_*.*/$1/;
+
+    my $catted_genome =
+      catfile( $out_dir, $catted_genome_header . '_scaffolded' . $suffix );
+
+    $catted_genome =~ s/\/\>(GC[AF])/\/$1/;
+
+    print $geno_path "$catted_genome\n";
+
+    open( my $fh, "gunzip -c $uncatted_genome_path |" )
+      || die "\nCannot create pipe for $uncatted_genome_path: $!\n\n";
+
+    open( my $fho, '|-', "gzip -c > $catted_genome" )
+      || die "\nCannot pipe to gzip: $!\n\n";
+
+    my $seq_obj = Bio::SeqIO->new(
+        -fh     => $fh,
+        -format => 'Fasta'
+    );
+
+    my $joined_seq = '';
+    while ( my $seq = $seq_obj->next_seq ) {
+        $joined_seq = $joined_seq . 'NNNNNNNNNN' . $seq->seq;
+    }
+
+    $joined_seq =~ s/NNNNNNNNNN$//;
+    $joined_seq =~ s/^NNNNNNNNNN//;
+
+    # $joined_seq =~ s/.{80}\K/\n/g;
+    # $joined_seq =~ s/\n$//;
+    print $fho $catted_genome_header, "\n", $joined_seq, "\n";
+
+    $seq_obj->close();
+    close $fh;
+    close $fho;
+}
+
+sub usage {
+    print
+"\nUsage: $0 [-in IN_DIR] [-ou OUT_DIR] [-su Filename Suffix for Header]\n\n";
+    exit;
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/fastq_dir_to_samplesheet.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,177 @@
+#!/usr/bin/env python3
+
+import os
+import sys
+import glob
+import argparse
+import re
+
+
+def parse_args(args=None):
+    Description = "Generate samplesheet from a directory of FastQ files."
+    Epilog = "Example usage: python fastq_dir_to_samplesheet.py <FASTQ_DIR> <SAMPLESHEET_FILE>"
+
+    parser = argparse.ArgumentParser(description=Description, epilog=Epilog)
+    parser.add_argument("FASTQ_DIR", help="Folder containing raw FastQ files.")
+    parser.add_argument("SAMPLESHEET_FILE", help="Output samplesheet file.")
+    parser.add_argument(
+        "-st",
+        "--strandedness",
+        type=str,
+        dest="STRANDEDNESS",
+        default="unstranded",
+        help="Value for 'strandedness' in samplesheet. Must be one of 'unstranded', 'forward', 'reverse'.",
+    )
+    parser.add_argument(
+        "-r1",
+        "--read1_extension",
+        type=str,
+        dest="READ1_EXTENSION",
+        default="_R1_001.fastq.gz",
+        help="File extension for read 1.",
+    )
+    parser.add_argument(
+        "-r2",
+        "--read2_extension",
+        type=str,
+        dest="READ2_EXTENSION",
+        default="_R2_001.fastq.gz",
+        help="File extension for read 2.",
+    )
+    parser.add_argument(
+        "-se",
+        "--single_end",
+        dest="SINGLE_END",
+        action="store_true",
+        help="Single-end information will be auto-detected but this option forces paired-end FastQ files to be treated as single-end so only read 1 information is included in the samplesheet.",
+    )
+    parser.add_argument(
+        "-sn",
+        "--sanitise_name",
+        dest="SANITISE_NAME",
+        action="store_true",
+        help="Whether to further sanitise FastQ file name to get sample id. Used in conjunction with --sanitise_name_delimiter and --sanitise_name_index.",
+    )
+    parser.add_argument(
+        "-sd",
+        "--sanitise_name_delimiter",
+        type=str,
+        dest="SANITISE_NAME_DELIMITER",
+        default="_",
+        help="Delimiter to use to sanitise sample name.",
+    )
+    parser.add_argument(
+        "-si",
+        "--sanitise_name_index",
+        type=int,
+        dest="SANITISE_NAME_INDEX",
+        default=1,
+        help="After splitting FastQ file name by --sanitise_name_delimiter all elements before this index (1-based) will be joined to create final sample name.",
+    )
+    return parser.parse_args(args)
+
+
+def fastq_dir_to_samplesheet(
+    fastq_dir,
+    samplesheet_file,
+    strandedness="unstranded",
+    read1_extension="_R1_001.fastq.gz",
+    read2_extension="_R2_001.fastq.gz",
+    single_end=False,
+    sanitise_name=False,
+    sanitise_name_delimiter="_",
+    sanitise_name_index=1,
+):
+    def sanitize_sample(path, extension):
+        """Retrieve sample id from filename"""
+        sample = os.path.basename(path).replace(extension, "")
+        if sanitise_name:
+            if sanitise_name_index > 0:
+                sample = sanitise_name_delimiter.join(
+                    os.path.basename(path).split(sanitise_name_delimiter)[
+                        :sanitise_name_index
+                    ]
+                )
+            # elif sanitise_name_index == -1:
+            #     sample = os.path.basename(path)[ :os.path.basename(path).index('.') ]
+        return sample
+
+    def get_fastqs(extension):
+        """
+        Needs to be sorted to ensure R1 and R2 are in the same order
+        when merging technical replicates. Glob is not guaranteed to produce
+        sorted results.
+        See also https://stackoverflow.com/questions/6773584/how-is-pythons-glob-glob-ordered
+        """
+        abs_fq_files = glob.glob(os.path.join(fastq_dir, f"**", f"*{extension}"), recursive=True)
+        return sorted(
+            [
+                fq for _, fq in enumerate(abs_fq_files) if re.match('^((?!undetermined|unclassified|downloads).)*$', fq, flags=re.IGNORECASE)
+            ]
+        )
+
+    read_dict = {}
+
+    ## Get read 1 files
+    for read1_file in get_fastqs(read1_extension):
+        sample = sanitize_sample(read1_file, read1_extension)
+        if sample not in read_dict:
+            read_dict[sample] = {"R1": [], "R2": []}
+        read_dict[sample]["R1"].append(read1_file)
+
+    ## Get read 2 files
+    if not single_end:
+        for read2_file in get_fastqs(read2_extension):
+            sample = sanitize_sample(read2_file, read2_extension)
+            read_dict[sample]["R2"].append(read2_file)
+
+    ## Write to file
+    if len(read_dict) > 0:
+        out_dir = os.path.dirname(samplesheet_file)
+        if out_dir and not os.path.exists(out_dir):
+            os.makedirs(out_dir)
+
+        with open(samplesheet_file, "w") as fout:
+            header = ["sample", "fq1", "fq2", "strandedness"]
+            fout.write(",".join(header) + "\n")
+            for sample, reads in sorted(read_dict.items()):
+                for idx, read_1 in enumerate(reads["R1"]):
+                    read_2 = ""
+                    if idx < len(reads["R2"]):
+                        read_2 = reads["R2"][idx]
+                    sample_info = ",".join([sample, read_1, read_2, strandedness])
+                    fout.write(f"{sample_info}\n")
+    else:
+        error_str = (
+            "\nWARNING: No FastQ files found so samplesheet has not been created!\n\n"
+        )
+        error_str += "Please check the values provided for the:\n"
+        error_str += "  - Path to the directory containing the FastQ files\n"
+        error_str += "  - '--read1_extension' parameter\n"
+        error_str += "  - '--read2_extension' parameter\n"
+        print(error_str)
+        sys.exit(1)
+
+
+def main(args=None):
+    args = parse_args(args)
+
+    strandedness = "unstranded"
+    if args.STRANDEDNESS in ["unstranded", "forward", "reverse"]:
+        strandedness = args.STRANDEDNESS
+
+    fastq_dir_to_samplesheet(
+        fastq_dir=args.FASTQ_DIR,
+        samplesheet_file=args.SAMPLESHEET_FILE,
+        strandedness=strandedness,
+        read1_extension=args.READ1_EXTENSION,
+        read2_extension=args.READ2_EXTENSION,
+        single_end=args.SINGLE_END,
+        sanitise_name=args.SANITISE_NAME,
+        sanitise_name_delimiter=args.SANITISE_NAME_DELIMITER,
+        sanitise_name_index=args.SANITISE_NAME_INDEX,
+    )
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_otf_genome.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,227 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import glob
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+
+# Set logging.
+logging.basicConfig(
+    format="\n"
+    + "=" * 55
+    + "\n%(asctime)s - %(levelname)s\n"
+    + "=" * 55
+    + "\n%(message)s\n\n",
+    level=logging.DEBUG,
+)
+
+# Debug print.
+ppp = pprint.PrettyPrinter(width=50, indent=4)
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def main() -> None:
+    """
+    This script works only in the context of a Nextflow workflow.
+    It takes:
+        1. A text file containing accessions or FASTA IDs, one per line and
+            then,
+        2. Optionally, searches for a genome FASTA file in gzipped format in specified
+            search path, where the prefix of the filename is the accession or
+            FASTA ID from 1. and then, creates a new concatenated gzipped genome FASTA
+            file with all the genomes in the text file from 1.
+        3. Creates a new FASTQ file with reads aligned to the accessions in the text
+            file from 1.
+    """
+
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-txt",
+        dest="accs_txt",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .txt file containing accessions\n"
+        + "FASTA IDs, one per line.",
+    )
+    required.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="CATTED_GENOMES",
+        required=True,
+        help="Set the output file prefix for .fna.gz and .txt\n" + "files.",
+    )
+    parser.add_argument(
+        "-gd",
+        dest="genomes_dir",
+        default=False,
+        required=False,
+        help="Absolute UNIX path to a directory containing\n"
+        + "gzipped genome FASTA files or a file.\n",
+    )
+    parser.add_argument(
+        "-gds",
+        dest="genomes_dir_suffix",
+        default="_scaffolded_genomic.fna.gz",
+        required=False,
+        help="Genome FASTA file suffix to search for\nin the directory mentioned using\n-gd.",
+    )
+    parser.add_argument(
+        "-query",
+        dest="id_is_query",
+        default=False,
+        action="store_true",
+        required=False,
+        help="In the produced FASTQ file, should the FASTA ID should be of KMA query ID\n"
+        + "or template ID.",
+    )
+    parser.add_argument(
+        "-txts",
+        dest="accs_suffix",
+        default="_template_hits.txt",
+        required=False,
+        help="The suffix of the file supplied with -txt option. It is assumed that the\n"
+        + "sample name is present in the file supplied with -txt option and the suffix\n"
+        + "will be stripped and stored in a file that logs samples which have no hits.",
+    )
+    parser.add_argument(
+        "-frag_delim",
+        dest="frag_delim",
+        default="\t",
+        required=False,
+        help="The delimitor by which the fields are separated in *_frag.gz file.",
+    )
+
+    args = parser.parse_args()
+    accs_txt = args.accs_txt
+    genomes_dir = args.genomes_dir
+    genomes_dir_suffix = args.genomes_dir_suffix
+    id_is_query = args.id_is_query
+    out_prefix = args.out_prefix
+    accs_suffix = args.accs_suffix
+    frag_delim = args.frag_delim
+    accs_seen = dict()
+    cat_genomes_gz = os.path.join(os.getcwd(), out_prefix + "_" + genomes_dir_suffix)
+    cat_genomes_gz = re.sub("__", "_", str(cat_genomes_gz))
+    frags_gz = os.path.join(os.getcwd(), out_prefix + ".frag.gz")
+    cat_reads_gz = os.path.join(os.getcwd(), out_prefix + "_aln_reads.fna.gz")
+    cat_reads_gz = re.sub("__", "_", cat_reads_gz)
+
+    if (
+        accs_txt
+        and os.path.exists(cat_genomes_gz)
+        and os.path.getsize(cat_genomes_gz) > 0
+    ):
+        logging.error(
+            "A concatenated genome FASTA file,\n"
+            + f"{os.path.basename(cat_genomes_gz)} already exists in:\n"
+            + f"{os.getcwd()}\n"
+            + "Please remove or move it as we will not "
+            + "overwrite it."
+        )
+        exit(1)
+
+    if accs_txt and (not os.path.exists(accs_txt) or not os.path.getsize(accs_txt) > 0):
+        logging.error("File,\n" + f"{accs_txt}\ndoes not exist " + "or is empty!")
+        failed_sample_name = re.sub(accs_suffix, "", os.path.basename(accs_txt))
+        with open(
+            os.path.join(os.getcwd(), "_".join([out_prefix, "FAILED.txt"])), "w"
+        ) as failed_sample_fh:
+            failed_sample_fh.write(f"{failed_sample_name}\n")
+        failed_sample_fh.close()
+        exit(0)
+
+    # ppp.pprint(mash_hits)
+    empty_lines = 0
+    empty_lines_msg = ""
+
+    with open(accs_txt, "r") as accs_txt_fh:
+        for line in accs_txt_fh:
+            if line in ["\n", "\n\r"]:
+                empty_lines += 1
+                continue
+            else:
+                line = line.strip()
+
+            if line in accs_seen.keys():
+                continue
+            else:
+                accs_seen[line] = 1
+    accs_txt_fh.close()
+
+    if genomes_dir:
+        if not os.path.isdir(genomes_dir):
+            logging.error("UNIX path\n" + f"{genomes_dir}\n" + "does not exist!")
+            exit(1)
+        if len(glob.glob(os.path.join(genomes_dir, "*" + genomes_dir_suffix))) <= 0:
+            logging.error(
+                "Genomes directory"
+                + f"{genomes_dir}"
+                + "\ndoes not seem to have any\n"
+                + f"files ending with suffix: {genomes_dir_suffix}"
+            )
+            exit(1)
+
+        with open(cat_genomes_gz, "wb") as genomes_out_gz:
+            for line in accs_seen.keys():
+                genome_file = os.path.join(genomes_dir, line + genomes_dir_suffix)
+
+                if not os.path.exists(genome_file) or os.path.getsize(genome_file) <= 0:
+                    logging.error(
+                        f"Genome file {os.path.basename(genome_file)} does not\n"
+                        + "exits or is empty!"
+                    )
+                    exit(1)
+                else:
+                    with open(genome_file, "rb") as genome_file_h:
+                        genomes_out_gz.writelines(genome_file_h.readlines())
+                    genome_file_h.close()
+        genomes_out_gz.close()
+
+    if (
+        len(accs_seen.keys()) > 0
+        and os.path.exists(frags_gz)
+        and os.path.getsize(frags_gz) > 0
+    ):
+        with gzip.open(
+            cat_reads_gz, "wt", encoding="utf-8", compresslevel=6
+        ) as cat_reads_gz_fh:
+            with gzip.open(frags_gz, "rb", compresslevel=6) as fragz_gz_fh:
+                fasta_id = 7 if id_is_query else 6
+                for frag_line in fragz_gz_fh:
+                    frag_lines = frag_line.decode("utf-8").strip().split(frag_delim)
+                    # Per KMA specification, 6=template, 7=query, 1=read
+                    cat_reads_gz_fh.write(f">{frag_lines[fasta_id]}\n{frag_lines[0]}\n")
+            fragz_gz_fh.close()
+        cat_reads_gz_fh.close()
+
+        if empty_lines > 0:
+            empty_lines_msg = f"Skipped {empty_lines} empty line(s).\n"
+
+        logging.info(
+            empty_lines_msg
+            + f"File {os.path.basename(cat_genomes_gz)}\n"
+            + f"written in:\n{os.getcwd()}\nDone! Bye!"
+        )
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_per_species_fa_from_bold.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,437 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+from collections import defaultdict
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def get_lineages(csv: os.PathLike, cols: list) -> list:
+    """
+    Parse the output from `ncbitax2lin` tool and
+    return a dict of lineages where the key is
+    genusspeciesstrain.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        header_cols = csv_fh.readline().strip().split(",")
+        user_req_cols = [
+            tcol_i for tcol_i, tcol in enumerate(header_cols) if tcol in cols
+        ]
+        cols_not_found = [tcol for tcol in cols if tcol not in header_cols]
+        raw_recs = 0
+
+        if len(cols_not_found) > 0:
+            logging.error(
+                f"The following columns do not exist in the"
+                + f"\nCSV file [ {os.path.basename(csv)} ]:\n"
+                + "".join(cols_not_found)
+            )
+            exit(1)
+        elif len(user_req_cols) > 9:
+            logging.error(
+                f"Only a total of 9 columns are needed!"
+                + "\ntax_id,kindom,phylum,class,order,family,genus,species,strain"
+            )
+            exit(1)
+
+        for tax in csv_fh:
+            raw_recs += 1
+            lcols = tax.strip().split(",")
+
+            if bool(lcols[user_req_cols[8]]):
+                lineages[lcols[user_req_cols[8]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:]]
+                )
+            elif bool(lcols[user_req_cols[7]]):
+                lineages[lcols[user_req_cols[7]]] = ",".join(
+                    [lcols[l] for l in user_req_cols[1:8]] + [str()]
+                )
+
+    csv_fh.close()
+    return lineages, raw_recs
+
+
+def write_fasta(recs: list, basedir: os.PathLike, name: str, suffix: str) -> None:
+    """
+    Write sequence with no description to a specified file.
+    """
+    SeqIO.write(
+        recs,
+        os.path.join(basedir, name + suffix),
+        "fasta",
+    )
+
+
+def check_and_get_cols(pat: re, cols: str, delim: str) -> list:
+    """
+    Check if header column matches the pattern and return
+    columns.
+    """
+    if not pat.match(cols):
+        logging.error(
+            f"Supplied columns' names {cols} should only have words"
+            f"\n(alphanumeric) separated by: {delim}."
+        )
+        exit(1)
+    else:
+        cols = re.sub("\n", "", cols).split(delim)
+
+    return cols
+
+
+def parse_tsv(fh: Union[TextIO, BinaryIO], tcols: list, delim: str) -> list:
+    """
+    Parse the TSV file and produce the required per
+    species FASTA's.
+    """
+    records, sp2accs = (defaultdict(list), defaultdict(list))
+    header = fh.readline().strip().split(delim)
+    raw_recs = 0
+
+    if not all(col in header for col in tcols):
+        logging.error(
+            "The following columns were not found in the"
+            + f"\nheader row of file {os.path.basename(fh.name)}\n"
+            + "\n".join([ele for ele in tcols if ele not in header])
+        )
+
+    id_i, genus_i, species_i, strain_i, seq_i = [
+        i for i, ele in enumerate(header) if ele in tcols
+    ]
+
+    for record in fh:
+        raw_recs += 1
+
+        id = record.strip().split(delim)[id_i]
+        genus = record.strip().split(delim)[genus_i]
+        species = re.sub(r"[\/\\]+", "-", record.strip().split(delim)[species_i])
+        strain = record.strip().split(delim)[strain_i]
+        seq = re.sub(r"[^ATGC]+", "", record.strip().split(delim)[seq_i], re.IGNORECASE)
+
+        if re.match(r"None|Null", species, re.IGNORECASE):
+            continue
+
+        # print(id)
+        # print(genus)
+        # print(species)
+        # print(strain)
+        # print(seq)
+
+        records.setdefault(species, []).append(
+            SeqRecord(Seq(seq), id=id, description=str())
+        )
+        sp2accs.setdefault(species, []).append(id)
+
+    logging.info(f"Collected FASTA records for {len(records.keys())} species'.")
+    fh.close()
+    return records, sp2accs, raw_recs
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. The TSV file from BOLD systems,
+        2. Takes as input a .csv file generated by `ncbitax2lin`.
+
+    and then generates a folder containing individual FASTA sequence files
+    per species. This is only possible if the full taxonomy of the barcode
+    sequence is present in the FASTA header.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\r\r",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-tsv",
+        dest="tsv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the TSV file from BOLD systems"
+        + "\nin uncompressed TXT format.",
+    )
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .csv or .csv.gz file which is generated "
+        + "\nby the `ncbitax2lin` tool.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "species"),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual species FASTA files\nthat will be saved.",
+    )
+    parser.add_argument(
+        "-ccols",
+        dest="csv_cols",
+        default="tax_id,superkingdom,phylum,class,order,family,genus,species,strain",
+        required=False,
+        help="Taxonomic lineage will be built using these columns from the output of"
+        + "\n`ncbitax2lin`\ntool.",
+    )
+    parser.add_argument(
+        "-ccols-sep",
+        dest="csv_delim",
+        default=",",
+        required=False,
+        help="The delimitor of the fields in the CSV file.",
+    )
+    parser.add_argument(
+        "-tcols",
+        dest="tsv_cols",
+        default="processid\tgenus\tspecies\tsubspecies\tnucraw",
+        required=False,
+        help="For each species, the nucletide sequences will be\naggregated.",
+    )
+    parser.add_argument(
+        "-tcols-sep",
+        dest="tsv_delim",
+        default="\t",
+        required=False,
+        help="The delimitor of the fields in the TSV file.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    tsv = args.tsv
+    csv = args.csv
+    csep = args.csv_delim
+    tsep = args.tsv_delim
+    csv_cols = args.csv_cols
+    tsv_cols = args.tsv_cols
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    ccols_pat = re.compile(f"^[\w\{csep}]+?\w$")
+    tcols_pat = re.compile(f"^[\w\{tsep}]+?\w$")
+    final_lineages = os.path.join(out, "lineages.csv")
+    lineages_not_found = os.path.join(out, "lineages_not_found.csv")
+    base_fasta_dir = os.path.join(out, "fasta")
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    # Validate user requested columns
+    passed_ccols = check_and_get_cols(ccols_pat, csv_cols, csep)
+    passed_tcols = check_and_get_cols(tcols_pat, tsv_cols, tsep)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages, raw_recs = get_lineages(csv, passed_ccols)
+
+    # Finally, read BOLD tsv if lineage exists.
+    logging.info(f"Creating new squences per species...")
+
+    if not os.path.exists(out):
+        os.makedirs(out)
+
+    try:
+        gz_fh = gzip.open(tsv, "rt")
+        records, sp2accs, traw_recs = parse_tsv(gz_fh, passed_tcols, tsep)
+    except gzip.BadGzipFile:
+        logging.info(f"Input TSV file {os.path.basename(tsv)} is not in\nGZIP format.")
+        txt_fh = open(tsv, "r")
+        records, sp2accs, traw_recs = parse_tsv(txt_fh, passed_tcols, tsep)
+
+    passed_tax_check = 0
+    failed_tax_check = 0
+    fasta_recs_written = 0
+    l_fh = open(final_lineages, "w")
+    ln_fh = open(lineages_not_found, "w")
+    l_fh.write(
+        "identifiers,superkingdom,phylum,class,order,family,genus,species,strain\n"
+    )
+    ln_fh.write("fna_id,parsed_org\n")
+
+    if not os.path.exists(base_fasta_dir):
+        os.makedirs(base_fasta_dir)
+
+    for genus_species in records.keys():
+        fasta_recs_written += len(records[genus_species])
+        write_fasta(
+            records[genus_species],
+            base_fasta_dir,
+            "_".join(genus_species.split(" ")),
+            fna_suffix,
+        )
+        org_words = genus_species.split(" ")
+
+        for id in sp2accs[genus_species]:
+            if genus_species in lineages.keys():
+                this_line = ",".join([id, lineages[genus_species]]) + "\n"
+
+                if len(org_words) > 2:
+                    this_line = (
+                        ",".join(
+                            [id, lineages[genus_species].rstrip(","), genus_species]
+                        )
+                        + "\n"
+                    )
+
+                l_fh.write(this_line)
+                passed_tax_check += 1
+            else:
+                this_line = (
+                    ",".join(
+                        [
+                            id,
+                            "",
+                            "",
+                            "",
+                            "",
+                            "",
+                            org_words[0],
+                            genus_species,
+                            "",
+                        ]
+                    )
+                    + "\n"
+                )
+                if len(org_words) > 2:
+                    this_line = (
+                        ",".join(
+                            [
+                                id,
+                                "",
+                                "",
+                                "",
+                                "",
+                                "",
+                                org_words[0],
+                                org_words[0] + " " + org_words[1],
+                                genus_species,
+                            ]
+                        )
+                        + "\n"
+                    )
+                l_fh.write(this_line)
+                ln_fh.write(",".join([id, genus_species]) + "\n")
+                failed_tax_check += 1
+
+    logging.info(
+        f"No. of raw records present in `ncbitax2lin` [{os.path.basename(csv)}]: {raw_recs}"
+        + f"\nNo. of valid records collected from `ncbitax2lin` [{os.path.basename(csv)}]: {len(lineages.keys())}"
+        + f"\nNo. of raw records in TSV [{os.path.basename(tsv)}]: {traw_recs}"
+        + f"\nNo. of valid records in TSV [{os.path.basename(tsv)}]: {passed_tax_check + failed_tax_check}"
+        + f"\nNo. of FASTA records for which new lineages were created: {passed_tax_check}"
+        + f"\nNo. of FASTA records for which only genus, species and/or strain information were created: {failed_tax_check}"
+    )
+
+    if (passed_tax_check + failed_tax_check) != fasta_recs_written:
+        logging.error(
+            f"The number of input FASTA records [{fasta_recs_written}]"
+            + f"\nis not equal to number of lineages created [{passed_tax_check + failed_tax_check}]!"
+        )
+        exit(1)
+    else:
+        logging.info("Succesfully created lineages and FASTA records! Done!!")
+
+
+if __name__ == "__main__":
+    main()
+
+# ~/apps/nowayout/bin/gen_per_species_fa_from_bold.py -tsv BOLD_Public.05-Feb-2024.tsv -csv ../tax.csv                                  ─╯
+
+# =======================================================
+# 2024-02-08 21:37:28,541 - INFO
+# =======================================================
+# Indexing tax.csv...
+
+# =======================================================
+# 2024-02-08 21:38:06,567 - INFO
+# =======================================================
+# Creating new squences per species...
+
+# =======================================================
+# 2024-02-08 21:38:06,572 - INFO
+# =======================================================
+# Input TSV file BOLD_Public.05-Feb-2024.tsv is not in
+# GZIP format.
+
+# =======================================================
+# 2024-02-08 22:01:04,554 - INFO
+# =======================================================
+# Collected FASTA records for 497421 species'.
+
+# =======================================================
+# 2024-02-08 22:24:35,000 - INFO
+# =======================================================
+# No. of raw records present in `ncbitax2lin` [tax.csv]: 2550767
+# No. of valid records collected from `ncbitax2lin` [tax.csv]: 2134980
+# No. of raw records in TSV [BOLD_Public.05-Feb-2024.tsv]: 9735210
+# No. of valid records in TSV [BOLD_Public.05-Feb-2024.tsv]: 4988323
+# No. of FASTA records for which new lineages were created: 4069202
+# No. of FASTA records for which only genus, species and/or strain information were created: 919121
+
+# =======================================================
+# 2024-02-08 22:24:35,001 - INFO
+# =======================================================
+# Succesfully created lineages and FASTA records! Done!!
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_per_species_fa_from_lin.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,289 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+import shutil
+from collections import defaultdict
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def get_lineages(csv: os.PathLike) -> defaultdict:
+    """
+    Parse the lineages.csv file and store a list of
+    accessions.
+    """
+    lineages = dict()
+    if csv == None or not (os.path.exists(csv) or os.path.getsize(csv) > 0):
+        logging.error(
+            f"The CSV file [{os.path.basename(csv)}] is empty or does not exist!"
+        )
+        exit(1)
+
+    logging.info(f"Indexing {os.path.basename(csv)}...")
+
+    with open(csv, "r") as csv_fh:
+        _ = csv_fh.readline().strip().split(",")
+        for line in csv_fh:
+            cols = line.strip().split(",")
+
+            if len(cols) < 9:
+                logging.error(
+                    f"The CSV file {os.path.basename(csv)} should have a mandatory 9 columns."
+                    + "\n\nEx: identifiers,superkingdom,phylum,class,order,family,genus,species,strain"
+                    + "\nAB211151.1,Eukaryota,Arthropoda,Malacostraca,Decapoda,Majidae,Chionoecetes,Chionoecetes opilio,"
+                    + f"\n\nGot:\n{line}"
+                )
+                exit(1)
+
+            lineages[cols[0]] = re.sub(r"\W+", "-", "_".join(cols[7].split(" ")))
+
+    csv_fh.close()
+    return lineages
+
+
+def get_unique_dir(file_num: int, k=3) -> str:
+    """
+    Return a unique directory name to manage a
+    hundred's of thousands of files.
+    """
+    dir_name = list()
+    for i in range(k - 1, -1, -1):
+        letter = chr(ord("A") + (file_num // (26**i)) % 26)  # Start from ASCII 65 (A).
+        dir_name.append(letter)
+    return "".join(dir_name)
+
+
+def write_fasta(
+    recs: list, basedir: os.PathLike, name: str, suffix: str, name_re: re
+) -> None:
+    """
+    Write sequence with no description to a specified file.
+    """
+    sanitized_name = name_re.sub("", name)
+
+    if not os.path.exists(basedir):
+        print(f"Writing into {os.path.basename(basedir)}")
+        os.makedirs(basedir)
+
+    SeqIO.write(
+        recs,
+        os.path.join(basedir, sanitized_name + suffix),
+        "fasta",
+    )
+
+
+def parse_fasta(fh: Union[TextIO, BinaryIO], sp2accs: dict) -> list:
+    """
+    Parse the sequences and create per species FASTA record.
+    """
+    records = defaultdict()
+
+    for record in SeqIO.parse(fh, "fasta"):
+
+        id = record.id
+        seq = record.seq
+
+        if id in sp2accs.keys():
+            records.setdefault(sp2accs[id], []).append(
+                SeqRecord(Seq(seq), id=id, description=str())
+            )
+        else:
+            print(f"Lineage row does not exist for accession: {id}")
+
+    logging.info(f"Collected FASTA records for {len(records.keys())} species'.")
+    fh.close()
+    return records
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. The FASTA file and,
+        2. Takes the corresponding lineages.csv file and,
+
+    then generates a folder containing individual FASTA sequence files
+    per species.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\r\r",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-fa",
+        dest="fna",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the FASTA file that corresponds"
+        + "\nto the lineages.csv file.",
+    )
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to lineages.csv which has a guaranteed 9 "
+        + "\ncolumns with the first being an accession.",
+    )
+    parser.add_argument(
+        "-out",
+        dest="out_folder",
+        default=os.path.join(os.getcwd(), "species"),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite output directory contents.",
+    )
+    parser.add_argument(
+        "-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the individual species FASTA files\nthat will be saved.",
+    )
+    parser.add_argument(
+        "-dlen",
+        dest="dir_name_len",
+        default=int(3),
+        required=False,
+        help="Name of the unique directory\nthat will be generated to manage FASTA files.",
+    )
+    parser.add_argument(
+        "-nfiles",
+        dest="num_files_per_dir",
+        default=int(1000),
+        required=False,
+        help="Number of FASTA files per unique directory.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    csv = args.csv
+    fna = args.fna
+    out = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    dir_name_len = args.dir_name_len
+    num_files_per_dir = args.num_files_per_dir
+    name_re = re.compile(r"^\W+")
+
+    # Basic checks
+    if not overwrite and os.path.exists(out):
+        logging.warning(
+            f"Output destination [{os.path.basename(out)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(out):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(out)}...")
+        shutil.rmtree(out)
+
+    # Get  taxonomy from ncbitax2lin
+    lineages = get_lineages(csv)
+
+    logging.info(f"Creating new squences per species...")
+
+    if not os.path.exists(out):
+        os.makedirs(out)
+
+    try:
+        gz_fh = gzip.open(fna, "rt")
+        fa_recs = parse_fasta(gz_fh, lineages)
+    except gzip.BadGzipFile:
+        logging.info(
+            f"Input FASTA file {os.path.basename(fna)} is not in\nGZIP format."
+        )
+        txt_fh = open(fna, "r")
+        fa_recs = parse_fasta(txt_fh, lineages)
+    finally:
+        logging.info("Assigned FASTA records per species...")
+
+    logging.info("Writing FASTA records per species...")
+
+    species_list = list(fa_recs.keys())
+    for i in range(0, len(species_list)):
+        sp = species_list[i]
+        unique_out_dir = os.path.join(
+            out, get_unique_dir(i // num_files_per_dir, k=dir_name_len)
+        )
+        write_fasta(fa_recs[sp], unique_out_dir, sp, fna_suffix, name_re)
+
+
+if __name__ == "__main__":
+    main()
+
+# ~/apps/nowayout/bin/gen_per_species_fa_from_bold.py -tsv BOLD_Public.05-Feb-2024.tsv -csv ../tax.csv                                  ─╯
+
+# =======================================================
+# 2024-02-08 21:37:28,541 - INFO
+# =======================================================
+# Indexing tax.csv...
+
+# =======================================================
+# 2024-02-08 21:38:06,567 - INFO
+# =======================================================
+# Creating new squences per species...
+
+# =======================================================
+# 2024-02-08 21:38:06,572 - INFO
+# =======================================================
+# Input TSV file BOLD_Public.05-Feb-2024.tsv is not in
+# GZIP format.
+
+# =======================================================
+# 2024-02-08 22:01:04,554 - INFO
+# =======================================================
+# Collected FASTA records for 497421 species'.
+
+# =======================================================
+# 2024-02-08 22:24:35,000 - INFO
+# =======================================================
+# No. of raw records present in `ncbitax2lin` [tax.csv]: 2550767
+# No. of valid records collected from `ncbitax2lin` [tax.csv]: 2134980
+# No. of raw records in TSV [BOLD_Public.05-Feb-2024.tsv]: 9735210
+# No. of valid records in TSV [BOLD_Public.05-Feb-2024.tsv]: 4988323
+# No. of FASTA records for which new lineages were created: 4069202
+# No. of FASTA records for which only genus, species and/or strain information were created: 919121
+
+# =======================================================
+# 2024-02-08 22:24:35,001 - INFO
+# =======================================================
+# Succesfully created lineages and FASTA records! Done!!
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_salmon_tph_and_krona_tsv.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,523 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+# 03/06/2024
+
+import argparse
+import glob
+import inspect
+import logging
+import os
+import pprint
+import re
+from collections import defaultdict
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# Main
+def main() -> None:
+    """
+    The succesful execution of this script requires access to properly formatted
+    lineages.csv file which has no more than 9 columns.
+
+    It takes the lineages.csv file, the *_hits.csv results from `sourmash gather`
+    mentioned with -smres option and and a root parent directory of the
+    `salmon quant` results mentioned with -sal option and generates a final
+    results table with the TPM values and a .krona.tsv file for each sample
+    to be used by KronaTools.
+    """
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = inspect.stack()[0].filename
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-sal",
+        dest="salmon_res_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "`salmon quant` results. For example, if path to\n"
+        + "`quant.sf` is in /hpc/john_doe/test/salmon_res/sampleA/quant.sf, then\n"
+        + "use this command-line option as:\n"
+        + "-sal /hpc/john_doe/test/salmon_res",
+    )
+    required.add_argument(
+        "-lin",
+        dest="lin",
+        default=False,
+        required=True,
+        help="Absolute UNIX Path to the lineages CSV file.\n"
+        + "This file should have only 9 columns.",
+    )
+    required.add_argument(
+        "-smres",
+        dest="sm_res_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "filtered `sourmas gather` results. For example, if path to\n"
+        + "`sampleA.csv` is in /hpc/john_doe/test/sourmash_gather/sampleA.csv,\n"
+        + "then use this command-line option as:\n"
+        + "-sal /hpc/john_doe/test",
+    )
+    parser.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="nowayout.tblsum",
+        required=False,
+        help="Set the output file(s) prefix for output(s) generated\n"
+        + "by this program.",
+    )
+    parser.add_argument(
+        "-sf",
+        dest="scale_down_factor",
+        default=float(10000),
+        required=False,
+        help="Set the scaling factor by which TPM values are scaled\ndown.",
+    )
+    parser.add_argument(
+        "-smres-suffix",
+        dest="sm_res_suffix",
+        default="_hits.csv",
+        required=False,
+        help="Find the `sourmash gather` result files ending in this\nsuffix.",
+    )
+    parser.add_argument(
+        "-failed-suffix",
+        dest="failed_suffix",
+        default="_FAILED.txt",
+        required=False,
+        help="Find the sample names which failed classification stored\n"
+        + "inside the files ending in this suffix.",
+    )
+    parser.add_argument(
+        "-num-lin-cols",
+        dest="num_lin_cols",
+        default=int(9),
+        required=False,
+        help="Number of columns expected in the lineages CSV file.",
+    )
+    parser.add_argument(
+        "-lin-acc-regex",
+        dest="lin_acc_regex",
+        default=re.compile(r"\w+[\-\.]{1}[0-9]+"),
+        required=False,
+        help="The pattern of the lineage's accession.",
+    )
+
+    args = parser.parse_args()
+    salmon_res_dir = args.salmon_res_dir
+    sm_res_dir = args.sm_res_dir
+    sm_res_suffix = args.sm_res_suffix
+    failed_suffix = args.failed_suffix
+    out_prefix = args.out_prefix
+    lin = args.lin
+    num_lin_cols = args.num_lin_cols
+    acc_pat = args.lin_acc_regex
+    scale_down = float(args.scale_down_factor)
+    no_hit = "Unclassified"
+    no_hit_reads = "reads mapped to the database"
+    tpm_const = float(1000000.0000000000)
+    round_to = 10
+    all_samples = set()
+    (
+        lineage2sample,
+        unclassified2sample,
+        lineage2sm,
+        sm2passed,
+        reads_total,
+        per_taxon_reads,
+        lineages,
+    ) = (
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(defaultdict),
+        defaultdict(int),
+    )
+
+    salmon_comb_res = os.path.join(os.getcwd(), out_prefix + ".txt")
+    # salmon_comb_res_reads_mapped = os.path.join(
+    #     os.getcwd(), re.sub(".tblsum", "_reads_mapped.tblsum", out_prefix) + ".txt"
+    # )
+    salmon_comb_res_indiv_reads_mapped = os.path.join(
+        os.getcwd(),
+        re.sub(".tblsum", "_indiv_reads_mapped.tblsum", out_prefix) + ".txt",
+    )
+    salmon_res_files = glob.glob(
+        os.path.join(salmon_res_dir, "*", "quant.sf"), recursive=True
+    )
+    sample_res_files_failed = glob.glob(
+        os.path.join(salmon_res_dir, "*" + failed_suffix), recursive=True
+    )
+    sm_res_files = glob.glob(
+        os.path.join(sm_res_dir, "*" + sm_res_suffix), recursive=True
+    )
+
+    # Basic checks
+    if lin and not (os.path.exists(lin) and os.path.getsize(lin) > 0):
+        logging.error(
+            "The lineages file,\n"
+            + f"{os.path.basename(lin)} does not exist or is empty!"
+        )
+        exit(1)
+
+    if salmon_res_dir:
+        if not os.path.isdir(salmon_res_dir):
+            logging.error("UNIX path\n" + f"{salmon_res_dir}\n" + "does not exist!")
+            exit(1)
+        if len(salmon_res_files) <= 0:
+            with open(salmon_comb_res, "w") as salmon_comb_res_fh, open(
+                salmon_comb_res_indiv_reads_mapped, "w"
+            ) as salmon_comb_res_indiv_reads_mapped_fh:
+                salmon_comb_res_fh.write(f"Sample\n{no_hit} reads in all samples\n")
+                salmon_comb_res_indiv_reads_mapped_fh.write(
+                    f"Sample\nNo {no_hit_reads} from all samples\n"
+                )
+            salmon_comb_res_fh.close()
+            salmon_comb_res_indiv_reads_mapped_fh.close()
+            exit(0)
+
+    # Only proceed if lineages.csv exists.
+    if lin and os.path.exists(lin) and os.path.getsize(lin) > 0:
+        lin_fh = open(lin, "r")
+        _ = lin_fh.readline()
+
+        # Index lineages.csv
+        for line in lin_fh:
+            cols = line.strip().split(",")
+
+            if len(cols) < num_lin_cols:
+                logging.error(
+                    f"The file {os.path.basename(lin)} seems to\n"
+                    + "be malformed. It contains less than required 9 columns."
+                )
+                exit(1)
+
+            if cols[0] in lineages.keys():
+                continue
+                # logging.info(
+                #     f"There is a duplicate accession [{cols[0]}]"
+                #     + f" in the lineages file {os.path.basename(lin)}!"
+                # )
+            elif acc_pat.match(cols[0]):
+                lineages[cols[0]] = ",".join(cols[1:])
+
+        lin_fh.close()
+
+        # Index each samples' filtered sourmash results.
+        for sm_res_file in sm_res_files:
+            sample_name = re.sub(sm_res_suffix, "", os.path.basename(sm_res_file))
+
+            with open(sm_res_file, "r") as sm_res_fh:
+                _ = sm_res_fh.readline()
+                for line in sm_res_fh:
+                    acc = acc_pat.findall(line.strip().split(",")[9])
+
+                    if len(acc) == 0:
+                        logging.info(
+                            f"Got empty lineage accession: {acc}"
+                            + f"\nRow elements: {line.strip().split(',')}"
+                        )
+                        exit(1)
+                    if len(acc) not in [1]:
+                        logging.info(
+                            f"Got more than one lineage accession: {acc}"
+                            + f"\nRow elements: {line.strip().split(',')}"
+                        )
+                        logging.info(f"Considering first element: {acc[0]}")
+                    if acc[0] not in lineages.keys():
+                        logging.error(
+                            f"The lineage accession {acc[0]} is not found in {os.path.basename(lin)}"
+                        )
+                        exit(1)
+                    lineage2sm[lineages[acc[0]]].setdefault(sample_name, 1)
+                    sm2passed["sourmash_passed"].setdefault(sample_name, 1)
+            sm_res_fh.close()
+
+        # Index each samples' salmon results.
+        for salmon_res_file in salmon_res_files:
+            sample_name = re.match(
+                r"(^.+?)((\_salmon\_res)|(\.salmon))$",
+                os.path.basename(os.path.dirname(salmon_res_file)),
+            )[1]
+            salmon_meta_json = os.path.join(
+                os.path.dirname(salmon_res_file), "aux_info", "meta_info.json"
+            )
+
+            if (
+                not os.path.exists(salmon_meta_json)
+                or not os.path.getsize(salmon_meta_json) > 0
+            ):
+                logging.error(
+                    "The file\n"
+                    + f"{salmon_meta_json}\ndoes not exist or is empty!\n"
+                    + "Did `salmon quant` fail?"
+                )
+                exit(1)
+
+            if (
+                not os.path.exists(salmon_res_file)
+                or not os.path.getsize(salmon_res_file) > 0
+            ):
+                logging.error(
+                    "The file\n"
+                    + f"{salmon_res_file}\ndoes not exist or is empty!\n"
+                    + "Did `salmon quant` fail?"
+                )
+                exit(1)
+
+            # Initiate all_tpm, rem_tpm and reads_mapped
+            # all_tpm
+            reads_total[sample_name].setdefault("all_tpm", []).append(float(0.0))
+            # rem_tpm
+            reads_total[sample_name].setdefault("rem_tpm", []).append(float(0.0))
+            # reads_mapped
+            reads_total[sample_name].setdefault("reads_mapped", []).append(float(0.0))
+
+            with open(salmon_res_file, "r") as salmon_res_fh:
+                for line in salmon_res_fh.readlines():
+                    if re.match(r"^Name.+", line):
+                        continue
+                    cols = line.strip().split("\t")
+                    ref_acc = cols[0]
+                    tpm = cols[3]
+                    num_reads_mapped = cols[4]
+
+                    (
+                        reads_total[sample_name]
+                        .setdefault("all_tpm", [])
+                        .append(
+                            round(float(tpm), round_to),
+                        )
+                    )
+
+                    (
+                        reads_total[sample_name]
+                        .setdefault("reads_mapped", [])
+                        .append(
+                            round(float(num_reads_mapped), round_to),
+                        )
+                    )
+
+                    if lineages[ref_acc] in lineage2sm.keys():
+                        (
+                            lineage2sample[lineages[ref_acc]]
+                            .setdefault(sample_name, [])
+                            .append(round(float(tpm), round_to))
+                        )
+                        (
+                            per_taxon_reads[sample_name]
+                            .setdefault(lineages[ref_acc], [])
+                            .append(round(float(num_reads_mapped)))
+                        )
+                    else:
+                        (
+                            reads_total[sample_name]
+                            .setdefault("rem_tpm", [])
+                            .append(
+                                round(float(tpm), round_to),
+                            )
+                        )
+
+            salmon_res_fh.close()
+
+        # Index each samples' complete failure results i.e., 100% unclassified.
+        for sample_res_file_failed in sample_res_files_failed:
+            sample_name = re.sub(
+                failed_suffix, "", os.path.basename(sample_res_file_failed)
+            )
+            with open("".join(sample_res_file_failed), "r") as no_calls_fh:
+                for line in no_calls_fh.readlines():
+                    if line in ["\n", "\n\r", "\r"]:
+                        continue
+                    unclassified2sample[sample_name].setdefault(no_hit, tpm_const)
+            no_calls_fh.close()
+
+        # Finally, write all results.
+        for sample in sorted(reads_total.keys()) + sorted(unclassified2sample.keys()):
+            all_samples.add(sample)
+
+        # Check if sourmash results exist but salmon `quant` failed
+        # and if so, set the sample to 100% Unclassified as well.
+        for sample in sm2passed["sourmash_passed"].keys():
+            if sample not in all_samples:
+                unclassified2sample[sample].setdefault(no_hit, tpm_const)
+                all_samples.add(sample)
+
+        # Write total number of reads mapped to nowayout database.
+        # with open(salmon_comb_res_reads_mapped, "w") as nowo_reads_mapped_fh:
+        #     nowo_reads_mapped_fh.write(
+        #         "\t".join(
+        #             [
+        #                 "Sample",
+        #                 "All reads",
+        #                 "Classified reads",
+        #                 "Unclassified reads (Reads failed thresholds )",
+        #             ]
+        #         )
+        #     )
+
+        #     for sample in all_samples:
+        #         if sample in reads_total.keys():
+        #             nowo_reads_mapped_fh.write(
+        #                 "\n"
+        #                 + "\t".join(
+        #                     [
+        #                         f"\n{sample}",
+        #                         f"{int(sum(reads_total[sample]['reads_mapped']))}",
+        #                         f"{int(reads_total[sample]['reads_mapped'])}",
+        #                         f"{int(reads_total[sample]['rem_tpm'])}",
+        #                     ],
+        #                 )
+        #             )
+        #         else:
+        #             nowo_reads_mapped_fh.write(f"\n{sample}\t{int(0.0)}")
+        # nowo_reads_mapped_fh.close()
+
+        # Write scaled down TPM values for each sample.
+        with open(salmon_comb_res, "w") as salmon_comb_res_fh, open(
+            salmon_comb_res_indiv_reads_mapped, "w"
+        ) as salmon_comb_res_indiv_reads_mapped_fh:
+            salmon_comb_res_fh.write("Lineage\t" + "\t".join(all_samples) + "\n")
+            salmon_comb_res_indiv_reads_mapped_fh.write(
+                "Lineage\t" + "\t".join(all_samples) + "\n"
+            )
+
+            # Write *.krona.tsv header for all samples.
+            for sample in all_samples:
+                krona_fh = open(
+                    os.path.join(salmon_res_dir, sample + ".krona.tsv"), "w"
+                )
+                krona_fh.write(
+                    "\t".join(
+                        [
+                            "fraction",
+                            "superkingdom",
+                            "phylum",
+                            "class",
+                            "order",
+                            "family",
+                            "genus",
+                            "species",
+                        ]
+                    )
+                )
+                krona_fh.close()
+
+            # Write the TPM values (TPM/scale_down) for valid lineages.
+            for lineage in lineage2sm.keys():
+                salmon_comb_res_fh.write(lineage)
+                salmon_comb_res_indiv_reads_mapped_fh.write(lineage)
+
+                for sample in all_samples:
+                    krona_fh = open(
+                        os.path.join(salmon_res_dir, sample + ".krona.tsv"), "a"
+                    )
+
+                    if sample in unclassified2sample.keys():
+                        salmon_comb_res_fh.write(f"\t0.0")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    elif sample in lineage2sample[lineage].keys():
+                        reads = sum(per_taxon_reads[sample][lineage])
+                        tpm = sum(lineage2sample[lineage][sample])
+                        tph = round(tpm / scale_down, round_to)
+                        lineage2sample[sample].setdefault("hits_tpm", []).append(
+                            float(tpm)
+                        )
+
+                        salmon_comb_res_fh.write(f"\t{tph}")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t{reads}")
+                        krona_lin_row = lineage.split(",")
+
+                        if len(krona_lin_row) > num_lin_cols - 1:
+                            logging.error(
+                                "Taxonomy columns are more than 8 for the following lineage:"
+                                + f"{krona_lin_row}"
+                            )
+                            exit(1)
+                        else:
+                            krona_fh.write(
+                                "\n"
+                                + str(round((tpm / tpm_const), round_to))
+                                + "\t"
+                                + "\t".join(krona_lin_row[:-1])
+                            )
+                    else:
+                        salmon_comb_res_fh.write(f"\t0.0")
+                        salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    krona_fh.close()
+
+                salmon_comb_res_fh.write("\n")
+                salmon_comb_res_indiv_reads_mapped_fh.write(f"\n")
+
+            # Finally write TPH (TPM/scale_down) for Unclassified
+            # Row = Unclassified / No reads mapped to the database ...
+            salmon_comb_res_fh.write(f"{no_hit}")
+            salmon_comb_res_indiv_reads_mapped_fh.write(f"Total {no_hit_reads}")
+
+            for sample in all_samples:
+                krona_ufh = open(
+                    os.path.join(salmon_res_dir, sample + ".krona.tsv"), "a"
+                )
+                # krona_ufh.write("\t")
+                if sample in unclassified2sample.keys():
+                    salmon_comb_res_fh.write(
+                        f"\t{round((unclassified2sample[sample][no_hit] / scale_down), round_to)}"
+                    )
+                    salmon_comb_res_indiv_reads_mapped_fh.write(f"\t0")
+                    krona_ufh.write(
+                        f"\n{round((unclassified2sample[sample][no_hit] / tpm_const), round_to)}"
+                    )
+                else:
+                    trace_tpm = tpm_const - sum(reads_total[sample]["all_tpm"])
+                    trace_tpm = float(f"{trace_tpm:.{round_to}f}")
+                    if trace_tpm <= 0:
+                        trace_tpm = float(0.0)
+                    tph_unclassified = float(
+                        f"{(sum(reads_total[sample]['rem_tpm']) + trace_tpm) / scale_down:{round_to}f}"
+                    )
+                    krona_unclassified = float(
+                        f"{(sum(reads_total[sample]['rem_tpm']) + trace_tpm) / tpm_const:{round_to}f}"
+                    )
+                    salmon_comb_res_fh.write(f"\t{tph_unclassified}")
+                    salmon_comb_res_indiv_reads_mapped_fh.write(
+                        f"\t{int(sum(sum(per_taxon_reads[sample].values(), [])))}"
+                    )
+                    krona_ufh.write(f"\n{krona_unclassified}")
+                krona_ufh.write("\t" + "\t".join(["unclassified"] * (num_lin_cols - 2)))
+                krona_ufh.close()
+
+        salmon_comb_res_fh.close()
+        salmon_comb_res_indiv_reads_mapped_fh.close()
+        # ppp.pprint(lineage2sample)
+        # ppp.pprint(lineage2sm)
+        # ppp.pprint(reads_total)
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/gen_sim_abn_table.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,191 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import glob
+import inspect
+import logging
+import os
+import pprint
+import re
+from collections import defaultdict
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+# Main
+def main() -> None:
+    """
+    This script will take the final taxonomic classification files and create a
+    global relative abundance type file in the current working directory. The
+    relative abundance type files should be in CSV or TSV format and should have
+    the lineage or taxonomy in first column and samples in the subsequent columns.
+    """
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = inspect.stack()[0].filename
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-abn",
+        dest="rel_abn_dir",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to the parent directory that contains the\n"
+        + "abundance type files.",
+    )
+    parser.add_argument(
+        "-op",
+        dest="out_prefix",
+        default="nowayout.tblsum",
+        required=False,
+        help="Set the output file(s) prefix for output(s) generated\nby this program.",
+    )
+    parser.add_argument(
+        "-header",
+        dest="header",
+        action="store_true",
+        default=True,
+        required=False,
+        help="Do the relative abundance files have a header.",
+    )
+    parser.add_argument(
+        "-filepat",
+        dest="file_pat",
+        default="*.lineage_summary.tsv",
+        required=False,
+        help="Files will be searched by this suffix for merged output generation\nby this program.",
+    )
+    parser.add_argument(
+        "-failedfilepat",
+        dest="failed_file_pat",
+        default="*FAILED.txt",
+        required=False,
+        help="Files will be searched by this suffix for merged output generation\nby this program.",
+    )
+    parser.add_argument(
+        "-delim",
+        dest="delim",
+        default="\t",
+        required=False,
+        help="The delimitor by which the fields are separated in the file.",
+    )
+
+    args = parser.parse_args()
+    rel_abn_dir = args.rel_abn_dir
+    is_header = args.header
+    out_prefix = args.out_prefix
+    file_pat = args.file_pat
+    failed_file_pat = args.failed_file_pat
+    delim = args.delim
+    suffix = re.sub(r"^\*", "", file_pat)
+    rel_abn_comb = os.path.join(os.getcwd(), out_prefix + ".txt")
+    rel_abn_files = glob.glob(os.path.join(rel_abn_dir, file_pat))
+    failed_rel_abn_files = glob.glob(os.path.join(rel_abn_dir, failed_file_pat))
+    empty_results = "Relative abundance results did not pass thresholds"
+    sample2lineage, seen_lineage = (defaultdict(defaultdict), defaultdict(int))
+
+    if len(rel_abn_files) == 0:
+        logging.info(
+            "Unable to find any files with .tsv extentsion.\nNow trying .csv extension."
+        )
+        rel_abn_files = glob.glob(os.path.join(rel_abn_dir, "*.csv"))
+        delim = ","
+
+    if len(failed_rel_abn_files) == 0:
+        logging.info(
+            f"Unable to find any files with patttern {failed_file_pat}.\n"
+            + "The failed samples will not appear in the final aggregate file."
+        )
+
+    if rel_abn_dir:
+        if not os.path.isdir(rel_abn_dir):
+            logging.error("UNIX path\n" + f"{rel_abn_dir}\n" + "does not exist!")
+            exit(1)
+        if len(rel_abn_files) <= 0:
+            with open(rel_abn_comb, "w") as rel_abn_comb_fh:
+                rel_abn_comb_fh.write(f"Sample\n{empty_results} in any samples\n")
+            rel_abn_comb_fh.close()
+            exit(0)
+
+        for failed_rel_abn in failed_rel_abn_files:
+            with open(failed_rel_abn, "r") as failed_fh:
+                sample2lineage[failed_fh.readline().strip()].setdefault(
+                    "unclassified", []
+                ).append(float("1.0"))
+            failed_fh.close()
+
+        for rel_abn_file in rel_abn_files:
+            sample_name = re.match(r"(^.+?)\..*$", os.path.basename(rel_abn_file))[1]
+
+            with open(rel_abn_file, "r") as rel_abn_fh:
+                if is_header:
+                    sample_names = rel_abn_fh.readline().strip().split(delim)[1:]
+                    if len(sample_names) > 2:
+                        logging.error(
+                            "The individual relative abundance file has more "
+                            + "\nthan 1 sample. This is rare in the context of running the "
+                            + "\n nowayout Nextflow workflow."
+                        )
+                        exit(1)
+                    elif len(sample_names) < 2:
+                        sample_name = re.sub(suffix, "", os.path.basename(rel_abn_file))
+                        logging.info(
+                            "Seems like there is no sample name in the lineage summary file."
+                            + f"\nTherefore, sample name has been extracted from file name: {sample_name}."
+                        )
+                    else:
+                        sample_name = sample_names[0]
+
+                for line in rel_abn_fh.readlines():
+                    cols = line.strip().split(delim)
+                    lineage = cols[0]
+                    abn = cols[1]
+                    sample2lineage[sample_name].setdefault(lineage, []).append(
+                        float(abn)
+                    )
+                    seen_lineage[lineage] = 1
+
+        with open(rel_abn_comb, "w") as rel_abn_comb_fh:
+            samples = sorted(sample2lineage.keys())
+            rel_abn_comb_fh.write(f"Lineage{delim}" + delim.join(samples) + "\n")
+
+            for lineage in sorted(seen_lineage.keys()):
+                rel_abn_comb_fh.write(lineage)
+                for sample in samples:
+                    if lineage in sample2lineage[sample].keys():
+                        rel_abn_comb_fh.write(
+                            delim
+                            + "".join(
+                                [str(abn) for abn in sample2lineage[sample][lineage]]
+                            )
+                        )
+                    else:
+                        rel_abn_comb_fh.write(f"{delim}0.0")
+                rel_abn_comb_fh.write("\n")
+        rel_abn_comb_fh.close()
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/remove_dup_fasta_ids.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,201 @@
+#!/usr/bin/env python3
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import shutil
+from typing import BinaryIO, TextIO, Union
+
+from Bio import SeqIO
+from Bio.Seq import Seq
+from Bio.SeqRecord import SeqRecord
+from genericpath import isdir
+
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(
+    argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter
+):
+    pass
+
+
+def write_fasta(seq: str, id: str, fh: Union[TextIO, BinaryIO]) -> None:
+    """
+    Write sequence with no description to specified file.
+    """
+    SeqIO.write(
+        SeqRecord(Seq(seq), id=id, description=str()),
+        fh,
+        "fasta",
+    )
+
+
+# Main
+def main() -> None:
+    """
+    This script takes:
+        1. A FASTA file in gzip or non-gzip (ASCII TXT) format and
+
+    and then generates a new FASTA file with duplicate FASTA IDs replaced
+    with a unique ID.
+    """
+
+    # Set logging.
+    logging.basicConfig(
+        format="\n"
+        + "=" * 55
+        + "\n%(asctime)s - %(levelname)s\n"
+        + "=" * 55
+        + "\n%(message)s\n\n",
+        level=logging.DEBUG,
+    )
+
+    # Debug print.
+    ppp = pprint.PrettyPrinter(width=55)
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-fna",
+        dest="fna",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to .fna or .fna.gz file.",
+    )
+    parser.add_argument(
+        "-lin",
+        dest="lineages",
+        default=False,
+        required=False,
+        help="Absolute UNIX path to lineages.csv file for which the"
+        + "\nthe duplicate IDs will be made unique corresponding to"
+        + "\nthe FASTA IDs",
+    )
+    parser.add_argument(
+        "-outdir",
+        dest="out_folder",
+        default=os.getcwd(),
+        required=False,
+        help="By default, the output is written to this\nfolder.",
+    )
+    parser.add_argument(
+        "-f",
+        dest="force_write_out",
+        default=False,
+        action="store_true",
+        required=False,
+        help="Force overwrite the output file.",
+    )
+    parser.add_argument(
+        "--fna-suffix",
+        dest="fna_suffix",
+        default=".fna",
+        required=False,
+        help="Suffix of the output FASTA file.",
+    )
+
+    # Parse defaults
+    args = parser.parse_args()
+    fna = args.fna
+    lineages = args.lineages
+    outdir = args.out_folder
+    overwrite = args.force_write_out
+    fna_suffix = args.fna_suffix
+    new_fna = os.path.join(
+        outdir, os.path.basename(fna).split(".")[0] + "_dedup_ids" + fna_suffix
+    )
+    lin_header = False
+    new_lin = False
+    seen_ids = dict()
+    seen_lineages = dict()
+
+    # Basic checks
+    if not overwrite and os.path.exists(new_fna):
+        logging.warning(
+            f"Output destination [{os.path.basename(new_fna)}] already exists!"
+            + "\nPlease use -f to delete and overwrite."
+        )
+    elif overwrite and os.path.exists(new_fna):
+        logging.info(f"Overwrite requested. Deleting {os.path.basename(new_fna)}...")
+        if os.path.isdir(new_fna):
+            shutil.rmtree(new_fna)
+        else:
+            os.remove(new_fna)
+
+    # Prepare for writing
+    new_fna_fh = open(new_fna, "+at")
+
+    # If lineages file is mentioned, index it.
+    if lineages and os.path.exists(lineages) and os.path.getsize(lineages) > 0:
+        new_lin = os.path.join(os.getcwd(), os.path.basename(lineages) + "_dedup.csv")
+        new_lin_fh = open(new_lin, "w")
+        with open(lineages, "r") as l_fh:
+            lin_header = l_fh.readline()
+            for line in l_fh:
+                cols = line.strip().split(",")
+                if len(cols) < 9:
+                    logging.error(
+                        f"The row in the lineages file {os.path.basename(lineages)}"
+                        + f"\ndoes not have 9 required columns: {len(cols)}"
+                        + f"\n\n{lin_header.strip()}\n{line.strip()}"
+                    )
+                    exit(1)
+                elif len(cols) > 9:
+                    logging.info(
+                        f"The row in the lineages file {os.path.basename(lineages)}"
+                        + f"\nhas more than 9 required columns: {len(cols)}"
+                        + f"\nRetaining only 9 columns of the following 10 columns."
+                        + f"\n\n{lin_header.strip()}\n{line.strip()}"
+                    )
+
+                if cols[0] not in seen_lineages.keys():
+                    seen_lineages[cols[0]] = ",".join(cols[1:9])
+
+        new_lin_fh.write(lin_header)
+        l_fh.close()
+
+    # Read FASTA and create unique FASTA IDs.
+    logging.info(f"Creating new FASTA with unique IDs.")
+    try:
+        fna_fh = gzip.open(fna, "rt")
+        _ = fna_fh.readline()
+    except gzip.BadGzipFile:
+        logging.info(
+            f"Input FASTA file {os.path.basename(fna)} is not in\nGZIP format."
+            + "\nAttempting text parsing."
+        )
+        fna_fh = open(fna, "r")
+
+    for record in SeqIO.parse(fna_fh, format="fasta"):
+        seq_id = record.id
+
+        if record.id not in seen_ids.keys():
+            seen_ids[record.id] = 1
+        else:
+            seen_ids[record.id] += 1
+
+        if seen_ids[seq_id] > 1:
+            seq_id = str(record.id) + str(seen_ids[record.id])
+
+        if new_lin:
+            new_lin_fh.write(",".join([seq_id, seen_lineages[record.id]]) + "\n")
+
+        write_fasta(record.seq, seq_id, new_fna_fh)
+
+    if new_lin:
+        new_lin_fh.close()
+
+    logging.info("Done!")
+
+
+if __name__ == "__main__":
+
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/bin/sourmash_filter_hits.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,193 @@
+#!/usr/bin/env python3
+
+# Kranti Konganti
+
+import argparse
+import gzip
+import inspect
+import logging
+import os
+import pprint
+import re
+
+# Set logging.
+logging.basicConfig(
+    format="\n" + "=" * 55 + "\n%(asctime)s - %(levelname)s\n" + "=" * 55 + "\n%(message)s\n\n",
+    level=logging.DEBUG,
+)
+
+# Debug print.
+ppp = pprint.PrettyPrinter(width=50, indent=4)
+
+# Multiple inheritence for pretty printing of help text.
+class MultiArgFormatClasses(argparse.RawTextHelpFormatter, argparse.ArgumentDefaultsHelpFormatter):
+    pass
+
+
+def write_failures(prefix: str, file: os.PathLike) -> None:
+    with open(file, "w") as outfile_failed_fh:
+        outfile_failed_fh.write(f"{prefix}\n")
+    outfile_failed_fh.close()
+
+
+def main() -> None:
+    """
+    This script will take the CSV output of `sourmash search` and `sourmash gather`
+    and will return a column's value filtered by requested column name and its value
+    """
+
+    prog_name = os.path.basename(inspect.stack()[0].filename)
+
+    parser = argparse.ArgumentParser(
+        prog=prog_name, description=main.__doc__, formatter_class=MultiArgFormatClasses
+    )
+
+    required = parser.add_argument_group("required arguments")
+
+    required.add_argument(
+        "-csv",
+        dest="csv",
+        default=False,
+        required=True,
+        help="Absolute UNIX path to CSV file containing output from\n"
+        + "`sourmash gather` or `sourmash search`",
+    )
+    required.add_argument(
+        "-extract",
+        dest="extract",
+        required=False,
+        default="name",
+        help="Extract this column's value which matches the filters.\n"
+        + "Controlled by -fcn and -fcv.",
+    )
+    parser.add_argument(
+        "-all",
+        dest="alllines",
+        required=False,
+        default=False,
+        action="store_true",
+        help="Instead of just the column value, print entire row.",
+    )
+    parser.add_argument(
+        "-fcn",
+        dest="filter_col_name",
+        default="f_match",
+        required=False,
+        help="Column name by which the filtering of rows\nshould be applied.",
+    )
+    parser.add_argument(
+        "-fcv",
+        dest="filter_col_val",
+        default="0",
+        required=False,
+        help="Only rows where the column (defined by --fcn)\nsatisfies this value will be\n"
+        + "will be considered. This can be numeric, regex\nor a string value.",
+    )
+    parser.add_argument(
+        "-gt",
+        dest="gt",
+        default=True,
+        required=False,
+        action="store_true",
+        help="Apply greater than or equal to condition on\nnumeric values of --fcn column.",
+    )
+    parser.add_argument(
+        "-lt",
+        dest="lt",
+        default=False,
+        required=False,
+        action="store_true",
+        help="Apply less than or equal to condition on\nnumeric values of --fcn column.",
+    )
+
+    args = parser.parse_args()
+    csv = args.csv
+    ex = args.extract
+    all_lines = args.alllines
+    fcn = args.filter_col_name
+    fcv = args.filter_col_val
+    gt = args.gt
+    lt = args.lt
+    hits = set()
+    hit_lines = set()
+    empty_lines = 0
+
+    outfile_prefix = re.sub(r"(^.*?)\.csv\.gz", r"\1", os.path.basename(csv))
+    outfile_failed = os.path.join(os.getcwd(), "_".join([outfile_prefix, "FAILED.txt"]))
+
+    if csv and (not os.path.exists(csv) or not os.path.getsize(csv) > 0):
+        logging.error(
+            "The CSV file,\n" + f"{os.path.basename(csv)} does not exists or\nis of size zero."
+        )
+        write_failures(outfile_prefix, outfile_failed)
+        exit(0)
+
+    if all_lines:
+        outfile = os.path.join(os.getcwd(), "_".join([outfile_prefix, "hits.csv"]))
+    else:
+        outfile = os.path.join(os.getcwd(), "_".join([outfile_prefix, "template_hits.txt"]))
+
+    with gzip.open(csv, "rb") as csv_fh:
+        header_cols = dict(
+            [
+                (col, ele)
+                for ele, col in enumerate(csv_fh.readline().decode("utf-8").strip().split(","))
+            ]
+        )
+
+        if fcn and ex not in header_cols.keys():
+            logging.info(
+                f"The header row in file\n{os.path.basename(csv)}\n"
+                + "does not have a column whose names are:\n"
+                + f"-fcn: {fcn} and -extract: {ex}"
+            )
+            exit(1)
+
+        for line in csv_fh:
+            line = line.decode("utf-8")
+
+            if line in ["\n", "\n\r"]:
+                empty_lines += 1
+                continue
+
+            cols = [x.strip() for x in line.strip().split(",")]
+            investigate = float(format(float(cols[header_cols[fcn]]), '.10f'))
+            fcv = float(fcv)
+
+            if re.match(r"[\d\.]+", str(investigate)):
+                if gt and investigate >= fcv:
+                    hits.add(cols[header_cols[ex]])
+                    hit_lines.add(line.strip())
+                elif lt and investigate <= fcv:
+                    hits.add(cols[header_cols[ex]])
+                    hit_lines.add(line.strip())
+            elif investigate == fcv:
+                hits.add(cols[header_cols[ex]])
+                hit_lines.add(line.strip())
+
+        csv_fh.close()
+
+        if len(hits) >= 1:
+            with open(outfile, "w") as outfile_fh:
+                outfile_fh.write(",".join(header_cols.keys()) + "\n")
+                if all_lines:
+                    outfile_fh.write("\n".join(hit_lines) + "\n")
+                else:
+                    outfile_fh.writelines("\n".join(hits) + "\n")
+            outfile_fh.close()
+        else:
+            write_failures(outfile_prefix, outfile_failed)
+
+        if empty_lines > 0:
+            empty_lines_msg = f"Skipped {empty_lines} empty line(s).\n"
+
+            logging.info(
+                empty_lines_msg
+                + f"File {os.path.basename(csv)}\n"
+                + f"written in:\n{os.getcwd()}\nDone! Bye!"
+            )
+        exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/base.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,58 @@
+plugins {
+    id 'nf-amazon'
+}
+
+params {
+    fs = File.separator
+    cfsanpipename = 'CPIPES'
+    center = 'CFSAN, FDA.'
+    libs = "${projectDir}${params.fs}lib"
+    modules = "${projectDir}${params.fs}modules"
+    projectconf = "${projectDir}${params.fs}conf"
+    assetsdir = "${projectDir}${params.fs}assets"
+    subworkflows = "${projectDir}${params.fs}subworkflows"
+    workflows = "${projectDir}${params.fs}workflows"
+    workflowsconf = "${workflows}${params.fs}conf"
+    routines = "${libs}${params.fs}routines"
+    toolshelp = "${libs}${params.fs}help"
+    swmodulepath = "${params.fs}nfs${params.fs}software${params.fs}modules"
+    tracereportsdir = "${launchDir}${params.fs}${cfsanpipename}-${params.pipeline}${params.fs}nextflow-reports"
+    dummyfile = "${projectDir}${params.fs}assets${params.fs}dummy_file.txt"
+    dummyfile2 = "${projectDir}${params.fs}assets${params.fs}dummy_file2.txt"
+    max_cpus = 10
+    linewidth = 80
+    pad = 32
+    pipeline = null
+    help = null
+    input = null
+    output = null
+    metadata = null
+    publish_dir_mode = "copy"
+    publish_dir_overwrite = true
+    user_email = null
+}
+
+dag {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_dag.html"
+    overwrite = true
+}
+
+report {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_report.html"
+    overwrite = true
+}
+
+trace {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_trace.txt"
+    overwrite = true
+}
+
+timeline {
+    enabled = true
+    file = "${params.tracereportsdir}${params.fs}${params.pipeline}_exec_timeline.html"
+    overwrite = true
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/computeinfra.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,171 @@
+standard {
+    process.executor = 'local'
+    process.cpus = 1
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+}
+
+stdkondagac {
+    process.executor = 'local'
+    process.cpus = 4
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    singularity.enabled = false
+    docker.enabled = false
+}
+
+stdcingularitygac {
+    process.executor = 'local'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = false
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+}
+
+raven {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprod {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = true
+    singularity.enabled = false
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprodkonda {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    singularity.enabled = false
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+eprodcingularity {
+    process.executor = 'slurm'
+    process.queue = 'lowmem,midmem,bigmem'
+    process.memory = '10GB'
+    process.cpus = 4
+    params.enable_conda = false
+    params.enable_module = false
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    clusterOptions = '--signal B:USR2'
+}
+
+cingularity {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    params.enable_conda = false
+    params.enable_module = false
+    clusterOptions = '--signal B:USR2'
+}
+
+cingularitygac {
+    process.executor = 'slurm'
+    executor.$slurm.exitReadTimeout = 120000
+    process.queue = 'centriflaken'
+    process.cpus = 4
+    singularity.enabled = true
+    singularity.autoMounts = true
+    singularity.runOptions = "-B ${params.input} -B ${params.bcs_root_dbdir}"
+    docker.enabled = false
+    params.enable_conda = false
+    params.enable_module = false
+    clusterOptions = '-n 1 --signal B:USR2'
+}
+
+konda {
+    process.executor = 'slurm'
+    process.queue = 'prod'
+    process.memory = '10GB'
+    process.cpus = 4
+    singularity.enabled = false
+    docker.enabled = false
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    clusterOptions = '--signal B:USR2'
+}
+
+kondagac {
+    process.executor = 'slurm'
+    executor.$slurm.exitReadTimeout = 120000
+    process.queue = 'centriflaken'
+    process.cpus = 4
+    singularity.enabled = false
+    docker.enabled = false
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+    clusterOptions = '-n 1 --signal B:USR2'
+}
+
+cfsanawsbatch {
+    process.executor = 'awsbatch'
+    process.queue = 'cfsan-nf-batch-job-queue'
+    aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
+    aws.batch.region = 'us-east-1'
+    aws.batch.volumes = ['/hpc/db:/hpc/db:ro', '/hpc/scratch:/hpc/scratch:rw']
+    singularity.enabled = false
+    singularity.autoMounts = true
+    docker.enabled = true
+    params.enable_conda = false
+    conda.enabled = false
+    conda.useMicromamba = false
+    params.enable_module = false
+}
+
+gxkubernetes {
+    process.executor = 'k8s'
+    k8s.namespace = 'galaxy'
+    k8s.serviceAccount = 'default'
+    k8s.pod = [
+        [volumeClaim: 's3-centriflaken-claim', mountPath: '/galaxy/cfsan-centriflaken-db'],
+        [volumeClaim: 's3-nextflow-claim', mountPath: '/galaxy/nf-work-dirs'],
+        [volumeClaim: 'galaxy-galaxy-pvc', mountPath: '/galaxy/server/database'],
+        [priorityClassName: 'galaxy-job-priority']
+    ]
+    singularity.enabled = false
+    docker.enabled = true
+    params.enable_conda = false
+    params.enable_module = false
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/fastq.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,9 @@
+params {
+    fq_filter_by_len = "4000"
+    fq_suffix = ".fastq.gz"
+    fq2_suffix = false
+    fq_strandedness = "unstranded"
+    fq_single_end = false
+    fq_filename_delim = "_"
+    fq_filename_delim_idx = "1"
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/logtheseparams.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,17 @@
+params {
+    logtheseparams = [
+        "${params.metadata}" ? 'metadata' : null,
+        "${params.input}" ? 'input' : null,
+        "${params.output}" ? 'output' : null,
+        "${params.fq_suffix}" ? 'fq_suffix' : null,
+        "${params.fq2_suffix}" ? 'fq2_suffix' : null,
+        "${params.fq_strandedness}" ? 'fq_strandedness' : null,
+        "${params.fq_single_end}" ? 'fq_single_end' : null,
+        "${params.fq_filter_by_len}" ? 'fq_filter_by_len' : null,
+        "${params.fq_filename_delim}" ? 'fq_filename_delim' : null,
+        "${params.fq_filename_delim_idx}" ? 'fq_filename_delim_idx' : null,
+        'enable_conda',
+        'enable_module',
+        'max_cpus'
+    ]
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/manifest.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,8 @@
+manifest {
+    author = 'Kranti.Konganti@fda.hhs.gov'
+    homePage = 'https://cfsan-git.fda.gov/Kranti.Konganti/cpipes'
+    name = 'CPIPES'
+    version = '0.8.0'
+    nextflowVersion = '>=23.04'
+    description = 'Modular Nextflow pipelines at CFSAN, FDA.'
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/modules.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,122 @@
+process {
+    publishDir = [
+        path: {
+            "${task.process.tokenize(':')[-1].toLowerCase()}" == "multiqc" ?
+                "${params.output}${params.fs}${params.pipeline.toLowerCase()}-${task.process.tokenize(':')[-1].toLowerCase()}" :
+                "${params.output}${params.fs}${task.process.tokenize(':')[-1].toLowerCase()}"
+        },
+        mode: params.publish_dir_mode,
+        overwrite: params.publish_dir_overwrite,
+        saveAs: { filename -> filename =~ /^versions.yml|.+?_mqc.*/ ? null : filename }
+    ]
+
+    errorStrategy = {
+        ![0].contains(task.exitStatus) ? dynamic_retry(task.attempt, 10) : 'finish'
+    }
+
+    maxRetries = 1
+    resourceLabels = {[
+        process: task.process,
+        memoryRequested: task.memory.toString(),
+        cpusRequested: task.cpus.toString()
+    ]}
+
+    withLabel: 'process_femto' {
+        cpus = { 1 * task.attempt }
+        memory = { 1.GB * task.attempt }
+        time = { 1.h * task.attempt }
+    }
+
+    withLabel: 'process_pico' {
+        cpus = { min_cpus(2) * task.attempt }
+        memory = { 4.GB * task.attempt }
+        time = { 2.h * task.attempt }
+    }
+
+    withLabel: 'process_nano' {
+        cpus = { min_cpus(4) * task.attempt }
+        memory = { 8.GB * task.attempt }
+        time = { 4.h * task.attempt }
+    }
+
+    withLabel: 'process_micro' {
+        cpus = { min_cpus(8) * task.attempt }
+        memory = { 16.GB * task.attempt }
+        time = { 8.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_low' {
+        cpus = { 1 * task.attempt }
+        memory = { 60.GB * task.attempt }
+        time = { 20.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_medium' {
+        cpus = { 1 * task.attempt }
+        memory = { 100.GB * task.attempt }
+        time = { 30.h * task.attempt }
+    }
+
+    withLabel: 'process_only_mem_high' {
+        cpus = { 1 * task.attempt }
+        memory = { 128.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_low' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 60.GB * task.attempt }
+        time = { 20.h * task.attempt }
+    }
+
+    withLabel: 'process_medium' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 100.GB * task.attempt }
+        time = { 30.h * task.attempt }
+    }
+
+    withLabel: 'process_high' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 128.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_higher' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 256.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+
+    withLabel: 'process_gigantic' {
+        cpus = { min_cpus(10) * task.attempt }
+        memory = { 512.GB * task.attempt }
+        time = { 60.h * task.attempt }
+    }
+}
+
+if ( (params.input || params.metadata ) && params.pipeline ) {
+    try {
+        includeConfig "${params.workflowsconf}${params.fs}process${params.fs}${params.pipeline}.process.config"
+    } catch (Exception e) {
+        System.err.println('-'.multiply(params.linewidth) + "\n" +
+            "\033[0;31m${params.cfsanpipename} - ERROR\033[0m\n" +
+            '-'.multiply(params.linewidth) + "\n" + "\033[0;31mCould not load " +
+            "default pipeline's process configuration. Please provide a pipeline \n" +
+            "name using the --pipeline option.\n\033[0m" + '-'.multiply(params.linewidth) + "\n")
+        System.exit(1)
+    }
+}
+
+// Function will return after sleeping for some time.
+// Sleep time increases exponentially by task attempt.
+def dynamic_retry(task_retry_num, factor_by) {
+    // sleep(Math.pow(2, task_retry_num.toInteger()) * factor_by.toInteger() as long)
+    sleep(Math.pow(1.27, task_retry_num.toInteger()) as long)
+    return 'retry'
+}
+
+// Function that will adjust the minimum number of CPU
+// cores depending as requested by the user.
+def min_cpus(cores) {
+    return Math.min(cores as int, "${params.max_cpus}" as int)
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/conf/multiqc/nowayout_mqc.yml	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,65 @@
+title: CPIPES Report
+intro_text: >
+    CPIPES (CFSAN PIPELINES) is a modular bioinformatics data analysis project at CFSAN, FDA based on NEXTFLOW DSL2.
+report_comment: >
+    This report has been generated by the <a href="https://github.com/CFSAN-Biostatistics/sequoia/blob/master/readme/Workflow_Name_Placeholder.md" target="_blank">CPIPES - Workflow_Name_Placeholder</a>
+    analysis pipeline. Only certain tables and plots are reported here. For complete results, please refer to the analysis pipeline output directory.
+report_header_info:
+    - CPIPES Version: CPIPES_Version_Placeholder
+    - Workflow: Workflow_Name_Placeholder
+    - Workflow Version: Workflow_Version_Placeholder
+    - Conceived By: "Kranti Konganti"
+    - Input Directory: Workflow_Input_Placeholder
+    - Output Directory: Workflow_Output_Placeholder
+show_analysis_paths: False
+show_analysis_time: False
+disable_version_detection: true
+report_section_order:
+    kraken:
+        order: -994
+    NOWAYOUT_collated_table:
+        order: -995
+    NOWAYOUT_INDIV_READS_MAPPED_collated_table:
+        order: -996
+    fastp:
+        order: -997
+    fastqc:
+        order: -998
+    software_versions:
+        order: -999
+
+export_plots: true
+
+# Run only these modules
+run_modules:
+    - fastqc
+    - fastp
+    - kraken
+    - custom_content
+
+module_order:
+    - kraken:
+          name: "SOURMASH TAX METAGENOME"
+          href: "https://sourmash.readthedocs.io/en/latest/command-line.html#sourmash-tax-metagenome-summarize-metagenome-content-from-gather-results"
+          doi: "10.21105/joss.00027"
+          info: >
+              section of the report shows how <b>reads</b> are approximately classified.
+              Please note that the plot title below is shown as
+              <b>Kraken2: Top taxa</b> since <code>kreport</code> fornat was used
+              to create Kraken-style reports with <code>sourmash tax metagenome</code>.
+          path_filters:
+              - "*.kreport.txt"
+    - fastqc:
+          name: "FastQC"
+          info: >
+              section of the report shows FastQC results <b>before</b> adapter trimming
+              on SE reads or on merged PE reads.
+          path_filters:
+              - "*_fastqc.zip"
+    - fastp:
+          name: "fastp"
+          info: >
+              section of the report shows read statistics <b>before</b> and <b>after</b> adapter trimming
+              with <code>fastp</code> on SE reads or on merged PE reads.
+          path_filters:
+              - "*.fastp.json"
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/cpipes	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,56 @@
+#!/usr/bin/env nextflow
+
+/*
+----------------------------------------------------------------------------------------
+    cfsan-dev/cpipes
+----------------------------------------------------------------------------------------
+    NAME          : CPIPES
+    DESCRIPTION   : Modular Nextflow pipelines at CFSAN, FDA.
+    GITLAB        : https://xxxxxxxxxx/Kranti.Konganti/cpipes-framework
+    JIRA          : https://xxxxxxxxxx/jira/projects/CPIPES/
+    CONTRIBUTORS  : Kranti Konganti
+----------------------------------------------------------------------------------------
+*/
+
+// Enable DSL 2
+nextflow.enable.dsl = 2
+
+// Default routines for MAIN
+include { pipelineBanner; stopNow; } from "${params.routines}"
+
+// Our banner for CPIPES
+log.info pipelineBanner()
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    INCLUDE ALL WORKFLOWS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+switch ("${params.pipeline}") {
+    case "nowayout":
+        include { NOWAYOUT } from "${params.workflows}${params.fs}${params.pipeline}"
+        break
+    default:
+        stopNow("PLEASE MENTION A PIPELINE NAME. Ex: --pipeline nowayout")
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    RUN ALL WORKFLOWS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow {
+    switch ("${params.pipeline}") {
+        case "nowayout":
+            NOWAYOUT()
+            break
+    }
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    THE END
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/dbcheck	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,129 @@
+#!/usr/bin/env bash
+
+##########################################################
+# Constants
+##########################################################
+GREEN=$(tput setaf 2)
+RED=$(tput setaf 1)
+CYAN=$(tput setaf 6)
+CLRESET=$(tput sgr0)
+prog_name="nowayout"
+dbBuild="03062025"
+SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
+dbPath="${SCRIPT_DIR}/assets/dbfiles"
+taxonomyPath="$dbPath/taxonomy"
+
+usage()
+{
+    echo
+    echo usage: "$0" [-h]
+    echo
+    echo "Check for species presence in ${prog_name} database(s)."
+    echo
+    echo 'Example usage:'
+    echo
+    echo 'dbcheck -l'
+    echo 'dbcheck -g Cathartus'
+    echo 'dbcheck -d mitomine -g Cathartus'
+    echo 'dbcheck -d mitomine -s "Cathartus quadriculus"'
+    echo
+    echo 'Options:'
+    echo " -l        : List ${prog_name} databases"
+    echo ' -d        : Search this database. Default: mitomine.'
+    echo ' -g        : Genus to search for.'
+    echo ' -s        : "Genus Species" to search for.'
+    echo ' -h        : Show this help message and exit'
+    echo
+    echo "$1"
+}
+
+while getopts ":d:g:s:l" OPT; do
+    case "${OPT}" in
+        l)
+            listdb="list"
+            ;;
+        d)
+            dbname=${OPTARG}
+            ;;
+        g)
+            genus=${OPTARG}
+            ;;
+        s)
+            species=${OPTARG}
+            ;;
+        ?)
+            usage
+            exit 0
+            ;;
+    esac
+done
+
+
+
+if [ -n "$listdb" ]; then
+    num_dbs=$(find -L "$taxonomyPath" -type d | tail -n+2 | wc -l)
+    echo "=============================================="
+
+    db_num="1"
+    find -L "$taxonomyPath" -type d | tail -n+2 | while read -r db; do
+        dbName=$(basename "$db")
+        echo "${db_num}. $dbName"
+        db_num=$(( db_num + 1 ))
+    done
+    echo "=============================================="
+    echo "Number of ${prog_name} databases: $num_dbs"
+    echo "=============================================="
+
+    exit 0
+fi
+
+
+
+if [ -z "$dbname" ]; then
+    dbname="mitomine2"
+fi
+
+if [[ -n "$genus" && -n "$species" ]]; then
+    usage "ERROR: Only one of -g or -s needs to be defined!"
+    exit 1
+elif [ -n "$genus" ]; then
+    check="$genus"
+elif [ -n "$species" ]; then
+    check="$species"
+else
+    check=""
+fi
+
+if [ -z "$check" ]; then
+    usage "ERROR: -g or -s is required! check:$check"
+    exit 1
+fi
+
+lineages="$taxonomyPath/$dbname/lineages.csv"
+
+echo
+echo -e "Checking ${dbname} for ${CYAN}${check}${CLRESET}...\nPlease wait..."
+echo
+
+num=$(grep -F ",$check," "$lineages" | cut -f1 -d, | sort -u | wc -l)
+num_species=$(tail -n+2 "$lineages" | cut -f8 -d, | sort -u | wc -l)
+num_entries=$(tail -n+2 "$lineages" | wc -l)
+
+echo "$dbname brief stats"
+echo "=============================================="
+echo "DB Build: $dbBuild"
+echo "Number of unique species: $num_species"
+echo "Number of accessions in database: $num_entries"
+echo "=============================================="
+
+
+if [ "$num" -gt 0 ]; then
+    echo
+    echo "${GREEN}$check is present in ${dbname}${CLRESET}."
+    echo "Number of accessions representing $check: $num"
+    echo "=============================================="
+else
+    echo "${RED}$check is absent in ${dbname}${CLRESET}."
+    echo -e "No worries. Please request the developer of\n${prog_name} to augment the database!"
+    echo "=============================================="
+fi
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/fastp.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,280 @@
+// Help text for fastp within CPIPES.
+
+def fastpHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'fastp_run': [
+            clihelp: 'Run fastp tool. Default: ' +
+                (params.fastp_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_failed_out': [
+            clihelp: 'Specify whether to store reads that cannot pass the filters. ' +
+                "Default: ${params.fastp_failed_out}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_merged_out': [
+            clihelp: 'Specify whether to store merged output or not. ' +
+                "Default: ${params.fastp_merged_out}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'fastp_overlapped_out': [
+            clihelp: 'For each read pair, output the overlapped region if it has no mismatched base. ' +
+                "Default: ${params.fastp_overlapped_out}",
+            cliflag: '--overlapped_out',
+            clivalue: (params.fastp_overlapped_out ?: '')
+        ],
+        'fastp_6': [
+            clihelp: "Indicate that the input is using phred64 scoring (it'll be converted to phred33, " +
+                'so the output will still be phred33). ' +
+                "Default: ${params.fastp_6}",
+            cliflag: '-6',
+            clivalue: (params.fastp_6 ? ' ' : '')
+        ],
+        'fastp_reads_to_process': [
+            clihelp: 'Specify how many reads/pairs are to be processed. Default value 0 means ' +
+                'process all reads. ' +
+                "Default: ${params.fastp_reads_to_process}",
+            cliflag: '--reads_to_process',
+            clivalue: (params.fastp_reads_to_process ?: '')
+        ],
+        'fastp_fix_mgi_id': [
+            clihelp: 'The MGI FASTQ ID format is not compatible with many BAM operation tools, ' +
+                'enable this option to fix it. ' +
+                "Default: ${params.fastp_fix_mgi_id}",
+            cliflag: '--fix_mgi_id',
+            clivalue: (params.fastp_fix_mgi_id ? ' ' : '')
+        ],
+        'fastp_A': [
+            clihelp: 'Disable adapter trimming. On by default. ' +
+                "Default: ${params.fastp_A}",
+            cliflag: '-A',
+            clivalue: (params.fastp_A ? ' ' : '')
+        ],
+        'fastp_adapter_fasta': [
+            clihelp: 'Specify a FASTA file to trim both read1 and read2 (if PE) by all the sequences ' +
+                'in this FASTA file. ' +
+                "Default: ${params.fastp_adapter_fasta}",
+            cliflag: '--adapter_fasta',
+            clivalue: (params.fastp_adapter_fasta ?: '')
+        ],
+        'fastp_f': [
+            clihelp: 'Trim how many bases in front of read1. ' +
+                "Default: ${params.fastp_f}",
+            cliflag: '-f',
+            clivalue: (params.fastp_f ?: '')
+        ],
+        'fastp_t': [
+            clihelp: 'Trim how many bases at the end of read1. ' +
+                "Default: ${params.fastp_t}",
+            cliflag: '-t',
+            clivalue: (params.fastp_t ?: '')
+        ],
+        'fastp_b': [
+            clihelp: 'Max length of read1 after trimming. ' +
+                "Default: ${params.fastp_b}",
+            cliflag: '-b',
+            clivalue: (params.fastp_b ?: '')
+        ],
+        'fastp_F': [
+            clihelp: 'Trim how many bases in front of read2. ' +
+                "Default: ${params.fastp_F}",
+            cliflag: '-F',
+            clivalue: (params.fastp_F ?: '')
+        ],
+        'fastp_T': [
+            clihelp: 'Trim how many bases at the end of read2. ' +
+                "Default: ${params.fastp_T}",
+            cliflag: '-T',
+            clivalue: (params.fastp_T ?: '')
+        ],
+        'fastp_B': [
+            clihelp: 'Max length of read2 after trimming. ' +
+                "Default: ${params.fastp_B}",
+            cliflag: '-B',
+            clivalue: (params.fastp_B ?: '')
+        ],
+        'fastp_dedup': [
+            clihelp: 'Enable deduplication to drop the duplicated reads/pairs. ' +
+                "Default: ${params.fastp_dedup}",
+            cliflag: '--dedup',
+            clivalue: (params.fastp_dedup ? ' ' : '')
+        ],
+        'fastp_dup_calc_accuracy': [
+            clihelp: 'Accuracy level to calculate duplication (1~6), higher level uses more memory ' +
+                '(1G, 2G, 4G, 8G, 16G, 24G). Default 1 for no-dedup mode, and 3 for dedup mode. ' +
+                "Default: ${params.fastp_dup_calc_accuracy}",
+            cliflag: '--dup_calc_accuracy',
+            clivalue: (params.fastp_dup_calc_accuracy ?: '')
+        ],
+        'fastp_poly_g_min_len': [
+            clihelp: 'The minimum length to detect polyG in the read tail. ' +
+                "Default: ${params.fastp_poly_g_min_len}",
+            cliflag: '--poly_g_min_len',
+            clivalue: (params.fastp_poly_g_min_len ?: '')
+        ],
+        'fastp_G': [
+            clihelp: 'Disable polyG tail trimming. ' +
+                "Default: ${params.fastp_G}",
+            cliflag: '-G',
+            clivalue: (params.fastp_G ? ' ' : '')
+        ],
+        'fastp_x': [
+            clihelp: "Enable polyX trimming in 3' ends. " +
+                "Default: ${params.fastp_x}",
+            cliflag: 'x=',
+            clivalue: (params.fastp_x ? ' ' : '')
+        ],
+        'fastp_poly_x_min_len': [
+            clihelp: 'The minimum length to detect polyX in the read tail. ' +
+                "Default: ${params.fastp_poly_x_min_len}",
+            cliflag: '--poly_x_min_len',
+            clivalue: (params.fastp_poly_x_min_len ?: '')
+        ],
+        'fastp_cut_front': [
+            clihelp: "Move a sliding window from front (5') to tail, drop the bases in the window " +
+                'if its mean quality < threshold, stop otherwise. ' +
+                "Default: ${params.fastp_cut_front}",
+            cliflag: '--cut_front',
+            clivalue: (params.fastp_cut_front ? ' ' : '')
+        ],
+        'fastp_cut_tail': [
+            clihelp: "Move a sliding window from tail (3') to front, drop the bases in the window " +
+                'if its mean quality < threshold, stop otherwise. ' +
+                "Default: ${params.fastp_cut_tail}",
+            cliflag: '--cut_tail',
+            clivalue: (params.fastp_cut_tail ? ' ' : '')
+        ],
+        'fastp_cut_right': [
+            clihelp: "Move a sliding window from tail, drop the bases in the window and the right part " +
+                'if its mean quality < threshold, and then stop. ' +
+                "Default: ${params.fastp_cut_right}",
+            cliflag: '--cut_right',
+            clivalue: (params.fastp_cut_right ? ' ' : '')
+        ],
+        'fastp_W': [
+            clihelp: "Sliding window size shared by --fastp_cut_front, --fastp_cut_tail and " +
+                '--fastp_cut_right. ' +
+                "Default: ${params.fastp_W}",
+            cliflag: '--cut_window_size',
+            clivalue: (params.fastp_W ?: '')
+        ],
+        'fastp_M': [
+            clihelp: "The mean quality requirement shared by --fastp_cut_front, --fastp_cut_tail and " +
+                '--fastp_cut_right. ' +
+                "Default: ${params.fastp_M}",
+            cliflag: '--cut_mean_quality',
+            clivalue: (params.fastp_M ?: '')
+        ],
+        'fastp_q': [
+            clihelp: 'The quality value below which a base should is not qualified. ' +
+                "Default: ${params.fastp_q}",
+            cliflag: '-q',
+            clivalue: (params.fastp_q ?: '')
+        ],
+        'fastp_u': [
+            clihelp: 'What percent of bases are allowed to be unqualified. ' +
+                "Default: ${params.fastp_u}",
+            cliflag: '-u',
+            clivalue: (params.fastp_u ?: '')
+        ],
+        'fastp_n': [
+            clihelp: "How many N's can a read have. " +
+                "Default: ${params.fastp_n}",
+            cliflag: '-n',
+            clivalue: (params.fastp_n ?: '')
+        ],
+        'fastp_e': [
+            clihelp: "If the full reads' average quality is below this value, then it is discarded. " +
+                "Default: ${params.fastp_e}",
+            cliflag: '-e',
+            clivalue: (params.fastp_e ?: '')
+        ],
+        'fastp_l': [
+            clihelp: 'Reads shorter than this length will be discarded. ' +
+                "Default: ${params.fastp_l}",
+            cliflag: '-l',
+            clivalue: (params.fastp_l ?: '')
+        ],
+        'fastp_max_len': [
+            clihelp: 'Reads longer than this length will be discarded. ' +
+                "Default: ${params.fastp_max_len}",
+            cliflag: '--length_limit',
+            clivalue: (params.fastp_max_len ?: '')
+        ],
+        'fastp_y': [
+            clihelp: 'Enable low complexity filter. The complexity is defined as the percentage ' +
+                'of bases that are different from its next base (base[i] != base[i+1]). ' +
+                "Default: ${params.fastp_y}",
+            cliflag: '-y',
+            clivalue: (params.fastp_y ? ' ' : '')
+        ],
+        'fastp_Y': [
+            clihelp: 'The threshold for low complexity filter (0~100). Ex: A value of 30 means ' +
+                '30% complexity is required. ' +
+                "Default: ${params.fastp_Y}",
+            cliflag: '-Y',
+            clivalue: (params.fastp_Y ?: '')
+        ],
+        'fastp_U': [
+            clihelp: 'Enable Unique Molecular Identifier (UMI) pre-processing. ' +
+                "Default: ${params.fastp_U}",
+            cliflag: '-U',
+            clivalue: (params.fastp_U ? ' ' : '')
+        ],
+        'fastp_umi_loc': [
+            clihelp: 'Specify the location of UMI, can be one of ' +
+                'index1/index2/read1/read2/per_index/per_read. ' +
+                "Default: ${params.fastp_umi_loc}",
+            cliflag: '--umi_loc',
+            clivalue: (params.fastp_umi_loc ?: '')
+        ],
+        'fastp_umi_len': [
+            clihelp: 'If the UMI is in read1 or read2, its length should be provided. ' +
+                "Default: ${params.fastp_umi_len}",
+            cliflag: '--umi_len',
+            clivalue: (params.fastp_umi_len ?: '')
+        ],
+        'fastp_umi_prefix': [
+            clihelp: 'If specified, an underline will be used to connect prefix and UMI ' +
+                '(i.e. prefix=UMI, UMI=AATTCG, final=UMI_AATTCG). ' +
+                "Default: ${params.fastp_umi_prefix}",
+            cliflag: '--umi_prefix',
+            clivalue: (params.fastp_umi_prefix ?: '')
+        ],
+        'fastp_umi_skip': [
+            clihelp: 'If the UMI is in read1 or read2, fastp can skip several bases following the UMI. ' +
+                "Default: ${params.fastp_umi_skip}",
+            cliflag: '--umi_skip',
+            clivalue: (params.fastp_umi_skip ?: '')
+        ],
+        'fastp_p': [
+            clihelp: 'Enable overrepresented sequence analysis. ' +
+                "Default: ${params.fastp_p}",
+            cliflag: '-p',
+            clivalue: (params.fastp_p ? ' ' : '')
+        ],
+        'fastp_P': [
+            clihelp: 'One in this many number of reads will be computed for overrepresentation analysis ' +
+                '(1~10000), smaller is slower. ' +
+                "Default: ${params.fastp_P}",
+            cliflag: '-P',
+            clivalue: (params.fastp_P ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/gsalkronapy.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,58 @@
+// Help text for `gen_salmon_tph_and_krona_tsv.py` (gsalkronapy) within CPIPES.
+
+def gsalkronapyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'gsalkronapy_run': [
+            clihelp: 'Run the `gen_salmon_tph_and_krona_tsv.py` script. Default: ' +
+                (params.gsalkronapy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'gsalkronapy_sf': [
+            clihelp: 'Set the scaling factor by which TPM values ' +
+                'are scaled down.' +
+                " Default: ${params.gsalkronapy_sf}",
+            cliflag: '-sf',
+            clivalue: (params.gsalkronapy_sf ?: '')
+        ],
+        'gsalkronapy_smres_suffix': [
+            clihelp: 'Find the `sourmash gather` result files ' +
+                'ending in this suffix.' +
+                " Default: ${params.gsalkronapy_smres_suffix}",
+            cliflag: '-smres-suffix',
+            clivalue: (params.gsalkronapy_smres_suffix ?: '')
+        ],
+        'gsalkronapy_failed_suffix': [
+            clihelp: 'Find the sample names which failed classification stored ' +
+                'inside the files ending in this suffix.' +
+                " Default: ${params.gsalkronapy_failed_suffix}",
+            cliflag: '-failed-suffix',
+            clivalue: (params.gsalkronapy_failed_suffix ?: '')
+        ],
+        'gsalkronapy_num_lin_cols': [
+            clihelp: 'Number of columns expected in the lineages CSV file. ' +
+                " Default: ${params.gsalkronapy_num_lin_cols}",
+            cliflag: '-num-lin-cols',
+            clivalue: (params.gsalkronapy_num_lin_cols ?: '')
+        ],
+        'gsalkronapy_lin_regex': [
+            clihelp: 'Number of columns expected in the lineages CSV file. ' +
+                " Default: ${params.gsalkronapy_num_lin_cols}",
+            cliflag: '-num-lin-cols',
+            clivalue: (params.gsalkronapy_num_lin_cols ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/gsatpy.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,32 @@
+// Help text for gen_sim_abn_table.py (gsat) within CPIPES.
+
+def gsatpyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'gsatpy_run': [
+            clihelp: 'Run the gen_sim_abn_table.py script. Default: ' +
+                (params.gsatpy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'gsatpy_header': [
+            clihelp: 'Does the taxonomic summary result files have ' +
+                'a header line. ' +
+                " Default: ${params.gsatpy_header}",
+            cliflag: '-header',
+            clivalue: (params.gsatpy_header ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kmaalign.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,200 @@
+// Help text for kma align within CPIPES.
+
+def kmaalignHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'kmaalign_run': [
+            clihelp: 'Run kma tool. Default: ' +
+                (params.kmaalign_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'kmaalign_int': [
+            clihelp: 'Input file has interleaved reads. ' +
+                " Default: ${params.kmaalign_int}",
+            cliflag: '-int',
+            clivalue: (params.kmaalign_int ? ' ' : '')
+        ],
+        'kmaalign_ef': [
+            clihelp: 'Output additional features. ' +
+                "Default: ${params.kmaalign_ef}",
+            cliflag: '-ef',
+            clivalue: (params.kmaalign_ef ? ' ' : '')
+        ],
+        'kmaalign_vcf': [
+            clihelp: 'Output vcf file. 2 to apply FT. ' +
+                "Default: ${params.kmaalign_vcf}",
+            cliflag: '-vcf',
+            clivalue: (params.kmaalign_vcf ? ' ' : '')
+        ],
+        'kmaalign_sam': [
+            clihelp: 'Output SAM, 4/2096 for mapped/aligned. ' +
+                "Default: ${params.kmaalign_sam}",
+            cliflag: '-sam',
+            clivalue: (params.kmaalign_sam ? ' ' : '')
+        ],
+        'kmaalign_nc': [
+            clihelp: 'No consensus file. ' +
+                "Default: ${params.kmaalign_nc}",
+            cliflag: '-nc',
+            clivalue: (params.kmaalign_nc ? ' ' : '')
+        ],
+        'kmaalign_na': [
+            clihelp: 'No aln file. ' +
+                "Default: ${params.kmaalign_na}",
+            cliflag: '-na',
+            clivalue: (params.kmaalign_na ? ' ' : '')
+        ],
+        'kmaalign_nf': [
+            clihelp: 'No frag file. ' +
+                "Default: ${params.kmaalign_nf}",
+            cliflag: '-nf',
+            clivalue: (params.kmaalign_nf ? ' ' : '')
+        ],
+        'kmaalign_a': [
+            clihelp: 'Output all template mappings. ' +
+                "Default: ${params.kmaalign_a}",
+            cliflag: '-a',
+            clivalue: (params.kmaalign_a ? ' ' : '')
+        ],
+        'kmaalign_and': [
+            clihelp: 'Use both -mrs and p-value on consensus. ' +
+                "Default: ${params.kmaalign_and}",
+            cliflag: '-and',
+            clivalue: (params.kmaalign_and ? ' ' : '')
+        ],
+        'kmaalign_oa': [
+            clihelp: 'Use neither -mrs or p-value on consensus. ' +
+                "Default: ${params.kmaalign_oa}",
+            cliflag: '-oa',
+            clivalue: (params.kmaalign_oa ? ' ' : '')
+        ],
+        'kmaalign_bc': [
+            clihelp: 'Minimum support to call bases. ' +
+                "Default: ${params.kmaalign_bc}",
+            cliflag: '-bc',
+            clivalue: (params.kmaalign_bc ?: '')
+        ],
+        'kmaalign_bcNano': [
+            clihelp: 'Altered indel calling for ONT data. ' +
+                "Default: ${params.kmaalign_bcNano}",
+            cliflag: '-bcNano',
+            clivalue: (params.kmaalign_bcNano ? ' ' : '')
+        ],
+        'kmaalign_bcd': [
+            clihelp: 'Minimum depth to call bases. ' +
+                "Default: ${params.kmaalign_bcd}",
+            cliflag: '-bcd',
+            clivalue: (params.kmaalign_bcd ?: '')
+        ],
+        'kmaalign_bcg': [
+            clihelp: 'Maintain insignificant gaps. ' +
+                "Default: ${params.kmaalign_bcg}",
+            cliflag: '-bcg',
+            clivalue: (params.kmaalign_bcg ? ' ' : '')
+        ],
+        'kmaalign_ID': [
+            clihelp: 'Minimum consensus ID. ' +
+                "Default: ${params.kmaalign_ID}",
+            cliflag: '-ID',
+            clivalue: (params.kmaalign_ID ?: '')
+        ],
+        'kmaalign_md': [
+            clihelp: 'Minimum depth. ' +
+                "Default: ${params.kmaalign_md}",
+            cliflag: '-md',
+            clivalue: (params.kmaalign_md ?: '')
+        ],
+        'kmaalign_dense': [
+            clihelp: 'Skip insertion in consensus. ' +
+                "Default: ${params.kmaalign_dense}",
+            cliflag: '-dense',
+            clivalue: (params.kmaalign_dense ? ' ' : '')
+        ],
+        'kmaalign_ref_fsa': [
+            clihelp: 'Use Ns on indels. ' +
+                "Default: ${params.kmaalign_ref_fsa}",
+            cliflag: '-ref_fsa',
+            clivalue: (params.kmaalign_ref_fsa ? ' ' : '')
+        ],
+        'kmaalign_Mt1': [
+            clihelp: 'Map everything to one template. ' +
+                "Default: ${params.kmaalign_Mt1}",
+            cliflag: '-Mt1',
+            clivalue: (params.kmaalign_Mt1 ? ' ' : '')
+        ],
+        'kmaalign_1t1': [
+            clihelp: 'Map one query to one template. ' +
+                "Default: ${params.kmaalign_1t1}",
+            cliflag: '-1t1',
+            clivalue: (params.kmaalign_1t1 ? ' ' : '')
+        ],
+        'kmaalign_mrs': [
+            clihelp: 'Minimum relative alignment score. ' +
+                "Default: ${params.kmaalign_mrs}",
+            cliflag: '-mrs',
+            clivalue: (params.kmaalign_mrs ?: '')
+        ],
+        'kmaalign_mrc': [
+            clihelp: 'Minimum query coverage. ' +
+                "Default: ${params.kmaalign_mrc}",
+            cliflag: '-mrc',
+            clivalue: (params.kmaalign_mrc ?: '')
+        ],
+        'kmaalign_mp': [
+            clihelp: 'Minimum phred score of trailing and leading bases. ' +
+                "Default: ${params.kmaalign_mp}",
+            cliflag: '-mp',
+            clivalue: (params.kmaalign_mp ?: '')
+        ],
+        'kmaalign_mq': [
+            clihelp: 'Set the minimum mapping quality. ' +
+                "Default: ${params.kmaalign_mq}",
+            cliflag: '-mq',
+            clivalue: (params.kmaalign_mq ?: '')
+        ],
+        'kmaalign_eq': [
+            clihelp: 'Minimum average quality score. ' +
+                "Default: ${params.kmaalign_eq}",
+            cliflag: '-eq',
+            clivalue: (params.kmaalign_eq ?: '')
+        ],
+        'kmaalign_5p': [
+            clihelp: 'Trim 5 prime by this many bases. ' +
+                "Default: ${params.kmaalign_5p}",
+            cliflag: '-5p',
+            clivalue: (params.kmaalign_5p ?: '')
+        ],
+        'kmaalign_3p': [
+            clihelp: 'Trim 3 prime by this many bases ' +
+                "Default: ${params.kmaalign_3p}",
+            cliflag: '-3p',
+            clivalue: (params.kmaalign_3p ?: '')
+        ],
+        'kmaalign_apm': [
+            clihelp: 'Sets both -pm and -fpm ' +
+                "Default: ${params.kmaalign_apm}",
+            cliflag: '-apm',
+            clivalue: (params.kmaalign_apm ?: '')
+        ],
+        'kmaalign_cge': [
+            clihelp: 'Set CGE penalties and rewards ' +
+                "Default: ${params.kmaalign_cge}",
+            cliflag: '-cge',
+            clivalue: (params.kmaalign_cge ? ' ' : '')
+        ],
+
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kraken2.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,72 @@
+// Help text for kraken2 within CPIPES.
+
+def kraken2Help(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'kraken2_db': [
+            clihelp: "Absolute path to kraken database. Default: ${params.kraken2_db}",
+            cliflag: '--db',
+            clivalue: null
+        ],
+        'kraken2_confidence': [
+            clihelp: 'Confidence score threshold which must be ' +
+                "between 0 and 1. Default: ${params.kraken2_confidence}",
+            cliflag: '--confidence',
+            clivalue: (params.kraken2_confidence ?: '')
+        ],
+        'kraken2_quick': [
+            clihelp: "Quick operation (use first hit or hits). Default: ${params.kraken2_quick}",
+            cliflag: '--quick',
+            clivalue: (params.kraken2_quick ? ' ' : '')
+        ],
+        'kraken2_use_mpa_style': [
+            clihelp: "Report output like Kraken 1's " +
+                "kraken-mpa-report. Default: ${params.kraken2_use_mpa_style}",
+            cliflag: '--use-mpa-style',
+            clivalue: (params.kraken2_use_mpa_style ? ' ' : '')
+        ],
+        'kraken2_minimum_base_quality': [
+            clihelp: 'Minimum base quality used in classification ' +
+                " which is only effective with FASTQ input. Default: ${params.kraken2_minimum_base_quality}",
+            cliflag: '--minimum-base-quality',
+            clivalue: (params.kraken2_minimum_base_quality ?: '')
+        ],
+        'kraken2_report_zero_counts': [
+            clihelp: 'Report counts for ALL taxa, even if counts are zero. ' +
+                "Default: ${params.kraken2_report_zero_counts}",
+            cliflag: '--report-zero-counts',
+            clivalue: (params.kraken2_report_zero_counts ? ' ' : '')
+        ],
+        'kraken2_report_minmizer_data': [
+            clihelp: 'Report minimizer and distinct minimizer count' +
+                ' information in addition to normal Kraken report. ' +
+                "Default: ${params.kraken2_report_minimizer_data}",
+            cliflag: '--report-minimizer-data',
+            clivalue: (params.kraken2_report_minimizer_data ? ' ' : '')
+        ],
+        'kraken2_use_names': [
+            clihelp: 'Print scientific names instead of just taxids. ' +
+                "Default: ${params.kraken2_use_names}",
+            cliflag: '--use-names',
+            clivalue: (params.kraken2_use_names ? ' ' : '')
+        ],
+        'kraken2_extract_bug': [
+            clihelp: 'Extract the reads or contigs beloging to this bug. ' +
+                "Default: ${params.kraken2_extract_bug}",
+            cliflag: null,
+            clivalue: null
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/kronaktimporttext.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,44 @@
+// Help text for ktImportText (krona) within CPIPES.
+
+def kronaktimporttextHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'krona_ktIT_run': [
+            clihelp: 'Run the ktImportText (ktIT) from krona. Default: ' +
+                (params.krona_ktIT_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'krona_ktIT_n': [
+            clihelp: 'Name of the highest level. ' +
+                "Default: ${params.krona_ktIT_n}",
+            cliflag: '-n',
+            clivalue: (params.krona_ktIT_n ?: '')
+        ],
+        'krona_ktIT_q': [
+            clihelp: 'Input file(s) do not have a field for quantity. ' +
+                "Default: ${params.krona_ktIT_q}",
+            cliflag: '-q',
+            clivalue: (params.krona_ktIT_q ? ' ' : '')
+        ],
+        'krona_ktIT_c': [
+            clihelp: 'Combine data from each file, rather than creating separate datasets '
+                + 'within the chart. ' +
+                "Default: ${params.krona_ktIT_c}",
+            cliflag: '-c',
+            clivalue: (params.krona_ktIT_c ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/salmonidx.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,91 @@
+// Help text for salmon index within CPIPES.
+
+def salmonidxHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'salmonidx_run': [
+            clihelp: 'Run `salmon index` tool. Default: ' +
+                (params.salmonidx_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'salmonidx_k': [
+            clihelp: 'The size of k-mers that should be used for the ' +
+                " quasi index. Default: ${params.salmonidx_k}",
+            cliflag: '-k',
+            clivalue: (params.salmonidx_k ?: '')
+        ],
+        'salmonidx_gencode': [
+            clihelp: 'This flag will expect the input transcript FASTA ' +
+                'to be in GENCODE format, and will split the transcript ' +
+                'name at the first `|` character. These reduced names ' +
+                'will be used in the output and when looking for these ' +
+                'transcripts in a gene to transcript GTF.' +
+                " Default: ${params.salmonidx_gencode}",
+            cliflag: '--gencode',
+            clivalue: (params.salmonidx_gencode ? ' ' : '')
+        ],
+        'salmonidx_features': [
+            clihelp: 'This flag will expect the input reference to be in the ' +
+                'tsv file format, and will split the feature name at the first ' +
+                '`tab` character. These reduced names will be used in the output ' +
+                'and when looking for the sequence of the features. GTF.' +
+                " Default: ${params.salmonidx_features}",
+            cliflag: '--features',
+            clivalue: (params.salmonidx_features ? ' ' : '')
+        ],
+        'salmonidx_keepDuplicates': [
+            clihelp: 'This flag will disable the default indexing behavior of ' +
+                'discarding sequence-identical duplicate transcripts. If this ' +
+                'flag is passed then duplicate transcripts that appear in the ' +
+                'input will be retained and quantified separately.' +
+                " Default: ${params.salmonidx_keepDuplicates}",
+            cliflag: '--keepDuplicates',
+            clivalue: (params.salmonidx_keepDuplicates ? ' ' : '')
+        ],
+        'salmonidx_keepFixedFasta': [
+            clihelp: 'Retain the fixed fasta file (without short ' +
+                'transcripts and duplicates, clipped, etc.) generated ' +
+                "during indexing. Default: ${params.salmonidx_keepFixedFasta}",
+            cliflag: '--keepFixedFasta',
+            clivalue: (params.salmonidx_keepFixedFasta ?: '')
+        ],
+        'salmonidx_filterSize': [
+            clihelp: 'The size of the Bloom filter that will be used ' +
+                'by TwoPaCo during indexing. The filter will be of ' +
+                'size 2^{filterSize}. A value of -1 means that the ' +
+                'filter size will be automatically set based on the ' +
+                'number of distinct k-mers in the input, as estimated by ' +
+                "nthll. Default: ${params.salmonidx_filterSize}",
+            cliflag: '--filterSize',
+            clivalue: (params.salmonidx_filterSize ?: '')
+        ],
+        'salmonidx_sparse': [
+            clihelp: 'Build the index using a sparse sampling of k-mer ' +
+                'positions This will require less memory (especially ' +
+                'during quantification), but will take longer to construct' +
+                'and can slow down mapping / alignment.' +
+                " Default: ${params.salmonidx_sparse}",
+            cliflag: '--sparse',
+            clivalue: (params.salmonidx_sparse ? ' ' : '')
+        ],
+        'salmonidx_n': [
+            clihelp: 'Do not clip poly-A tails from the ends of target ' +
+                "sequences. Default: ${params.salmonidx_n}",
+            cliflag: '-n',
+            clivalue: (params.salmonidx_n ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/seqkitgrep.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,75 @@
+// Help text for seqkit `grep` within CPIPES.
+
+def seqkitgrepHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'seqkit_grep_run': [
+            clihelp: 'Run the seqkit `grep` tool. Default: ' +
+                (params.seqkit_grep_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_grep_n': [
+            clihelp: 'Match by full name instead of just ID. ' +
+                "Default: " + (params.seqkit_grep_n ?: 'undefined'),
+            cliflag: '--seqkit_grep_n',
+            clivalue: (params.seqkit_grep_n ? ' ' : '')
+        ],
+        'seqkit_grep_s': [
+            clihelp: 'Search subseq on seq, both positive and negative ' +
+                'strand are searched, and mismatch allowed using flag --seqkit_grep_m. ' +
+                "Default: " + (params.seqkit_grep_s ?: 'undefined'),
+            cliflag: '--seqkit_grep_s',
+            clivalue: (params.seqkit_grep_s ? ' ' : '')
+        ],
+        'seqkit_grep_c': [
+            clihelp: 'Input is circular genome ' +
+                "Default: " + (params.seqkit_grep_c ?: 'undefined'),
+            cliflag: '--seqkit_grep_c',
+            clivalue: (params.seqkit_grep_c ? ' ' : '')
+        ],
+        'seqkit_grep_C': [
+            clihelp: 'Just print a count of matching records. With the ' +
+                '--seqkit_grep_v flag, count non-matching records. ' +
+                "Default: " + (params.seqkit_grep_v ?: 'undefined'),
+            cliflag: '--seqkit_grep_v',
+            clivalue: (params.seqkit_grep_v ? ' ' : '')
+        ],
+        'seqkit_grep_i': [
+            clihelp: 'Ignore case while using seqkit grep. ' +
+                "Default: " + (params.seqkit_grep_i ?: 'undefined'),
+            cliflag: '--seqkit_grep_i',
+            clivalue: (params.seqkit_grep_i ? ' ' : '')
+        ],
+        'seqkit_grep_v': [
+            clihelp: 'Invert the match i.e. select non-matching records. ' +
+                "Default: " + (params.seqkit_grep_v ?: 'undefined'),
+            cliflag: '--seqkit_grep_v',
+            clivalue: (params.seqkit_grep_v ? ' ' : '')
+        ],
+        'seqkit_grep_m': [
+            clihelp: 'Maximum mismatches when matching by sequence. ' +
+                "Default: " + (params.seqkit_grep_m ?: 'undefined'),
+            cliflag: '--seqkit_grep_m',
+            clivalue: (params.seqkit_grep_v ?: '')
+        ],
+        'seqkit_grep_r': [
+            clihelp: 'Input patters are regular expressions. ' +
+                "Default: " + (params.seqkit_grep_m ?: 'undefined'),
+            cliflag: '--seqkit_grep_m',
+            clivalue: (params.seqkit_grep_v ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/seqkitrmdup.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,61 @@
+// Help text for seqkit rmdup within CPIPES.
+
+def seqkitrmdupHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'seqkit_rmdup_run': [
+            clihelp: 'Remove duplicate sequences using seqkit rmdup. Default: ' +
+                (params.seqkit_rmdup_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_n': [
+            clihelp: 'Match and remove duplicate sequences by full name instead of just ID. ' +
+                "Default: ${params.seqkit_rmdup_n}",
+            cliflag: '-n',
+            clivalue: (params.seqkit_rmdup_n ? ' ' : '')
+        ],
+        'seqkit_rmdup_s': [
+            clihelp: 'Match and remove duplicate sequences by sequence content. ' +
+                "Default: ${params.seqkit_rmdup_s}",
+            cliflag: '-s',
+            clivalue: (params.seqkit_rmdup_s ? ' ' : '')
+        ],
+        'seqkit_rmdup_d': [
+            clihelp: 'Save the duplicated sequences to a file. ' +
+                "Default: ${params.seqkit_rmdup_d}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_D': [
+            clihelp: 'Save the number and list of duplicated sequences to a file. ' +
+                "Default: ${params.seqkit_rmdup_D}",
+            cliflag: null,
+            clivalue: null
+        ],
+        'seqkit_rmdup_i': [
+            clihelp: 'Ignore case while using seqkit rmdup. ' +
+                "Default: ${params.seqkit_rmdup_i}",
+            cliflag: '-i',
+            clivalue: (params.seqkit_rmdup_i ? ' ' : '')
+        ],
+        'seqkit_rmdup_P': [
+            clihelp: "Only consider positive strand (i.e. 5') when comparing by sequence content. " +
+                "Default: ${params.seqkit_rmdup_P}",
+            cliflag: '-P',
+            clivalue: (params.seqkit_rmdup_P ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sfhpy.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,58 @@
+// Help text for sourmash_filter_hits.py (sfhpy) within CPIPES.
+def sfhpyHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sfhpy_run': [
+            clihelp: 'Run the sourmash_filter_hits.py ' +
+                'script. Default: ' +
+                (params.sfhpy_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sfhpy_fcn': [
+            clihelp: 'Column name by which filtering of rows should be applied. ' +
+                "Default: ${params.sfhpy_fcn}",
+            cliflag: '-fcn',
+            clivalue: (params.sfhpy_fcn ?: '')
+        ],
+        'sfhpy_fcv': [
+            clihelp: 'Remove genomes whose match with the query FASTQ is less than ' +
+                'this much. ' +
+                "Default: ${params.sfhpy_fcv}",
+            cliflag: '-fcv',
+            clivalue: (params.sfhpy_fcv ?: '')
+        ],
+        'sfhpy_gt': [
+            clihelp: 'Apply greather than or equal to condition on numeric values of ' +
+                '--sfhpy_fcn column. ' +
+                "Default: ${params.sfhpy_gt}",
+            cliflag: '-gt',
+            clivalue: (params.sfhpy_gt ? ' ' : '')
+        ],
+        'sfhpy_lt': [
+            clihelp: 'Apply less than or equal to condition on numeric values of ' +
+                '--sfhpy_fcn column. ' +
+                "Default: ${params.sfhpy_lt}",
+            cliflag: '-gt',
+            clivalue: (params.sfhpy_lt ? ' ' : '')
+        ],
+        'sfhpy_all': [
+            clihelp: 'Instead of just the column value, print entire row. ' +
+                "Default: ${params.sfhpy_all}",
+            cliflag: '-all',
+            clivalue: (params.sfhpy_all ? ' ' : '')
+        ],
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashgather.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,86 @@
+// Help text for sourmash gather within CPIPES.mashsketch
+
+def sourmashgatherHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashgather_run': [
+            clihelp: 'Run `sourmash gather` tool. Default: ' +
+                (params.sourmashgather_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashgather_n': [
+            clihelp: 'Number of results to report. ' +
+                'By default, will terminate at --sourmashgather_thr_bp value. ' +
+                "Default: ${params.sourmashgather_n}",
+            cliflag: '-n',
+            clivalue: (params.sourmashgather_n ?: '')
+        ],
+        'sourmashgather_thr_bp': [
+            clihelp: 'Reporting threshold (in bp) for estimated overlap with remaining query. ' +
+                "Default: ${params.sourmashgather_thr_bp}",
+            cliflag: '--threshold-bp',
+            clivalue: (params.sourmashgather_thr_bp ?: '')
+        ],
+        'sourmashgather_ani_ci': [
+            clihelp: 'Output confidence intervals for ANI estimates. ' +
+                "Default: ${params.sourmashgather_ani_ci}",
+            cliflag: '--estimate-ani-ci',
+            clivalue: (params.sourmashgather_ani_ci ? ' ' : '')
+        ],
+        'sourmashgather_k': [
+            clihelp: 'The k-mer size to select. ' +
+                "Default: ${params.sourmashgather_k}",
+            cliflag: '-k',
+            clivalue: (params.sourmashgather_k ?: '')
+        ],
+        'sourmashgather_dna': [
+            clihelp: 'Choose DNA signature. ' +
+                "Default: ${params.sourmashgather_dna}",
+            cliflag: '--dna',
+            clivalue: (params.sourmashgather_dna ? ' ' : '')
+        ],
+        'sourmashgather_rna': [
+            clihelp: 'Choose RNA signature. ' +
+                "Default: ${params.sourmashgather_rna}",
+            cliflag: '--rna',
+            clivalue: (params.sourmashgather_rna ? ' ' : '')
+        ],
+        'sourmashgather_nuc': [
+            clihelp: 'Choose Nucleotide signature. ' +
+                "Default: ${params.sourmashgather_nuc}",
+            cliflag: '--nucleotide',
+            clivalue: (params.sourmashgather_nuc ? ' ' : '')
+        ],
+        'sourmashgather_scaled': [
+            clihelp: 'Scaled value should be between 100 and 1e6. ' +
+                "Default: ${params.sourmashgather_scaled}",
+            cliflag: '--scaled',
+            clivalue: (params.sourmashgather_scaled ?: '')
+        ],
+        'sourmashgather_inc_pat': [
+            clihelp: 'Search only signatures that match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashgather_inc_pat}",
+            cliflag: '--include-db-pattern',
+            clivalue: (params.sourmashgather_inc_pat ?: '')
+        ],
+        'sourmashgather_exc_pat': [
+            clihelp: 'Search only signatures that do not match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashgather_exc_pat}",
+            cliflag: '--exclude-db-pattern',
+            clivalue: (params.sourmashgather_exc_pat ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashsearch.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,134 @@
+// Help text for sourmash search within CPIPES.
+
+def sourmashsearchHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashsearch_run': [
+            clihelp: 'Run `sourmash search` tool. Default: ' +
+                (params.sourmashsearch_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashsearch_n': [
+            clihelp: 'Number of results to report. ' +
+                'By default, will terminate at --sourmashsearch_thr value. ' +
+                "Default: ${params.sourmashsearch_n}",
+            cliflag: '-n',
+            clivalue: (params.sourmashsearch_n ?: '')
+        ],
+        'sourmashsearch_thr': [
+            clihelp: 'Reporting threshold (similarity) to return results. ' +
+                "Default: ${params.sourmashsearch_thr}",
+            cliflag: '--threshold',
+            clivalue: (params.sourmashsearch_thr ?: '')
+        ],
+        'sourmashsearch_contain': [
+            clihelp: 'Score based on containment rather than similarity. ' +
+                "Default: ${params.sourmashsearch_contain}",
+            cliflag: '--containment',
+            clivalue: (params.sourmashsearch_contain ? ' ' : '')
+        ],
+        'sourmashsearch_maxcontain': [
+            clihelp: 'Score based on max containment rather than similarity. ' +
+                "Default: ${params.sourmashsearch_contain}",
+            cliflag: '--max-containment',
+            clivalue: (params.sourmashsearch_maxcontain ? ' ' : '')
+        ],
+        'sourmashsearch_ignoreabn': [
+            clihelp: 'Do NOT use k-mer abundances if present. ' +
+                "Default: ${params.sourmashsearch_ignoreabn}",
+            cliflag: '--ignore-abundance',
+            clivalue: (params.sourmashsearch_ignoreabn ? ' ' : '')
+        ],
+        'sourmashsearch_ani_ci': [
+            clihelp: 'Output confidence intervals for ANI estimates. ' +
+                "Default: ${params.sourmashsearch_ani_ci}",
+            cliflag: '--estimate-ani-ci',
+            clivalue: (params.sourmashsearch_ani_ci ? ' ' : '')
+        ],
+        'sourmashsearch_k': [
+            clihelp: 'The k-mer size to select. ' +
+                "Default: ${params.sourmashsearch_k}",
+            cliflag: '-k',
+            clivalue: (params.sourmashsearch_k ?: '')
+        ],
+        'sourmashsearch_protein': [
+            clihelp: 'Choose a protein signature. ' +
+                "Default: ${params.sourmashsearch_protein}",
+            cliflag: '--protein',
+            clivalue: (params.sourmashsearch_protein ? ' ' : '')
+        ],
+        'sourmashsearch_noprotein': [
+            clihelp: 'Do not choose a protein signature. ' +
+                "Default: ${params.sourmashsearch_noprotein}",
+            cliflag: '--no-protein',
+            clivalue: (params.sourmashsearch_noprotein ? ' ' : '')
+        ],
+        'sourmashsearch_dayhoff': [
+            clihelp: 'Choose Dayhoff-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_dayhoff}",
+            cliflag: '--dayhoff',
+            clivalue: (params.sourmashsearch_dayhoff ? ' ' : '')
+        ],
+        'sourmashsearch_nodayhoff': [
+            clihelp: 'Do not choose Dayhoff-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_nodayhoff}",
+            cliflag: '--no-dayhoff',
+            clivalue: (params.sourmashsearch_nodayhoff ? ' ' : '')
+        ],
+        'sourmashsearch_hp': [
+            clihelp: 'Choose hydrophobic-polar-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_hp}",
+            cliflag: '--hp',
+            clivalue: (params.sourmashsearch_hp ? ' ' : '')
+        ],
+        'sourmashsearch_nohp': [
+            clihelp: 'Do not choose hydrophobic-polar-encoded amino acid signatures. ' +
+                "Default: ${params.sourmashsearch_nohp}",
+            cliflag: '--no-hp',
+            clivalue: (params.sourmashsearch_nohp ? ' ' : '')
+        ],
+        'sourmashsearch_dna': [
+            clihelp: 'Choose DNA signature. ' +
+                "Default: ${params.sourmashsearch_dna}",
+            cliflag: '--dna',
+            clivalue: (params.sourmashsearch_dna ? ' ' : '')
+        ],
+        'sourmashsearch_nodna': [
+            clihelp: 'Do not choose DNA signature. ' +
+                "Default: ${params.sourmashsearch_nodna}",
+            cliflag: '--no-dna',
+            clivalue: (params.sourmashsearch_nodna ? ' ' : '')
+        ],
+        'sourmashsearch_scaled': [
+            clihelp: 'Scaled value should be between 100 and 1e6. ' +
+                "Default: ${params.sourmashsearch_scaled}",
+            cliflag: '--scaled',
+            clivalue: (params.sourmashsearch_scaled ?: '')
+        ],
+        'sourmashsearch_inc_pat': [
+            clihelp: 'Search only signatures that match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashsearch_inc_pat}",
+            cliflag: '--include-db-pattern',
+            clivalue: (params.sourmashsearch_inc_pat ?: '')
+        ],
+        'sourmashsearch_exc_pat': [
+            clihelp: 'Search only signatures that do not match this pattern in name, filename, or md5. ' +
+                "Default: ${params.sourmashsearch_exc_pat}",
+            cliflag: '--exclude-db-pattern',
+            clivalue: (params.sourmashsearch_exc_pat ?: '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashsketch.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,61 @@
+// Help text for sourmash sketch dna within CPIPES.
+
+def sourmashsketchHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashsketch_run': [
+            clihelp: 'Run `sourmash sketch dna` tool. Default: ' +
+                (params.sourmashsketch_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashsketch_mode': [
+            clihelp: "Select which type of signatures to be created: dna, protein, fromfile or translate. "
+                + "Default: ${params.sourmashsketch_mode}",
+            cliflag: "${params.sourmashsketch_mode}",
+            clivalue: ' '
+        ],
+        'sourmashsketch_p': [
+            clihelp: 'Signature parameters to use. ' +
+                "Default: ${params.sourmashsketch_p}",
+            cliflag: '-p',
+            clivalue: (params.sourmashsketch_p ?: '')
+        ],
+        'sourmashsketch_file': [
+            clihelp: '<path>  A text file containing a list of sequence files to load. ' +
+                "Default: ${params.sourmashsketch_file}",
+            cliflag: '--from-file',
+            clivalue: (params.sourmashsketch_file ?: '')
+        ],
+        'sourmashsketch_f': [
+            clihelp: 'Recompute signatures even if the file exists. ' +
+                "Default: ${params.sourmashsketch_f}",
+            cliflag: '-f',
+            clivalue: (params.sourmashsketch_f ? ' ' : '')
+        ],
+        'sourmashsketch_name': [
+            clihelp: 'Name the signature generated from each file after the first record in the file. ' +
+                "Default: ${params.sourmashsketch_name}",
+            cliflag: '--name-from-first',
+            clivalue: (params.sourmashsketch_name ? ' ' : '')
+        ],
+        'sourmashsketch_randomize': [
+            clihelp: 'Shuffle the list of input files randomly. ' +
+                "Default: ${params.sourmashsketch_randomize}",
+            cliflag: '--randomize',
+            clivalue: (params.sourmashsketch_randomize ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/help/sourmashtaxmetagenome.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,69 @@
+// Help text for sourmash tax metagenome within CPIPES.
+
+def sourmashtaxmetagenomeHelp(params) {
+
+    Map tool = [:]
+    Map toolspecs = [:]
+    tool.text = [:]
+    tool.helpparams = [:]
+
+    toolspecs = [
+        'sourmashtaxmetagenome_run': [
+            clihelp: 'Run `sourmash tax metagenome` tool. Default: ' +
+                (params.sourmashtaxmetagenome_run ?: false),
+            cliflag: null,
+            clivalue: null
+        ],
+        'sourmashtaxmetagenome_t': [
+            clihelp: "Taxonomy CSV file. "
+                + "Default: ${params.sourmashtaxmetagenome_t}",
+            cliflag: '-t',
+            clivalue: (params.sourmashtaxmetagenome_t ?: '')
+        ],
+        'sourmashtaxmetagenome_r': [
+            clihelp: 'For non-default output formats: Summarize genome'
+                + ' taxonomy at this rank and above. Note that the taxonomy CSV must'
+                + ' contain lineage information at this rank.'
+                + " Default: ${params.sourmashtaxmetagenome_r}",
+            cliflag: '-r',
+            clivalue: (params.sourmashtaxmetagenome_r ?: '')
+        ],
+        'sourmashtaxmetagenome_F': [
+            clihelp: 'Choose output format. ' +
+                "Default: ${params.sourmashtaxmetagenome_F}",
+            cliflag: '--output-format',
+            clivalue: (params.sourmashtaxmetagenome_F ?: '')
+        ],
+        'sourmashtaxmetagenome_f': [
+            clihelp: 'Continue past errors in taxonomy database loading. ' +
+                "Default: ${params.sourmashtaxmetagenome_f}",
+            cliflag: '-f',
+            clivalue: (params.sourmashtaxmetagenome_f ?: '')
+        ],
+        'sourmashtaxmetagenome_kfi': [
+            clihelp: 'Do not split identifiers on whitespace. ' +
+                "Default: ${params.sourmashtaxmetagenome_kfi}",
+            cliflag: '--keep-full-identifiers',
+            clivalue: (params.sourmashtaxmetagenome_kfi ? ' ' : '')
+        ],
+        'sourmashtaxmetagenome_kiv': [
+            clihelp: 'After splitting identifiers do not remove accession versions. ' +
+                "Default: ${params.sourmashtaxmetagenome_kiv}",
+            cliflag: '--keep-identifier-versions',
+            clivalue: (params.sourmashtaxmetagenome_kiv ?: '')
+        ],
+        'sourmashtaxmetagenome_fomt': [
+            clihelp: 'Fail quickly if taxonomy is not available for an identifier. ' +
+                "Default: ${params.sourmashtaxmetagenome_fomt}",
+            cliflag: '--fail-on-missing-taxonomy',
+            clivalue: (params.sourmashtaxmetagenome_fomt ? ' ' : '')
+        ]
+    ]
+
+    toolspecs.each {
+        k, v -> tool.text['--' + k] = "${v.clihelp}"
+        tool.helpparams[k] = [ cliflag: "${v.cliflag}", clivalue: v.clivalue ]
+    }
+
+    return tool
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/lib/routines.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,397 @@
+// Hold methods to print:
+//      1. Colored logo.
+//      2. Summary of parameters.
+//      3. Single dashed line.
+//      4. Double dashed line.
+//
+
+import groovy.json.JsonSlurper
+import groovy.util.ConfigSlurper
+// import nextflow.config.ConfigParser
+// import nextflow.config.ConfigBuilder
+// import groovy.json.JsonOutput
+
+// ASCII logo
+def pipelineBanner() {
+
+    def padding = (params.pad) ?: 30
+    Map fgcolors = getANSIColors()
+
+    def banner = [
+        name: "${fgcolors.magenta}${workflow.manifest.name}${fgcolors.reset}",
+        author: "${fgcolors.cyan}${workflow.manifest.author}${fgcolors.reset}",
+        // workflow: "${fgcolors.magenta}${params.pipeline}${fgcolors.reset}",
+        version:  "${fgcolors.green}${workflow.manifest.version}${fgcolors.reset}",
+        center: "${fgcolors.green}${params.center}${fgcolors.reset}",
+        pad: padding
+    ]
+
+    manifest = addPadding(banner)
+
+    return """${fgcolors.white}${dashedLine(type: '=')}${fgcolors.magenta}
+             (o)
+  ___  _ __   _  _ __    ___  ___
+ / __|| '_ \\ | || '_ \\  / _ \\/ __|
+| (__ | |_) || || |_) ||  __/\\__ \\
+ \\___|| .__/ |_|| .__/  \\___||___/
+      | |       | |
+      |_|       |_|${fgcolors.reset}
+${dashedLine()}
+${fgcolors.blue}A collection of modular pipelines at CFSAN, FDA.${fgcolors.reset}
+${dashedLine()}
+${manifest}
+${dashedLine(type: '=')}
+""".stripIndent()
+}
+
+// Add padding to keys so that
+// they indent nicely on the
+// terminal
+def addPadding(values) {
+
+    def pad = (params.pad) ?: 30
+    values.pad = pad
+
+    def padding = values.pad.toInteger()
+    def nocapitalize = values.nocapitalize
+    def stopnow = values.stopNow
+    def help = values.help
+
+    values.removeAll {
+        k, v -> [
+            'nocapitalize',
+            'pad',
+            'stopNow',
+            'help'
+        ].contains(k)
+    }
+
+    values.keySet().each { k ->
+        v = values[k]
+        s = params.linewidth - (pad + 5)
+        if (v.toString().size() > s && !stopnow) {
+            def sen = ''
+            // v.toString().findAll(/.{1,${s}}\b(?:\W*|\s*)/).each {
+            //     sen += ' '.multiply(padding + 2) + it + '\n'
+            // }
+            v.toString().eachMatch(/.{1,${s}}(?=.*)\b|\w+/) {
+                sen += ' '.multiply(padding + 2) + it.trim() + '\n'
+            }
+            values[k] = (
+                help ? sen.replaceAll(/^(\n|\s)*/, '') : sen.trim()
+            )
+        } else {
+            values[k] = (help ? v + "\n" : v)
+        }
+        k = k.replaceAll(/\./, '_')
+    }
+
+    return values.findResults {
+        k, v -> nocapitalize ?
+            k.padRight(padding) + ': ' + v :
+            k.capitalize().padRight(padding) + ': ' + v
+    }.join("\n")
+}
+
+// Method for error messages
+def stopNow(msg) {
+
+    Map fgcolors = getANSIColors()
+    Map errors = [:]
+
+    if (msg == null) {
+        msg = "Unknown error"
+    }
+
+    errors['stopNow'] = true
+    errors["${params.cfsanpipename} - ${params.pipeline} - ERROR"] = """
+${fgcolors.reset}${dashedLine()}
+${fgcolors.red}${msg}${fgcolors.reset}
+${dashedLine()}
+""".stripIndent()
+    // println dashedLine() // defaults to stdout
+    // log.info addPadding(errors) // prints to stdout
+    exit 1, "\n" + dashedLine() +
+        "${fgcolors.red}\n" + addPadding(errors)
+}
+
+// Method to validate 4 required parameters
+// if input for entry point is FASTQ files
+def validateParamsForFASTQ() {
+    switch (params) {
+        case { params.metadata == null && params.input == null }:
+            stopNow("Either metadata CSV file with 5 required columns\n" +
+                "in order: sample, fq1, fq2, strandedness, single_end or \n" +
+                "input directory of only FASTQ files (gzipped or unzipped) should be provided\n" +
+                "using --metadata or --input options.\n" +
+                "None of these two options were provided!")
+            break
+        case { params.metadata != null && params.input != null }:
+            stopNow("Either metadata or input directory of FASTQ files\n" +
+                "should be provided using --metadata or --input options.\n" +
+                "Using both these options is not allowed!")
+            break
+        case { params.output == null }:
+            stopNow("Please mention output directory to store all results " +
+                "using --output option!")
+            break
+    }
+    return 1
+}
+
+// Method to print summary of parameters
+// before running
+def summaryOfParams() {
+
+    // def pipeline_specific_config = pipeline_specific_config = new ConfigParser().setIgnoreIncludes(true).parse(
+    //     file("${params.workflowsconf}${params.fs}${params.pipeline}.config").text
+    // )
+    def pipeline_specific_config = new ConfigSlurper().parse(
+        file("${params.workflowsconf}${params.fs}${params.pipeline}.config").text
+    )
+
+    Map fgcolors = getANSIColors()
+    Map globalparams = [:]
+    Map localparams = params.subMap(
+        pipeline_specific_config.params.keySet().toList() + params.logtheseparams
+    )
+
+    if (localparams !instanceof Map) {
+        stopNow("Need a Map of paramters. We got: " + localparams.getClass())
+    }
+
+    if (localparams.size() != 0) {
+        localparams['nocapitalize'] = true
+        globalparams['nocapitalize'] = true
+        globalparams['nextflow_version'] = "${nextflow.version}"
+        globalparams['nextflow_build'] = "${nextflow.build}"
+        globalparams['nextflow_timestamp'] = "${nextflow.timestamp}"
+        globalparams['workflow_projectDir'] = "${workflow.projectDir}"
+        globalparams['workflow_launchDir'] = "${workflow.launchDir}"
+        globalparams['workflow_workDir'] = "${workflow.workDir}"
+        globalparams['workflow_container'] = "${workflow.container}"
+        globalparams['workflow_containerEngine'] = "${workflow.containerEngine}"
+        globalparams['workflow_runName'] = "${workflow.runName}"
+        globalparams['workflow_sessionId'] = "${workflow.sessionId}"
+        globalparams['workflow_profile'] = "${workflow.profile}"
+        globalparams['workflow_start'] = "${workflow.start}"
+        globalparams['workflow_commandLine'] = "${workflow.commandLine}"
+        return """${dashedLine()}
+Summary of the current workflow (${fgcolors.magenta}${params.pipeline}${fgcolors.reset}) parameters
+${dashedLine()}
+${addPadding(localparams)}
+${dashedLine()}
+${fgcolors.cyan}N E X T F L O W${fgcolors.reset} - ${fgcolors.magenta}${params.cfsanpipename}${fgcolors.reset} - Runtime metadata
+${dashedLine()}
+${addPadding(globalparams)}
+${dashedLine()}""".stripIndent()
+    }
+    return 1
+}
+
+// Method to display
+// Return dashed line either '-'
+// type or '=' type
+def dashedLine(Map defaults = [:]) {
+
+    Map fgcolors = getANSIColors()
+    def line = [color: 'white', type: '-']
+
+    if (!defaults.isEmpty()) {
+        line.putAll(defaults)
+    }
+
+    return fgcolors."${line.color}" +
+        "${line.type}".multiply(params.linewidth) +
+        fgcolors.reset
+}
+
+// Return slurped keys parsed from JSON
+def slurpJson(file) {
+    def slurped = null
+    def jsonInst = new JsonSlurper()
+
+    try {
+        slurped = jsonInst.parse(new File ("${file}"))
+    }
+    catch (Exception e) {
+        log.error 'Please check your JSON schema. Invalid JSON file: ' + file
+    }
+
+    // Declare globals for the nanofactory
+    // workflow.
+    return [keys: slurped.keySet().toList(), cparams: slurped]
+}
+
+// Default help text in a map if the entry point
+// to a pipeline is FASTQ files.
+def fastqEntryPointHelp() {
+
+    Map helptext = [:]
+    Map fgcolors = getANSIColors()
+
+    helptext['Workflow'] =  "${fgcolors.magenta}${params.pipeline}${fgcolors.reset}"
+    helptext['Author'] =  "${fgcolors.cyan}${params.workflow_built_by}${fgcolors.reset}"
+    helptext['Version'] = "${fgcolors.green}${params.workflow_version}${fgcolors.reset}\n"
+    helptext['Usage'] = "cpipes --pipeline ${params.pipeline} [options]\n"
+    helptext['Required'] = ""
+    helptext['--input'] = "Absolute path to directory containing FASTQ files. " +
+        "The directory should contain only FASTQ files as all the " +
+        "files within the mentioned directory will be read. " +
+        "Ex: --input /path/to/fastq_pass"
+    helptext['--output'] = "Absolute path to directory where all the pipeline " +
+        "outputs should be stored. Ex: --output /path/to/output"
+    helptext['Other options'] = ""
+    helptext['--metadata'] = "Absolute path to metadata CSV file containing five " +
+        "mandatory columns: sample,fq1,fq2,strandedness,single_end. The fq1 and fq2 " +
+        "columns contain absolute paths to the FASTQ files. This option can be used in place " +
+        "of --input option. This is rare. Ex: --metadata samplesheet.csv"
+    helptext['--fq_suffix'] = "The suffix of FASTQ files (Unpaired reads or R1 reads or Long reads) if " +
+        "an input directory is mentioned via --input option. Default: ${params.fq_suffix}"
+    helptext['--fq2_suffix'] = "The suffix of FASTQ files (Paired-end reads or R2 reads) if an input directory is mentioned via " +
+        "--input option. Default: ${params.fq2_suffix}"
+    helptext['--fq_filter_by_len'] = "Remove FASTQ reads that are less than this many bases. " +
+        "Default: ${params.fq_filter_by_len}"
+    helptext['--fq_strandedness'] = "The strandedness of the sequencing run. This is mostly needed " +
+        "if your sequencing run is RNA-SEQ. For most of the other runs, it is probably safe to use " +
+        "unstranded for the option. Default: ${params.fq_strandedness}"
+    helptext['--fq_single_end'] = "SINGLE-END information will be auto-detected but this option forces " +
+        "PAIRED-END FASTQ files to be treated as SINGLE-END so only read 1 information is included in " +
+        "auto-generated samplesheet. Default: ${params.fq_single_end}"
+    helptext['--fq_filename_delim'] = "Delimiter by which the file name is split to obtain sample name. " +
+        "Default: ${params.fq_filename_delim}"
+    helptext['--fq_filename_delim_idx'] = "After splitting FASTQ file name by using the --fq_filename_delim option," +
+        " all elements before this index (1-based) will be joined to create final sample name." +
+        " Default: ${params.fq_filename_delim_idx}"
+
+    return helptext
+}
+
+// Show concise help text if configured within the main workflow.
+def conciseHelp(def tool = null) {
+    Map fgcolors = getANSIColors()
+
+    tool ?= "fastp"
+    tools = tool?.tokenize(',')
+
+    return """
+${dashedLine()}
+Show configurable CLI options for each tool within ${fgcolors.magenta}${params.pipeline}${fgcolors.reset}
+${dashedLine()}
+Ex: cpipes --pipeline ${params.pipeline} --help
+""" + (tools.size() > 1 ? "Ex: cpipes --pipeline ${params.pipeline} --help ${tools[0]}"
+    + """
+Ex: cpipes --pipeline ${params.pipeline} --help ${tools[0]},${tools[1]}
+${dashedLine()}""".stripIndent() : """Ex: cpipes --pipeline ${params.pipeline} --help ${tool}
+${dashedLine()}""".stripIndent())
+
+}
+
+// Wrap help text with the following options
+def wrapUpHelp() {
+
+    return [
+        'Help options' : "",
+        '--help': "Display this message.\n",
+        'help': true,
+        'nocapitalize': true
+    ]
+}
+
+// Method to send email on workflow complete.
+def sendMail() {
+
+    if (params.user_email == null) {
+        return 1
+    }
+
+    def pad = (params.pad) ?: 30
+    def contact_emails = [
+        stakeholder: (params.workflow_blueprint_by ?: 'Not defined'),
+        author: (params.workflow_built_by ?: 'Not defined')
+    ]
+    def msg = """
+${pipelineBanner()}
+${summaryOfParams()}
+${params.cfsanpipename} - ${params.pipeline}
+${dashedLine()}
+Please check the following directory for N E X T F L O W
+reports. You can view the HTML files directly by double clicking
+them on your workstation.
+${dashedLine()}
+${params.tracereportsdir}
+${dashedLine()}
+Please send any bug reports to CFSAN Dev Team or the author or
+the stakeholder of the current pipeline.
+${dashedLine()}
+Error messages (if any)
+${dashedLine()}
+${workflow.errorMessage}
+${workflow.errorReport}
+${dashedLine()}
+Contact emails
+${dashedLine()}
+${addPadding(contact_emails)}
+${dashedLine()}
+Thank you for using ${params.cfsanpipename} - ${params.pipeline}!
+${dashedLine()}
+""".stripIndent()
+
+    def mail_cmd = [
+        'sendmail',
+        '-f', 'noreply@gmail.com',
+        '-F', 'noreply',
+        '-t', "${params.user_email}"
+    ]
+
+    def email_subject = "${params.cfsanpipename} - ${params.pipeline}"
+    Map fgcolors = getANSIColors()
+
+    if (workflow.success) {
+        email_subject += ' completed successfully!'
+    }
+    else if (!workflow.success) {
+        email_subject += ' has failed!'
+    }
+
+    try {
+        ['env', 'bash'].execute() << """${mail_cmd.join(' ')}
+Subject: ${email_subject}
+Mime-Version: 1.0
+Content-Type: text/html
+<pre>
+${msg.replaceAll(/\x1b\[[0-9;]*m/, '')}
+</pre>
+""".stripIndent()
+    } catch (all) {
+        def warning_msg = "${fgcolors.yellow}${params.cfsanpipename} - ${params.pipeline} - WARNING"
+            .padRight(pad) + ':'
+        log.info """
+${dashedLine()}
+${warning_msg}
+${dashedLine()}
+Could not send mail with the sendmail command!
+${dashedLine()}
+""".stripIndent()
+    }
+    return 1
+}
+
+// Set ANSI colors for any and all
+// STDOUT or STDERR
+def getANSIColors() {
+
+    Map fgcolors = [:]
+
+    fgcolors['reset']   = "\033[0m"
+    fgcolors['black']   = "\033[0;30m"
+    fgcolors['red']     = "\033[0;31m"
+    fgcolors['green']   = "\033[0;32m"
+    fgcolors['yellow']  = "\033[0;33m"
+    fgcolors['blue']    = "\033[0;34m"
+    fgcolors['magenta'] = "\033[0;35m"
+    fgcolors['cyan']    = "\033[0;36m"
+    fgcolors['white']   = "\033[0;37m"
+
+    return fgcolors
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/bwa/mem/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,50 @@
+process BWA_MEM {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}bwa${params.fs}0.7.17" : null)
+    conda (params.enable_conda ? "bioconda::bwa=0.7.17 conda-forge::perl" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/bwa:0.7.17--he4a0461_11' :
+        'quay.io/biocontainers/bwa:0.7.17--he4a0461_11' }"
+
+    input:
+        tuple val(meta), path(reads), path(index)
+        val index2
+
+    output:
+        tuple val(meta), path("*.sam"), emit: aligned_sam
+        path  "versions.yml"          , emit: versions
+
+    when:
+
+
+    script:
+        def args   = task.ext.args ?: ''
+        def args2  = task.ext.args2 ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def this_index = (index ?: index2)
+        """
+
+        if [ "${params.fq_single_end}" = "false" ]; then
+            bwa mem \\
+                $args \\
+                -t $task.cpus \\
+                $this_index \\
+                ${reads[0]} ${reads[1]} > ${prefix}.aligned.sam
+        else
+            bwa mem \\
+                $args \\
+                -t $task.cpus \\
+                -a \\
+                $this_index \\
+                $reads > ${prefix}.aligned.sam
+
+        fi
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            bwa: \$(echo \$(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*\$//')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/fastq/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,96 @@
+# NextFlow DSL2 Module
+
+```bash
+CAT_FASTQ
+```
+
+## Description
+
+Concatenates a list of FASTQ files. Produces 2 files per sample (`id:`) if `single_end` is `false` as mentioned in the metadata Groovy Map.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of FASTQ files of input type `path` (`reads`) to be concatenated.
+
+Ex:
+
+```groovy
+[ [id: 'sample1', single_end: true], ['/data/sample1/f_L001.fq', '/data/sample1/f_L002.fq'] ]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[ id: 'FAL00870', strandedness: 'unstranded', single_end: true ]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to list of FASTQ files.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'CAT_FASTQ' {
+    ext.args = '--genome_size 5.5m'
+}
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and list of concatenated FASTQ files (`catted_reads`).
+
+\
+&nbsp;
+
+#### `catted_reads`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the concatenated FASTQ files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/fastq/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,89 @@
+process CAT_FASTQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    conda (params.enable_conda ? "conda-forge::sed=4.7 conda-forge::gzip" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' :
+        'biocontainers/biocontainers:v1.2.0_cv1' }"
+
+    input:
+        tuple val(meta), path(reads, stageAs: "input*/*")
+
+    output:
+        tuple val(meta), path("*.merged.fastq.gz"), emit: catted_reads
+        path "versions.yml"                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def readList = reads.collect{ it.toString() }
+        def is_in_gz = readList[0].endsWith('.gz')
+        def gz_or_ungz = (is_in_gz ? '' : ' | gzip')
+        def pigz_or_ungz = (is_in_gz ? '' : " | pigz -p ${task.cpus}")
+        if (meta.single_end) {
+            if (readList.size > 1) {
+                """
+                zcmd="gzip"
+                zver=""
+
+                if type pigz > /dev/null 2>&1; then
+                    cat ${readList.join(' ')} ${pigz_or_ungz} > ${prefix}.merged.fastq.gz
+                    zcmd="pigz"
+                    zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed -e '1!d' | sed "s/\$zcmd //" )
+                else
+                    cat ${readList.join(' ')} ${gz_or_ungz} > ${prefix}.merged.fastq.gz
+                    zcmd="gzip"
+
+                    if [ "${workflow.containerEngine}" != "null" ]; then
+                        zver=\$( echo \$( \$zcmd --help 2>&1 ) | sed -e '1!d; s/ (.*\$//' )
+                    else
+                        zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed "s/^.*(\$zcmd) //; s/\$zcmd //; s/ Copyright.*\$//" )
+                    fi
+                fi
+
+                cat <<-END_VERSIONS > versions.yml
+                "${task.process}":
+                    cat: \$( echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//' )
+                    \$zcmd: \$zver
+                END_VERSIONS
+                """
+            }
+        } else {
+            if (readList.size > 2) {
+                def read1 = []
+                def read2 = []
+                readList.eachWithIndex{ v, ix -> ( ix & 1 ? read2 : read1 ) << v }
+                """
+                zcmd="gzip"
+                zver=""
+
+                if type pigz > /dev/null 2>&1; then
+                    cat ${read1.join(' ')} ${pigz_or_ungz} > ${prefix}_1.merged.fastq.gz
+                    cat ${read2.join(' ')} ${pigz_or_ungz} > ${prefix}_2.merged.fastq.gz
+                    zcmd="pigz"
+                    zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed -e '1!d' | sed "s/\$zcmd //" )
+                else
+                    cat ${read1.join(' ')} ${gz_or_ungz} > ${prefix}_1.merged.fastq.gz
+                    cat ${read2.join(' ')} ${gz_or_ungz} > ${prefix}_2.merged.fastq.gz
+                    zcmd="gzip"
+
+                    if [ "${workflow.containerEngine}" != "null" ]; then
+                        zver=\$( echo \$( \$zcmd --help 2>&1 ) | sed -e '1!d; s/ (.*\$//' )
+                    else
+                        zver=\$( echo \$( \$zcmd --version 2>&1 ) | sed "s/^.*(\$zcmd) //; s/\$zcmd //; s/ Copyright.*\$//" )
+                    fi
+                fi
+
+                cat <<-END_VERSIONS > versions.yml
+                "${task.process}":
+                    cat: \$( echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//' )
+                    \$zcmd: \$zver
+                END_VERSIONS
+                """
+            }
+        }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/tables/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,88 @@
+# NextFlow DSL2 Module
+
+```bash
+TABLE_SUMMARY
+```
+
+## Description
+
+Concatenates a list of tables (CSV or TAB delimited) in `.txt` or `.csv` format. The table files to be concatenated **must** have a header as the header from one of the table files will be used as the header for the concatenated result table file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of `val` table key (`table_sum_on`) and a list of table files of input type `path` (`tables`) to be concatenated. For this module to work, a `bin` directory with the script `create_mqc_data_table.py` should be present where the NextFlow script using this DSL2 module will be run. This `python` script will convert the aggregated table to `.yml` format to be used with `multiqc`.
+
+Ex:
+
+```groovy
+[ ['ectyper'], ['/data/sample1/f1_ectyper.txt', '/data/sample2/f2_ectyper.txt'] ]
+```
+
+\
+&nbsp;
+
+#### `table_sum_on`
+
+Type: `val`
+
+A single key defining what tables are being concatenated. For example, if all the `ectyper` results are being concatenated for all samples, then this can be `ectyper`.
+
+Ex:
+
+```groovy
+[ ['ectyper'], ['/data/sample1/f1_ectyper.txt', '/data/sample2/f2_ectyper.txt'] ]
+```
+
+\
+&nbsp;
+
+#### `tables`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to a list of tables (files) to be concatenated.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of table key (`table_sum_on` from `input:`) and list of concatenated table files (`tblsummed`).
+
+\
+&nbsp;
+
+#### `tblsummed`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the concatenated table files per table key (Ex: `ectyper`).
+
+\
+&nbsp;
+
+#### `mqc_yml`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing table contents in `YAML` format which can be used to inject this table as part of the `multiqc` report.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/cat/tables/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,58 @@
+process TABLE_SUMMARY {
+    tag "$table_sum_on"
+    label 'process_low'
+
+    // Requires `pyyaml` which does not have a dedicated container but is in the MultiQC container
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9 conda-forge::pyyaml conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0' }"
+
+    input:
+    tuple val(table_sum_on), path(tables)
+
+    output:
+    tuple val(table_sum_on), path("*.tblsum.txt"), emit: tblsummed
+    path "*_mqc.yml"                             , emit: mqc_yml
+    path "versions.yml"                          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when || tables
+
+    script:
+    def args = task.ext.args ?: ''
+    def onthese = tables.collect().join('\\n')
+    """
+    filenum="1"
+    header=""
+
+    echo -e "$onthese" | while read -r file; do
+
+        if [ "\${filenum}" == "1" ]; then
+            header=\$( head -n1 "\${file}" )
+            echo -e "\${header}" > ${table_sum_on}.tblsum.txt
+        fi
+
+        tail -n+2 "\${file}" >> ${table_sum_on}.tblsum.txt
+
+        filenum=\$((filenum+1))
+    done
+
+    create_mqc_data_table.py $table_sum_on ${workflow.manifest.name}
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        bash: \$( bash --version 2>&1 | sed '1!d; s/^.*version //; s/ (.*\$//' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+
+    headver=\$( head --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+    tailver=\$( tail --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+
+    cat <<-END_VERSIONS >> versions.yml
+        head: \$headver
+        tail: \$tailver
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,57 @@
+# NextFlow DSL2 Module
+
+```bash
+DUMP_SOFTWARE_VERSIONS
+```
+
+## Description
+
+Given an `YAML` format file, produce a final `.yml` file which has unique entries and a corresponding `.mqc.yml` file for use with `multiqc`.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in a `path` (`versions`) type pointing to the file to be used to produce a final `.yml` file without any duplicate entries and a `.mqc.yml` file. Generally, this is passed by mixing `versions` from various run time channels and finally passed to this module to produce a final software versions list.
+
+Ex:
+
+```groovy
+[ '/hpc/scratch/test/work/9b/e7bf7e28806419c1c9a571dacd1f67/versions.yml' ]
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+#### `yml`
+
+Type: `path`
+
+NextFlow output type of `path` type pointing to an `YAML` file with software versions.
+
+\
+&nbsp;
+
+#### `mqc_yml`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to `.mqc.yml` file which can be used to produce a software versions' table with `multiqc`.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,26 @@
+process DUMP_SOFTWARE_VERSIONS {
+    tag "${params.pipeline} software versions"
+    label 'process_pico'
+
+    // Requires `pyyaml` which does not have a dedicated container but is in the MultiQC container
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9 conda-forge::pyyaml" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/mulled-v2-ca258a039fcd88610bc4e297b13703e8be53f5ca:d638c4f85566099ea0c74bc8fddc6f531fe56753-0' :
+        'quay.io/biocontainers/mulled-v2-ca258a039fcd88610bc4e297b13703e8be53f5ca:d638c4f85566099ea0c74bc8fddc6f531fe56753-0' }"
+
+    input:
+    path versions
+
+    output:
+    path "software_versions.yml"    , emit: yml
+    path "software_versions_mqc.yml", emit: mqc_yml
+    path "versions.yml"             , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    template 'dumpsoftwareversions.py'
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/custom/dump_software_versions/templates/dumpsoftwareversions.py	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,103 @@
+#!/usr/bin/env python
+
+import platform
+import subprocess
+from textwrap import dedent
+
+import yaml
+
+
+def _make_versions_html(versions):
+    html = [
+        dedent(
+            """\\
+            <link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/dt/jszip-2.5.0/dt-1.12.1/b-2.2.3/b-colvis-2.2.3/b-html5-2.2.3/b-print-2.2.3/fc-4.1.0/r-2.3.0/sc-2.0.6/sb-1.3.3/sp-2.0.1/datatables.min.css"/>
+            <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.36/pdfmake.min.js"></script>
+            <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.36/vfs_fonts.js"></script>
+            <script type="text/javascript" src="https://cdn.datatables.net/v/dt/jszip-2.5.0/dt-1.12.1/b-2.2.3/b-colvis-2.2.3/b-html5-2.2.3/b-print-2.2.3/fc-4.1.0/r-2.3.0/sc-2.0.6/sb-1.3.3/sp-2.0.1/datatables.min.js"></script>
+            <style>
+            #cpipes-software-versions tbody:nth-child(even) {
+                background-color: #f2f2f2;
+            }
+            </style>
+            <table class="table" style="width:100%" id="cpipes-software-versions">
+                <thead>
+                    <tr>
+                        <th> Process Name </th>
+                        <th> Software </th>
+                        <th> Version  </th>
+                    </tr>
+                </thead>
+            """
+        )
+    ]
+    for process, tmp_versions in sorted(versions.items()):
+        html.append("<tbody>")
+        for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
+            html.append(
+                dedent(
+                    f"""\\
+                    <tr>
+                        <td><samp>{process if (i == 0) else ''}</samp></td>
+                        <td><samp>{tool}</samp></td>
+                        <td><samp>{version}</samp></td>
+                    </tr>
+                    """
+                )
+            )
+        html.append("</tbody>")
+    html.append("</table>")
+    return "\\n".join(html)
+
+
+versions_this_module = {}
+versions_this_module["${task.process}"] = {
+    "python": platform.python_version(),
+    "yaml": yaml.__version__,
+}
+
+subprocess.run("perl -i -p -e 's/(sourmash:).*\s(.+)/\$1 \$2/' $versions", shell=True)
+
+with open("$versions") as f:
+    versions_by_process = yaml.load(f, Loader=yaml.BaseLoader)
+    versions_by_process.update(versions_this_module)
+
+# aggregate versions by the module name (derived from fully-qualified process name)
+versions_by_module = {}
+for process, process_versions in versions_by_process.items():
+    module = process.split(":")[-1]
+    try:
+        assert versions_by_module[module] == process_versions, (
+            "We assume that software versions are the same between all modules. "
+            "If you see this error-message it means you discovered an edge-case "
+            "and should open an issue in nf-core/tools. "
+        )
+    except KeyError:
+        versions_by_module[module] = process_versions
+
+versions_by_module["CPIPES"] = {
+    "Nextflow": "$workflow.nextflow.version",
+    "$workflow.manifest.name": "$workflow.manifest.version",
+    "${params.pipeline}": "${params.workflow_version}",
+}
+
+versions_mqc = {
+    "id": "software_versions",
+    "section_name": "${workflow.manifest.name} Software Versions",
+    "section_href": "https://cfsan-git.fda.gov/Kranti.Konganti/${workflow.manifest.name.toLowerCase()}",
+    "plot_type": "html",
+    "description": "Collected at run time from the software output (STDOUT/STDERR).",
+    "data": _make_versions_html(versions_by_module),
+}
+
+with open("software_versions.yml", "w") as f:
+    yaml.dump(versions_by_module, f, default_flow_style=False)
+
+# print('sed -i -e "' + "s%'%%g" + '" *.yml')
+subprocess.run('sed -i -e "' + "s%'%%g" + '" software_versions.yml', shell=True)
+
+with open("software_versions_mqc.yml", "w") as f:
+    yaml.dump(versions_mqc, f, default_flow_style=False)
+
+with open("versions.yml", "w") as f:
+    yaml.dump(versions_this_module, f, default_flow_style=False)
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastp/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,96 @@
+process FASTP {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}fastp${params.fs}0.23.2" : null)
+    conda (params.enable_conda ? "bioconda::fastp=0.23.2 conda-forge::isa-l" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/fastp:0.23.2--h79da9fb_0' :
+        'quay.io/biocontainers/fastp:0.23.2--h79da9fb_0' }"
+
+    input:
+        tuple val(meta), path(reads)
+
+    output:
+        tuple val(meta), path('*.fastp.fastq.gz') , emit: passed_reads, optional: true
+        tuple val(meta), path('*.fail.fastq.gz')  , emit: failed_reads, optional: true
+        tuple val(meta), path('*.merged.fastq.gz'), emit: merged_reads, optional: true
+        tuple val(meta), path('*.json')           , emit: json
+        tuple val(meta), path('*.html')           , emit: html
+        tuple val(meta), path('*.log')            , emit: log
+        path "versions.yml"                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def fail_fastq = params.fastp_failed_out && meta.single_end ? "--failed_out ${prefix}.fail.fastq.gz" : params.fastp_failed_out && !meta.single_end ? "--unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : ''
+        // Added soft-links to original fastqs for consistent naming in MultiQC
+        // Use single ended for interleaved. Add --interleaved_in in config.
+        if ( task.ext.args?.contains('--interleaved_in') ) {
+            """
+            [ ! -f  ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz
+
+            fastp \\
+                --stdout \\
+                --in1 ${prefix}.fastq.gz \\
+                --thread $task.cpus \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $args \\
+                2> ${prefix}.fastp.log \\
+            | gzip -c > ${prefix}.fastp.fastq.gz
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+        } else if (meta.single_end) {
+            """
+            [ ! -f  ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz
+
+            fastp \\
+                --in1 ${prefix}.fastq.gz \\
+                --out1  ${prefix}.fastp.fastq.gz \\
+                --thread $task.cpus \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $args \\
+                2> ${prefix}.fastp.log
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+        } else {
+            def merge_fastq = params.fastp_merged_out ? "-m --merged_out ${prefix}.merged.fastq.gz" : ''
+            """
+            [ ! -f  ${prefix}_1.fastq.gz ] && ln -sf ${reads[0]} ${prefix}_1.fastq.gz
+            [ ! -f  ${prefix}_2.fastq.gz ] && ln -sf ${reads[1]} ${prefix}_2.fastq.gz
+            fastp \\
+                --in1 ${prefix}_1.fastq.gz \\
+                --in2 ${prefix}_2.fastq.gz \\
+                --out1 ${prefix}_1.fastp.fastq.gz \\
+                --out2 ${prefix}_2.fastp.fastq.gz \\
+                --json ${prefix}.fastp.json \\
+                --html ${prefix}.fastp.html \\
+                $fail_fastq \\
+                $merge_fastq \\
+                --thread $task.cpus \\
+                --detect_adapter_for_pe \\
+                $args \\
+                2> ${prefix}.fastp.log
+
+            cat <<-END_VERSIONS > versions.yml
+            "${task.process}":
+                fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
+            END_VERSIONS
+            """
+    }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastqc/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,113 @@
+# NextFlow DSL2 Module
+
+```bash
+FASTQC
+```
+
+## Description
+
+Run `fastqc` tool on reads in FASTQ format. Produces a HTML report file and a `.zip` file containing plots and data used to produce the plots.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `fastqc` classification should be run.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'FASTQC' {
+    ext.args = '--nano'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and list of `fastqc` result files.
+
+\
+&nbsp;
+
+#### `html`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `fastqc` report file in HTML format per sample (`id:`).
+
+\
+&nbsp;
+
+#### `zip`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the zipped `fastqc` results per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/fastqc/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,48 @@
+process FASTQC {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}fastqc${params.fs}0.11.9" : null)
+    conda (params.enable_conda ? "conda-forge::perl bioconda::fastqc=0.11.9" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' :
+        'quay.io/biocontainers/fastqc:0.11.9--0' }"
+
+    input:
+    tuple val(meta), path(reads)
+
+    output:
+    tuple val(meta), path("*.html"), emit: html
+    tuple val(meta), path("*.zip") , emit: zip
+    path  "versions.yml"           , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    // Add soft-links to original FastQs for consistent naming in pipeline
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    if (meta.single_end) {
+        """
+        [ ! -f  ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz
+        fastqc $args --threads $task.cpus ${prefix}.fastq.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
+        END_VERSIONS
+        """
+    } else {
+        """
+        [ ! -f  ${prefix}_1.fastq.gz ] && ln -s ${reads[0]} ${prefix}_1.fastq.gz
+        [ ! -f  ${prefix}_2.fastq.gz ] && ln -s ${reads[1]} ${prefix}_2.fastq.gz
+        fastqc $args --threads $task.cpus ${prefix}_1.fastq.gz ${prefix}_2.fastq.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
+        END_VERSIONS
+        """
+    }
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/gen_samplesheet/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,55 @@
+# NextFlow DSL2 Module
+
+```bash
+GEN_SAMPLESHEET
+```
+
+## Description
+
+Generates a sample sheet in CSV format that contains required fields to be used to construct a Groovy Map of metadata. It requires as input, an absolute UNIX path to a folder containing only FASTQ files. This module requires the `fastq_dir_to_samplesheet.py` script to be present in the `bin` folder from where the NextFlow script including this module will be executed.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `val`
+
+Takes in the absolute UNIX path to a folder containing only FASTQ files (`inputdir`).
+
+Ex:
+
+```groovy
+'/hpc/scratch/test/reads'
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `path`
+
+NextFlow output of type `path` pointing to auto-generated CSV sample sheet (`csv`).
+
+\
+&nbsp;
+
+#### `csv`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to auto-generated CSV sample sheet for all FASTQ files present in the folder given by NextFlow input type of `val` (`inputdir`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/gen_samplesheet/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,41 @@
+process GEN_SAMPLESHEET {
+    tag "${inputdir.simpleName}"
+    label "process_pico"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9.5" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.9--1' :
+        'quay.io/biocontainers/python:3.9--1' }"
+
+    input:
+        val inputdir
+
+    output:
+        path '*.csv'       , emit: csv
+        path 'versions.yml', emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    // This script (fastq_dir_to_samplesheet.py) is distributed
+    // as part of the pipeline nf-core/rnaseq/bin/. MIT License.
+    script:
+        def this_script_args = (params.fq_single_end ? ' -se' : '')
+        this_script_args += (params.fq_suffix ? " -r1 '${params.fq_suffix}'" : '')
+        this_script_args += (params.fq2_suffix ? " -r2 '${params.fq2_suffix}'" : '')
+
+        """
+        fastq_dir_to_samplesheet.py -sn \\
+            -st '${params.fq_strandedness}' \\
+            -sd '${params.fq_filename_delim}' \\
+            -si ${params.fq_filename_delim_idx} \\
+            ${this_script_args} \\
+            ${inputdir} autogen_samplesheet.csv
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/align/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,135 @@
+# NextFlow DSL2 Module
+
+```bash
+KMA_ALIGN
+```
+
+## Description
+
+Run `kma` alinger on input FASTQ files with a pre-formatted `kma` index.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) and a correspondonding `kma` pre-formatted index folder per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+        strandedness: 'unstranded',
+        single_end: false
+    ],
+    [
+        '/hpc/scratch/test/f1.R1.fq.gz',
+        '/hpc/scratch/test/f1.R2.fq.gz'
+    ],
+    '/path/to/kma/index/folder'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to paired-end FASTQ files on which `bbmerge.sh` should be run.
+
+\
+&nbsp;
+
+#### `index`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to folder containing `kma` index files.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'KMA_ALIGN' {
+    ext.args = '-mint2'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and `kma` result files.
+
+\
+&nbsp;
+
+#### `res`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.res` file from `kma` per sample (`id:`).
+
+\
+&nbsp;
+
+#### `mapstat`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.map` file from `kma` per sample (`id:`). Optional: `true`
+
+\
+&nbsp;
+
+#### `hits`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to a `*_template_hits.txt` file containing only hit IDs. Optional: `true`
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/align/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,73 @@
+process KMA_ALIGN {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}kma${params.fs}1.4.4" : null)
+    conda (params.enable_conda ? "conda-forge::libgcc-ng bioconda::kma=1.4.3" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/kma:1.4.3--h7132678_1':
+        'quay.io/biocontainers/kma:1.4.3--h7132678_1' }"
+
+    input:
+        tuple val(meta), path(reads), path(index)
+
+    output:
+        path "${meta.id}_kma_res"
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.res")              , emit: res
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.mapstat")          , emit: mapstat, optional: true
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*.frag.gz")          , emit: frags, optional: true
+        tuple val(meta), path("${meta.id}_kma_res${params.fs}*_template_hits.txt"), emit: hits, optional: true
+        path "versions.yml"                                                       , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def reads_in = (meta.single_end ? "-i $reads" : "-ipe ${reads[0]} ${reads[1]}")
+        def db = (meta.kma_t_db ?: "${index}")
+        def db_basename = (meta.kma_t_db ? '' : "${params.fs}${index.baseName}")
+        def get_hit_accs = (meta.get_kma_hit_accs ? 'true' : 'false')
+        def res_dir = prefix + '_kma_res'
+        reads_in = (params.kmaalign_int ? "-int $reads" : "$reads_in")
+        """
+        mkdir -p $res_dir || exit 1
+        kma \\
+            $args \\
+            -t_db $db$db_basename \\
+            -t $task.cpus \\
+            -o $res_dir${params.fs}$prefix \\
+            $reads_in
+
+        if [ "$get_hit_accs" == "true" ]; then
+            grep -v '^#' $res_dir${params.fs}${prefix}.res | \\
+                grep -E -o '^[[:alnum:]]+\\-*\\_*[[:alnum:]]*\\.*[0-9]+' > $res_dir${params.fs}${prefix}_template_hits.txt || true
+        fi
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            kma: \$( kma -v | sed -e 's%KMA-%%' )
+        END_VERSIONS
+
+        mkdirver=""
+        cutver=""
+        grepver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            mkdirver=\$( mkdir --help 2>&1 | sed -e '1!d; s/ (.*\$//' |  cut -f1-2 -d' ' )
+            cutver="\$mkdirver"
+            grepver="\$mkdirver"
+        else
+            mkdirver=\$( mkdir --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            cutver=\$( cut --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            grepver=\$( echo \$(grep --version 2>&1) | sed 's/^.*(GNU grep) //; s/ Copyright.*\$//' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            mkdir: \$mkdirver
+            cut: \$cutver
+            grep: \$grepver
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/index/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,86 @@
+# NextFlow DSL2 Module
+
+```bash
+KMA_INDEX
+```
+
+## Description
+
+Run `kma index` alinger on input FASTA files.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a FASTA file of type `path` (`fasta`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+    ],
+    '/path/to/FAL00870_contigs.fasta'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTA file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870'
+]
+```
+
+\
+&nbsp;
+
+#### `fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to the FASTA file on which the `kma index` command should be run.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `kma index` files.
+
+\
+&nbsp;
+
+#### `idx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the folder containing `kma index` files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/kma/index/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,57 @@
+process KMA_INDEX {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}kma${params.fs}1.4.4" : null)
+    conda (params.enable_conda ? "conda-forge::libgcc-ng bioconda::kma=1.4.3 conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/kma:1.4.3--h7132678_1':
+        'quay.io/biocontainers/kma:1.4.3--h7132678_1' }"
+
+    input:
+        tuple val(meta), path(fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_kma_idx"), emit: idx
+        path "versions.yml"                        , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}_kma_idx"
+        def add_to_db = (meta.kmaindex_t_db ? "-t_db ${meta.kmaindex_t_db}" : '')
+        """
+        mkdir -p $prefix && cd $prefix || exit 1
+        kma \\
+            index \\
+            $args \\
+            $add_to_db \\
+            -i ../$fasta \\
+            -o $prefix
+        cd .. || exit 1
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            kma: \$( kma -v | sed -e 's%KMA-%%' )
+        END_VERSIONS
+
+        mkdirver=""
+        cutver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            mkdirver=\$( mkdir --help 2>&1 | sed -e '1!d; s/ (.*\$//' |  cut -f1-2 -d' ' )
+            cutver="\$mkdirver"
+        else
+            mkdirver=\$( mkdir --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+            cutver=\$( cut --version 2>&1 | sed '1!d; s/^.*(GNU coreutils//; s/) //;' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            mkdir: \$mkdirver
+            cut: \$cutver
+            cd: \$( bash --version 2>&1 | sed '1!d; s/^.*version //; s/ (.*\$//' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/krona/ktimporttext/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,39 @@
+process KRONA_KTIMPORTTEXT {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}krona${params.fs}2.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::curl bioconda::krona=2.8.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/krona:2.8.1--pl5321hdfd78af_1':
+        'quay.io/biocontainers/krona:2.8.1--pl5321hdfd78af_1' }"
+
+    input:
+        tuple val(meta), path(report)
+
+    output:
+        tuple val(meta), path ('*.html'), emit: html
+        path "versions.yml"             , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def krona_suffix = params.krona_res_suffix ?: '.krona.tsv'
+        def reports = report.collect {
+            it = it.toString() + ',' + it.toString().replaceAll(/(.*)${krona_suffix}$/, /$1/)
+        }.sort().join(' ')
+        """
+        ktImportText  \\
+            $args \\
+            -o ${prefix}.html \\
+            $reports
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            krona: \$( echo \$(ktImportText 2>&1) | sed 's/^.*KronaTools //g; s/- ktImportText.*\$//g')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/multiqc/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,67 @@
+# NextFlow DSL2 Module
+
+```bash
+MULTIQC
+```
+
+## Description
+
+Generate an aggregated [**MultiQC**](https://multiqc.info/) report. This particular module **will only work** within the framework of `cpipes` as in, it uses many `cpipes` related UNIX absolute paths to store and retrieve **MultiQC** related configration files and `cpipes` context aware metadata. It also uses a custom logo with filename `FDa-Logo-Blue---medium-01.png` which should be located inside an `assets` folder from where the NextFlow script including this module will be executed.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in NextFlow input type of `path` which points to many log files that **MultiQC** should parse.
+
+Ex:
+
+```groovy
+[ '/data/sample1/centrifuge/cent_output.txt', '/data/sample1/kraken/kraken_output.txt'] ]
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+#### `report`
+
+Type: `path`
+
+Outputs a NextFlow output type of `path` pointing to the location of **MultiQC** final HTML report.
+
+\
+&nbsp;
+
+#### `data`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the data files folder generated by **MultiQC** which were used to generate plots and HTML report.
+
+\
+&nbsp;
+
+#### `plots`
+
+Type: `path`
+Optional: `true`
+
+NextFlow output type of `path` pointing to the plots folder generated by **MultiQC**.
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/multiqc/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,51 @@
+process MULTIQC {
+    label 'process_micro'
+    tag 'MultiQC'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}multiqc${params.fs}1.19" : null)
+    conda (params.enable_conda ? 'conda-forge::python=3.11 conda-forge::spectra conda-forge::lzstring conda-forge::imp bioconda::multiqc=1.19' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.19--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.19--pyhdfd78af_0' }"
+
+    input:
+        path multiqc_files
+
+    output:
+        path "*multiqc*"
+        path "*multiqc_report.html", emit: report
+        path "*_data"              , emit: data
+        path "*_plots"             , emit: plots, optional: true
+        path "versions.yml"        , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        """
+        cp ${params.projectconf}${params.fs}multiqc${params.fs}${params.pipeline}_mqc.yml cpipes_mqc_config.yml
+        sed -i -e 's/Workflow_Name_Placeholder/${params.pipeline}/g; s/Workflow_Version_Placeholder/${params.workflow_version}/g' cpipes_mqc_config.yml
+        sed -i -e 's/CPIPES_Version_Placeholder/${workflow.manifest.version}/g; s%Workflow_Output_Placeholder%${params.output}%g' cpipes_mqc_config.yml
+        sed -i -e 's%Workflow_Input_Placeholder%${params.input}%g' cpipes_mqc_config.yml
+
+        multiqc --interactive -c cpipes_mqc_config.yml -f $args .
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" )
+        END_VERSIONS
+
+        sedver=""
+
+        if [ "${workflow.containerEngine}" != "null" ]; then
+            sedver=\$( sed --help 2>&1 | sed -e '1!d; s/ (.*\$//' )
+        else
+            sedver=\$( echo \$(sed --version 2>&1) | sed 's/^.*(GNU sed) //; s/ Copyright.*\$//' )
+        fi
+
+        cat <<-END_VERSIONS >> versions.yml
+            sed: \$sedver
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/nowayout_results/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,45 @@
+process NOWAYOUT_RESULTS {
+    tag "nowayout aggregate"
+    label "process_pico"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? 'conda-forge::python=3.11 conda-forge::spectra conda-forge::lzstring conda-forge::imp bioconda::multiqc=1.19' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/multiqc:1.19--pyhdfd78af_0' :
+        'quay.io/biocontainers/multiqc:1.19--pyhdfd78af_0' }"
+
+    input:
+        path pass_and_fail_rel_abn_files
+        path lineage_csv
+
+    output:
+        path '*.tblsum.txt', emit: mqc_txt, optional: true
+        path '*_mqc.json'  , emit: mqc_json, optional: true
+        path '*_mqc.yml'   , emit: mqc_yml, optional: true
+        path '*.tsv'       , emit: tsv, optional: true
+        path 'versions.yml', emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        """
+        gen_salmon_tph_and_krona_tsv.py \\
+            $args \\
+            -sal "." \\
+            -smres "." \\
+            -lin $lineage_csv
+
+        create_mqc_data_table.py \\
+            "nowayout" "The results shown here are <code>salmon quant</code> TPM values scaled down by a factor of ${params.gsalkronapy_sf}."
+
+        create_mqc_data_table.py \\
+            "nowayout_indiv_reads_mapped" "The results shown here are the number of reads mapped (post threshold filters) per taxon to the <code>nowayout</code>'s custom <code>${params.db_mode}</code> database for each sample."
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/otf_genome/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,39 @@
+process OTF_GENOME {
+    tag "$meta.id"
+    label "process_nano"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.10.4" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.10.4' :
+        'quay.io/biocontainers/python:3.10.4' }"
+
+    input:
+        tuple val(meta), path(kma_hits), path(kma_fragz)
+
+    output:
+        tuple val(meta), path('*_scaffolded_genomic.fna.gz'), emit: genomes_fasta, optional: true
+        tuple val(meta), path('*_aln_reads.fna.gz')         , emit: reads_extracted, optional: true
+        path '*FAILED.txt'                                  , emit: failed, optional: true
+        path 'versions.yml'                                 , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        args += (kma_hits ? " -txt ${kma_hits}" : '')
+        args += (params.tuspy_gd ? " -gd ${params.tuspy_gd}" : '')
+        args += (prefix ? " -op ${prefix}" : '')
+
+        """
+        gen_otf_genome.py \\
+            $args
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/index/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,88 @@
+# NextFlow DSL2 Module
+
+```bash
+SALMON_INDEX
+```
+
+## Description
+
+Run `salmon index` command on input FASTA file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a FASTA file of type `path` (`genome_fasta`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870'
+    ],
+    [
+        '/hpc/scratch/test/FAL00870_contigs.fasta',
+    ]
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the genome FASTA file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870'
+]
+```
+
+\
+&nbsp;
+
+#### `genome_fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to the FASTA file (gzipped or unzipped) on which `salmon index` should be run.
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `salmon index` result files.
+
+\
+&nbsp;
+
+#### `idx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `salmon index` result files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/index/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,40 @@
+process SALMON_INDEX {
+    tag "$meta.id"
+    label "process_micro"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}salmon${params.fs}1.10.0" : null)
+    conda (params.enable_conda ? 'conda-forge::libgcc-ng bioconda::salmon=1.10.1' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/salmon:1.10.1--h7e5ed60_1' :
+        'quay.io/biocontainers/salmon:1.10.1--h7e5ed60_1' }"
+
+    input:
+        tuple val(meta), path(genome_fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_salmon_idx"), emit: idx
+        path "versions.yml"                           , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}_salmon_idx"
+        def decoys_file = file( meta.salmon_decoys )
+        def decoys = !("${decoys_file.simpleName}" ==~ 'dummy_file.*') && decoys_file.exits() ? "--decoys ${meta.salmon_decoys}" : ''
+        """
+        salmon \\
+            index \\
+            $decoys \\
+            --threads $task.cpus \\
+            $args \\
+            --index $prefix \\
+            --transcripts $genome_fasta
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            salmon: \$(echo \$(salmon --version) | sed -e "s/salmon //g")
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/quant/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,118 @@
+# NextFlow DSL2 Module
+
+```bash
+SALMON_QUANT
+```
+
+## Description
+
+Run `salmon quant` in `reads` or `alignments` mode. The inputs can be either the alignment (Ex: `.bam`) files or read (Ex: `.fastq.gz`) files.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and either an alignment file or reads file and a `salmon index` or a transcript FASTA file per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [
+        id: 'FAL00870',
+        strandedness: 'unstranded',
+        single_end: true
+    ],
+    [
+        '/hpc/scratch/test/FAL00870_R1.fastq.gz'
+    ],
+    [
+        '/hpc/scratch/test/salmon_idx_for_FAL00870'
+    ]
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the input setup for `salmon quant`.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads_or_bam`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to either an alignment file (Ex: `.bam`) or a reads file (Ex: `.fastq.gz`) on which `salmon quant` should be run.
+
+\
+&nbsp;
+
+#### `index_or_tr_fasta`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to either a folder containing `salmon index` files or a trasnscript FASTA file.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SALMON_QUANT' {
+    ext.args = '--vbPrior 0.02'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and a folder containing `salmon quant` result files.
+
+\
+&nbsp;
+
+#### `results`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `salmon quant` result files per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/salmon/quant/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,75 @@
+process SALMON_QUANT {
+    tag "$meta.id"
+    label "process_micro"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}salmon${params.fs}1.10.0" : null)
+    conda (params.enable_conda ? 'conda-forge::libgcc-ng bioconda::salmon=1.10.1' : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/salmon:1.10.1--h7e5ed60_1' :
+        'quay.io/biocontainers/salmon:1.10.1--h7e5ed60_1' }"
+    input:
+        tuple val(meta), path(reads_or_bam), path(index_or_tr_fasta)
+
+    output:
+        tuple val(meta), path("${meta.id}_salmon_res"), emit: results
+        path  "versions.yml"                          , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args   ?: ''
+        def prefix   = task.ext.prefix ?: "${meta.id}_salmon_res"
+        def reference   = "--index $index_or_tr_fasta"
+        def lib_type = (meta.salmon_lib_type ?: '')
+        def alignment_mode = (meta.salmon_alignment_mode ?: '')
+        def gtf = (meta.salmon_gtf ? "--geneMap ${meta.salmon_gtf}" : '')
+        def input_reads =(meta.single_end || !reads_or_bam[1] ? "-r $reads_or_bam" : "-1 ${reads_or_bam[0]} -2 ${reads_or_bam[1]}")
+
+        // Use path(reads_or_bam) to point to BAM and path(index_or_tr_fasta) to point to transcript fasta
+        // if using salmon DSL2 module in alignment-based mode.
+        // By default, this module will be run in selective-alignment-based mode of salmon.
+        if (alignment_mode) {
+            reference   = "-t $index_or_tr_fasta"
+            input_reads = "-a $reads_or_bam"
+        }
+
+        def strandedness_opts = [
+            'A', 'U', 'SF', 'SR',
+            'IS', 'IU' , 'ISF', 'ISR',
+            'OS', 'OU' , 'OSF', 'OSR',
+            'MS', 'MU' , 'MSF', 'MSR'
+        ]
+
+        def strandedness =  'A'
+
+        if (lib_type) {
+            if (strandedness_opts.contains(lib_type)) {
+                strandedness = lib_type
+            } else {
+                log.info "[Salmon Quant] Invalid library type specified '--libType=${lib_type}', defaulting to auto-detection with '--libType=A'."
+            }
+        } else {
+            strandedness = meta.single_end ? 'U' : 'IU'
+            if (meta.strandedness == 'forward') {
+                strandedness = meta.single_end ? 'SF' : 'ISF'
+            } else if (meta.strandedness == 'reverse') {
+                strandedness = meta.single_end ? 'SR' : 'ISR'
+            }
+        }
+        """
+        salmon quant \\
+            --threads $task.cpus \\
+            --libType=$strandedness \\
+            $gtf \\
+            $args \\
+            -o $prefix \\
+            $reference \\
+            $input_reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            salmon: \$(echo \$(salmon --version) | sed -e "s/salmon //g")
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samplesheet_check/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,55 @@
+# NextFlow DSL2 Module
+
+```bash
+SAMPLESHEET_CHECK
+```
+
+## Description
+
+Checks the validity of the sample sheet in CSV format to make sure there are required mandatory fields. This module generally succeeds `GEN_SAMPLESHEET` module as part of the `cpipes` pipelines to make sure that all fields of the columns are properly formatted to be used as Groovy Map for `meta` which is of input type `val`. This module requires the `check_samplesheet.py` script to be present in the `bin` folder from where the NextFlow script including this module will be executed
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `path`
+
+Takes in the absolute UNIX path to the sample sheet in CSV format (`samplesheet`).
+
+Ex:
+
+```groovy
+'/hpc/scratch/test/reads/output/gen_samplesheet/autogen_samplesheet.csv'
+```
+
+\
+&nbsp;
+
+### `output:`
+
+___
+
+Type: `path`
+
+NextFlow output of type `path` pointing to properly formatted CSV sample sheet (`csv`).
+
+\
+&nbsp;
+
+#### `csv`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to auto-generated CSV sample sheet for all FASTQ files present in the folder given by NextFlow input type of `val` (`inputdir`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samplesheet_check/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,32 @@
+process SAMPLESHEET_CHECK {
+    tag "$samplesheet"
+    label "process_femto"
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}python${params.fs}3.8.1" : null)
+    conda (params.enable_conda ? "conda-forge::python=3.9.5" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/python:3.9--1' :
+        'quay.io/biocontainers/python:3.9--1' }"
+
+    input:
+        path samplesheet
+
+    output:
+        path '*.csv'       , emit: csv
+        path "versions.yml", emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script: // This script is bundled with the pipeline, in nf-core/rnaseq/bin/
+        """
+        check_samplesheet.py \\
+            $samplesheet \\
+            samplesheet.valid.csv
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            python: \$( python --version | sed 's/Python //g' )
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/samtools/fastq/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,53 @@
+process SAMTOOLS_FASTQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}samtools${params.fs}1.13" : null)
+    conda (params.enable_conda ? "bioconda::samtools=1.18 bioconda::htslib=1.18 conda-forge::bzip2" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/samtools:1.18--h50ea8bc_1' :
+        'quay.io/biocontainers/samtools:1.18--h50ea8bc_1' }"
+
+    input:
+        tuple val(meta), path(input)
+        val(interleave)
+
+    output:
+        tuple val(meta), path("*_{1,2}.fastq.gz")      , optional:true, emit: fastq
+        tuple val(meta), path("*_{1,2}.fastq.gz")      , optional:true, emit: mapped_refs
+        tuple val(meta), path("*_interleaved.fastq")   , optional:true, emit: interleaved
+        tuple val(meta), path("*_singleton.fastq.gz")  , optional:true, emit: singleton
+        tuple val(meta), path("*_other.fastq.gz")      , optional:true, emit: other
+        path  "versions.yml"                           , emit: versions
+
+    when:
+        task.ext.when == null || task.ext.when
+
+    script:
+        def args = task.ext.args ?: ''
+        def prefix = task.ext.prefix ?: "${meta.id}"
+        def output = ( interleave && ! meta.single_end ) ? "> ${prefix}_interleaved.fastq" :
+            meta.single_end ? "-1 ${prefix}_1.fastq.gz -s ${prefix}_singleton.fastq.gz" :
+            "-1 ${prefix}_1.fastq.gz -2 ${prefix}_2.fastq.gz -s ${prefix}_singleton.fastq.gz"
+        """
+        samtools \\
+            fastq \\
+            $args \\
+            --threads ${task.cpus-1} \\
+            -0 ${prefix}_other.fastq.gz \\
+            $input \\
+            $output
+
+        samtools \\
+            view \\
+            $args2 \\
+            --threads ${task.cpus-1} \\
+            $input \\
+            | grep -v '*' | cut -f3 | sort -u > mapped_refs.txt
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
+        END_VERSIONS
+        """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/grep/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,113 @@
+# NextFlow DSL2 Module
+
+```bash
+SEQKIT_GREP
+```
+
+## Description
+
+Run `seqkit grep` command on reads in FASTQ format. Produces a filtered FASTQ file as per the filter strategy in the supplied input file.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `seqkit grep` should be run.
+
+\
+&nbsp;
+
+#### `pattern_file`
+
+Type: path
+
+NextFlow input type of `path` pointing to the pattern file which has the patterns, one per line, by which FASTQ sequence ids should be searched and whose reads will be extracted.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SEQKIT_GREP' {
+    ext.args = '--only-positive-strand'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and and filtered gzipped FASTQ file.
+
+\
+&nbsp;
+
+#### `fastx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the FASTQ format filtered gzipped file per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/grep/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,90 @@
+process SEQKIT_GREP {
+    tag "$meta.id"
+    label 'process_low'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}seqkit${params.fs}2.2.0" : null)
+    conda (params.enable_conda ? "bioconda::seqkit=2.2.0 conda-forge::sed=4.7 conda-forge::coreutils" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/seqkit:2.1.0--h9ee0642_0':
+        'quay.io/biocontainers/seqkit:2.1.0--h9ee0642_0' }"
+
+    input:
+    tuple val(meta), path(reads), path(pattern_file)
+
+    output:
+    tuple val(meta), path("*.gz"), emit: fastx
+    path "versions.yml"          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def num_read_files = reads.toList().size()
+    def extension = "fastq"
+    if ("$reads" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) {
+        extension = "fasta"
+    }
+
+    if (meta.single_end || num_read_files == 1) {
+        """
+        pattern_file_contents=\$(sed '1!d' $pattern_file)
+        if [ "\$pattern_file_contents" != "DuMmY" ]; then
+            cut -f1 -d " " $pattern_file > ${prefix}.seqids.txt
+            additional_args="-f ${prefix}.seqids.txt $args"
+        else
+            additional_args="$args"
+        fi
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            $reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    } else {
+        """
+        pattern_file_contents=\$(sed '1!d' $pattern_file)
+        if [ "\$pattern_file_contents" != "DuMmY" ]; then
+            additional_args="-f $pattern_file $args"
+        else
+            additional_args="$args"
+        fi
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.R1.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            ${reads[0]}
+
+        seqkit \\
+            grep \\
+            -j $task.cpus \\
+            -o ${prefix}.R2.seqkit-grep.${extension}.gz \\
+            \$additional_args \\
+            ${reads[1]}
+
+        seqkit \\
+            pair \\
+            -j $task.cpus \\
+            -1 ${prefix}.R1.seqkit-grep.${extension}.gz \\
+            -2 ${prefix}.R2.seqkit-grep.${extension}.gz
+
+        rm ${prefix}.R1.seqkit-grep.${extension}.gz
+        rm ${prefix}.R2.seqkit-grep.${extension}.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    }
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/seq/README.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,104 @@
+# NextFlow DSL2 Module
+
+```bash
+SEQKIT_SEQ
+```
+
+## Description
+
+Run `seqkit seq` command on reads in FASTQ format. Produces a filtered FASTQ file as per the filter strategy mentioned using the `ext.args` within the process scope.
+
+\
+&nbsp;
+
+### `input:`
+
+___
+
+Type: `tuple`
+
+Takes in the following tuple of metadata (`meta`) and a list of reads of type `path` (`reads`) per sample (`id:`).
+
+Ex:
+
+```groovy
+[
+    [ id: 'FAL00870',
+       strandedness: 'unstranded',
+       single_end: true,
+       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
+    ],
+    '/hpc/scratch/test/FAL000870/f1.merged.fq.gz'
+]
+```
+
+\
+&nbsp;
+
+#### `meta`
+
+Type: Groovy Map
+
+A Groovy Map containing the metadata about the FASTQ file.
+
+Ex:
+
+```groovy
+[
+    id: 'FAL00870',
+    strandedness: 'unstranded',
+    single_end: true
+]
+```
+
+\
+&nbsp;
+
+#### `reads`
+
+Type: `path`
+
+NextFlow input type of `path` pointing to FASTQ files on which `seqkit seq` should be run.
+
+\
+&nbsp;
+
+#### `args`
+
+Type: Groovy String
+
+String of optional command-line arguments to be passed to the tool. This can be mentioned in `process` scope within `withName:process_name` block using `ext.args` option within your `nextflow.config` file.
+
+Ex:
+
+```groovy
+withName: 'SEQKIT_SEQ' {
+    ext.args = '--max-len 4000'
+}
+```
+
+### `output:`
+
+___
+
+Type: `tuple`
+
+Outputs a tuple of metadata (`meta` from `input:`) and filtered gzipped FASTQ file.
+
+\
+&nbsp;
+
+#### `fastx`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the FASTQ format filtered gzipped file per sample (`id:`).
+
+\
+&nbsp;
+
+#### `versions`
+
+Type: `path`
+
+NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/seqkit/seq/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,75 @@
+process SEQKIT_SEQ {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}seqkit${params.fs}2.2.0" : null)
+    conda (params.enable_conda ? "bioconda::seqkit=2.2.0" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/seqkit:2.1.0--h9ee0642_0':
+        'quay.io/biocontainers/seqkit:2.1.0--h9ee0642_0' }"
+
+    input:
+    tuple val(meta), path(reads)
+
+    output:
+    tuple val(meta), path("*.gz"), emit: fastx
+    path "versions.yml"          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+
+    def extension = "fastq"
+    if ("$reads" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) {
+        extension = "fasta"
+    }
+
+    if (meta.single_end) {
+        """
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.seqkit-seq.${extension}.gz \\
+            $args \\
+            $reads
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    } else {
+        """
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.R1.seqkit-seq.${extension}.gz \\
+            $args \\
+            ${reads[0]}
+
+        seqkit \\
+            seq \\
+            -j $task.cpus \\
+            -o ${prefix}.R2.seqkit-seq.${extension}.gz \\
+            $args \\
+            ${reads[1]}
+
+        seqkit \\
+            pair \\
+            -j $task.cpus \\
+            -1 ${prefix}.R1.seqkit-seq.${extension}.gz \\
+            -2 ${prefix}.R2.seqkit-seq.${extension}.gz
+
+        rm ${prefix}.R1.seqkit-seq.${extension}.gz
+        rm ${prefix}.R2.seqkit-seq.${extension}.gz
+
+        cat <<-END_VERSIONS > versions.yml
+        "${task.process}":
+            seqkit: \$( seqkit | sed '3!d; s/Version: //' )
+        END_VERSIONS
+        """
+    }
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/gather/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,60 @@
+process SOURMASH_GATHER {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(signature), path(database)
+    val save_unassigned
+    val save_matches_sig
+    val save_prefetch
+    val save_prefetch_csv
+
+    output:
+    tuple val(meta), path("*_hits.csv")          , emit: result       , optional: true
+    tuple val(meta), path("*_unassigned.sig.zip"), emit: unassigned   , optional: true
+    tuple val(meta), path("*_matches.sig.zip")   , emit: matches      , optional: true
+    tuple val(meta), path("*_prefetch.sig.zip")  , emit: prefetch     , optional: true
+    tuple val(meta), path("*_prefetch.csv.gz")   , emit: prefetchcsv  , optional: true
+    tuple val(meta), path("*FAILED.txt")         , emit: failed       , optional: true
+    path "versions.yml"                          , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args        = task.ext.args ?: ''
+    def args2       = task.ext.args2 ?: ''
+    def prefix      = task.ext.prefix ?: "${meta.id}"
+    def unassigned  = save_unassigned   ? "--output-unassigned ${prefix}_unassigned.sig.zip" : ''
+    def matches     = save_matches_sig  ? "--save-matches ${prefix}_matches.sig.zip"         : ''
+    def prefetch    = save_prefetch     ? "--save-prefetch ${prefix}_prefetch.sig.zip"       : ''
+    def prefetchcsv = save_prefetch_csv ? "--save-prefetch-csv ${prefix}_prefetch.csv.gz"    : ''
+
+    """
+    sourmash gather \\
+        $args \\
+        --output ${prefix}.csv.gz \\
+        ${unassigned} \\
+        ${matches} \\
+        ${prefetch} \\
+        ${prefetchcsv} \\
+        ${signature} \\
+        ${database}
+
+    sourmash_filter_hits.py \\
+        $args2 \\
+        -csv ${prefix}.csv.gz
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/search/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,55 @@
+process SOURMASH_SEARCH {
+    tag "$meta.id"
+    label 'process_micro'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(signature), path(database)
+    val save_matches_sig
+
+    output:
+    tuple val(meta), path("*.csv.gz")                   , emit: result       , optional: true
+    tuple val(meta), path("*_scaffolded_genomic.fna.gz"), emit: genomes_fasta, optional: true
+    tuple val(meta), path("*_matches.sig.zip")          , emit: matches      , optional: true
+    path "*FAILED.txt"                                  , emit: failed       , optional: true
+    path "versions.yml"                                 , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    def args        = task.ext.args ?: ''
+    def args2       = task.ext.args2 ?: ''
+    def prefix      = task.ext.prefix ?: "${meta.id}"
+    def matches     = save_matches_sig  ? "--save-matches ${prefix}_matches.sig.zip" : ''
+    def gd          = params.tuspy_gd   ? "-gd ${params.tuspy_gd}"                   : ''
+
+    """
+    sourmash search \\
+        $args \\
+        --output ${prefix}.csv.gz \\
+        ${matches} \\
+        ${signature} \\
+        ${database}
+
+    sourmash_filter_hits.py \\
+        $args2 \\
+        -csv ${prefix}.csv.gz
+
+    gen_otf_genome.py \\
+        $gd \\
+        -op ${prefix} \\
+        -txt ${prefix}_template_hits.txt
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+        python: \$( python --version | sed 's/Python //g' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/sketch/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,43 @@
+process SOURMASH_SKETCH {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(sequence)
+    val singleton
+    val merge
+    val db_or_query
+
+    output:
+    tuple val(meta), path("*.{query,db}.sig"), emit: signatures
+    path "versions.yml"                      , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    // required defaults for the tool to run, but can be overridden
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def merge_sig = merge ? "--merge ${meta.id}" : ''
+    def singleton = singleton ? '--singleton' : ''
+    """
+    sourmash sketch \\
+        $args \\
+        $merge_sig \\
+        $singleton \\
+        --output "${prefix}.${db_or_query}.sig" \\
+        $sequence
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/modules/sourmash/tax/metagenome/main.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,40 @@
+process SOURMASH_TAX_METAGENOME {
+    tag "$meta.id"
+    label 'process_nano'
+
+    module (params.enable_module ? "${params.swmodulepath}${params.fs}sourmash${params.fs}4.6.1" : null)
+    conda (params.enable_conda ? "conda-forge::python bioconda::sourmash=4.6.1" : null)
+    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
+        'https://depot.galaxyproject.org/singularity/sourmash:4.6.1--hdfd78af_0':
+        'quay.io/biocontainers/sourmash:4.6.1--hdfd78af_0' }"
+
+    input:
+    tuple val(meta), path(csv), path(lineage)
+
+    output:
+    tuple val(meta), path("*.txt"), emit: txt, optional: true
+    tuple val(meta), path("*.tsv"), emit: tsv, optional: true
+    tuple val(meta), path("*.csv"), emit: csv, optional: true
+    path "versions.yml"           , emit: versions
+
+    when:
+    task.ext.when == null || task.ext.when
+
+    script:
+    // required defaults for the tool to run, but can be overridden
+    def args = task.ext.args ?: ''
+    def prefix = task.ext.prefix ?: "${meta.id}"
+    def output_format = args.findAll(/(--output-format\s+[\w\,]+)\s*/).join("").replaceAll(/\,/, / --output-format /)
+    args = args.replaceAll(/--output-format\s+[\w\,]+\s*/, /${output_format}/)
+    """
+    sourmash tax metagenome \\
+        $args \\
+        -g $csv \\
+        --output-base $prefix \\
+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        sourmash: \$(echo \$(sourmash --version 2>&1) | sed 's/^sourmash //' )
+    END_VERSIONS
+    """
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/nextflow.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,41 @@
+// Main driver script
+manifest.mainScript = 'cpipes'
+
+def fs = File.separator
+def pd = "${projectDir}"
+
+// Global parameters
+includeConfig "${pd}${fs}conf${fs}manifest.config"
+includeConfig "${pd}${fs}conf${fs}base.config"
+
+// Include FASTQ config to prepare for a case when the entry point is
+// FASTQ metadata CSV or FASTQ input directory
+includeConfig "${pd}${fs}conf${fs}fastq.config"
+
+if (params.pipeline != null) {
+    try {
+        includeConfig "${params.workflowsconf}${fs}${params.pipeline}.config"
+    } catch (Exception e) {
+        System.err.println('-'.multiply(params.linewidth) + "\n" +
+            "\033[0;31m${params.cfsanpipename} - ERROR\033[0m\n" +
+            '-'.multiply(params.linewidth) + "\n" + "\033[0;31mCould not load " +
+            "default pipeline configuration. Please provide a pipeline \n" +
+            "name using the --pipeline option.\n\033[0m" + '-'.multiply(params.linewidth) + "\n")
+        System.exit(1)
+    }
+}
+
+// Include modules' config last.
+includeConfig "${pd}${fs}conf${fs}logtheseparams.config"
+includeConfig "${pd}${fs}conf${fs}modules.config"
+
+// Nextflow runtime profiles
+conda.cacheDir = "${pd}${fs}kondagac_cache"
+singularity.cacheDir = "${pd}${fs}cingularitygac_cache"
+
+// Clean up after successfull run
+// cleanup = true
+
+profiles {
+    includeConfig "${pd}${fs}conf${fs}computeinfra.config"
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/readme/nowayout.md	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,775 @@
+<p align="center">
+    <img src="../assets/nowayout-icon.png" width="20%" height="20%" />
+</p>
+
+---
+
+`nowayout` is a **super-fast** automated software pipeline for taxonomic classification of Eukaryotic mitochondrial reads. It uses a custom database to first identify mitochondrial reads and performs read classification on those identified reads.
+
+---
+
+<!-- TOC -->
+
+- [Minimum Requirements](#minimum-requirements)
+- [HFP GalaxyTrakr](#hfp-galaxytrakr)
+- [Usage and Examples](#usage-and-examples)
+  - [Databases](#databases)
+  - [Input](#input)
+  - [Output](#output)
+  - [Preset filters](#preset-filters)
+  - [Computational resources](#computational-resources)
+  - [Runtime profiles](#runtime-profiles)
+  - [your_institution.config](#your_institutionconfig)
+- [Test Run](#test-run)
+- [nowayout CLI Help](#nowayout-cli-help)
+
+<!-- /TOC -->
+
+\
+&nbsp;
+
+## Minimum Requirements
+
+1. [Nextflow version 25.04.6](https://github.com/nextflow-io/nextflow/releases/download/v25.04.6/nextflow).
+    - Make the `nextflow` binary executable (`chmod 755 nextflow`) and also make sure that it is made available in your `$PATH`.
+    - If your existing `JAVA` install does not support the newest **Nextflow** version, you can try **Amazon**'s `JAVA` (OpenJDK):  [Corretto](https://docs.aws.amazon.com/corretto/latest/corretto-21-ug/downloads-list.html).
+2. Either of `micromamba` (version `1.5.9`) or `docker` or `singularity` installed and made available in your `$PATH`.
+    - Running the workflow via `micromamba` software provisioning is **preferred** as it does not require any `sudo` or `admin` privileges or any other configurations with respect to the various container providers.
+    - To install `micromamba` for your system type, please follow these [installation steps](https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html#linux-and-macos) and make sure that the `micromamba` binary is made available in your `$PATH`.
+    - Just the `curl` step is sufficient to download the binary as far as running the workflows are concerned.
+    - Once you have finished the installation, **it is important that you downgrade `micromamba` to version `1.5.9`**.
+    - First check, if your version is other than `1.5.9` and if not, do the downgrade.
+
+        ```bash
+        micromamba --version
+        micromamba self-update --version 1.5.9 -c conda-forge
+        ```
+
+3. Minimum of 10 CPU cores and about 60 GBs for main workflow steps. More memory may be required if your **FASTQ** files are big.
+
+\
+&nbsp;
+
+## HFP GalaxyTrakr
+
+The `nowayout` pipeline **will** be made available for use on the newest version of [Galaxy instance supported by HFP, FDA](https://galaxytrakr.org/) (`version >= 24.x`). Please check this space for announcements in this regard.
+
+Please note that the pipeline on [HFP GalaxyTrakr](https://galaxytrakr.org) in most cases may be a version older than the one on **GitHub** due to testing prioritization.
+
+\
+&nbsp;
+
+## Usage and Examples
+
+Clone or download this repository and then call `cpipes`.
+
+```bash
+cpipes --pipeline nowayout [options]
+```
+
+Alternatively, you can use `nextflow` to directly pull and run the pipeline.
+
+```bash
+nextflow pull CFSAN-Biostatistics/nowayout
+nextflow list
+nextflow info CFSAN-Biostatistics/nowayout
+nextflow run CFSAN-Biostatistics/nowayout --pipeline nowayout --help
+```
+
+\
+&nbsp;
+
+### Databases
+
+---
+
+The successful run of the workflow requires proper setup of the custom database files:
+
+- `nowayout_dbs`: [Download](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/nowayout/nowayout_dbs.tar.bz2) (~ 22 GB).
+
+Once you have downloaded the databases, uncompress and set the **UNIX symbolic link** to the database folders in [assets](../assets/) folder as follows:
+
+```bash
+mkdir assets/dbfiles
+cd assets/dbfiles
+ln -s /path/to/nowayout_dbs/kma kma
+ln -s /path/to/nowayout_dbs/reference reference
+ln -s /path/to/nowayout_dbs/taxonomy taxonomy
+```
+
+That's it!
+
+\
+&nbsp;
+
+### Input
+
+---
+
+The input to the workflow is a folder containing compressed (`.gz`) FASTQ files of long reads or short reads. Please note that the sample grouping happens automatically by the file name of the FASTQ file. If for example, a single sample is sequenced across multiple sequencing lanes, you can choose to group those FASTQ files into one sample by using the `--fq_filename_delim` and `--fq_filename_delim_idx` options. By default, `--fq_filename_delim` is set to `_` (underscore) and `--fq_filename_delim_idx` is set to 1.
+
+For example, if the directory contains FASTQ files as shown below:
+
+- KB-01_apple_L001_R1.fastq.gz
+- KB-01_apple_L001_R2.fastq.gz
+- KB-01_apple_L002_R1.fastq.gz
+- KB-01_apple_L002_R2.fastq.gz
+- KB-02_mango_L001_R1.fastq.gz
+- KB-02_mango_L001_R2.fastq.gz
+- KB-02_mango_L002_R1.fastq.gz
+- KB-02_mango_L002_R2.fastq.gz
+
+Then, to create 2 sample groups, `apple` and `mango`, we split the file name by the delimitor (underscore in the case, which is default) and group by the first 2 words (`--fq_filename_delim_idx 2`).
+
+This goes without saying that all the FASTQ files should have uniform naming patterns so that `--fq_filename_delim` and `--fq_filename_delim_idx` options do not have any adverse effect in collecting and creating a sample metadata sheet.
+
+\
+&nbsp;
+
+### Output
+
+---
+
+All the outputs for each step are stored inside the folder mentioned with the `--output` option. A `multiqc_report.html` file inside the `nowayout-multiqc` folder can be opened in any browser on your local workstation which contains a consolidated brief report.
+
+Please note that the percentage relative abundances seen are relative to the total number of mitochondrial reads and not the total number of reads per sample.
+
+\
+&nbsp;
+
+### Preset filters
+
+---
+
+There are three preset threshold filters that are available with the pipeline: `--nowo_thresholds strict`, `--nowo_thresholds mild` and `--nowo_thresholds relax`. Use these options for exploration of results via multiple runs on the same input dataset. The default is `strict` thresholds.
+
+\
+&nbsp;
+
+### Computational resources
+
+---
+
+The workflows `nowayout` require at least a minimum of 10 CPU cores and 60 GBs of memory to successfully finish the workflow.
+
+\
+&nbsp;
+
+### Runtime profiles
+
+---
+
+You can use different run time profiles that suit your specific compute environments i.e., you can run the workflow locally on your machine or in a grid computing infrastructure.
+
+\
+&nbsp;
+
+Example:
+
+```bash
+cd /data/scratch/$USER
+mkdir nf-cpipes
+cd nf-cpipes
+cpipes \
+    --pipeline nowayout \
+    --input /path/to/fastq_pass_dir \
+    --output /path/to/where/output/should/go \
+    -profile your_institution
+```
+
+The above command would run the pipeline and store the output at the location per the `--output` flag and the **NEXTFLOW** reports are always stored in the current working directory from where `cpipes` is run. For example, for the above command, a directory called `CPIPES-nowayout` would hold all the **NEXTFLOW** related logs, reports and trace files.
+
+\
+&nbsp;
+
+### `your_institution.config`
+
+---
+
+In the above example, we can see that we have mentioned the run time profile as `your_institution`. For this to work, add the following lines at the end of [`computeinfra.config`](../conf/computeinfra.config) file which should be located inside the `conf` folder. For example, if your institution uses **SGE** or **UNIVA** for grid computing instead of **SLURM** and has a job queue named `normal.q`, then add these lines:
+
+\
+&nbsp;
+
+```groovy
+your_institution {
+    process.executor = 'sge'
+    process.queue = 'normal.q'
+    singularity.enabled = false
+    singularity.autoMounts = true
+    docker.enabled = false
+    params.enable_conda = true
+    conda.enabled = true
+    conda.useMicromamba = true
+    params.enable_module = false
+}
+```
+
+In the above example, by default, all the software provisioning choices are disabled except `conda`. You can also choose to remove the `process.queue` line altogether and the `nowayout` workflow will request the appropriate memory and number of CPU cores automatically, which ranges from 1 CPU, 1 GB and 1 hour for job completion up to 10 CPU cores, 1 TB and 120 hours for job completion.
+
+\
+&nbsp;
+
+### Cloud computing
+
+---
+
+You can run the workflow in the cloud (works only with proper set up of AWS resources). Add new run time profiles with required parameters per [Nextflow docs](https://www.nextflow.io/docs/latest/executor.html):
+
+\
+&nbsp;
+
+Example:
+
+```groovy
+my_aws_batch {
+    executor = 'awsbatch'
+    queue = 'my-batch-queue'
+    aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
+    aws.batch.region = 'us-east-1'
+    singularity.enabled = false
+    singularity.autoMounts = true
+    docker.enabled = true
+    params.conda_enabled = false
+    params.enable_module = false
+}
+```
+
+\
+&nbsp;
+
+## Test Run
+
+After you make sure that you have all the [minimum requirements](#minimum-requirements) to run the workflow, you can try the `nowayout` on some datasets.
+
+- Download input reads [from S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/nowayout/nowayout_test_reads.tar.bz2) (~ 8 GB).
+  - This dataset was part of the research for detecting and identifying insects or insect fragments in food, an essential component of food safety and regulatory monitoring. Insects such as **_Plodia interpunctella_** (Indian meal moth) and _**Tribolium castaneum**_ (red flour beetle) were intentionally spiked into wheat flour at varying concentrations to create benchmark samples. These serve as reference materials to test and validate molecular detection workflows.
+- Download pre-formatted  databases (**MANDATORY**) [from S3](https://cfsan-pub-xfer.s3.amazonaws.com/Kranti.Konganti/nowayout/nowayout_dbs.tar.bz2) (~ 22 GB).
+- After successful download, untar and add **symbolic links** in [assets](../assets) folder as described in the [Databases](#databases) section.
+- It is always a best practice to use absolute UNIX paths and real destinations of symbolic links during pipeline execution. For example, find out the real path(s) of your absolute UNIX path(s) and use that for the `--input` and `--output` options of the pipeline.
+
+  ```bash
+  realpath /hpc/scratch/user/input/srr
+  ```
+
+- Now run the workflow by ignoring quality values since these are simulated base qualities:
+
+    ```bash
+    cpipes \
+        --pipeline nowayout \
+        --input /path/to/nowayout_test_reads \
+        --output /path/to/nowayout_test_output \
+        --fq_single_end true \
+        -profile stdkondagac \
+        -resume
+    ```
+
+- After succesful run of the workflow, your **MultiQC** report should look something like [this](https://cfsan-pub-xfer.s3.us-east-1.amazonaws.com/Kranti.Konganti/nowayout/CPIPES-Report_multiqc_report.html).
+
+- `nowayout` also automatically generates [Krona](https://github.com/marbl/Krona) charts. The **Krona** chart for the above test run should look something like [this](https://cfsan-pub-xfer.s3.us-east-1.amazonaws.com/Kranti.Konganti/nowayout/CPIPES_nowayout_krona.html)
+
+Please note that the run time profile `stdkondagac` will run jobs locally using `micromamba` for software provisioning. The first time you run the command, a new folder called `kondagac_cache` will be created and subsequent runs should use this `conda` cache.
+
+\
+&nbsp;
+
+## `nowayout` CLI Help
+
+```text
+cpipes --pipeline nowayout --help
+
+ N E X T F L O W   ~  version 24.10.4
+
+Launching `/home/user/nowayout/cpipes` [sleepy_pauling] DSL2 - revision: 55d6f63710
+
+================================================================================
+             (o)
+  ___  _ __   _  _ __    ___  ___
+ / __|| '_ \ | || '_ \  / _ \/ __|
+| (__ | |_) || || |_) ||  __/\__ \
+ \___|| .__/ |_|| .__/  \___||___/
+      | |       | |
+      |_|       |_|
+--------------------------------------------------------------------------------
+A collection of modular pipelines at CFSAN, FDA.
+--------------------------------------------------------------------------------
+Name                            : CPIPES
+Author                          : Kranti.Konganti@fda.hhs.gov
+Version                         : 0.8.0
+Center                          : CFSAN, FDA.
+================================================================================
+
+Workflow                        : nowayout
+
+Author                          : Kranti Konganti
+
+Version                         : 0.5.0
+
+
+Usage                           : cpipes --pipeline nowayout [options]
+
+
+Required                        :
+
+--input                         : Absolute path to directory containing FASTQ
+                                  files. The directory should contain only
+                                  FASTQ files as all the files within the
+                                  mentioned directory will be read. Ex: --
+                                  input /path/to/fastq_pass
+
+--output                        : Absolute path to directory where all the
+                                  pipeline outputs should be stored. Ex: --
+                                  output /path/to/output
+
+Other options                   :
+
+--metadata                      : Absolute path to metadata CSV file
+                                  containing five mandatory columns: sample,
+                                  fq1,fq2,strandedness,single_end. The fq1
+                                  and fq2 columns contain absolute paths to
+                                  the FASTQ files. This option can be used in
+                                  place of --input option. This is rare. Ex
+                                  : --metadata samplesheet.csv
+
+--fq_suffix                     : The suffix of FASTQ files (Unpaired reads
+                                  or R1 reads or Long reads) if an input
+                                  directory is mentioned via --input option.
+                                  Default: _R1_001.fastq.gz
+
+--fq2_suffix                    : The suffix of FASTQ files (Paired-end reads
+                                  or R2 reads) if an input directory is
+                                  mentioned via --input option. Default:
+                                  _R2_001.fastq.gz
+
+--fq_filter_by_len              : Remove FASTQ reads that are less than this
+                                  many bases. Default: 0
+
+--fq_strandedness               : The strandedness of the sequencing run.
+                                  This is mostly needed if your sequencing
+                                  run is RNA-SEQ. For most of the other runs
+                                  , it is probably safe to use unstranded for
+                                  the option. Default: unstranded
+
+--fq_single_end                 : SINGLE-END information will be auto-
+                                  detected but this option forces PAIRED-END
+                                  FASTQ files to be treated as SINGLE-END so
+                                  only read 1 information is included in auto
+                                  -generated samplesheet. Default: false
+
+--fq_filename_delim             : Delimiter by which the file name is split
+                                  to obtain sample name. Default: _
+
+--fq_filename_delim_idx         : After splitting FASTQ file name by using
+                                  the --fq_filename_delim option, all
+                                  elements before this index (1-based) will
+                                  be joined to create final sample name.
+                                  Default: 1
+
+--fastp_run                     : Run fastp tool. Default: true
+
+--fastp_failed_out              : Specify whether to store reads that cannot
+                                  pass the filters. Default: false
+
+--fastp_merged_out              : Specify whether to store merged output or
+                                  not. Default: false
+
+--fastp_overlapped_out          : For each read pair, output the overlapped
+                                  region if it has no mismatched base.
+                                  Default: false
+
+--fastp_6                       : Indicate that the input is using phred64
+                                  scoring (it'll be converted to phred33, so
+                                  the output will still be phred33). Default
+                                  : false
+
+--fastp_reads_to_process        : Specify how many reads/pairs are to be
+                                  processed. Default value 0 means process
+                                  all reads. Default: 0
+
+--fastp_fix_mgi_id              : The MGI FASTQ ID format is not compatible
+                                  with many BAM operation tools, enable this
+                                  option to fix it. Default: false
+
+--fastp_A                       : Disable adapter trimming. On by default.
+                                  Default: false
+
+--fastp_adapter_fasta           : Specify a FASTA file to trim both read1 and
+                                  read2 (if PE) by all the sequences in this
+                                  FASTA file. Default: false
+
+--fastp_f                       : Trim how many bases in front of read1.
+                                  Default: 0
+
+--fastp_t                       : Trim how many bases at the end of read1.
+                                  Default: 0
+
+--fastp_b                       : Max length of read1 after trimming. Default
+                                  : 0
+
+--fastp_F                       : Trim how many bases in front of read2.
+                                  Default: 0
+
+--fastp_T                       : Trim how many bases at the end of read2.
+                                  Default: 0
+
+--fastp_B                       : Max length of read2 after trimming. Default
+                                  : 0
+
+--fastp_dedup                   : Enable deduplication to drop the duplicated
+                                  reads/pairs. Default: true
+
+--fastp_dup_calc_accuracy       : Accuracy level to calculate duplication (1~
+                                  6), higher level uses more memory (1G, 2G,
+                                  4G, 8G, 16G, 24G). Default 1 for no-dedup
+                                  mode, and 3 for dedup mode. Default: 6
+
+--fastp_poly_g_min_len          : The minimum length to detect polyG in the
+                                  read tail. Default: 10
+
+--fastp_G                       : Disable polyG tail trimming. Default: true
+
+--fastp_x                       : Enable polyX trimming in 3' ends. Default:
+                                  false
+
+--fastp_poly_x_min_len          : The minimum length to detect polyX in the
+                                  read tail. Default: 10
+
+--fastp_cut_front               : Move a sliding window from front (5') to
+                                  tail, drop the bases in the window if its
+                                  mean quality < threshold, stop otherwise.
+                                  Default: true
+
+--fastp_cut_tail                : Move a sliding window from tail (3') to
+                                  front, drop the bases in the window if its
+                                  mean quality < threshold, stop otherwise.
+                                  Default: false
+
+--fastp_cut_right               : Move a sliding window from tail, drop the
+                                  bases in the window and the right part if
+                                  its mean quality < threshold, and then stop
+                                  . Default: true
+
+--fastp_W                       : Sliding window size shared by --
+                                  fastp_cut_front, --fastp_cut_tail and --
+                                  fastp_cut_right. Default: 20
+
+--fastp_M                       : The mean quality requirement shared by --
+                                  fastp_cut_front, --fastp_cut_tail and --
+                                  fastp_cut_right. Default: 30
+
+--fastp_q                       : The quality value below which a base should
+                                  is not qualified. Default: 30
+
+--fastp_u                       : What percent of bases are allowed to be
+                                  unqualified. Default: 40
+
+--fastp_n                       : How many N's can a read have. Default: 5
+
+--fastp_e                       : If the full reads' average quality is below
+                                  this value, then it is discarded. Default
+                                  : 0
+
+--fastp_l                       : Reads shorter than this length will be
+                                  discarded. Default: 35
+
+--fastp_max_len                 : Reads longer than this length will be
+                                  discarded. Default: 0
+
+--fastp_y                       : Enable low complexity filter. The
+                                  complexity is defined as the percentage of
+                                  bases that are different from its next base
+                                  (base[i] != base[i+1]). Default: true
+
+--fastp_Y                       : The threshold for low complexity filter (0~
+                                  100). Ex: A value of 30 means 30%
+                                  complexity is required. Default: 30
+
+--fastp_U                       : Enable Unique Molecular Identifier (UMI)
+                                  pre-processing. Default: false
+
+--fastp_umi_loc                 : Specify the location of UMI, can be one of
+                                  index1/index2/read1/read2/per_index/
+                                  per_read. Default: false
+
+--fastp_umi_len                 : If the UMI is in read1 or read2, its length
+                                  should be provided. Default: false
+
+--fastp_umi_prefix              : If specified, an underline will be used to
+                                  connect prefix and UMI (i.e. prefix=UMI,
+                                  UMI=AATTCG, final=UMI_AATTCG). Default:
+                                  false
+
+--fastp_umi_skip                : If the UMI is in read1 or read2, fastp can
+                                  skip several bases following the UMI.
+                                  Default: false
+
+--fastp_p                       : Enable overrepresented sequence analysis.
+                                  Default: true
+
+--fastp_P                       : One in this many number of reads will be
+                                  computed for overrepresentation analysis (1
+                                  ~10000), smaller is slower. Default: 20
+
+--kmaalign_run                  : Run kma tool. Default: true
+
+--kmaalign_int                  : Input file has interleaved reads.  Default
+                                  : false
+
+--kmaalign_ef                   : Output additional features. Default: false
+
+--kmaalign_vcf                  : Output vcf file. 2 to apply FT. Default:
+                                  false
+
+--kmaalign_sam                  : Output SAM, 4/2096 for mapped/aligned.
+                                  Default: false
+
+--kmaalign_nc                   : No consensus file. Default: true
+
+--kmaalign_na                   : No aln file. Default: true
+
+--kmaalign_nf                   : No frag file. Default: false
+
+--kmaalign_a                    : Output all template mappings. Default:
+                                  false
+
+--kmaalign_and                  : Use both -mrs and p-value on consensus.
+                                  Default: true
+
+--kmaalign_oa                   : Use neither -mrs or p-value on consensus.
+                                  Default: false
+
+--kmaalign_bc                   : Minimum support to call bases. Default:
+                                  false
+
+--kmaalign_bcNano               : Altered indel calling for ONT data. Default
+                                  : false
+
+--kmaalign_bcd                  : Minimum depth to call bases. Default: false
+
+--kmaalign_bcg                  : Maintain insignificant gaps. Default: false
+
+--kmaalign_ID                   : Minimum consensus ID. Default: 85.0
+
+--kmaalign_md                   : Minimum depth. Default: false
+
+--kmaalign_dense                : Skip insertion in consensus. Default: false
+
+--kmaalign_ref_fsa              : Use Ns on indels. Default: false
+
+--kmaalign_Mt1                  : Map everything to one template. Default:
+                                  false
+
+--kmaalign_1t1                  : Map one query to one template. Default:
+                                  false
+
+--kmaalign_mrs                  : Minimum relative alignment score. Default:
+                                  0.99
+
+--kmaalign_mrc                  : Minimum query coverage. Default: 0.99
+
+--kmaalign_mp                   : Minimum phred score of trailing and leading
+                                  bases. Default: 30
+
+--kmaalign_mq                   : Set the minimum mapping quality. Default:
+                                  false
+
+--kmaalign_eq                   : Minimum average quality score. Default: 30
+
+--kmaalign_5p                   : Trim 5 prime by this many bases. Default:
+                                  false
+
+--kmaalign_3p                   : Trim 3 prime by this many bases Default:
+                                  false
+
+--kmaalign_apm                  : Sets both -pm and -fpm Default: false
+
+--kmaalign_cge                  : Set CGE penalties and rewards Default:
+                                  false
+
+--seqkit_grep_run               : Run the seqkit `grep` tool. Default: true
+
+--seqkit_grep_n                 : Match by full name instead of just ID.
+                                  Default: undefined
+
+--seqkit_grep_s                 : Search subseq on seq, both positive and
+                                  negative strand are searched, and mismatch
+                                  allowed using flag --seqkit_grep_m. Default
+                                  : undefined
+
+--seqkit_grep_c                 : Input is circular genome Default: undefined
+
+--seqkit_grep_C                 : Just print a count of matching records.
+                                  With the --seqkit_grep_v flag, count non-
+                                  matching records. Default: undefined
+
+--seqkit_grep_i                 : Ignore case while using seqkit grep.
+                                  Default: undefined
+
+--seqkit_grep_v                 : Invert the match i.e. select non-matching
+                                  records. Default: undefined
+
+--seqkit_grep_m                 : Maximum mismatches when matching by
+                                  sequence. Default: undefined
+
+--seqkit_grep_r                 : Input patters are regular expressions.
+                                  Default: undefined
+
+--salmonidx_run                 : Run `salmon index` tool. Default: true
+
+--salmonidx_k                   : The size of k-mers that should be used for
+                                  the  quasi index. Default: false
+
+--salmonidx_gencode             : This flag will expect the input transcript
+                                  FASTA to be in GENCODE format, and will
+                                  split the transcript name at the first `|`
+                                  character. These reduced names will be used
+                                  in the output and when looking for these
+                                  transcripts in a gene to transcript GTF.
+                                  Default: false
+
+--salmonidx_features            : This flag will expect the input reference
+                                  to be in the tsv file format, and will
+                                  split the feature name at the first `tab`
+                                  character. These reduced names will be used
+                                  in the output and when looking for the
+                                  sequence of the features. GTF. Default:
+                                  false
+
+--salmonidx_keepDuplicates      : This flag will disable the default indexing
+                                  behavior of discarding sequence-identical
+                                  duplicate transcripts. If this flag is
+                                  passed then duplicate transcripts that
+                                  appear in the input will be retained and
+                                  quantified separately. Default: true
+
+--salmonidx_keepFixedFasta      : Retain the fixed fasta file (without short
+                                  transcripts and duplicates, clipped, etc.)
+                                  generated during indexing. Default: false
+
+--salmonidx_filterSize          : The size of the Bloom filter that will be
+                                  used by TwoPaCo during indexing. The filter
+                                  will be of size 2^{filterSize}. A value of
+                                  -1 means that the filter size will be
+                                  automatically set based on the number of
+                                  distinct k-mers in the input, as estimated
+                                  by nthll. Default: false
+
+--salmonidx_sparse              : Build the index using a sparse sampling of
+                                  k-mer positions This will require less
+                                  memory (especially during quantification),
+                                  but will take longer to constructand can
+                                  slow down mapping / alignment. Default:
+                                  false
+
+--salmonidx_n                   : Do not clip poly-A tails from the ends of
+                                  target sequences. Default: true
+
+--sourmashsketch_run            : Run `sourmash sketch dna` tool. Default:
+                                  true
+
+--sourmashsketch_mode           : Select which type of signatures to be
+                                  created: dna, protein, fromfile or
+                                  translate. Default: dna
+
+--sourmashsketch_p              : Signature parameters to use. Default: '
+                                  abund,scaled=100,k=71
+
+--sourmashsketch_file           : <path>  A text file containing a list of
+                                  sequence files to load. Default: false
+
+--sourmashsketch_f              : Recompute signatures even if the file
+                                  exists. Default: false
+
+--sourmashsketch_name           : Name the signature generated from each file
+                                  after the first record in the file.
+                                  Default: false
+
+--sourmashsketch_randomize      : Shuffle the list of input files randomly.
+                                  Default: false
+
+--sourmashgather_run            : Run `sourmash gather` tool. Default: true
+
+--sourmashgather_n              : Number of results to report. By default,
+                                  will terminate at --sourmashgather_thr_bp
+                                  value. Default: false
+
+--sourmashgather_thr_bp         : Reporting threshold (in bp) for estimated
+                                  overlap with remaining query. Default: 100
+
+--sourmashgather_ani_ci         : Output confidence intervals for ANI
+                                  estimates. Default: true
+
+--sourmashgather_k              : The k-mer size to select. Default: 71
+
+--sourmashgather_dna            : Choose DNA signature. Default: true
+
+--sourmashgather_rna            : Choose RNA signature. Default: false
+
+--sourmashgather_nuc            : Choose Nucleotide signature. Default: false
+
+--sourmashgather_scaled         : Scaled value should be between 100 and 1e6
+                                  . Default: false
+
+--sourmashgather_inc_pat        : Search only signatures that match this
+                                  pattern in name, filename, or md5. Default
+                                  : false
+
+--sourmashgather_exc_pat        : Search only signatures that do not match
+                                  this pattern in name, filename, or md5.
+                                  Default: false
+
+--sfhpy_run                     : Run the sourmash_filter_hits.py script.
+                                  Default: true
+
+--sfhpy_fcn                     : Column name by which filtering of rows
+                                  should be applied. Default: f_match
+
+--sfhpy_fcv                     : Remove genomes whose match with the query
+                                  FASTQ is less than this much. Default: 0.8
+
+--sfhpy_gt                      : Apply greather than or equal to condition
+                                  on numeric values of --sfhpy_fcn column.
+                                  Default: true
+
+--sfhpy_lt                      : Apply less than or equal to condition on
+                                  numeric values of --sfhpy_fcn column.
+                                  Default: false
+
+--sfhpy_all                     : Instead of just the column value, print
+                                  entire row. Default: true
+
+--gsalkronapy_run               : Run the `gen_salmon_tph_and_krona_tsv.py`
+                                  script. Default: true
+
+--gsalkronapy_sf                : Set the scaling factor by which TPM values
+                                  are scaled down. Default: 10000
+
+--gsalkronapy_smres_suffix      : Find the `sourmash gather` result files
+                                  ending in this suffix. Default: false
+
+--gsalkronapy_failed_suffix     : Find the sample names which failed
+                                  classification stored inside the files
+                                  ending in this suffix. Default: false
+
+--gsalkronapy_num_lin_cols      : Number of columns expected in the lineages
+                                  CSV file.  Default: false
+
+--gsalkronapy_lin_regex         : Number of columns expected in the lineages
+                                  CSV file.  Default: false
+
+--krona_ktIT_run                : Run the ktImportText (ktIT) from krona.
+                                  Default: true
+
+--krona_ktIT_n                  : Name of the highest level. Default: all
+
+--krona_ktIT_q                  : Input file(s) do not have a field for
+                                  quantity. Default: false
+
+--krona_ktIT_c                  : Combine data from each file, rather than
+                                  creating separate datasets within the chart
+                                  . Default: false
+
+Help options                    :
+
+--help                          : Display this message.
+```
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/subworkflows/process_fastq.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,144 @@
+// Include any necessary methods and modules
+include { stopNow; validateParamsForFASTQ } from "${params.routines}"
+include { GEN_SAMPLESHEET                 } from "${params.modules}${params.fs}gen_samplesheet${params.fs}main"
+include { SAMPLESHEET_CHECK               } from "${params.modules}${params.fs}samplesheet_check${params.fs}main"
+include { CAT_FASTQ                       } from "${params.modules}${params.fs}cat${params.fs}fastq${params.fs}main"
+include { SEQKIT_SEQ                      } from "${params.modules}${params.fs}seqkit${params.fs}seq${params.fs}main"
+
+// Validate 4 required workflow parameters if
+// FASTQ files are the input for the
+// entry point.
+validateParamsForFASTQ()
+
+// Start the subworkflow
+workflow PROCESS_FASTQ {
+    main:
+        versions = Channel.empty()
+        input_ch = Channel.empty()
+        reads = Channel.empty()
+
+        def input = file( (params.input ?: params.metadata) )
+
+        if (params.input) {
+            def fastq_files = []
+
+            if (params.fq_suffix == null) {
+            stopNow("We need to know what suffix the FASTQ files ends with inside the\n" +
+                "directory. Please use the --fq_suffix option to indicate the file\n" +
+                "suffix by which the files are to be collected to run the pipeline on.")
+            }
+
+            if (params.fq_strandedness == null) {
+                stopNow("We need to know if the FASTQ files inside the directory\n" +
+                    "are sequenced using stranded or non-stranded sequencing. This is generally\n" +
+                    "required if the sequencing experiment is RNA-SEQ. For almost all of the other\n" +
+                    "cases, you can probably use the --fq_strandedness unstranded option to indicate\n" +
+                    "that the reads are unstranded.")
+            }
+
+            if (params.fq_filename_delim == null || params.fq_filename_delim_idx == null) {
+                stopNow("We need to know the delimiter of the filename of the FASTQ files.\n" +
+                    "By default the filename delimiter is _ (underscore). This delimiter character\n" +
+                    "is used to split and assign a group name. The group name can be controlled by\n" +
+                    "using the --fq_filename_delim_idx option (1-based). For example, if the FASTQ\n" +
+                    "filename is WT_REP1_001.fastq, then to create a group WT, use the following\n" +
+                    "options: --fq_filename_delim _ --fq_filename_delim_idx 1")
+            }
+
+            if (!input.exists()) {
+                stopNow("The input directory,\n${params.input}\ndoes not exist!")
+            }
+
+            input.eachFileRecurse {
+                it.name.endsWith("${params.fq_suffix}") ? fastq_files << it : fastq_files << null
+            }
+
+            if (fastq_files.findAll{ it != null }.size() == 0) {
+                stopNow("The input directory,\n${params.input}\nis empty! or does not " +
+                    "have FASTQ files ending with the suffix: ${params.fq_suffix}")
+            }
+
+            GEN_SAMPLESHEET( Channel.fromPath(params.input, type: 'dir') )
+            GEN_SAMPLESHEET.out.csv.set{ input_ch }
+            versions.mix( GEN_SAMPLESHEET.out.versions )
+                .set { versions }
+        } else if (params.metadata) {
+            if (!input.exists()) {
+                stopNow("The metadata CSV file,\n${params.metadata}\ndoes not exist!")
+            }
+
+            if (input.size() <= 0) {
+                stopNow("The metadata CSV file,\n${params.metadata}\nis empty!")
+            }
+
+            Channel.fromPath(params.metadata, type: 'file')
+                .set { input_ch }
+        }
+
+        SAMPLESHEET_CHECK( input_ch )
+            .csv
+            .splitCsv( header: true, sep: ',')
+            .map { create_fastq_channel(it) }
+            .groupTuple(by: [0])
+            .branch {
+                meta, fastq ->
+                    single   : fastq.size() == 1
+                        return [ meta, fastq.flatten() ]
+                    multiple : fastq.size() > 1
+                        return [ meta, fastq.flatten() ]
+            }
+            .set { reads }
+
+        CAT_FASTQ( reads.multiple )
+            .catted_reads
+            .mix( reads.single )
+            .set { processed_reads }
+
+        if (params.fq_filter_by_len.toInteger() > 0) {
+            SEQKIT_SEQ( processed_reads )
+                .fastx
+                .set { processed_reads }
+
+            versions.mix( SEQKIT_SEQ.out.versions.first().ifEmpty(null) )
+                .set { versions }
+        }
+
+        versions.mix(
+            SAMPLESHEET_CHECK.out.versions,
+            CAT_FASTQ.out.versions.first().ifEmpty(null)
+        )
+        .set { versions }
+
+    emit:
+        processed_reads
+        versions
+}
+
+// Function to get list of [ meta, [ fq1, fq2 ] ]
+def create_fastq_channel(LinkedHashMap row) {
+
+    def meta = [:]
+    meta.id           = row.sample
+    meta.single_end   = row.single_end.toBoolean()
+    meta.strandedness = row.strandedness
+    meta.id = meta.id.split(params.fq_filename_delim)[0..params.fq_filename_delim_idx.toInteger() - 1]
+        .join(params.fq_filename_delim)
+    meta.id = (meta.id =~ /\./ ? meta.id.take(meta.id.indexOf('.')) : meta.id)
+
+    def array = []
+
+    if (!file(row.fq1).exists()) {
+        stopNow("Please check input metadata CSV. The following Read 1 FASTQ file does not exist!" +
+            "\n${row.fq1}")
+    }
+    if (meta.single_end) {
+        array = [ meta, [ file(row.fq1) ] ]
+    } else {
+        if (!file(row.fq2).exists()) {
+            stopNow("Please check input metadata CSV. The following Read 2 FASTQ file does not exist!" +
+                "\n${row.fq2}")
+        }
+        array = [ meta, [ file(row.fq1), file(row.fq2) ] ]
+    }
+    return array
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/subworkflows/prodka.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,45 @@
+// Include any necessary methods and modules
+include { PRODIGAL                } from "${params.modules}${params.fs}prodigal${params.fs}main"
+include { PROKKA                  } from "${params.modules}${params.fs}prokka${params.fs}main"
+
+// Start the subworkflow
+workflow PRODKA {
+    take:
+        trained_asm
+        predict_asm
+
+    main:
+        PRODIGAL(
+            trained_asm,
+            (params.prodigal_f ?: 'gbk')
+        )
+
+        PROKKA(
+            predict_asm
+                .join(PRODIGAL.out.proteins)
+                .join(PRODIGAL.out.trained)
+        )
+
+        PRODIGAL.out.versions
+            .mix( PROKKA.out.versions )
+            .set{ versions }
+    emit:
+        prodigal_gene_annots     = PRODIGAL.out.gene_annotations
+        prodigal_fna             = PRODIGAL.out.cds
+        prodigal_faa             = PRODIGAL.out.proteins
+        prodigal_all_gene_annots = PRODIGAL.out.all_gene_annotations
+        prodigal_trained         = PRODIGAL.out.trained
+        prokka_gff               = PROKKA.out.gff
+        prokka_gbk               = PROKKA.out.gbk
+        prokka_fna               = PROKKA.out.fna
+        prokka_sqn               = PROKKA.out.sqn
+        prokka_ffn               = PROKKA.out.ffn
+        prokka_fsa               = PROKKA.out.fsa
+        prokka_faa               = PROKKA.out.faa
+        prokka_tbl               = PROKKA.out.tbl
+        prokka_err               = PROKKA.out.err
+        prokka_log               = PROKKA.out.log
+        prokka_txt               = PROKKA.out.txt
+        prokka_tsv               = PROKKA.out.tsv
+        versions
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/conf/nowayout.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,180 @@
+params {
+    workflow_conceived_by = 'Kranti Konganti'
+    workflow_built_by = 'Kranti Konganti'
+    workflow_version = '0.5.0'
+    db_mode = 'mitomine2'
+    db_root = '/server/galaxy/tool-data/nowayout-db'
+    nowo_thresholds = 'strict'
+    fastp_run = true
+    fastp_failed_out = false
+    fastp_merged_out = false
+    fastp_overlapped_out = false
+    fastp_6 = false
+    fastp_reads_to_process = 0
+    fastp_fix_mgi_id = false
+    fastp_A = false
+    fastp_use_custom_adapters = false
+    fastp_adapter_fasta = (params.fastp_use_custom_adapters ? "${projectDir}"
+        + File.separator
+        + 'assets'
+        + File.separator
+        + 'adaptors.fa' : false)
+    fastp_f = 0
+    fastp_t = 0
+    fastp_b = 0
+    fastp_F = 0
+    fastp_T = 0
+    fastp_B = 0
+    fastp_dedup = true
+    fastp_dup_calc_accuracy = 6
+    fastp_poly_g_min_len = 10
+    fastp_G = true
+    fastp_x = false
+    fastp_poly_x_min_len = 10
+    fastp_cut_front = true
+    fastp_cut_tail = false
+    fastp_cut_right = true
+    fastp_W = 20
+    fastp_M = 30
+    fastp_q = 30
+    fastp_u = 40
+    fastp_n = 5
+    fastp_e = 0
+    fastp_l = 35
+    fastp_max_len = 0
+    fastp_y = true
+    fastp_Y = 30
+    fastp_U = false
+    fastp_umi_loc = false
+    fastp_umi_len = false
+    fastp_umi_prefix = false
+    fastp_umi_skip = false
+    fastp_p = true
+    fastp_P = 20
+    kmaalign_run = true
+    kmaalign_idx = ("${params.db_root}"
+        + File.separator
+        + "kma"
+        + File.separator
+        + "${params.db_mode}")
+    kmaalign_ignorequals = false
+    kmaalign_int = false
+    kmaalign_ef = false
+    kmaalign_vcf = false
+    kmaalign_sam = false
+    kmaalign_nc = true
+    kmaalign_na = true
+    kmaalign_nf = false
+    kmaalign_a = false
+    kmaalign_and = true
+    kmaalign_oa = false
+    kmaalign_bc = false
+    kmaalign_bcNano = false
+    kmaalign_bcd = false
+    kmaalign_bcg = false
+    kmaalign_ID = (params.nowo_thresholds =~ /strict|mild/ ? 85.0 : 50.0)
+    kmaalign_md = false
+    kmaalign_dense = false
+    kmaalign_ref_fsa = false
+    kmaalign_Mt1 = false
+    kmaalign_1t1 = false
+    kmaalign_mrs = (params.nowo_thresholds ==~ /strict/ ? 0.99 : 0.90)
+    kmaalign_mrc = (params.nowo_thresholds ==~ /strict/ ? 0.99 : 0.90)
+    kmaalign_mp = (params.nowo_thresholds ==~ /strict/ ? 30 : 20)
+    kmaalign_eq = (params.nowo_thresholds ==~ /strict/ ? 30 : 20)
+    kmaalign_mrs = (params.nowo_thresholds ==~ /mild/ ? 0.90 : params.kmaalign_mrs)
+    kmaalign_mrc = (params.nowo_thresholds ==~ /mild/ ? 0.90 : params.kmaalign_mrc)
+    kmaalign_mp = (params.nowo_thresholds ==~ /mild/ ? 20 : params.kmaalign_mp)
+    kmaalign_eq = (params.nowo_thresholds ==~ /mild/ ? 20 : params.kmaalign_eq)
+    kmaalign_mp = (params.kmaalign_ignorequals ? 0 : params.kmaalign_mp)
+    kmaalign_eq = (params.kmaalign_ignorequals ? 0 : params.kmaalign_eq)
+    kmaalign_mq = false
+    kmaalign_5p = false
+    kmaalign_3p = false
+    kmaalign_apm = false
+    kmaalign_cge = false
+    tuspy_gd = false
+    seqkit_grep_run = true
+    seqkit_grep_n = false
+    seqkit_grep_s = false
+    seqkit_grep_c = false
+    seqkit_grep_C = false
+    seqkit_grep_i = false
+    seqkit_grep_v = false
+    seqkit_grep_m = false
+    seqkit_grep_r = false
+    salmonidx_run = true
+    salmonidx_k = false
+    salmonidx_gencode = false
+    salmonidx_features = false
+    salmonidx_keepDuplicates = true
+    salmonidx_keepFixedFasta = false
+    salmonidx_filterSize = false
+    salmonidx_sparse = false
+    salmonidx_n = true
+    salmonidx_decoys = false
+    salmonalign_libtype = 'SF'
+    ref_fna = ("${params.db_root}"
+        + File.separator
+        + "reference"
+        + File.separator
+        + "${params.db_mode}"
+        + ".fna")
+    sourmash_k = (params.nowo_thresholds ==~ /strict/ ? 71 : 51)
+    sourmash_scale = (params.nowo_thresholds ==~ /strict/ ? 100 : 100)
+    sourmashsketch_run = true
+    sourmashsketch_mode = 'dna'
+    sourmashsketch_file = false
+    sourmashsketch_f = false
+    sourmashsketch_name = false
+    sourmashsketch_p = "'abund,scaled=${params.sourmash_scale},k=${params.sourmash_k}'"
+    sourmashsketch_randomize = false
+    sourmashgather_run = (params.sourmashsketch_run ?: false)
+    sourmashgather_n = false
+    sourmashgather_thr_bp = (params.nowo_thresholds ==~ /strict/ ? 100 : 100)
+    sourmashgather_ignoreabn = false
+    sourmashgather_prefetch = false
+    sourmashgather_noprefetch = false
+    sourmashgather_ani_ci = true
+    sourmashgather_k = "${params.sourmash_k}"
+    sourmashgather_protein = false
+    sourmashgather_rna = false
+    sourmashgather_nuc = false
+    sourmashgather_noprotein = false
+    sourmashgather_dayhoff = false
+    sourmashgather_nodayhoff = false
+    sourmashgather_hp = false
+    sourmashgather_nohp = false
+    sourmashgather_dna = true
+    sourmashgather_nodna = false
+    sourmashgather_scaled = false
+    sourmashgather_inc_pat = false
+    sourmashgather_exc_pat = false
+    sfhpy_run = true
+    sfhpy_fcn = 'f_match'
+    sfhpy_fcv = (params.nowo_thresholds ==~ /strict/ ? "0.8" : "0.5")
+    sfhpy_gt = true
+    sfhpy_lt = false
+    sfhpy_all = true
+    lineages_csv = ("${params.db_root}"
+        + File.separator
+        + "taxonomy"
+        + File.separator
+        + "${params.db_mode}"
+        + File.separator
+        + "lineages.csv")
+    gsalkronapy_run = true
+    gsalkronapy_sf = 10000
+    gsalkronapy_smres_suffix = false
+    gsalkronapy_failed_suffix = false
+    gsalkronapy_num_lin_cols = false
+    gsalkronapy_lin_regex = false
+    krona_ktIT_run = true
+    krona_ktIT_n = 'all'
+    krona_ktIT_q = false
+    krona_ktIT_c = false
+    krona_res_suffix = '.krona.tsv'
+    fq_filter_by_len = 0
+    fq_suffix = (params.fq_single_end ? '.fastq.gz' : '_R1_001.fastq.gz')
+    fq2_suffix = '_R2_001.fastq.gz'
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/conf/process/nowayout.process.config	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,121 @@
+process {
+    withName: 'SEQKIT_SEQ' {
+        ext.args = [
+            params.fq_filter_by_len ? "-m ${params.fq_filter_by_len}" : ''
+        ].join(' ').trim()
+    }
+
+    // withName: 'SAMTOOLS_FASTQ' {
+    //     ext.args = (params.fq_single_end ? '-F 4' : '-f 2')
+    // }
+
+    if (params.fastp_run) {
+        withName: 'FASTP' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}fastp.nf").fastpHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.kmaalign_run) {
+        withName: 'KMA_ALIGN' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}kmaalign.nf").kmaalignHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.seqkit_grep_run) {
+        withName: 'SEQKIT_GREP' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}seqkitgrep.nf").seqkitgrepHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.salmonidx_run){
+        withName: 'SALMON_INDEX' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}salmonidx.nf").salmonidxHelp(params).helpparams
+            )
+        }
+
+        withName: 'SALMON_QUANT' {
+            errorStrategy = 'ignore'
+            ext.args = '--minAssignedFrags 1'
+        }
+    }
+
+    if (params.sourmashsketch_run) {
+        withName: 'SOURMASH_SKETCH' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}sourmashsketch.nf").sourmashsketchHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.sourmashgather_run) {
+        withName: 'SOURMASH_GATHER' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}sourmashgather.nf").sourmashgatherHelp(params).helpparams
+            )
+
+            if (params.sfhpy_run) {
+                ext.args2 = addParamsToSummary(
+                    loadThisFunction("${params.toolshelp}${params.fs}sfhpy.nf").sfhpyHelp(params).helpparams
+                )
+            }
+        }
+    }
+
+    // if (params.sourmashtaxmetagenome_run) {
+    //     withName: 'SOURMASH_TAX_METAGENOME' {
+    //         ext.args = addParamsToSummary(
+    //             loadThisFunction("${params.toolshelp}${params.fs}sourmashtaxmetagenome.nf").sourmashtaxmetagenomeHelp(params).helpparams
+    //         )
+    //     }
+    // }
+
+    if (params.gsalkronapy_run) {
+        withName: 'NOWAYOUT_RESULTS' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}gsalkronapy.nf").gsalkronapyHelp(params).helpparams
+            )
+        }
+    }
+
+    if (params.krona_ktIT_run) {
+        withName: 'KRONA_KTIMPORTTEXT' {
+            ext.args = addParamsToSummary(
+                loadThisFunction("${params.toolshelp}${params.fs}kronaktimporttext.nf").kronaktimporttextHelp(params).helpparams
+            )
+        }
+    }
+}
+
+// Method to instantiate a new function parser
+// Need to refactor using ScriptParser... another day
+def loadThisFunction (func_file) {
+    GroovyShell grvy_sh = new GroovyShell()
+    def func = grvy_sh.parse(new File ( func_file ) )
+    return func
+}
+
+// Method to add relevant final parameters to summary log
+def addParamsToSummary(Map params_to_add = [:]) {
+
+    if (!params_to_add.isEmpty()) {
+        def not_null_params_to_add = params_to_add.findAll {
+            it.value.clivalue != null &&
+                it.value.clivalue != '[:]' &&
+                it.value.clivalue != ''
+        }
+
+        params.logtheseparams += not_null_params_to_add.keySet().toList()
+
+        return not_null_params_to_add.collect {
+            "${it.value.cliflag} ${it.value.clivalue.toString().replaceAll(/(?:^\s+|\s+$)/, '')}"
+        }.join(' ').trim()
+    }
+    return 1
+}
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/0.5.0/workflows/nowayout.nf	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,349 @@
+// Define any required imports for this specific workflow
+import java.nio.file.Paths
+import java.util.zip.GZIPInputStream
+import java.io.FileInputStream
+import nextflow.file.FileHelper
+
+
+// Include any necessary methods
+include { \
+    summaryOfParams; stopNow; fastqEntryPointHelp; sendMail; \
+    addPadding; wrapUpHelp           } from "${params.routines}"
+include { fastpHelp                  } from "${params.toolshelp}${params.fs}fastp"
+include { kmaalignHelp               } from "${params.toolshelp}${params.fs}kmaalign"
+include { seqkitgrepHelp             } from "${params.toolshelp}${params.fs}seqkitgrep"
+include { salmonidxHelp              } from "${params.toolshelp}${params.fs}salmonidx"
+include { sourmashsketchHelp         } from "${params.toolshelp}${params.fs}sourmashsketch"
+include { sourmashgatherHelp         } from "${params.toolshelp}${params.fs}sourmashgather"
+include { sfhpyHelp                  } from "${params.toolshelp}${params.fs}sfhpy"
+include { gsalkronapyHelp            } from "${params.toolshelp}${params.fs}gsalkronapy"
+include { kronaktimporttextHelp      } from "${params.toolshelp}${params.fs}kronaktimporttext"
+
+// Exit if help requested before any subworkflows
+if (params.help) {
+    log.info help()
+    exit 0
+}
+
+
+// Include any necessary modules and subworkflows
+include { PROCESS_FASTQ           } from "${params.subworkflows}${params.fs}process_fastq"
+include { FASTP                   } from "${params.modules}${params.fs}fastp${params.fs}main"
+include { KMA_ALIGN               } from "${params.modules}${params.fs}kma${params.fs}align${params.fs}main"
+include { OTF_GENOME              } from "${params.modules}${params.fs}otf_genome${params.fs}main"
+include { SEQKIT_GREP             } from "${params.modules}${params.fs}seqkit${params.fs}grep${params.fs}main"
+include { SALMON_INDEX            } from "${params.modules}${params.fs}salmon${params.fs}index${params.fs}main"
+include { SALMON_QUANT            } from "${params.modules}${params.fs}salmon${params.fs}quant${params.fs}main"
+include { SOURMASH_SKETCH         } from "${params.modules}${params.fs}sourmash${params.fs}sketch${params.fs}main"
+include { SOURMASH_SKETCH \
+    as REDUCE_DB_IDX              } from "${params.modules}${params.fs}sourmash${params.fs}sketch${params.fs}main"
+include { SOURMASH_GATHER         } from "${params.modules}${params.fs}sourmash${params.fs}gather${params.fs}main"
+include { NOWAYOUT_RESULTS        } from "${params.modules}${params.fs}nowayout_results${params.fs}main"
+include { KRONA_KTIMPORTTEXT      } from "${params.modules}${params.fs}krona${params.fs}ktimporttext${params.fs}main"
+include { DUMP_SOFTWARE_VERSIONS  } from "${params.modules}${params.fs}custom${params.fs}dump_software_versions${params.fs}main"
+include { MULTIQC                 } from "${params.modules}${params.fs}multiqc${params.fs}main"
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    INPUTS AND ANY CHECKS FOR THE NOWAYOUT WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def reads_platform = 0
+reads_platform += (params.input ? 1 : 0)
+
+if (reads_platform < 1 || reads_platform == 0) {
+    stopNow("Please mention at least one absolute path to input folder which contains\n" +
+            "FASTQ files sequenced using the --input option.\n" +
+        "Ex: --input (Illumina or Generic short reads in FASTQ format)")
+}
+
+params.fastp_adapter_fasta ? checkMetadataExists(params.fastp_adapter_fasta, 'Adapter sequences FASTA') : null
+checkMetadataExists(params.lineages_csv, 'Lineages CSV')
+checkMetadataExists(params.kmaalign_idx, 'KMA Indices')
+checkMetadataExists(params.ref_fna, 'FASTA reference')
+
+ch_sourmash_lin = file( params.lineages_csv )
+
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    RUN THE NOWAYOUT WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow NOWAYOUT {
+    main:
+        log.info summaryOfParams()
+
+        PROCESS_FASTQ()
+
+        PROCESS_FASTQ.out.versions
+            .set { software_versions }
+
+        PROCESS_FASTQ.out.processed_reads
+            .set { ch_processed_reads }
+
+        ch_processed_reads
+            .map { meta, fastq ->
+                meta.get_kma_hit_accs = true
+                meta.salmon_decoys = params.dummyfile
+                meta.salmon_lib_type = (params.salmonalign_libtype ?: false)
+                meta.kma_t_db = params.kmaalign_idx
+                [ meta, fastq ]
+            }
+            .filter { meta, fastq ->
+                fq_file = ( fastq.getClass().toString() =~ /ArrayList/ ? fastq : [ fastq ] )
+                fq_gzip = new GZIPInputStream( new FileInputStream( fq_file[0].toAbsolutePath().toString() ) )
+                fq_gzip.read() != -1
+            }
+            .set { ch_processed_reads }
+
+        FASTP( ch_processed_reads )
+
+        FASTP.out.json
+            .map { meta, json ->
+                json
+            }
+            .collect()
+            .set { ch_multiqc }
+
+        KMA_ALIGN(
+            FASTP.out.passed_reads
+                .map { meta, fastq ->
+                    [meta, fastq, []]
+                }
+        )
+
+        OTF_GENOME(
+            KMA_ALIGN.out.hits
+                .join(KMA_ALIGN.out.frags)
+        )
+
+        OTF_GENOME.out.reads_extracted
+            .filter { meta, fasta ->
+                fa_file = ( fasta.getClass().toString() =~ /ArrayList/ ? fasta : [ fasta ] )
+                fa_gzip = new GZIPInputStream( new FileInputStream( fa_file[0].toAbsolutePath().toString() ) )
+                fa_gzip.read() != -1
+            }
+            .set { ch_mito_aln_reads }
+
+        SEQKIT_GREP(
+            KMA_ALIGN.out.hits
+                .filter { meta, mapped_refs ->
+                    patterns = file( mapped_refs )
+                    patterns.size() > 0
+                }
+                .map { meta, mapped_refs ->
+                    [meta, params.ref_fna, mapped_refs]
+                }
+        )
+
+        SALMON_INDEX( SEQKIT_GREP.out.fastx )
+
+        SALMON_QUANT(
+            ch_mito_aln_reads
+                .join( SALMON_INDEX.out.idx )
+        )
+
+        REDUCE_DB_IDX(
+            SEQKIT_GREP.out.fastx,
+            true,
+            false,
+            'db'
+        )
+
+        SOURMASH_SKETCH(
+            ch_mito_aln_reads,
+            false,
+            false,
+            'query'
+        )
+
+        SOURMASH_GATHER(
+            SOURMASH_SKETCH.out.signatures
+                .join( REDUCE_DB_IDX.out.signatures ),
+                [], [], [], []
+        )
+
+        // SOURMASH_TAX_METAGENOME(
+        //     SOURMASH_GATHER.out.result
+        //         .groupTuple(by: [0])
+        //         .map { meta, csv ->
+        //             [ meta, csv, ch_sourmash_lin ]
+        //         }
+        // )
+
+        // SOURMASH_TAX_METAGENOME.out.csv
+        //     .map { meta, csv ->
+        //         csv
+        //     }
+        //     .set { ch_lin_csv }
+
+        // SOURMASH_TAX_METAGENOME.out.tsv
+        //     .tap { ch_lin_krona }
+        //     .map { meta, tsv ->
+        //         tsv
+        //     }
+        //     .tap { ch_lin_tsv }
+
+        SOURMASH_GATHER.out.result
+            .groupTuple(by: [0])
+            .map { meta, csv ->
+                [ csv ]
+            }
+            .concat(
+                SALMON_QUANT.out.results
+                    .map { meta, salmon_res ->
+                        [ salmon_res ]
+                    }
+            )
+            .concat(
+                SOURMASH_GATHER.out.failed
+                    .map { meta, failed ->
+                        [ failed ]
+                    }
+            )
+            .concat( OTF_GENOME.out.failed )
+            .collect()
+            .flatten()
+            .collect()
+            .set { ch_gene_abn }
+
+        NOWAYOUT_RESULTS( ch_gene_abn, ch_sourmash_lin )
+
+        NOWAYOUT_RESULTS.out.tsv
+            .flatten()
+            .filter { tsv -> tsv.toString() =~ /.*${params.krona_res_suffix}$/ }
+            .map { tsv ->
+                    meta = [:]
+                    meta.id = "${params.cfsanpipename}_${params.pipeline}_krona"
+                    [ meta, tsv ]
+            }
+            .groupTuple(by: [0])
+            .set { ch_lin_krona }
+
+        // ch_lin_tsv
+        //     .mix( ch_lin_csv )
+        //     .collect()
+        //     .set { ch_lin_summary }
+
+        // SOURMASH_TAX_METAGENOME.out.txt
+        //     .map { meta, txt ->
+        //         txt
+        //     }
+        //     .collect()
+        //     .set { ch_lin_kreport }
+
+        // NOWAYOUT_RESULTS(
+        //     ch_lin_summary
+        //         .concat( SOURMASH_GATHER.out.failed )
+        //         .concat( OTF_GENOME.out.failed )
+        //         .collect()
+        // )
+
+        KRONA_KTIMPORTTEXT( ch_lin_krona )
+
+        DUMP_SOFTWARE_VERSIONS(
+            software_versions
+                .mix (
+                    FASTP.out.versions,
+                    KMA_ALIGN.out.versions,
+                    SEQKIT_GREP.out.versions,
+                    REDUCE_DB_IDX.out.versions,
+                    SOURMASH_SKETCH.out.versions,
+                    SOURMASH_GATHER.out.versions,
+                    SALMON_INDEX.out.versions,
+                    SALMON_QUANT.out.versions,
+                    NOWAYOUT_RESULTS.out.versions,
+                    KRONA_KTIMPORTTEXT.out.versions
+                )
+                .unique()
+                .collectFile(name: 'collected_versions.yml')
+        )
+
+        DUMP_SOFTWARE_VERSIONS.out.mqc_yml
+            .concat(
+                ch_multiqc,
+                NOWAYOUT_RESULTS.out.mqc_yml
+            )
+            .collect()
+            .flatten()
+            .collect()
+            .set { ch_multiqc }
+
+        MULTIQC( ch_multiqc )
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    ON COMPLETE, SHOW GORY DETAILS OF ALL PARAMS WHICH WILL BE HELPFUL TO DEBUG
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+workflow.onComplete {
+    if (workflow.success) {
+        sendMail()
+    }
+}
+
+workflow.onError {
+    sendMail()
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    METHOD TO CHECK METADATA EXISTENCE
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def checkMetadataExists(file_path, msg) {
+    file_path_obj = file( file_path )
+
+    if (msg.toString().find(/(?i)KMA/)) {
+        if (!file_path_obj.parent.exists()) {
+            stopNow("Please check if your ${msg}\n" +
+                "[ ${file_path} ]\nexists and that the files are not of size 0.")
+        }
+
+        // Check if db files within parent path are empty.
+        file_path_obj.parent.eachFileRecurse {
+            if (it.size() == 0) {
+                stopNow("For ${msg}, within\n" +
+                "[ ${file_path} ],\nthe following file is of size 0: ${it.name}")
+            }
+        }
+
+    }
+    else if (!file_path_obj.exists() || file_path_obj.size() == 0) {
+        stopNow("Please check if your ${msg} file\n" +
+            "[ ${file_path} ]\nexists and is not of size 0.")
+    }
+}
+
+/*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    HELP TEXT METHODS FOR NOWAYOUT WORKFLOW
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*/
+
+def help() {
+
+    Map helptext = [:]
+
+    helptext.putAll (
+        fastqEntryPointHelp() +
+        fastpHelp(params).text +
+        kmaalignHelp(params).text +
+        seqkitgrepHelp(params).text +
+        salmonidxHelp(params).text +
+        sourmashsketchHelp(params).text +
+        sourmashgatherHelp(params).text +
+        sfhpyHelp(params).text +
+        gsalkronapyHelp(params).text +
+        kronaktimporttextHelp(params).text +
+        wrapUpHelp()
+    )
+
+    return addPadding(helptext)
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/hfp_nowayout.xml	Fri May 29 13:37:56 2026 +0000
@@ -0,0 +1,201 @@
+<tool id="hfp_nowayout_awsbatch" name="nowayout" version="0.5.0+awsbatch">
+    <description>An automated workflow to identify Mitochondrial reads and classify Eukaryotes.</description>
+    <requirements>
+        <container type="docker">quay.io/galaxytrakr/mulled-v2-ebd88135862aa647eeae73d4d8e6ea8ec81245cd:v5.0</container>
+    </requirements>
+    <version_command>nextflow -version</version_command>
+    <command detect_errors="exit_code"><![CDATA[
+    export MAMBA_ROOT_PREFIX="/server/galaxy/data/nextflow-micromamba-cache";
+    export NXF_HOME=\$(pwd)"/.nextflow-home";
+	input_path=\$(pwd)"/cpipes-input";
+    mkdir -p "\${input_path}" || exit 1;
+    #import re
+    #if (str($input_read_type_cond.input_read_type) == "single_long"):
+	    #for _, $unpaired in enumerate($input_read_type_cond.input):
+            #set read1 = str($unpaired.name)
+            #if not str($unpaired.name).endswith(('.fastq', '.fastq.gz')):
+                #set read1_ext = re.sub('fastqsanger', 'fastq', str($unpaired.ext))
+                #set read1 = str($unpaired.name) + str('.') + $read1_ext
+            #end if
+            ln -sf '$unpaired' "\${input_path}/$read1";
+	    #end for
+    #elif (str($input_read_type_cond.input_read_type) == "paired"):
+	    #for _, $pair in enumerate($input_read_type_cond.input_pair)
+            #set read_R1 = re.sub('\:forward', '_forward', str($pair.forward.name))
+            #set read_R2 = re.sub('\:reverse', '_reverse', str($pair.reverse.name))
+            #set read_R1_ext = re.sub('fastqsanger', 'fastq', str($pair.forward.ext))
+            #set read_R2_ext = re.sub('fastqsanger', 'fastq', str($pair.reverse.ext))
+            #if not str($pair.forward.name).endswith(('.fastq', '.fastq.gz')):
+	            #set read_R1 = $read_R1 + str('.') + $read_R1_ext
+            #end if
+            #if not str($pair.reverse.name).endswith(('.fastq', '.fastq.gz')):
+                #set read_R2 = $read_R2 + str('.') + $read_R2_ext
+            #end if
+	        ln -sf '$pair.forward' "\${input_path}/$read_R1";
+	        ln -sf '$pair.reverse' "\${input_path}/$read_R2";
+	    #end for
+    #end if
+	$__tool_directory__/0.5.0/cpipes
+    --pipeline nowayout
+    --input \${input_path}
+	--output cpipes-output
+    --fq_suffix '${input_read_type_cond.fq_suffix}'
+    #if (str($input_read_type_cond.input_read_type) == "single_long"):
+        --fq_single_end true
+    #elif (str($input_read_type_cond.input_read_type) == "paired"):
+        --fq_single_end false --fq2_suffix '${input_read_type_cond.fq2_suffix}'
+    #end if
+    --db_mode $nowo_db_mode
+    --nowo_thresholds $nowo_thresholds
+	--fq_filename_delim '${fq_filename_delim}'
+	--fq_filename_delim_idx $fq_filename_delim_idx
+	-profile stdkondagac;
+    mv './cpipes-output/nowayout-multiqc/CPIPES-Report_multiqc_report.html' './multiqc_report.html' || exit 1;
+    if [ -e './cpipes-output/krona_ktimporttext/CPIPES_nowayout_krona.html' ]; then mv './cpipes-output/krona_ktimporttext/CPIPES_nowayout_krona.html' './CPIPES_nowayout_krona.html'; else echo '<html><h1>No mitochondrial reads detected in any of the samples</h1></html>' > './CPIPES_nowayout_krona.html'; fi;
+    rm -rf ./cpipes-output || exit 1;
+    rm -rf ./work || exit 1;
+    ]]></command>
+    <inputs>
+        <conditional name="input_read_type_cond">
+            <param name="input_read_type" type="select" label="Select the read collection type">
+                <option value="single_long" selected="true">Single-End short reads</option>
+                <option value="paired">Paired-End short reads</option>
+            </param>
+            <when value="single_long">
+                <param name="input" type="data_collection" collection_type="list" format="fastq,fastq.gz"
+                    label="Dataset list of unpaired short reads or long reads" />
+                <param name="fq_suffix" value=".fastq.gz" type="text" label="Suffix of the Single-End FASTQ"/>
+            </when>
+            <when value="paired">
+                <param name="input_pair" type="data_collection" collection_type="list:paired" format="fastq,fastq.gz" label="List of Dataset pairs" />
+                <param name="fq_suffix" value="_R1_001.fastq.gz" type="text" label="Suffix of the R1 FASTQ"
+                    help="For any data sets downloaded from NCBI into Galaxy, change this to _forward.fastq.gz suffix."/>
+                <param name="fq2_suffix" value="_R2_001.fastq.gz" type="text" label="Suffix of the R2 FASTQ"
+                    help="For any data sets downloaded from NCBI into Galaxy, change this to _reverse.fastq.gz suffix."/>
+            </when>
+        </conditional>
+        <param name="nowo_db_mode" type="select" label="Select the database with nowayout"
+            help="Please see below about different databases.">
+            <option value="mitomine2" selected="true">mitomine2</option>
+            <option value="mitomine">mitomine</option>
+            <option value="cytox1">cytox1</option>
+            <option value="voucher">voucher</option>
+            <option value="ganoderma">ganoderma</option>
+            <option value="listeria">listeria</option>
+        </param>
+        <param name="nowo_thresholds" type="select" label="Enter the type of base quality thresholds to be set with nowayout"
+            help="The default value sets strictest thresholds that tends to filter out most of the false positive hits.">
+            <option value="strict" selected="true">strict</option>
+            <option value="relax">relax</option>
+        </param>
+        <param name="fq_filename_delim" type="text" value="_" label="File name delimitor by which samples are grouped together (--fq_filename_delim)"
+            help="This is the delimitor by which samples are grouped together to display in the final MultiQC report. For example, if your input data sets are mango_replicate1.fastq.gz, mango_replicate2.fastq.gz, orange_replicate1_maryland.fastq.gz, orange_replicate2_maryland.fastq.gz, then to create 2 samples mango and orange, the value for --fq_filename_delim would be _ (underscore) and the value for --fq_filename_delim_idx would be 1, since you want to group by the first word (i.e. mango or orange) after splitting the filename based on _ (underscore)."/>
+        <param name="fq_filename_delim_idx" type="integer" value="1" label="File name delimitor index (--fq_filename_delim_idx)" />
+    </inputs>
+    <outputs>
+        <data name="krona_chart" format="html" label="nowayout: Krona Chart on ${on_string}" from_work_dir="CPIPES_nowayout_krona.html"/>
+        <data name="multiqc_report" format="html" label="nowayout: MultiQC Report on ${on_string}" from_work_dir="multiqc_report.html"/>
+    </outputs>
+    <tests>
+        <!--Test 01: long reads-->
+        <test expect_num_outputs="2">
+            <param name="input">
+                <collection type="list">
+                    <element name="FAL11127.fastq.gz" value="FAL11127.fastq.gz" />
+                    <element name="FAL11341.fastq.gz" value="FAL11341.fastq.gz" />
+                    <element name="FAL11342.fastq.gz" value="FAL11342.fastq.gz" />
+                </collection>
+            </param>
+            <param name="fq_suffix" value=".fastq.gz"/>
+            <output name="multiqc_report" file="multiqc_report.html" ftype="html" compare="sim_size"/>
+            <!-- <output name="assembled_mags" file="FAL11127.assembly_filtered.contigs.fasta" ftype="fasta" compare="sim_size"/> -->
+        </test>
+    </tests>
+    <help><![CDATA[
+
+.. class:: infomark
+
+**Purpose**
+
+nowayout is a mitochondrial metagenomics classifier for Eukaryotes.
+It uses a custom kma database to identify mitochondrial reads and
+performs read classification followed by further read classification
+reinforcement using sourmash.
+
+It is written in Nextflow and is part of the modular data analysis pipelines (CFSAN PIPELINES or CPIPES for short) at HFP.
+
+
+----
+
+.. class:: infomark
+
+**Databases**
+
+    - *mitomine2*: Big database that works in almost all scenarios.
+    - *cytox1*: Collection of only non-redundant COXI genes from NCBI.
+    - *voucher*: Collection of only non-redundant voucher sequences from NCBI.
+    - *ganoderma*: Collection of only non-redundant mtDNA sequences of Ganoderma fungi.
+    - *listeria*: Collection of organelle sequences and other rRNA genes for Listeria.
+
+
+----
+
+.. class:: infomark
+
+**Testing and Validation**
+
+The CPIPES - nowayout Nextflow pipeline has been wrapped to make it work in Galaxy.
+It takes in either paired or unpaired short reads list as an input and generates a MultiQC report
+which contains relative abundances in context of number of mitochondrial reads identified. It also
+generates a Krona chart for each sample. The pipeline has been tested on multiple internal insect
+mixture samples. All the original testing and validation was done on the command line on the
+HFP Reedling HPC Cluster.
+
+
+----
+
+.. class:: infomark
+
+**Please note**
+
+    - *nowayout* only works on Illumina short reads (paired or unpaired).
+    - *nowayout* uses a custom kma database named *mitomine*.
+    - The custom database will be incrementally augmented and refined over time.
+    - *mitomine* stats:
+        - Contains ~ 2.93M non-redundant mitochondrial and voucher sequences.
+        - Represents ~ 717K unique species.
+    - Other databases are also available but will be seldom updated.
+
+----
+
+.. class:: infomark
+
+**Outputs**
+
+The main output file is a:
+
+    ::
+
+        - MultiQC Report: Contains a brief summary report including individual Mitochondrial reads identified
+                          per sample and relative abundances in context of the total number of Mitochondrial reads
+                          identified.
+
+                          Please note that due to MultiQC customizations, the preview (eye icon) will not
+                          work within Galaxy for the MultiQC report. Please download the file by clicking
+                          on the floppy icon and view it in your browser on your local desktop/workstation.
+                          You can export the tables and plots from the downloaded MultiQC report.
+
+  ]]></help>
+    <citations>
+        <citation type="bibtex">
+            @article{nowayout,
+            author = {Konganti, Kranti},
+            year = {2025},
+            month = {May},
+            title = {nowayout: An automated mitrochiondrial read classifier for Eukaryotes.},
+            journal = {Manuscript in preparation},
+            doi = {10.3389/xxxxxxxxxxxxxxxxxx},
+            url = {https://xxxxxxx/articles/10.3389/xxxxxxxxxxxx/full}}
+        </citation>
+    </citations>
+</tool>