view 0.3.0/modules/centrifuge/extract/README.md @ 92:295c2597a475

"planemo upload"
author kkonganti
date Tue, 19 Jul 2022 10:07:24 -0400
parents
children
line wrap: on
line source
# NextFlow DSL2 Module

```bash
CENTRIFUGE_EXTRACT
```

## Description

Extract FASTQ reads given a FASTQ file originally used with `centrifuge` tool and a taxa of interest. This specific module uses only GNU Coreutils to create a list of FASTQ read ids that need to be extract. See also `CENTRIFUGE_PROCESS` module which uses a `python` script to generate the FASTQ read ids.

\
 

### `input:`

___

Type: `tuple`

Takes in the following 2 tuples:

- A tuple of metadata (`meta`) and of type `path` (`centrifuge_output`) per sample (`id:`).

- A tuple of metadata (`meta`) and of type `path` (`centrifuge_report`) per sample (`id:`).

Ex:

```groovy
[ 
    [ id: 'FAL00870',
       strandedness: 'unstranded',
       single_end: true,
       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
    ],
    '/hpc/scratch/test/FAL000870/f1.merged.cent_out.output.txt'
]

[ 
    [ id: 'FAL00870',
       strandedness: 'unstranded',
       single_end: true,
       centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
    ],
    '/hpc/scratch/test/FAL000870/f1.merged.cent_out.report.txt'
]
```

\
 

#### `meta`

Type: Groovy Map

A Groovy Map containing the metadata about the FASTQ file.

Ex:

```groovy
[ 
    id: 'FAL00870',
    strandedness: 'unstranded',
    single_end: true,
    centrifuge_x: '/hpc/db/centrifuge/2022-04-12/ab'
]
```

\
 

#### `centrifuge_report`

Type: `path`

NextFlow input type of `path` pointing to `centrifuge` report file generated using `--report-file` option of `centrifuge` tool.

\
 

#### `centrifuge_output`

Type: `path`

NextFlow input type of `path` pointing to `centrifuge` output file generated using `-S` option of `centrifuge` tool.

\
 

### `output:`

___

Type: `tuple`

Outputs a tuple of metadata (`meta` from `input:`) and list of extracted FASTQ read ids.

\
 

#### `extracted`

Type: `path`

NextFlow output type of `path` pointing to the extracted FASTQ read ids belonging to a particular taxa (`*.extract-centrifuge-bug-ids.txt`).

\
 

#### `versions`

Type: `path`

NextFlow output type of `path` pointing to the `.yml` file storing software versions for this process.