# HG changeset patch # User galaxytrakr # Date 1774310136 0 # Node ID d7f68b3cde397511aaf268db1bd3092ea6395a2b # Parent 5ecb94ab82c39e90820e2287db09b39e20f3d536 planemo upload for repository https://github.com/CFSAN-Biostatistics/galaxytrakr-tools commit e9adf514c1b6b341c9e5bf8cc5a41c79b738d48e diff -r 5ecb94ab82c3 -r d7f68b3cde39 aws_sra.xml --- a/aws_sra.xml Mon Mar 23 23:34:21 2026 +0000 +++ b/aws_sra.xml Mon Mar 23 23:55:36 2026 +0000 @@ -1,5 +1,5 @@ - - Fetches a single SRA run from AWS and converts it to FASTQ + + Fetches one or more SRA runs from AWS S3 and converts them to FASTQ awscli @@ -10,77 +10,74 @@ fasterq-dump --version accessions.txt && - ## 1. Create temporary directories - mkdir -p sra_cache fastq_out && + ## Loop over each clean accession + for acc in $(cat accessions.txt); + do + echo "Processing accession: $acc" && - ## 2. Download the file from S3 using the discovered path format (no .sra) - aws s3 cp --no-sign-request 's3://sra-pub-run-odp/sra/${acc}/${acc}' ./sra_cache/ && + ## 1. Create unique directories for this accession + mkdir -p sra_cache_${acc} fastq_out_${acc} && + + ## 2. Download the file from S3 using aws s3 cp + aws s3 cp --no-sign-request "s3://sra-pub-run-odp/sra/${acc}/${acc}" ./sra_cache_${acc}/ && - ## 3. Convert with fasterq-dump, using the correct argument order - fasterq-dump --outdir ./fastq_out --temp . --threads \${GALAXY_SLOTS:-4} --split-files ./sra_cache/${acc} && + ## 3. Convert with fasterq-dump + fasterq-dump --outdir ./fastq_out_${acc} --temp . --threads \${GALAXY_SLOTS:-4} --split-files ./sra_cache_${acc}/${acc} && - ## 4. Compress with pigz - pigz -p \${GALAXY_SLOTS:-4} ./fastq_out/*.fastq && + ## 4. Compress with pigz + pigz -p \${GALAXY_SLOTS:-4} ./fastq_out_${acc}/*.fastq && - ## 5. Move the final outputs to their Galaxy dataset paths - #if str($layout) == 'paired' - mv ./fastq_out/${acc}_1.fastq.gz '$output_r1' && - mv ./fastq_out/${acc}_2.fastq.gz '$output_r2' - #else - # Be explicit about the single-end filename, removing the wildcard - mv ./fastq_out/${acc}.fastq.gz '$output_r1' - #end if + ## 5. Move outputs for collection discovery + #if str($layout) == 'paired' + mv ./fastq_out_${acc}/${acc}_1.fastq.gz '$output_r1.files_path/${acc}_1.fastq.gz' && + mv ./fastq_out_${acc}/${acc}_2.fastq.gz '$output_r2.files_path/${acc}_2.fastq.gz' + #else + mv ./fastq_out_${acc}/${acc}.fastq.gz '$output_r1.files_path/${acc}.fastq.gz' + #end if && + + ## 6. Clean up + rm -rf sra_cache_${acc} fastq_out_${acc} + done ]]> - - + + - - + + + + + layout == 'paired' - + - - - - - - - - - - - - - - - - - - - + + + + + +