Mercurial > repos > rliterman > csp2
diff CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/bbmap-39.01-1/kmercoverage.sh @ 69:33d812a61356
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 17:55:14 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/bbmap-39.01-1/kmercoverage.sh Tue Mar 18 17:55:14 2025 -0400 @@ -0,0 +1,102 @@ +#!/bin/bash + +usage(){ +echo " +Written by Brian Bushnell +Last modified May 23, 2014 + +*** DEPRECATED: This should still work but is no longer maintained. *** + +Description: Annotates reads with their kmer depth. + +Usage: kmercoverage in=<input> out=<read output> hist=<histogram output> + +Input parameters: +in2=null Second input file for paired reads +extra=null Additional files to use for input (generating hash table) but not for output +fastareadlen=2^31 Break up FASTA reads longer than this. Can be useful when processing scaffolded genomes +tablereads=-1 Use at most this many reads when building the hashtable (-1 means all) +kmersample=1 Process every nth kmer, and skip the rest +readsample=1 Process every nth read, and skip the rest + +Output parameters: +hist=null Specify a file to output the depth histogram +histlen=10000 Max depth displayed on histogram +reads=-1 Only process this number of reads, then quit (-1 means all) +sampleoutput=t Use sampling on output as well as input (not used if sample rates are 1) +printcoverage=f Only print coverage information instead of reads +useheader=f Append coverage info to the read's header +minmedian=0 Don't output reads with median coverage below this +minaverage=0 Don't output reads with average coverage below this +zerobin=f Set to true if you want kmers with a count of 0 to go in the 0 bin instead of the 1 bin in histograms. + Default is false, to prevent confusion about how there can be 0-count kmers. + The reason is that based on the 'minq' and 'minprob' settings, some kmers may be excluded from the bloom filter. + +Hashing parameters: +k=31 Kmer length (values under 32 are most efficient, but arbitrarily high values are supported) +cbits=8 Bits per cell in bloom filter; must be 2, 4, 8, 16, or 32. Maximum kmer depth recorded is 2^cbits. + Large values decrease accuracy for a fixed amount of memory. +hashes=4 Number of times a kmer is hashed. Higher is slower. + Higher is MORE accurate if there is enough memory, and LESS accurate if there is not enough memory. +prefilter=f True is slower, but generally more accurate; filters out low-depth kmers from the main hashtable. +prehashes=2 Number of hashes for prefilter. +passes=1 More passes can sometimes increase accuracy by iteratively removing low-depth kmers +minq=7 Ignore kmers containing bases with quality below this +minprob=0.5 Ignore kmers with overall probability of correctness below this +threads=X Spawn exactly X hashing threads (default is number of logical processors). Total active threads may exceed X by up to 4. + +Java Parameters: +-Xmx This will set Java's memory usage, overriding autodetection. + -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. + The max is typically 85% of physical memory. +-eoom This flag will cause the process to exit if an + out-of-memory exception occurs. Requires Java 8u92+. +-da Disable assertions. + +Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems. +" +} + +#This block allows symlinked shellscripts to correctly set classpath. +pushd . > /dev/null +DIR="${BASH_SOURCE[0]}" +while [ -h "$DIR" ]; do + cd "$(dirname "$DIR")" + DIR="$(readlink "$(basename "$DIR")")" +done +cd "$(dirname "$DIR")" +DIR="$(pwd)/" +popd > /dev/null + +#DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/" +CP="$DIR""current/" + +z="-Xmx1g" +z2="-Xms1g" +set=0 + +if [ -z "$1" ] || [[ $1 == -h ]] || [[ $1 == --help ]]; then + usage + exit +fi + +calcXmx () { + source "$DIR""/calcmem.sh" + setEnvironment + parseXmx "$@" + if [[ $set == 1 ]]; then + return + fi + freeRam 3200m 84 + z="-Xmx${RAM}m" + z2="-Xms${RAM}m" +} +calcXmx "$@" + +kmercoverage() { + local CMD="java $EA $EOOM $z -cp $CP jgi.KmerCoverage prefilter=true bits=16 interleaved=false $@" + echo $CMD >&2 + eval $CMD +} + +kmercoverage "$@"