jpayne@69: #!/bin/bash jpayne@69: jpayne@69: usage(){ jpayne@69: echo " jpayne@69: Written by Brian Bushnell jpayne@69: Last modified May 23, 2014 jpayne@69: jpayne@69: *** DEPRECATED: This should still work but is no longer maintained. *** jpayne@69: jpayne@69: Description: Annotates reads with their kmer depth. jpayne@69: jpayne@69: Usage: kmercoverage in= out= hist= jpayne@69: jpayne@69: Input parameters: jpayne@69: in2=null Second input file for paired reads jpayne@69: extra=null Additional files to use for input (generating hash table) but not for output jpayne@69: fastareadlen=2^31 Break up FASTA reads longer than this. Can be useful when processing scaffolded genomes jpayne@69: tablereads=-1 Use at most this many reads when building the hashtable (-1 means all) jpayne@69: kmersample=1 Process every nth kmer, and skip the rest jpayne@69: readsample=1 Process every nth read, and skip the rest jpayne@69: jpayne@69: Output parameters: jpayne@69: hist=null Specify a file to output the depth histogram jpayne@69: histlen=10000 Max depth displayed on histogram jpayne@69: reads=-1 Only process this number of reads, then quit (-1 means all) jpayne@69: sampleoutput=t Use sampling on output as well as input (not used if sample rates are 1) jpayne@69: printcoverage=f Only print coverage information instead of reads jpayne@69: useheader=f Append coverage info to the read's header jpayne@69: minmedian=0 Don't output reads with median coverage below this jpayne@69: minaverage=0 Don't output reads with average coverage below this jpayne@69: zerobin=f Set to true if you want kmers with a count of 0 to go in the 0 bin instead of the 1 bin in histograms. jpayne@69: Default is false, to prevent confusion about how there can be 0-count kmers. jpayne@69: The reason is that based on the 'minq' and 'minprob' settings, some kmers may be excluded from the bloom filter. jpayne@69: jpayne@69: Hashing parameters: jpayne@69: k=31 Kmer length (values under 32 are most efficient, but arbitrarily high values are supported) jpayne@69: cbits=8 Bits per cell in bloom filter; must be 2, 4, 8, 16, or 32. Maximum kmer depth recorded is 2^cbits. jpayne@69: Large values decrease accuracy for a fixed amount of memory. jpayne@69: hashes=4 Number of times a kmer is hashed. Higher is slower. jpayne@69: Higher is MORE accurate if there is enough memory, and LESS accurate if there is not enough memory. jpayne@69: prefilter=f True is slower, but generally more accurate; filters out low-depth kmers from the main hashtable. jpayne@69: prehashes=2 Number of hashes for prefilter. jpayne@69: passes=1 More passes can sometimes increase accuracy by iteratively removing low-depth kmers jpayne@69: minq=7 Ignore kmers containing bases with quality below this jpayne@69: minprob=0.5 Ignore kmers with overall probability of correctness below this jpayne@69: threads=X Spawn exactly X hashing threads (default is number of logical processors). Total active threads may exceed X by up to 4. jpayne@69: jpayne@69: Java Parameters: jpayne@69: -Xmx This will set Java's memory usage, overriding autodetection. jpayne@69: -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. jpayne@69: The max is typically 85% of physical memory. jpayne@69: -eoom This flag will cause the process to exit if an jpayne@69: out-of-memory exception occurs. Requires Java 8u92+. jpayne@69: -da Disable assertions. jpayne@69: jpayne@69: Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems. jpayne@69: " jpayne@69: } jpayne@69: jpayne@69: #This block allows symlinked shellscripts to correctly set classpath. jpayne@69: pushd . > /dev/null jpayne@69: DIR="${BASH_SOURCE[0]}" jpayne@69: while [ -h "$DIR" ]; do jpayne@69: cd "$(dirname "$DIR")" jpayne@69: DIR="$(readlink "$(basename "$DIR")")" jpayne@69: done jpayne@69: cd "$(dirname "$DIR")" jpayne@69: DIR="$(pwd)/" jpayne@69: popd > /dev/null jpayne@69: jpayne@69: #DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/" jpayne@69: CP="$DIR""current/" jpayne@69: jpayne@69: z="-Xmx1g" jpayne@69: z2="-Xms1g" jpayne@69: set=0 jpayne@69: jpayne@69: if [ -z "$1" ] || [[ $1 == -h ]] || [[ $1 == --help ]]; then jpayne@69: usage jpayne@69: exit jpayne@69: fi jpayne@69: jpayne@69: calcXmx () { jpayne@69: source "$DIR""/calcmem.sh" jpayne@69: setEnvironment jpayne@69: parseXmx "$@" jpayne@69: if [[ $set == 1 ]]; then jpayne@69: return jpayne@69: fi jpayne@69: freeRam 3200m 84 jpayne@69: z="-Xmx${RAM}m" jpayne@69: z2="-Xms${RAM}m" jpayne@69: } jpayne@69: calcXmx "$@" jpayne@69: jpayne@69: kmercoverage() { jpayne@69: local CMD="java $EA $EOOM $z -cp $CP jgi.KmerCoverage prefilter=true bits=16 interleaved=false $@" jpayne@69: echo $CMD >&2 jpayne@69: eval $CMD jpayne@69: } jpayne@69: jpayne@69: kmercoverage "$@"