annotate CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/README @ 69:33d812a61356

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 17:55:14 -0400
parents
children
rev   line source
jpayne@69 1 -=- MUMmer3.x README -=-
jpayne@69 2
jpayne@69 3 ** NOTE **
jpayne@69 4 A comprehensive HTML user manual is available in the docs/web/manual
jpayne@69 5 subdirectory or at http://mummer.sourceforge.net/manual
jpayne@69 6
jpayne@69 7 MUMmer is now an open source package! Please contact us if you would like
jpayne@69 8 to contribute to the MUMmer project. For more information or the latest
jpayne@69 9 release please visit the MUMmer homepage at http://mummer.sourceforge.net
jpayne@69 10
jpayne@69 11 Please refer to the INSTALL file for installation instructions. This file
jpayne@69 12 contains brief descriptions of all executables in the base directory and
jpayne@69 13 general information about the MUMmer package.
jpayne@69 14
jpayne@69 15
jpayne@69 16
jpayne@69 17 -- DESCRIPTION --
jpayne@69 18 MUMmer is a system for rapidly aligning entire genomes. The current
jpayne@69 19 version (release 3.0) can find all 20 base pair maximal exact matches between
jpayne@69 20 two bacterial genomes of ~5 million base pairs each in 20 seconds, using 90 MB
jpayne@69 21 of memory, on a typical 1.8 GHz Linux desktop computer. MUMmer can also align
jpayne@69 22 incomplete genomes; it handles the 100s or 1000s of contigs from a shotgun
jpayne@69 23 sequencing project with ease, and will align them to another set of contigs or
jpayne@69 24 a genome, using the nucmer utility included with the system. The promer
jpayne@69 25 utility takes this a step further by generating alignments based upon the
jpayne@69 26 six-frame translations of both input sequences. promer permits the alignment
jpayne@69 27 of genomes for which the proteins are similar but the DNA sequence is too
jpayne@69 28 divergent to detect similarity. See the nucmer and promer readme files in the
jpayne@69 29 "docs/" subdirectory for more details. MUMmer is open source, so all we ask
jpayne@69 30 is that you cite our most recent paper in any publications that use this
jpayne@69 31 system:
jpayne@69 32
jpayne@69 33 (Version 3.0 described)
jpayne@69 34 Versatile and open software for comparing large genomes.
jpayne@69 35 S. Kurtz, A. Phillippy, A.L. Delcher,
jpayne@69 36 M. Smoot, M. Shumway, C. Antonescu, and S.L. Salzberg.
jpayne@69 37 Genome Biology (2004), 5:R12.
jpayne@69 38
jpayne@69 39 (Version 2.1 described)
jpayne@69 40 Fast algorithms for large-scale genome alignment and comparison.
jpayne@69 41 A.L. Delcher. A. Phillippy, J. Carlton, and S.L. Salzberg.
jpayne@69 42 Nucleic Acids Research 30:11 (2002), 2478-2483.
jpayne@69 43
jpayne@69 44 (Version 1.0 described)
jpayne@69 45 Alignment of Whole Genomes.
jpayne@69 46 A.L. Delcher, S. Kasif,
jpayne@69 47 R.D. Fleischmann, J. Peterson, O. White, and S.L. Salzberg.
jpayne@69 48 Nucleic Acids Research, 27:11 (1999), 2369-2376.
jpayne@69 49
jpayne@69 50
jpayne@69 51 -- RUNNING MUMmer3.0 --
jpayne@69 52 MUMmer3.0 is comprised of many various utilities and scripts. For general
jpayne@69 53 purposes, the scripts "run-mummer1", "run-mummer3", "nucmer", and "promer"
jpayne@69 54 will be all that is needed. See their descriptions in the "RUNNING THE MUMmer
jpayne@69 55 SCRIPTS" section, or refer to their individual documentation in the "docs/"
jpayne@69 56 subdirectory. Refer to the "RUNNING THE MUMmer UTILITIES" section for a brief
jpayne@69 57 description of all of the utilities in this directory.
jpayne@69 58
jpayne@69 59 Simple use case:
jpayne@69 60 Given a file containing a single reference sequence (ref.seq) in
jpayne@69 61 FASTA format and another file containing multiple sequences in FastA
jpayne@69 62 format (qry.seq) type the following at the command line:
jpayne@69 63
jpayne@69 64 './nucmer -p <prefix> ref.seq qry.seq'
jpayne@69 65
jpayne@69 66 To produce the following files:
jpayne@69 67 <prefix>.delta
jpayne@69 68
jpayne@69 69 or
jpayne@69 70
jpayne@69 71 './run-mummer3.csh ref.seq qry.seq <prefix>'
jpayne@69 72
jpayne@69 73 To produce the following files:
jpayne@69 74 <prefix>.out
jpayne@69 75 <prefix>.gaps
jpayne@69 76 <prefix>.align
jpayne@69 77 <prefix>.errorsgaps
jpayne@69 78
jpayne@69 79 Please read the utility-specific documentation in the "docs/" subdirectory
jpayne@69 80 for descriptions of these files and information on how to change the
jpayne@69 81 alignment parameters for the scripts (minimum match length, etc.), or see
jpayne@69 82 the notes below in the "RUNNING THE MUMmer SCRIPTS" section for a brief
jpayne@69 83 explanation.
jpayne@69 84
jpayne@69 85 To see a simple gnuplot output, if you have gnuplot installed, run
jpayne@69 86 the perl script 'mummerplot' on the output files. This script can be run
jpayne@69 87 on mummer output (.out), or nucmer/promer output (.delta). Edit the
jpayne@69 88 <prefix>.gp file that is created to change colors, line thicknesses, etc. or
jpayne@69 89 explore the <prefix>.[fr]plot file to see the data collection.
jpayne@69 90
jpayne@69 91 './mummerplot -p <prefix> <prefix>.out'
jpayne@69 92
jpayne@69 93 Or you can use the web viewer for completed microbial genomes:
jpayne@69 94 http://www.tigr.org/CMR
jpayne@69 95
jpayne@69 96
jpayne@69 97
jpayne@69 98 -- RUNNING THE MUMmer SCRIPTS --
jpayne@69 99 Because of MUMmer's modular design, it may be necessary to use a number
jpayne@69 100 of separate programs to produce the desired output. The MUMmer scripts
jpayne@69 101 attempt to simplify this process by wrapping various utilities into packages
jpayne@69 102 that can perform standard alignment requests. Listed below are brief
jpayne@69 103 descriptions and usage definitions for these scripts. Please refer to the
jpayne@69 104 "docs/" subdirectory for a more detailed description of each script.
jpayne@69 105
jpayne@69 106
jpayne@69 107 ** nucmer **
jpayne@69 108
jpayne@69 109 DESCRIPTION:
jpayne@69 110 nucmer is for the all-vs-all comparison of nucleotide sequences
jpayne@69 111 contained in multi-FastA data files. It is best used for highly
jpayne@69 112 similar sequence that may have large rearrangements. Common use
jpayne@69 113 cases are: comparing two unfinished shotgun sequencing assemblies,
jpayne@69 114 mapping an unfinished sequencing assembly to a finished genome, and
jpayne@69 115 comparing two fairly similar genomes that may have large
jpayne@69 116 rearrangements and duplications. Please refer to "docs/nucmer.README"
jpayne@69 117 for more information regarding this script and its output, or type
jpayne@69 118 'nucmer -h' for a list of its options.
jpayne@69 119
jpayne@69 120 USAGE:
jpayne@69 121 nucmer [options] <reference> <query>
jpayne@69 122
jpayne@69 123 [options] type 'nucmer -h' for a list of options.
jpayne@69 124 <reference> specifies the multi-FastA sequence file that contains
jpayne@69 125 the reference sequences, to be aligned with the queries.
jpayne@69 126 <query> specifies the multi-FastA sequence file that contains
jpayne@69 127 the query sequences, to be aligned with the references.
jpayne@69 128
jpayne@69 129 OUTPUT:
jpayne@69 130 out.delta the delta encoded alignments between the reference and
jpayne@69 131 query sequences. This file can be parsed with any of
jpayne@69 132 the show-* programs which are described in the "RUNNING
jpayne@69 133 THE MUMmer UTILITIES" section.
jpayne@69 134
jpayne@69 135 NOTES:
jpayne@69 136 All output coordinates reference the forward strand of the involved
jpayne@69 137 sequence, regardless of the match direction. Also, nucmer now uses
jpayne@69 138 only matches that are unique in the reference sequence by default,
jpayne@69 139 use the '--mum' or '--maxmatch' options to change this behavior.
jpayne@69 140
jpayne@69 141
jpayne@69 142 ** promer **
jpayne@69 143
jpayne@69 144 DESCRIPTION:
jpayne@69 145 promer is for the protein level, all-vs-all comparison of nucleotide
jpayne@69 146 sequences contained in multi-FastA data files. The nucleotide input
jpayne@69 147 files are translated in all 6 reading frames and then aligned to one
jpayne@69 148 another via the same methods as nucmer. It is best used for highly
jpayne@69 149 divergent sequences that may have moderate to high similarity on the
jpayne@69 150 protein level. Common use cases are: identifying syntenic regions
jpayne@69 151 between highly divergent genomes, comparative genome annotation i.e.
jpayne@69 152 using an already annotated genome to help in the annotation of a
jpayne@69 153 newly sequenced genome, and the general comparison of two fairly
jpayne@69 154 divergent genomes that have large rearrangements and may only be
jpayne@69 155 similar on the protein level. Please refer to "docs/promer.README"
jpayne@69 156 for more information regarding this script and its output, or type
jpayne@69 157 'promer -h' for a list of its options.
jpayne@69 158
jpayne@69 159 USAGE:
jpayne@69 160 promer [options] <reference> <query>
jpayne@69 161
jpayne@69 162 [options] type 'promer -h' for a list of options.
jpayne@69 163 <reference> specifies the multi-FastA sequence file that contains
jpayne@69 164 the reference sequences, to be aligned with the queries.
jpayne@69 165 <query> specifies the multi-FastA sequence file that contains
jpayne@69 166 the query sequences, to be aligned with the references.
jpayne@69 167
jpayne@69 168 OUTPUT:
jpayne@69 169 out.delta the delta encoded alignments between the reference and
jpayne@69 170 query sequences. This file can be parsed with any of
jpayne@69 171 the show-* programs which are described in the "RUNNING
jpayne@69 172 THE MUMmer UTILITIES" section.
jpayne@69 173
jpayne@69 174 NOTES:
jpayne@69 175 All output coordinates reference the forward strand of the involved
jpayne@69 176 sequence, regardless of the match direction, and are measured in
jpayne@69 177 nucleotides with the exception of the delta integers which are
jpayne@69 178 measured in amino acids (1 delta int = 3 nucleotides). Also, promer
jpayne@69 179 now uses only matches that are unique in the reference sequence by
jpayne@69 180 default, use the '--mum' or '--maxmatch' options to change this
jpayne@69 181 behavior.
jpayne@69 182
jpayne@69 183
jpayne@69 184 ** run-mummer1 **
jpayne@69 185
jpayne@69 186 DESCRIPTION:
jpayne@69 187 This script is taken directly from MUMmer1.0 and is best used to
jpayne@69 188 align two sequences in which there is high similarity and no re-
jpayne@69 189 arrangements. Common use cases are: aligning two finished bacterial
jpayne@69 190 chromosomes. Please refer to "docs/run-mummer1.README" for the
jpayne@69 191 original documentation for this script and its output.
jpayne@69 192
jpayne@69 193 USAGE:
jpayne@69 194 run-mummer1 <seq1> <seq2> <tag> [-r]
jpayne@69 195
jpayne@69 196 <seq1> specifies the file with the first sequence in FastA format.
jpayne@69 197 No more than one sequence is allowed.
jpayne@69 198 <seq2> specifies the file with the second sequence in FastA format.
jpayne@69 199 No more than one sequence is allowed.
jpayne@69 200 <tag> specifies the prefix to be used for the output files.
jpayne@69 201 [-r] is an optional parameter that will reverse complement the
jpayne@69 202 second sequence.
jpayne@69 203
jpayne@69 204 OUTPUT:
jpayne@69 205 out.align the out.gaps file interspersed with the alignments
jpayne@69 206 of the gaps.
jpayne@69 207 out.errorsgaps the out.gaps file with an extra column stating the
jpayne@69 208 number of errors contained in each gap.
jpayne@69 209 out.gaps an ordered (clustered) list of matches with position
jpayne@69 210 information, and gap distances between each match.
jpayne@69 211 out.out a list of all maximal unique matches between the two
jpayne@69 212 input sequences ordered by their start position in the
jpayne@69 213 second sequence.
jpayne@69 214
jpayne@69 215 NOTES:
jpayne@69 216 All output coordinates reference their respective strand. This means
jpayne@69 217 that if the -r switch is active, coordinates that reference the
jpayne@69 218 second sequence will be relative to the reverse complement of the
jpayne@69 219 second sequence. Please use nucmer or promer if this coordinate
jpayne@69 220 system is confusing.
jpayne@69 221 Eventually, this script's components will be rewritten to work
jpayne@69 222 with the new MUMmer format standards and phased out in favor of the
jpayne@69 223 new components and wrapping script.
jpayne@69 224
jpayne@69 225
jpayne@69 226 ** run-mummer3 **
jpayne@69 227
jpayne@69 228 DESCRIPTION:
jpayne@69 229 This script is the improved version of the MUMmer1.0 run-mummer1
jpayne@69 230 script. It uses a new clustering algorithm that appropriately
jpayne@69 231 handles multiple sequence rearrangements and inversions. Because
jpayne@69 232 of this, it can handle more divergent sequences better than
jpayne@69 233 run-mummer1. In addition, it allows a multi-FastA query file for
jpayne@69 234 1-vs-many sequence comparisons. Please refer to
jpayne@69 235 "docs/run-mummer3.README" for more detailed documentation of this
jpayne@69 236 script and its output.
jpayne@69 237
jpayne@69 238 USAGE:
jpayne@69 239 run-mummer3 <reference> <query> <prefix>
jpayne@69 240
jpayne@69 241 <reference> specifies the file with the reference sequence in FastA
jpayne@69 242 format. No more than one sequence is allowed.
jpayne@69 243 <query> specifies the multi-FastA sequence file that contains
jpayne@69 244 the query sequences.
jpayne@69 245 <prefix> specifies the file prefix for the output files.
jpayne@69 246
jpayne@69 247 OUTPUT:
jpayne@69 248 out.align the out.gaps file interspersed with the alignments
jpayne@69 249 of the gaps.
jpayne@69 250 out.errorsgaps the out.gaps file with an extra column stating the
jpayne@69 251 number of errors contained in each gap.
jpayne@69 252 out.gaps an ordered (clustered) list of matches with position
jpayne@69 253 information, and gap distances between each match.
jpayne@69 254 out.out a list of all maximal unique matches between the two
jpayne@69 255 input sequences ordered by their start position in the
jpayne@69 256 second sequence.
jpayne@69 257
jpayne@69 258 NOTES:
jpayne@69 259 All output coordinates reference their respective strand. This means
jpayne@69 260 that for all reverse matches, the coordinates that reference the
jpayne@69 261 query sequence will be relative to the reverse complement of the
jpayne@69 262 query sequence. Please use nucmer or promer if this coordinate
jpayne@69 263 system is confusing.
jpayne@69 264
jpayne@69 265
jpayne@69 266 ** dnadiff **
jpayne@69 267
jpayne@69 268 DESCRIPTION:
jpayne@69 269 This script is a wrapper around nucmer that builds an
jpayne@69 270 alignment using default parameters, and runs many of nucmer's
jpayne@69 271 helper scripts to process the output and report alignment
jpayne@69 272 statistics, SNPs, breakpoints, etc. It is designed for
jpayne@69 273 evaluating the sequence and structural similarity of two
jpayne@69 274 highly similar sequence sets. E.g. comparing two different
jpayne@69 275 assemblies of the same organism, or comparing two strains of
jpayne@69 276 the same species. Please refer to "docs/dnadiff.README" for
jpayne@69 277 more information regarding this script and its output, or type
jpayne@69 278 'dnadiff -h' for a list of its options.
jpayne@69 279
jpayne@69 280 USAGE: dnadiff [options] <reference> <query>
jpayne@69 281 or dnadiff [options] -d <delta file>
jpayne@69 282
jpayne@69 283 <reference> Set the input reference multi-FASTA filename
jpayne@69 284 <query> Set the input query multi-FASTA filename
jpayne@69 285 or
jpayne@69 286 <delta file> Unfiltered .delta alignment file from nucmer
jpayne@69 287
jpayne@69 288 OUTPUT:
jpayne@69 289 .report - Summary of alignments, differences and SNPs
jpayne@69 290 .delta - Standard nucmer alignment output
jpayne@69 291 .1delta - 1-to-1 alignment from delta-filter -1
jpayne@69 292 .mdelta - M-to-M alignment from delta-filter -m
jpayne@69 293 .1coords - 1-to-1 coordinates from show-coords -THrcl .1delta
jpayne@69 294 .mcoords - M-to-M coordinates from show-coords -THrcl .mdelta
jpayne@69 295 .snps - SNPs from show-snps -rlTHC .1delta
jpayne@69 296 .rdiff - Classified ref breakpoints from show-diff -rH .mdelta
jpayne@69 297 .qdiff - Classified qry breakpoints from show-diff -qH .mdelta
jpayne@69 298 .unref - Unaligned reference IDs and lengths (if applicable)
jpayne@69 299 .unqry - Unaligned query IDs and lengths (if applicable)
jpayne@69 300
jpayne@69 301 NOTES:
jpayne@69 302 The report file generated by this script can be useful for
jpayne@69 303 comparing the differences between two similar genomes or
jpayne@69 304 assemblies. The other outputs generated by this script are in
jpayne@69 305 unlabeled tabular format, so please refer to the utility
jpayne@69 306 specific documentation for interpreting them. A full
jpayne@69 307 description of the report file is given in "docs/dnadiff.README".
jpayne@69 308
jpayne@69 309
jpayne@69 310 -- RUNNING THE MUMmer UTILITIES --
jpayne@69 311 The MUMmer package consists of various utilities that can interact with
jpayne@69 312 the 'mummer' program. 'mummer' performs all maximal and maximal unique
jpayne@69 313 matching, and all other utilities were designed to process the input and
jpayne@69 314 output of this program and its related scripts, in order to extract
jpayne@69 315 additional information from the output. Listed below are the descriptions
jpayne@69 316 and usage definitions for these utilities.
jpayne@69 317
jpayne@69 318
jpayne@69 319 ** annotate **
jpayne@69 320
jpayne@69 321 DESCRIPTION:
jpayne@69 322 This program reads the output of the 'gaps' program and adds alignment
jpayne@69 323 information to it. Part of the original MUMmer1.0 pipeline and can
jpayne@69 324 only be used on the output of the 'gaps' program.
jpayne@69 325
jpayne@69 326 USAGE:
jpayne@69 327 annotate <gapsfile> <seq2>
jpayne@69 328
jpayne@69 329 <gapsfile> the output of the 'gaps' program.
jpayne@69 330 <seq2> the file containing the second sequence in the comparison.
jpayne@69 331
jpayne@69 332 OUTPUT:
jpayne@69 333 stdout the 'gaps' output interspersed with the alignments of
jpayne@69 334 the gaps between adjacent MUMs. An alignment of a
jpayne@69 335 gap comes after the second MUM defining the gap, and
jpayne@69 336 alignment errors are marked with a '^' character.
jpayne@69 337 witherrors.gaps the 'gaps' output with an appended column that lists
jpayne@69 338 the number of alignment errors for each gap.
jpayne@69 339
jpayne@69 340 NOTES:
jpayne@69 341 This program will eventually be dropped in favor of the combineMUMs
jpayne@69 342 or nucmer match extenders, but persists for the time being.
jpayne@69 343
jpayne@69 344
jpayne@69 345 ** combineMUMs **
jpayne@69 346
jpayne@69 347 DESCRIPTION:
jpayne@69 348 This program reads the output of the 'mgaps' program and adds alignment
jpayne@69 349 information to it. Part of the MUMmer3.0 pipeline and can only be
jpayne@69 350 used on the output of the 'mgaps' program. This -D option alters this
jpayne@69 351 behavior and only outputs the positions of difference, e.g. SNPs.
jpayne@69 352
jpayne@69 353 USAGE:
jpayne@69 354 combineMUMs [options] <reference> <query> <mgapsfile>
jpayne@69 355
jpayne@69 356 [options] type 'combineMUMs -h' for a list of options.
jpayne@69 357 <reference> the FastA reference file used in the comparison.
jpayne@69 358 <query> the multi-FastA reference file used in the comparison.
jpayne@69 359 <mgapsfile> the output of the 'mgaps' program run on the match
jpayne@69 360 list produced by 'mummer' for the reference and query
jpayne@69 361 files.
jpayne@69 362
jpayne@69 363 OUTPUT:
jpayne@69 364 stdout the 'mgaps' output interspersed with the alignments
jpayne@69 365 of the gaps between adjacent MUMs. An alignment of a
jpayne@69 366 gap comes after the second MUM defining the gap, and
jpayne@69 367 alignment errors are marked with a '^' character. At
jpayne@69 368 the end of each cluster is a summary line (keyword
jpayne@69 369 "Region") noting the bounds of the cluster in the
jpayne@69 370 reference and query sequences, the total number of
jpayne@69 371 errors for the region, the length of the region and
jpayne@69 372 the percent error of the region.
jpayne@69 373 witherrors.gaps the 'mgaps' output with an appended column that lists
jpayne@69 374 the number of alignment errors for each gap.
jpayne@69 375
jpayne@69 376
jpayne@69 377 ** delta-filter **
jpayne@69 378
jpayne@69 379 DESCRIPTION:
jpayne@69 380
jpayne@69 381 This program filters a delta alignment file produced by either
jpayne@69 382 nucmer or promer, leaving only the desired alignments which
jpayne@69 383 are output to stdout in the same delta format as the
jpayne@69 384 input. Its primary function is the LIS algorithm which
jpayne@69 385 calculates the longest increasing subset of alignments. This
jpayne@69 386 allows for the calculation of a global set of alignments
jpayne@69 387 (i.e. 1-to-1 and mutually consistent order) with the -g option
jpayne@69 388 or locally consistent with -1 or -m. Reference sequences can
jpayne@69 389 be mapped to query sequences with -r, or queries to references
jpayne@69 390 with -q. This allows the user to exclude chance and repeat
jpayne@69 391 induced alignments, leaving only the "best" alignments between
jpayne@69 392 the two data sets. Filtering can also be performed on length,
jpayne@69 393 identity, and uniquenes.
jpayne@69 394
jpayne@69 395 USAGE:
jpayne@69 396 delta-filter [options] <deltafile>
jpayne@69 397
jpayne@69 398 [options] type 'delta-filter -h' for a list of options.
jpayne@69 399 <deltafile> the .delta output file from either nucmer or promer.
jpayne@69 400
jpayne@69 401 OUTPUT:
jpayne@69 402 stdout The same delta alignment format as output by nucmer and promer.
jpayne@69 403
jpayne@69 404 NOTES:
jpayne@69 405 For most cases the -m option is recommended, however -1 is
jpayne@69 406 useful for applications that require a 1-to-1 mapping, such as
jpayne@69 407 SNP finding. Use the -q option for mapping query contigs to
jpayne@69 408 their best reference location.
jpayne@69 409
jpayne@69 410
jpayne@69 411 ** exact-tandems **
jpayne@69 412
jpayne@69 413 DESCRIPTION:
jpayne@69 414 This script finds exact tandem repeats in a specified FastA sequence
jpayne@69 415 file. It is a post-processor for 'repeat-match' and provides a simple
jpayne@69 416 interface and output for tandem repeat detection.
jpayne@69 417
jpayne@69 418 USAGE:
jpayne@69 419 exact-tandems <file> <min match>
jpayne@69 420
jpayne@69 421 <file> the single sequence in FastA format to search for repeats.
jpayne@69 422 <min match> the minimum match length for the tandems.
jpayne@69 423
jpayne@69 424 OUTPUT:
jpayne@69 425 stdout 4 columns, the start of the tandem repeat, the total extent
jpayne@69 426 of the repeat region, the length of each repetitive unit, and
jpayne@69 427 to total copies of the repetitive unit involved.
jpayne@69 428
jpayne@69 429
jpayne@69 430 ** gaps **
jpayne@69 431
jpayne@69 432 DESCRIPTION:
jpayne@69 433 This program reads a list of unique matches between two strings and
jpayne@69 434 outputs the longest consistent set of matches, followed by all the
jpayne@69 435 other matches. Part of the MUMmer1.0 pipeline and the output of the
jpayne@69 436 'mummer' program needs to be processed (to strip all non-match lines)
jpayne@69 437 before it can be passed to this program.
jpayne@69 438
jpayne@69 439 USAGE:
jpayne@69 440 gaps <seq1> [-r] < <matchlist>
jpayne@69 441
jpayne@69 442 <seq1> The first sequence file that the match list represents.
jpayne@69 443 <matchlist> A simple list of matches and NO header lines or other
jpayne@69 444 mumbo jumbo. The columns of the match list should be
jpayne@69 445 start in the reference, start in the query, and length
jpayne@69 446 of the match.
jpayne@69 447 [-r] Simply puts the string "reverse" on the header of the
jpayne@69 448 output so 'annotate' knows to reverse the second
jpayne@69 449 sequence.
jpayne@69 450
jpayne@69 451 OUTPUT:
jpayne@69 452 stdout an ordered set of the input matches, separated by headers.
jpayne@69 453 The first set is the longest consistent set of matches and
jpayne@69 454 the second set is all other matches.
jpayne@69 455
jpayne@69 456 NOTES:
jpayne@69 457 This program will eventually be rewritten to be interchangeable with
jpayne@69 458 'mgaps', so that it may be plugged into the nucmer or promer
jpayne@69 459 pipelines.
jpayne@69 460
jpayne@69 461
jpayne@69 462 ** mapview **
jpayne@69 463
jpayne@69 464 DESCRIPTION:
jpayne@69 465 mapview is a utility program for displaying sequence alignments as
jpayne@69 466 provided by MUMmer, nucmer or promer. This program takes the output
jpayne@69 467 from these alignment routines and converts it to a FIG, PDF or PS
jpayne@69 468 file for visual analysis. It can also break the output into multiple
jpayne@69 469 files for easier viewing and printing. Please refer to
jpayne@69 470 "docs/mapview.README" for a more detailed description and explination.
jpayne@69 471
jpayne@69 472 USAGE:
jpayne@69 473 mapview [options] <coords file> [UTR coords] [CDS coords]
jpayne@69 474
jpayne@69 475 [options] type 'mapview -h' for a list of options.
jpayne@69 476 <coords file> show-coords output file
jpayne@69 477 [UTR coords] UTR coordinate file in GFF format
jpayne@69 478 [CDS coords] CDS coordinate file in GFF format
jpayne@69 479
jpayne@69 480 OUTPUT:
jpayne@69 481 Default output format is an xfig file, however this can be changed to
jpayne@69 482 a postscript of PDF file with the -f option. See 'mapview -h' for a
jpayne@69 483 list of available formatting options.
jpayne@69 484
jpayne@69 485 NOTES:
jpayne@69 486 The produce the coords file input, 'show-coords' must be run with the
jpayne@69 487 -r -l options. To reduce redundant matches in promer output, run
jpayne@69 488 show-coords with the -k option. To generate output formats other than
jpayne@69 489 xfig, the fig2dev utility must be available from the system path. For
jpayne@69 490 very large reference genomes, FIG format may be the only option that
jpayne@69 491 will allow the entire display to be stored in one file, as fig2dev has
jpayne@69 492 problems if the output is too large.
jpayne@69 493
jpayne@69 494
jpayne@69 495 ** mgaps **
jpayne@69 496
jpayne@69 497 DESCRIPTION:
jpayne@69 498 This program reads a list of matches between a single-FastA reference
jpayne@69 499 and a multi-FastA query file and outputs clusters of matches that lie
jpayne@69 500 on similar diagonals and within a reasonable distance. Part of the
jpayne@69 501 MUMmer3.0 pipeline and the output of 'mummer' need not be processed
jpayne@69 502 before passing it to this program, so long as 'mummer' was run on a
jpayne@69 503 1-vs-many or 1-vs-1 dataset.
jpayne@69 504
jpayne@69 505 USAGE:
jpayne@69 506 mgaps [options] < <matchlist>
jpayne@69 507
jpayne@69 508 [options] type 'mgaps -h' for a list of options.
jpayne@69 509 <matchlist> A list of matches separated by their sequence FastA tags.
jpayne@69 510 The columns of the match list should be start in
jpayne@69 511 reference, start in query, and length of the match.
jpayne@69 512
jpayne@69 513 OUTPUT:
jpayne@69 514 stdout An ordered set of the input matches, separated by headers.
jpayne@69 515 Individual clusters are separated by a '#' character and
jpayne@69 516 sets of clusters from different sequences are separated by
jpayne@69 517 the FastA header tag for the query sequence.
jpayne@69 518
jpayne@69 519 NOTES:
jpayne@69 520 It is often very helpful to adjust the clustering parameters. Check
jpayne@69 521 'mgaps -h' for the list of parameters and check the source for a
jpayne@69 522 better idea of how each parameter affects the result. Often, it is
jpayne@69 523 helpful to run this program a number of times with different
jpayne@69 524 parameters until the desired result is achieved.
jpayne@69 525
jpayne@69 526
jpayne@69 527 ** mummer **
jpayne@69 528
jpayne@69 529 DESCRIPTION:
jpayne@69 530 This is the core program of the MUMmer package. It is the suffix-tree
jpayne@69 531 based match finding routine, and the main part of every MUMmer script.
jpayne@69 532 For a detailed manual describing how to use this program, please refer
jpayne@69 533 to "docs/maxmat3man.pdf" or in LaTeX format "docs/maxmat3man.tex". By
jpayne@69 534 default, 'mummer' now finds maximal matches regardless of their
jpayne@69 535 uniqueness. Limiting the output to only unique matches can be specified
jpayne@69 536 as a command line switch.
jpayne@69 537
jpayne@69 538 USAGE:
jpayne@69 539 mummer [options] <reference> <query> ...
jpayne@69 540
jpayne@69 541 [options] type 'mummer -help' for a list of options.
jpayne@69 542 <reference> specifies the single or multi-FastA sequence file that
jpayne@69 543 contains the reference sequence(s), to be aligned with
jpayne@69 544 the queries.
jpayne@69 545 <query> specifies the multi-FastA sequence file that contains
jpayne@69 546 the query sequences, to be aligned with the references.
jpayne@69 547 Multiple query files are allowed, up to 32.
jpayne@69 548
jpayne@69 549 OUTPUT:
jpayne@69 550 stdout a list of exact matches. Varies depending on input, refer to
jpayne@69 551 the manual specified in the description above.
jpayne@69 552
jpayne@69 553 NOTES:
jpayne@69 554 Many thanks to Stefan Kurtz for the latest mummer version. 'mummer'
jpayne@69 555 now behaves like the old 'mummer2' program by default. The -mum switch
jpayne@69 556 forces it to behave like 'mummer1', the -mumreference switch forces it
jpayne@69 557 to behave like 'mummer2' while the -maxmatch switch forces it to behave
jpayne@69 558 like the old 'max-match' program.
jpayne@69 559
jpayne@69 560
jpayne@69 561 ** mummerplot **
jpayne@69 562
jpayne@69 563 DESCRIPTION:
jpayne@69 564 mummerplot is a perl script that generates gnuplot scripts and data
jpayne@69 565 collections for plotting with the gnuplot utility. It can generate
jpayne@69 566 2-d dotplots and 1-d coverage plots for the output of mummer, nucmer,
jpayne@69 567 promer or show-tiling. It can also color dotplots with an identity
jpayne@69 568 color gradient.
jpayne@69 569
jpayne@69 570 USAGE:
jpayne@69 571 mummerplot [options] <matchfile>
jpayne@69 572
jpayne@69 573 [options] type 'mummerplot -h' for a list of options.
jpayne@69 574 <matchfile> the output of 'mummer', 'nucmer', 'promer', or
jpayne@69 575 'show-tiling'. 'mummerplot' will automatically determine
jpayne@69 576 the format of the data it was given and produce the plot
jpayne@69 577 accordingly.
jpayne@69 578
jpayne@69 579 OUTPUT:
jpayne@69 580 out.gp The gnuplot script, type 'gnuplot out.gp' to evaluate the
jpayne@69 581 the gnuplot script.
jpayne@69 582 out.fplot
jpayne@69 583 out.rplot
jpayne@69 584 out.hplot The forward, reverse and highlighted match information for
jpayne@69 585 plotting with gnuplot.
jpayne@69 586
jpayne@69 587 out.ps
jpayne@69 588 out.png The plotted image file, postscript or png depending on the
jpayne@69 589 selected terminal type.
jpayne@69 590
jpayne@69 591 NOTES:
jpayne@69 592 For alignments with multiple reference or query sequences, be sure to
jpayne@69 593 use the -r -q or -R -Q options to avoid overlaying multiple plots in
jpayne@69 594 the same space. For better looking color gradient plots, try the
jpayne@69 595 postscript terminal and avoid the png terminal.
jpayne@69 596
jpayne@69 597
jpayne@69 598 ** nucmer2xfig **
jpayne@69 599
jpayne@69 600 DESCRIPTION:
jpayne@69 601 Script for plotting nucmer hits against a reference sequence. See top
jpayne@69 602 of script for more information, or see if 'mummerplot' or 'mapview'
jpayne@69 603 has the functionality required as they are properly maintained.
jpayne@69 604
jpayne@69 605
jpayne@69 606 ** repeat-match **
jpayne@69 607
jpayne@69 608 DESCRIPTION:
jpayne@69 609 Finds exact repeats within a single sequence.
jpayne@69 610
jpayne@69 611 USAGE:
jpayne@69 612 repeat-match [options] <seq>
jpayne@69 613
jpayne@69 614 [options] type 'repeat-match -h' for a list of options.
jpayne@69 615 <seq> the single sequence in FastA format to search for repeats.
jpayne@69 616
jpayne@69 617 OUTPUT:
jpayne@69 618 stdout 3 columns, the start of the first copy of the repeat, the
jpayne@69 619 start of the second copy of the repeat, and the length of the
jpayne@69 620 repeat respectively.
jpayne@69 621
jpayne@69 622 NOTES:
jpayne@69 623 REPuter (freely available for universities) may be better suited for
jpayne@69 624 most repeat matching, but 'repeat-match' is open-source and has some
jpayne@69 625 functionality that REPuter does not so we include it along with the
jpayne@69 626 MUMmer package.
jpayne@69 627
jpayne@69 628
jpayne@69 629 ** show-aligns **
jpayne@69 630
jpayne@69 631 DESCRIPTION:
jpayne@69 632 This program parses the delta alignment output of nucmer and promer
jpayne@69 633 and displays all of the pairwise alignments from the two sequences
jpayne@69 634 specified on the command line.
jpayne@69 635
jpayne@69 636 USAGE:
jpayne@69 637 show-aligns [options] <deltafile> <IdR> <IdQ>
jpayne@69 638
jpayne@69 639 [options] type 'show-aligns -h' for a list of options.
jpayne@69 640 <deltafile> the .delta output file from either nucmer or promer.
jpayne@69 641 <IdR> the FastA header tag of the desired reference sequence.
jpayne@69 642 <IdQ> the FastA header tag of the desired query sequence.
jpayne@69 643
jpayne@69 644 OUTPUT:
jpayne@69 645 stdout each alignment header and footer describes the frame of the
jpayne@69 646 alignment in each sequence, and the start and finish
jpayne@69 647 (inclusive) of the alignment in each sequence. At the
jpayne@69 648 beginning of each line of aligned sequence are two numbers, the
jpayne@69 649 top is the coordinate of the first reference base on that line
jpayne@69 650 and the bottom is the coordinate of the first query base on
jpayne@69 651 that line. ALL coordinates reference the forward strand of the
jpayne@69 652 DNA sequence, even if it is a protein alignment. A gap caused
jpayne@69 653 by an insertion or deletion is filled with a '.' character.
jpayne@69 654 Errors in a DNA alignment are marked with a '^' below the
jpayne@69 655 error. Errors in an amino acid alignment are marked with a
jpayne@69 656 whitespace in the middle consensus line, while matches are
jpayne@69 657 marked with the consensus base and similarities are marked with
jpayne@69 658 a '+' in the consensus line.
jpayne@69 659
jpayne@69 660
jpayne@69 661 ** show-coords **
jpayne@69 662
jpayne@69 663 DESCRIPTION:
jpayne@69 664 This program parses the delta alignment output of nucmer and promer
jpayne@69 665 and displays the coordinates, and other useful information about the
jpayne@69 666 alignments.
jpayne@69 667
jpayne@69 668 USAGE:
jpayne@69 669 show-coords [options] <deltafile>
jpayne@69 670
jpayne@69 671 [options] type 'show-coords -h' for a list of options.
jpayne@69 672 <deltafile> the .delta output file from either nucmer or promer.
jpayne@69 673
jpayne@69 674 OUTPUT:
jpayne@69 675 stdout run 'show-coords' without the -H option to see the column
jpayne@69 676 header tags. Here is a description of each tag. Note that
jpayne@69 677 some of the below tags do not apply to nucmer data, and that
jpayne@69 678 all coordinates are inclusive and relative to the forward DNA
jpayne@69 679 strand.
jpayne@69 680
jpayne@69 681 [S1] Start of the alignment region in the reference sequence.
jpayne@69 682
jpayne@69 683 [E1] End of the alignment region in the reference sequence.
jpayne@69 684
jpayne@69 685 [S2] Start of the alignment region in the query sequence.
jpayne@69 686
jpayne@69 687 [E2] End of the alignment region in the query sequence.
jpayne@69 688
jpayne@69 689 [LEN 1] Length of the alignment region in the reference sequence,
jpayne@69 690 measured in nucleotides.
jpayne@69 691
jpayne@69 692 [LEN 2] Length of the alignment region in the query sequence, measured
jpayne@69 693 in nucleotides.
jpayne@69 694
jpayne@69 695 [% IDY] Percent identity of the alignment, calculated as the
jpayne@69 696 (number of exact matches) / ([LEN 1] + insertions in the query).
jpayne@69 697
jpayne@69 698 [% SIM] Percent similarity of the alignment, calculated like the above
jpayne@69 699 value, but counting positive BLOSUM matrix scores instead of exact
jpayne@69 700 matches.
jpayne@69 701
jpayne@69 702 [% STP] Percent of stop codons of the alignment, calculated as
jpayne@69 703 (number of stop codons) / (([LEN 1] + insertions in the query) * 2).
jpayne@69 704
jpayne@69 705 [LEN R] Length of the reference sequence.
jpayne@69 706
jpayne@69 707 [LEN Q] Length of the query sequence.
jpayne@69 708
jpayne@69 709 [COV R] Percent coverage of the alignment on the reference sequence,
jpayne@69 710 calculated as [LEN 1] / [LEN R].
jpayne@69 711
jpayne@69 712 [COV Q] Percent coverage of the alignment on the query sequence,
jpayne@69 713 calculated as [LEN 2] / [LEN Q].
jpayne@69 714
jpayne@69 715 [FRM] Reading frame for the reference sequence and the reading frame
jpayne@69 716 for the query sequence respectively. This is one of the columns
jpayne@69 717 absent from the nucmer data, however, match direction can easily be
jpayne@69 718 determined by the start and end coordinates.
jpayne@69 719
jpayne@69 720 [TAGS] The reference FastA ID and the query FastA ID.
jpayne@69 721
jpayne@69 722 There is also an optional final column (turned on with the -w
jpayne@69 723 or -o option) that will contain some 'annotations'. The -o option will
jpayne@69 724 annotate alignments that represent overlaps between two sequences,
jpayne@69 725 while the -w option is antiquated and should no longer be used.
jpayne@69 726 Sometimes, nucmer or promer will extend adjacent clusters past one
jpayne@69 727 another, thus causing a somewhat redundant output, this option will
jpayne@69 728 notify users of such rare occurrences.
jpayne@69 729
jpayne@69 730 NOTES:
jpayne@69 731 The -c and -l options are useful when comparing two sets of assembly
jpayne@69 732 contigs, in that these options help determine if an alignment spans an
jpayne@69 733 entire contig, or is just a partial hit to a different read. The -b
jpayne@69 734 option is useful when the user wishes to identify sytenic regions
jpayne@69 735 between two genomes, but is not particularly interested in the actual
jpayne@69 736 alignment similarity or appearance. This option also disregards match
jpayne@69 737 orientation, so should not be used if this information is needed.
jpayne@69 738
jpayne@69 739
jpayne@69 740 ** show-diff **
jpayne@69 741
jpayne@69 742 DESCRIPTION:
jpayne@69 743 This program classifies alignment breakpoints for the
jpayne@69 744 quantification of macroscopic differences between two
jpayne@69 745 genomes. It takes a standard, unfiltered delta file as input,
jpayne@69 746 determines the best mapping between the two sequence sets, and
jpayne@69 747 reports on the breaks in that mapping.
jpayne@69 748
jpayne@69 749 USAGE:
jpayne@69 750 show-diff [options] <deltafile>
jpayne@69 751
jpayne@69 752 [options] type 'show-diff -h' for a list of options.
jpayne@69 753 <deltafile> the .delta output file from nucmer
jpayne@69 754
jpayne@69 755 OUTPUT:
jpayne@69 756 stdout Classified breakpoints are output one per line with
jpayne@69 757 the following types and column definitions. The first
jpayne@69 758 five columns of every row are seq ID, feature type,
jpayne@69 759 feature start, feature end, and feature length.
jpayne@69 760
jpayne@69 761 Feature Columns
jpayne@69 762
jpayne@69 763 IDR GAP gap-start gap-end gap-length-R gap-length-Q gap-diff
jpayne@69 764 IDR DUP dup-start dup-end dup-length
jpayne@69 765 IDR BRK gap-start gap-end gap-length
jpayne@69 766 IDR JMP gap-start gap-end gap-length
jpayne@69 767 IDR INV gap-start gap-end gap-length
jpayne@69 768 IDR SEQ gap-start gap-end gap-length prev-sequence next-sequence
jpayne@69 769
jpayne@69 770 Feature Types
jpayne@69 771
jpayne@69 772 [GAP] A gap between two mutually consistent ordered and
jpayne@69 773 oriented alignments. gap-length-R is the length of the
jpayne@69 774 alignment gap in the reference, gap-length-Q is the length of
jpayne@69 775 the alignment gap in the query, and gap-diff is the difference
jpayne@69 776 between the two gap lengths. If gap-diff is positive, sequence
jpayne@69 777 has been inserted in the reference. If gap-diff is negative,
jpayne@69 778 sequence has been deleted from the reference. If both
jpayne@69 779 gap-length-R and gap-length-Q are negative, the indel is
jpayne@69 780 tandem duplication copy difference.
jpayne@69 781
jpayne@69 782 [DUP] A duplicated sequence in the reference that occurs more
jpayne@69 783 times in the reference than in the query. The coordinate
jpayne@69 784 columns specify the bounds and length of the
jpayne@69 785 duplication. These features are often bookended by BRK
jpayne@69 786 features if there is unique sequence bounding the duplication.
jpayne@69 787
jpayne@69 788 [BRK] An insertion in the reference of unknown origin, that
jpayne@69 789 indicates no query sequence aligns to the sequence bounded by
jpayne@69 790 gap-start and gap-end. Often found around DUP elements or at
jpayne@69 791 the beginning or end of sequences.
jpayne@69 792
jpayne@69 793 [JMP] A relocation event, where the consistent ordering of
jpayne@69 794 alignments is disrupted. The coordinate columns specify the
jpayne@69 795 breakpoints of the relocation in the reference, and the
jpayne@69 796 gap-length between them. A negative gap-length indicates the
jpayne@69 797 relocation occurred around a repetitive sequence, and a
jpayne@69 798 positive length indicates unique sequence between the
jpayne@69 799 alignments.
jpayne@69 800
jpayne@69 801 [INV] The same as a relocation event, however both the
jpayne@69 802 ordering and orientation of the alignments is disrupted. Note
jpayne@69 803 that for JMP and INV, generally two features will be output,
jpayne@69 804 one for the beginning of the inverted region, and another for
jpayne@69 805 the end of the inverted region.
jpayne@69 806
jpayne@69 807 [SEQ] A translocation event that requires jumping to a new
jpayne@69 808 query sequence in order to continue aligning to the
jpayne@69 809 reference. If each input sequence is a chromosome, these
jpayne@69 810 features correspond to inter-chromosomal translocations.
jpayne@69 811
jpayne@69 812 NOTES:
jpayne@69 813 The estimated number of features, take inversions for example,
jpayne@69 814 represents the number of breakpoints classified as bordering
jpayne@69 815 an inversion. Therefore, since there will be a breakpoint at
jpayne@69 816 both the beginning and the end of an inversion, the feature
jpayne@69 817 counts are roughly double the number of inversion events. In
jpayne@69 818 addition, all counts are estimates and do not represent the
jpayne@69 819 exact number of each evolutionary event.
jpayne@69 820
jpayne@69 821 Summing the fifth column (ignoring negative values) yeilds an
jpayne@69 822 estimate of the total inserted sequence in the
jpayne@69 823 reference. Summing the fifth column after removing DUP
jpayne@69 824 features yields an estimate of the total amount of unique
jpayne@69 825 (unaligned) sequence in the reference. Note that unaligned
jpayne@69 826 sequences are not counted, and could represent additional
jpayne@69 827 "unique" sequences. Use the 'dnadiff' script if you must
jpayne@69 828 recover this information. Finally, the -q option switches
jpayne@69 829 references for queries, and uses the query coordinates for the
jpayne@69 830 analysis.
jpayne@69 831
jpayne@69 832
jpayne@69 833 ** show-snps **
jpayne@69 834
jpayne@69 835 DESCRIPTION:
jpayne@69 836 This program reports polymorphism contained in a delta encoded
jpayne@69 837 alignment file output by either nucmer or promer. It catalogs
jpayne@69 838 all of the single nucleotide polymorphisms (SNPs) and
jpayne@69 839 insertions/deletions within the delta file
jpayne@69 840 alignments. Polymorphisms are reported one per line, in a
jpayne@69 841 delimited fashion similar to show-coords. Pairing this program
jpayne@69 842 with the appropriate MUMmer tools can create an easy to use
jpayne@69 843 SNP pipeline for the rapid identification of putative SNPs
jpayne@69 844 between any two sequence sets.
jpayne@69 845
jpayne@69 846 USAGE:
jpayne@69 847 show-snps [options] <deltafile>
jpayne@69 848
jpayne@69 849 [options] type 'show-snps -h' for a list of options.
jpayne@69 850 <deltafile> the .delta output file from either nucmer or promer.
jpayne@69 851
jpayne@69 852 OUTPUT:
jpayne@69 853 stdout Standard output has column headers with the following
jpayne@69 854 meanings. Not all columns will be output by default,
jpayne@69 855 see 'show-snps -h' for switch to control the output.
jpayne@69 856
jpayne@69 857 [P1] SNP position in the reference.
jpayne@69 858
jpayne@69 859 [SUB] Character in the reference.
jpayne@69 860
jpayne@69 861 [SUB] Character in the query.
jpayne@69 862
jpayne@69 863 [P2] SNP position in the query.
jpayne@69 864
jpayne@69 865 [BUFF] Distance from this SNP to the nearest mismatch (end of
jpayne@69 866 alignment, indel, SNP, etc) in the same alignment.
jpayne@69 867
jpayne@69 868 [DIST] Distance from this SNP to the nearest sequence end.
jpayne@69 869
jpayne@69 870 [R] Number of repeat alignments which cover this reference
jpayne@69 871 position, >0 means repetitive sequence.
jpayne@69 872
jpayne@69 873 [Q] Number of repeat alignments which cover this query
jpayne@69 874 position, >0 means repetitive sequence.
jpayne@69 875
jpayne@69 876 [LEN R] Length of the reference sequence.
jpayne@69 877
jpayne@69 878 [LEN Q] Length of the query sequence.
jpayne@69 879
jpayne@69 880 [CTX R] Surrounding context sequence in the reference.
jpayne@69 881
jpayne@69 882 [CTX Q] Surrounding context sequence in the query.
jpayne@69 883
jpayne@69 884 [FRM] Reading frame for the reference sequence and the
jpayne@69 885 reading frame for the query sequence respectively. Simply
jpayne@69 886 'forward' 1, or 'reverse' -1 for nucmer data.
jpayne@69 887
jpayne@69 888 [TAGS] The reference FastA ID and the query FastA ID.
jpayne@69 889
jpayne@69 890 NOTES:
jpayne@69 891 It is often helpful to run this with the -C option to assure
jpayne@69 892 reported SNPs are only reported from uniquely aligned regions.
jpayne@69 893
jpayne@69 894
jpayne@69 895 ** show-tiling **
jpayne@69 896
jpayne@69 897 DESCRIPTION:
jpayne@69 898 This program attempts to construct a tiling path out of the query
jpayne@69 899 contigs as mapped to the reference sequences. Given the delta
jpayne@69 900 alignment information of a few long reference sequences and many small
jpayne@69 901 query contigs, 'show-tiling' will determine the best location on a
jpayne@69 902 reference for each contig. Note that each contig may only be tiled
jpayne@69 903 once, so repetitive regions may cause this program some difficulty.
jpayne@69 904 This program is useful for aiding in the scaffolding and closure of an
jpayne@69 905 unfinished set of contigs, if a suitable, high similarity, reference
jpayne@69 906 genome is available. Or, if using promer, 'show-tiling' will help
jpayne@69 907 in the identification of syntenic regions and their contig's mapping
jpayne@69 908 the the references.
jpayne@69 909
jpayne@69 910 USAGE:
jpayne@69 911 show-tiling [options] <deltafile>
jpayne@69 912
jpayne@69 913 [options] type 'show-tiling -h' for a list of options.
jpayne@69 914 <deltafile> the .delta output file from either nucmer or promer.
jpayne@69 915
jpayne@69 916 OUTPUT:
jpayne@69 917 stdout Standard output has 8 columns: start in reference, end in
jpayne@69 918 reference, gap between this contig and the next, length of this
jpayne@69 919 contig, alignment coverage of this contig, average percent
jpayne@69 920 identity of the alignments for this contig, orientation of this
jpayne@69 921 contig, contig ID. All matches to a reference are headed by the
jpayne@69 922 FASTA tag of that reference. Output with the -a option is the
jpayne@69 923 same as 'show-coords -cl' when run on nucmer data.
jpayne@69 924
jpayne@69 925 NOTES:
jpayne@69 926 When run with the -x option, 'show-tiling' will produce an XML output
jpayne@69 927 format that can be accepted by TIGR's open source scaffolding software
jpayne@69 928 'Bambus' as contig linking information.
jpayne@69 929
jpayne@69 930
jpayne@69 931 -- CONTACT INFORMATION --
jpayne@69 932
jpayne@69 933 Please address questions and bug reports to: <mummer-help@lists.sourceforge.net>
jpayne@69 934
jpayne@69 935 Last Revised May 12, 2005