jpayne@69: -=- run-mummer1 (MUMmer1.0) README -=-
jpayne@69: 
jpayne@69: ** NOTE **
jpayne@69: This manual is outdated, please refer to the HTML documentation included in
jpayne@69: this distribution or at:
jpayne@69: 
jpayne@69:    http://mummer.sourceforge.net
jpayne@69:    http://mummer.sourceforge.net/manual
jpayne@69:    http://mummer.sourceforge.net/examples
jpayne@69: 
jpayne@69: This is the README for the original MUMmer 1.0 system, which is
jpayne@69: included in the MUMmer 3 package.  It is slower and uses more memory,
jpayne@69: but it does have some slightly different functions and output so we
jpayne@69: make it available.  
jpayne@69: 
jpayne@69: 
jpayne@69: 
jpayne@69: MUMmer 1.0 code and documentation are copyright (c) 1999 by The Institute
jpayne@69: for Genomic Research.  The principle architect for the system was
jpayne@69: Arthur Delcher.
jpayne@69: 
jpayne@69: This directory contains the source code for MUMmer program for
jpayne@69: aligning long DNA sequences.  If you use this code in any publication,
jpayne@69: please cite the following:
jpayne@69:    A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson,
jpayne@69:    O. White, and S.L. Salzberg.  Alignment of whole genomes.
jpayne@69:    Nucleic Acids Research, 27:11 (1999), 2369-2376.
jpayne@69: 
jpayne@69: MUMmer works on Unix only.  This README file is the only documentation
jpayne@69: besides the code itself.  Since it is free, we cannot provide any
jpayne@69: other support.  The system is very easy to use, but you need to invest
jpayne@69: a few minutes up front reading this file and figuring out how to
jpayne@69: interpret the output.  If you discover bugs, we would be interested in
jpayne@69: hearing about them so we can correct them.  Address bug reports to
jpayne@69: <mummer-help@lists.sourceforge.net>. The system uses a LOT of RAM - if
jpayne@69: it crashes for that reason, that's not a bug.  We recommend at least
jpayne@69: 512Mb to align most pairs of bacterial genomes, and 1Gb or more may be
jpayne@69: required.
jpayne@69: 
jpayne@69: To use this system, first compile it by typing 'make' at the
jpayne@69: command line.  There is a script, 'run-mummer1.csh', that runs all
jpayne@69: the steps of aligning two genomes.  The script takes these arguments:
jpayne@69: 
jpayne@69:      run-mummer1.csh <genome1> <genome2> <tag> [-r]
jpayne@69: 
jpayne@69: The two genomes must DNA sequences in FASTA format.  Multi-FASTA
jpayne@69: files don't work.  The tag is used to create 4 output files,
jpayne@69: all with 'tag' as a prefix.  -flip  will reverse complement <genome2>.
jpayne@69: 
jpayne@69: Of the four output files, two are intended for your inspection.
jpayne@69: <tag>.errorsgaps lists the alignment of all the MUMs (maximal unique
jpayne@69: matches - read the paper for definitions), and includes the longest
jpayne@69: ascending sequence of MUMs first.  This sequence is the best alignment
jpayne@69: of the two genomes by the program.  In between the MUMs are gaps,
jpayne@69: which are aligned using a Smith-Waterman implementation of our own.
jpayne@69: Those Smith-Waterman's are contained in the file <tag>.align.
jpayne@69: This can be a very long file if there are lots of gaps.  If either
jpayne@69: of the two gaps is too long (over 5000bp), then the alignment is
jpayne@69: not performed.
jpayne@69: 
jpayne@69: IMPORTANT: the performance of the program can critically depend on the
jpayne@69: minimum MUM length you use.  The default is 20bp.  If you want to 
jpayne@69: change it, do the following:  edit the file run-mummer1.csh. Add a new
jpayne@69: length switch to the 'mummer' call.
jpayne@69: 
jpayne@69: The other file - one that we often spend lots of time analyzing - is
jpayne@69: <tag>.errorsgaps.  The lines in that file are the MUMs, for example:
jpayne@69:    46989   271588     23    none   3262   3304    1022
jpayne@69:    47013   271612     24    none      1      1       1
jpayne@69: Columns 1 and 2 are the positions of a MUM in genomes 1 and 2.  The
jpayne@69: MUM is 24bp in length, shown in column 3.  Column 4 is the overlap
jpayne@69: from the previous MUM - usually this will be 'none' except when there
jpayne@69: are repeats present.  Columns 5 and 6 show the gaps from the *end*
jpayne@69: of the previous MUM.  Column 7 shows the number of errors - indels
jpayne@69: or mismatches - in the S-W alignment of the gap before this MUM.
jpayne@69: Hence in the two lines shown here, the second line is a MUM of 24bp,
jpayne@69: and it follows the previous MUM after a gap of just 1bp in each genome.
jpayne@69: This indicates a single nucleotide polymorphism.
jpayne@69: 
jpayne@69: Files in this directory:
jpayne@69: 
jpayne@69:    annotate.cc  Adds alignment info to gaps file produced by  gaps
jpayne@69:      or  gaps2 .  Reads frag info from the file named on the '>'
jpayne@69:      lines of the gap file.  The word "reverse" after that name
jpayne@69:      will make that sequence get reverse-complemented.
jpayne@69:      Also produces file  witherrors.gaps  which is the same as the
jpayne@69:      gaps input file, put with an extra column of number of
jpayne@69:      errors.
jpayne@69: 
jpayne@69:    gaps.cc  Finds longest consistent set of matches in list
jpayne@69:      produced by mummer1 program.
jpayne@69: 
jpayne@69:    run-mummer1.csh  Script to run alignment programs.  Format is:
jpayne@69:      run-mummer1.csh <genome1> <genome2> <tag> [-flip]
jpayne@69:      <tag> will be used to make output files:  <tag>.out , <tag>.gaps
jpayne@69:      and  <tag>.align .  -r  will reverse complement <genome2>
jpayne@69: 
jpayne@69: Email questions, comments or bug reports to: <mummer-help@lists.sourceforge.net>
jpayne@69: