jpayne@69: -=- run-mummer1 (MUMmer1.0) README -=- jpayne@69: jpayne@69: ** NOTE ** jpayne@69: This manual is outdated, please refer to the HTML documentation included in jpayne@69: this distribution or at: jpayne@69: jpayne@69: http://mummer.sourceforge.net jpayne@69: http://mummer.sourceforge.net/manual jpayne@69: http://mummer.sourceforge.net/examples jpayne@69: jpayne@69: This is the README for the original MUMmer 1.0 system, which is jpayne@69: included in the MUMmer 3 package. It is slower and uses more memory, jpayne@69: but it does have some slightly different functions and output so we jpayne@69: make it available. jpayne@69: jpayne@69: jpayne@69: jpayne@69: MUMmer 1.0 code and documentation are copyright (c) 1999 by The Institute jpayne@69: for Genomic Research. The principle architect for the system was jpayne@69: Arthur Delcher. jpayne@69: jpayne@69: This directory contains the source code for MUMmer program for jpayne@69: aligning long DNA sequences. If you use this code in any publication, jpayne@69: please cite the following: jpayne@69: A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson, jpayne@69: O. White, and S.L. Salzberg. Alignment of whole genomes. jpayne@69: Nucleic Acids Research, 27:11 (1999), 2369-2376. jpayne@69: jpayne@69: MUMmer works on Unix only. This README file is the only documentation jpayne@69: besides the code itself. Since it is free, we cannot provide any jpayne@69: other support. The system is very easy to use, but you need to invest jpayne@69: a few minutes up front reading this file and figuring out how to jpayne@69: interpret the output. If you discover bugs, we would be interested in jpayne@69: hearing about them so we can correct them. Address bug reports to jpayne@69: . The system uses a LOT of RAM - if jpayne@69: it crashes for that reason, that's not a bug. We recommend at least jpayne@69: 512Mb to align most pairs of bacterial genomes, and 1Gb or more may be jpayne@69: required. jpayne@69: jpayne@69: To use this system, first compile it by typing 'make' at the jpayne@69: command line. There is a script, 'run-mummer1.csh', that runs all jpayne@69: the steps of aligning two genomes. The script takes these arguments: jpayne@69: jpayne@69: run-mummer1.csh [-r] jpayne@69: jpayne@69: The two genomes must DNA sequences in FASTA format. Multi-FASTA jpayne@69: files don't work. The tag is used to create 4 output files, jpayne@69: all with 'tag' as a prefix. -flip will reverse complement . jpayne@69: jpayne@69: Of the four output files, two are intended for your inspection. jpayne@69: .errorsgaps lists the alignment of all the MUMs (maximal unique jpayne@69: matches - read the paper for definitions), and includes the longest jpayne@69: ascending sequence of MUMs first. This sequence is the best alignment jpayne@69: of the two genomes by the program. In between the MUMs are gaps, jpayne@69: which are aligned using a Smith-Waterman implementation of our own. jpayne@69: Those Smith-Waterman's are contained in the file .align. jpayne@69: This can be a very long file if there are lots of gaps. If either jpayne@69: of the two gaps is too long (over 5000bp), then the alignment is jpayne@69: not performed. jpayne@69: jpayne@69: IMPORTANT: the performance of the program can critically depend on the jpayne@69: minimum MUM length you use. The default is 20bp. If you want to jpayne@69: change it, do the following: edit the file run-mummer1.csh. Add a new jpayne@69: length switch to the 'mummer' call. jpayne@69: jpayne@69: The other file - one that we often spend lots of time analyzing - is jpayne@69: .errorsgaps. The lines in that file are the MUMs, for example: jpayne@69: 46989 271588 23 none 3262 3304 1022 jpayne@69: 47013 271612 24 none 1 1 1 jpayne@69: Columns 1 and 2 are the positions of a MUM in genomes 1 and 2. The jpayne@69: MUM is 24bp in length, shown in column 3. Column 4 is the overlap jpayne@69: from the previous MUM - usually this will be 'none' except when there jpayne@69: are repeats present. Columns 5 and 6 show the gaps from the *end* jpayne@69: of the previous MUM. Column 7 shows the number of errors - indels jpayne@69: or mismatches - in the S-W alignment of the gap before this MUM. jpayne@69: Hence in the two lines shown here, the second line is a MUM of 24bp, jpayne@69: and it follows the previous MUM after a gap of just 1bp in each genome. jpayne@69: This indicates a single nucleotide polymorphism. jpayne@69: jpayne@69: Files in this directory: jpayne@69: jpayne@69: annotate.cc Adds alignment info to gaps file produced by gaps jpayne@69: or gaps2 . Reads frag info from the file named on the '>' jpayne@69: lines of the gap file. The word "reverse" after that name jpayne@69: will make that sequence get reverse-complemented. jpayne@69: Also produces file witherrors.gaps which is the same as the jpayne@69: gaps input file, put with an extra column of number of jpayne@69: errors. jpayne@69: jpayne@69: gaps.cc Finds longest consistent set of matches in list jpayne@69: produced by mummer1 program. jpayne@69: jpayne@69: run-mummer1.csh Script to run alignment programs. Format is: jpayne@69: run-mummer1.csh [-flip] jpayne@69: will be used to make output files: .out , .gaps jpayne@69: and .align . -r will reverse complement jpayne@69: jpayne@69: Email questions, comments or bug reports to: jpayne@69: