comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/docs/run-mummer1.README @ 69:33d812a61356

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 17:55:14 -0400
parents
children
comparison
equal deleted inserted replaced
67:0e9998148a16 69:33d812a61356
1 -=- run-mummer1 (MUMmer1.0) README -=-
2
3 ** NOTE **
4 This manual is outdated, please refer to the HTML documentation included in
5 this distribution or at:
6
7 http://mummer.sourceforge.net
8 http://mummer.sourceforge.net/manual
9 http://mummer.sourceforge.net/examples
10
11 This is the README for the original MUMmer 1.0 system, which is
12 included in the MUMmer 3 package. It is slower and uses more memory,
13 but it does have some slightly different functions and output so we
14 make it available.
15
16
17
18 MUMmer 1.0 code and documentation are copyright (c) 1999 by The Institute
19 for Genomic Research. The principle architect for the system was
20 Arthur Delcher.
21
22 This directory contains the source code for MUMmer program for
23 aligning long DNA sequences. If you use this code in any publication,
24 please cite the following:
25 A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson,
26 O. White, and S.L. Salzberg. Alignment of whole genomes.
27 Nucleic Acids Research, 27:11 (1999), 2369-2376.
28
29 MUMmer works on Unix only. This README file is the only documentation
30 besides the code itself. Since it is free, we cannot provide any
31 other support. The system is very easy to use, but you need to invest
32 a few minutes up front reading this file and figuring out how to
33 interpret the output. If you discover bugs, we would be interested in
34 hearing about them so we can correct them. Address bug reports to
35 <mummer-help@lists.sourceforge.net>. The system uses a LOT of RAM - if
36 it crashes for that reason, that's not a bug. We recommend at least
37 512Mb to align most pairs of bacterial genomes, and 1Gb or more may be
38 required.
39
40 To use this system, first compile it by typing 'make' at the
41 command line. There is a script, 'run-mummer1.csh', that runs all
42 the steps of aligning two genomes. The script takes these arguments:
43
44 run-mummer1.csh <genome1> <genome2> <tag> [-r]
45
46 The two genomes must DNA sequences in FASTA format. Multi-FASTA
47 files don't work. The tag is used to create 4 output files,
48 all with 'tag' as a prefix. -flip will reverse complement <genome2>.
49
50 Of the four output files, two are intended for your inspection.
51 <tag>.errorsgaps lists the alignment of all the MUMs (maximal unique
52 matches - read the paper for definitions), and includes the longest
53 ascending sequence of MUMs first. This sequence is the best alignment
54 of the two genomes by the program. In between the MUMs are gaps,
55 which are aligned using a Smith-Waterman implementation of our own.
56 Those Smith-Waterman's are contained in the file <tag>.align.
57 This can be a very long file if there are lots of gaps. If either
58 of the two gaps is too long (over 5000bp), then the alignment is
59 not performed.
60
61 IMPORTANT: the performance of the program can critically depend on the
62 minimum MUM length you use. The default is 20bp. If you want to
63 change it, do the following: edit the file run-mummer1.csh. Add a new
64 length switch to the 'mummer' call.
65
66 The other file - one that we often spend lots of time analyzing - is
67 <tag>.errorsgaps. The lines in that file are the MUMs, for example:
68 46989 271588 23 none 3262 3304 1022
69 47013 271612 24 none 1 1 1
70 Columns 1 and 2 are the positions of a MUM in genomes 1 and 2. The
71 MUM is 24bp in length, shown in column 3. Column 4 is the overlap
72 from the previous MUM - usually this will be 'none' except when there
73 are repeats present. Columns 5 and 6 show the gaps from the *end*
74 of the previous MUM. Column 7 shows the number of errors - indels
75 or mismatches - in the S-W alignment of the gap before this MUM.
76 Hence in the two lines shown here, the second line is a MUM of 24bp,
77 and it follows the previous MUM after a gap of just 1bp in each genome.
78 This indicates a single nucleotide polymorphism.
79
80 Files in this directory:
81
82 annotate.cc Adds alignment info to gaps file produced by gaps
83 or gaps2 . Reads frag info from the file named on the '>'
84 lines of the gap file. The word "reverse" after that name
85 will make that sequence get reverse-complemented.
86 Also produces file witherrors.gaps which is the same as the
87 gaps input file, put with an extra column of number of
88 errors.
89
90 gaps.cc Finds longest consistent set of matches in list
91 produced by mummer1 program.
92
93 run-mummer1.csh Script to run alignment programs. Format is:
94 run-mummer1.csh <genome1> <genome2> <tag> [-flip]
95 <tag> will be used to make output files: <tag>.out , <tag>.gaps
96 and <tag>.align . -r will reverse complement <genome2>
97
98 Email questions, comments or bug reports to: <mummer-help@lists.sourceforge.net>
99