comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/docs/mapview.README @ 69:33d812a61356

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 17:55:14 -0400
parents
children
comparison
equal deleted inserted replaced
67:0e9998148a16 69:33d812a61356
1
2 -----------------------------------------------------------------------
3 MapView Utility software
4 Version 1.0
5 Contact: <mummer-help@lists.sourceforge.net>
6 Web: http://mummer.sourceforge.net
7 -----------------------------------------------------------------------
8 LICENCE: open source, included with MUMmer 3.0 and above
9 USAGE: see section 4, below.
10
11 1. WHAT IS MAPVIEW?
12 ----------------
13
14 MapView is an utility program for displaying sequence alignments
15 as provided by NUCmer or PROmer. For further information regarding these
16 programs, please see the documentation and code at
17 http://mummer.sourceforge.net . MapView takes the output from
18 these programs and converts it to a FIG, PDF or PS file. It can
19 break the output into multiple files for easier viewing and printing.
20 Note that for very large reference genomes, FIG files viewed in the
21 xfig program (Unix) may be the only option that allows the entire
22 display to be stored in one file.
23
24 2. SYSTEM REQUIREMENTS
25 -------------------
26 - PERL interpreter version 5.0 or greater.
27 - fig2dev utility (see www.linux.org for transfig rpm package and
28 installation documentation)
29 - xfig viewer to visualize the FIG format (see www.linux.org regarding
30 xfig rpm package)
31 - Adobe Acrobat Reader for reading PDF formats (free from www.adobe.com)
32 - Ghostscript Postscript interpreter to view PDF and postscript documents
33 (on www.linux.org, look for the 'gv' rpm package)
34
35 3. INPUT
36 -----
37
38 The input to MapView is the table generated by the "show-coords"
39 program in MUMmer. It is important to use the -r -l options in
40 show-coords in order to have the proper format for MapView. For PROmer
41 output, it can be very helpful to run show-coords with the -k option as
42 well, to reduce the redundant matches often found in highly similar
43 regions. However, this option does not always select the appropriate
44 reading frame.
45
46 Both PROmer and NUCmer writes output into a specific format that
47 can be found in the *.cluster and *.delta files. To translate this
48 output into a human readable format, the "show-coords" program
49 parses the delta alignment output of either NUCmer or PROmer and
50 displays a summary information for each alignment. (Note that
51 PROmer and NUCmer include command line options that allow them to
52 generate the same summary information without running "show-coords"
53 separately.) The output of show-coords is then used by MapView to
54 create a FIG, PDF or PS file.
55
56 An example of the standard output of show-coords, which is used
57 directly as input for MapView, is below. This shows just the top
58 few lines of a large file created by aligning an assembly of
59 Drosophila pseudoobscura (165 million bases) to chromosome 2L of
60 Drosophila melanogaster:
61
62 /usr/local/db/euk/internal/d_melanogaster/na_arm2R_genomic_dmel_RELEASE3.FASTA celera_scaffs.fa
63 PROMER
64
65 [S1] [E1] | [S2] [E2] | [LEN 1] [LEN 2] | [% IDY] [% SIM] [% STP] | [LEN R] [LEN Q] | [COV R] [COV Q] | [FRM] [TAGS]
66 ========================================================================================================================================================
67 2540 2806 | 3216 3473 | 267 258 | 46.67 50.00 2.78 | 20302755 8916 | 0.00 2.89 | 2 3 2R 3211358
68 2540 2806 | 1939 2196 | 267 258 | 46.67 51.11 2.22 | 20302755 2375 | 0.00 10.86 | 2 1 2R 3211430
69 2540 2893 | 20172 19852 | 354 321 | 39.52 45.16 3.23 | 20302755 25647 | 0.00 1.25 | 2 -1 2R 3215406
70 2806 2534 | 5291 5536 | 273 246 | 41.94 47.31 3.76 | 20302755 12414 | 0.00 1.98 | -3 2 2R 3211507
71 ....
72
73 For more information and an explanation of this format, please see
74 the MUMmer manual http://mummer.sourceforge.net/manual
75
76
77 4. USAGE
78 -----
79
80 USAGE: mapview [options] <coords file> [UTR coords] [CDS coords]
81
82 The optional UTR and CDS coordinates files, which are computed in
83 based on the reference seq, should be in GFF format. These contain
84 the coordinates of coding sequences and untranslated regions for
85 genes on the reference genome, and will be displayed graphically
86 if provided.
87
88 GFF format is a tab-delimited file format with the following columns:
89 <seq_ID> <source> <exon type> <start> <end> <score> <strand> <frame> <gene_name>
90
91 Options :
92 -f <output format> : pdf, ps or fig. the default is "fig".
93
94 -x1 <left coord > -x2 <right coord> : only display the region on
95 the reference genome between positions x1 and x2. By default the
96 whole sequence will be diplayed.
97
98 -d <no_bp> : the maximum distance (in bp) between the matches for
99 which the matches will be linked. Default is 50000 bp. To explain:
100 the query sequence may contain multiple contigs. All matches from
101 the same contig are linked by drawing lines between each successive
102 pair of matches. If the matches occur too far apart, then this can
103 get very messy. Therefore we don't draw a line if the matches are
104 further apart than specified by this parameter. This is especially
105 important if the reference genome is very long and all the output
106 is stored in a single graphical file.
107
108 -m <mag> : set the magnification at which the figure is rendered to
109 mag. The default is 1.0; this is an option for fig2dev which is
110 used to transform the fig files to pdf or ps files.
111
112 -n <no of output files> : the default is 10. The purpose of this
113 parameter is to avoid making figures that are too 'large', in the
114 sense that they cannot be converted to PDF by fig2dev.
115
116 -p <file name> : the output file prefix;
117 By default the name of the output file(s) will be
118 PROMER_graph_<n>.fig, where <n> will be incremented for each output
119 file. If you choose "-o MyName", for example, then the name of the
120 first output file name will be MyName_0.fig.
121
122 -h display this help;
123
124 -v verbosely list the files processed;
125
126 -g|ref If the input file is provided by 'mgaps', set the
127 reference sequence ID (as it appears in the first column
128 of the UTR/CDS coords file)
129
130 -I Display the name of query sequences
131
132 -Ir Display the name of reference genes
133
134
135 5. OUTPUT
136 ------
137 the output can be fig, pdf, or ps files.
138 The program uses fig2dev to transform FIG files to PDF or PS.
139
140 If you supply UTR and CDS coords files, then the genes are displayed
141 first, along the top. Alternatively spliced genes are shown on
142 different rows, stacked vertically. The CDS regions (i.e., the
143 protein coding portions of exons) are diplayed in light green and the
144 5'end and 3'end UTR's are in different colors. (For details, please
145 see the legend in the left corner below the graphic.)
146
147 The reference seq is displayed in light blue, and on a row imediately
148 below it are shown the alignment matches.
149
150 The alignment matches are displayed again in vertical positions
151 depending on the percent identity (PID) of each match, ranging from
152 50% to 100%. Matches with PID< 50% (if any are included in the input
153 file) are considered to have PID=50%. For better visualization, the
154 connecting lines between matches are colored differently, using
155 randomly chosen colors, from one query seq to the next. If
156 these connecting lines are crossed, it indicates that the sequence
157 has been reverse complemented to achieve the match; however, note that
158 if a sequence is similar at both the protein and DNA level, we often
159 detect matches in multiple reading frames. NUCmer and PROmer have options
160 to display only one match when matches occur in multiple frames, but they
161 don't always choose the correct orientation.
162
163 6. KNOWN PROBLEMS
164 --------------
165
166 There is a known problem with the PDF files. Fig2dev has problems if
167 the FIG file is too big. It will constantly export that file into a
168 PDF with errors. We recomend using the PS format for files that are
169 very big, or else breaking the files up using the -n option above.