Mercurial > repos > rliterman > csp2
comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/docs/mapview.README @ 69:33d812a61356
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 17:55:14 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
67:0e9998148a16 | 69:33d812a61356 |
---|---|
1 | |
2 ----------------------------------------------------------------------- | |
3 MapView Utility software | |
4 Version 1.0 | |
5 Contact: <mummer-help@lists.sourceforge.net> | |
6 Web: http://mummer.sourceforge.net | |
7 ----------------------------------------------------------------------- | |
8 LICENCE: open source, included with MUMmer 3.0 and above | |
9 USAGE: see section 4, below. | |
10 | |
11 1. WHAT IS MAPVIEW? | |
12 ---------------- | |
13 | |
14 MapView is an utility program for displaying sequence alignments | |
15 as provided by NUCmer or PROmer. For further information regarding these | |
16 programs, please see the documentation and code at | |
17 http://mummer.sourceforge.net . MapView takes the output from | |
18 these programs and converts it to a FIG, PDF or PS file. It can | |
19 break the output into multiple files for easier viewing and printing. | |
20 Note that for very large reference genomes, FIG files viewed in the | |
21 xfig program (Unix) may be the only option that allows the entire | |
22 display to be stored in one file. | |
23 | |
24 2. SYSTEM REQUIREMENTS | |
25 ------------------- | |
26 - PERL interpreter version 5.0 or greater. | |
27 - fig2dev utility (see www.linux.org for transfig rpm package and | |
28 installation documentation) | |
29 - xfig viewer to visualize the FIG format (see www.linux.org regarding | |
30 xfig rpm package) | |
31 - Adobe Acrobat Reader for reading PDF formats (free from www.adobe.com) | |
32 - Ghostscript Postscript interpreter to view PDF and postscript documents | |
33 (on www.linux.org, look for the 'gv' rpm package) | |
34 | |
35 3. INPUT | |
36 ----- | |
37 | |
38 The input to MapView is the table generated by the "show-coords" | |
39 program in MUMmer. It is important to use the -r -l options in | |
40 show-coords in order to have the proper format for MapView. For PROmer | |
41 output, it can be very helpful to run show-coords with the -k option as | |
42 well, to reduce the redundant matches often found in highly similar | |
43 regions. However, this option does not always select the appropriate | |
44 reading frame. | |
45 | |
46 Both PROmer and NUCmer writes output into a specific format that | |
47 can be found in the *.cluster and *.delta files. To translate this | |
48 output into a human readable format, the "show-coords" program | |
49 parses the delta alignment output of either NUCmer or PROmer and | |
50 displays a summary information for each alignment. (Note that | |
51 PROmer and NUCmer include command line options that allow them to | |
52 generate the same summary information without running "show-coords" | |
53 separately.) The output of show-coords is then used by MapView to | |
54 create a FIG, PDF or PS file. | |
55 | |
56 An example of the standard output of show-coords, which is used | |
57 directly as input for MapView, is below. This shows just the top | |
58 few lines of a large file created by aligning an assembly of | |
59 Drosophila pseudoobscura (165 million bases) to chromosome 2L of | |
60 Drosophila melanogaster: | |
61 | |
62 /usr/local/db/euk/internal/d_melanogaster/na_arm2R_genomic_dmel_RELEASE3.FASTA celera_scaffs.fa | |
63 PROMER | |
64 | |
65 [S1] [E1] | [S2] [E2] | [LEN 1] [LEN 2] | [% IDY] [% SIM] [% STP] | [LEN R] [LEN Q] | [COV R] [COV Q] | [FRM] [TAGS] | |
66 ======================================================================================================================================================== | |
67 2540 2806 | 3216 3473 | 267 258 | 46.67 50.00 2.78 | 20302755 8916 | 0.00 2.89 | 2 3 2R 3211358 | |
68 2540 2806 | 1939 2196 | 267 258 | 46.67 51.11 2.22 | 20302755 2375 | 0.00 10.86 | 2 1 2R 3211430 | |
69 2540 2893 | 20172 19852 | 354 321 | 39.52 45.16 3.23 | 20302755 25647 | 0.00 1.25 | 2 -1 2R 3215406 | |
70 2806 2534 | 5291 5536 | 273 246 | 41.94 47.31 3.76 | 20302755 12414 | 0.00 1.98 | -3 2 2R 3211507 | |
71 .... | |
72 | |
73 For more information and an explanation of this format, please see | |
74 the MUMmer manual http://mummer.sourceforge.net/manual | |
75 | |
76 | |
77 4. USAGE | |
78 ----- | |
79 | |
80 USAGE: mapview [options] <coords file> [UTR coords] [CDS coords] | |
81 | |
82 The optional UTR and CDS coordinates files, which are computed in | |
83 based on the reference seq, should be in GFF format. These contain | |
84 the coordinates of coding sequences and untranslated regions for | |
85 genes on the reference genome, and will be displayed graphically | |
86 if provided. | |
87 | |
88 GFF format is a tab-delimited file format with the following columns: | |
89 <seq_ID> <source> <exon type> <start> <end> <score> <strand> <frame> <gene_name> | |
90 | |
91 Options : | |
92 -f <output format> : pdf, ps or fig. the default is "fig". | |
93 | |
94 -x1 <left coord > -x2 <right coord> : only display the region on | |
95 the reference genome between positions x1 and x2. By default the | |
96 whole sequence will be diplayed. | |
97 | |
98 -d <no_bp> : the maximum distance (in bp) between the matches for | |
99 which the matches will be linked. Default is 50000 bp. To explain: | |
100 the query sequence may contain multiple contigs. All matches from | |
101 the same contig are linked by drawing lines between each successive | |
102 pair of matches. If the matches occur too far apart, then this can | |
103 get very messy. Therefore we don't draw a line if the matches are | |
104 further apart than specified by this parameter. This is especially | |
105 important if the reference genome is very long and all the output | |
106 is stored in a single graphical file. | |
107 | |
108 -m <mag> : set the magnification at which the figure is rendered to | |
109 mag. The default is 1.0; this is an option for fig2dev which is | |
110 used to transform the fig files to pdf or ps files. | |
111 | |
112 -n <no of output files> : the default is 10. The purpose of this | |
113 parameter is to avoid making figures that are too 'large', in the | |
114 sense that they cannot be converted to PDF by fig2dev. | |
115 | |
116 -p <file name> : the output file prefix; | |
117 By default the name of the output file(s) will be | |
118 PROMER_graph_<n>.fig, where <n> will be incremented for each output | |
119 file. If you choose "-o MyName", for example, then the name of the | |
120 first output file name will be MyName_0.fig. | |
121 | |
122 -h display this help; | |
123 | |
124 -v verbosely list the files processed; | |
125 | |
126 -g|ref If the input file is provided by 'mgaps', set the | |
127 reference sequence ID (as it appears in the first column | |
128 of the UTR/CDS coords file) | |
129 | |
130 -I Display the name of query sequences | |
131 | |
132 -Ir Display the name of reference genes | |
133 | |
134 | |
135 5. OUTPUT | |
136 ------ | |
137 the output can be fig, pdf, or ps files. | |
138 The program uses fig2dev to transform FIG files to PDF or PS. | |
139 | |
140 If you supply UTR and CDS coords files, then the genes are displayed | |
141 first, along the top. Alternatively spliced genes are shown on | |
142 different rows, stacked vertically. The CDS regions (i.e., the | |
143 protein coding portions of exons) are diplayed in light green and the | |
144 5'end and 3'end UTR's are in different colors. (For details, please | |
145 see the legend in the left corner below the graphic.) | |
146 | |
147 The reference seq is displayed in light blue, and on a row imediately | |
148 below it are shown the alignment matches. | |
149 | |
150 The alignment matches are displayed again in vertical positions | |
151 depending on the percent identity (PID) of each match, ranging from | |
152 50% to 100%. Matches with PID< 50% (if any are included in the input | |
153 file) are considered to have PID=50%. For better visualization, the | |
154 connecting lines between matches are colored differently, using | |
155 randomly chosen colors, from one query seq to the next. If | |
156 these connecting lines are crossed, it indicates that the sequence | |
157 has been reverse complemented to achieve the match; however, note that | |
158 if a sequence is similar at both the protein and DNA level, we often | |
159 detect matches in multiple reading frames. NUCmer and PROmer have options | |
160 to display only one match when matches occur in multiple frames, but they | |
161 don't always choose the correct orientation. | |
162 | |
163 6. KNOWN PROBLEMS | |
164 -------------- | |
165 | |
166 There is a known problem with the PDF files. Fig2dev has problems if | |
167 the FIG file is too big. It will constantly export that file into a | |
168 PDF with errors. We recomend using the PS format for files that are | |
169 very big, or else breaking the files up using the -n option above. |