diff CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/docs/mapview.README @ 69:33d812a61356

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 17:55:14 -0400
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/opt/mummer-3.23/docs/mapview.README	Tue Mar 18 17:55:14 2025 -0400
@@ -0,0 +1,169 @@
+
+ -----------------------------------------------------------------------
+                   MapView Utility software
+                                         Version 1.0
+  Contact: <mummer-help@lists.sourceforge.net>
+  Web: http://mummer.sourceforge.net
+ -----------------------------------------------------------------------
+ LICENCE: open source, included with MUMmer 3.0 and above
+ USAGE: see section 4, below.
+
+ 1. WHAT IS MAPVIEW?
+   ----------------
+
+ MapView is an utility program for displaying sequence alignments
+ as provided by NUCmer or PROmer. For further information regarding these
+ programs, please see the documentation and code at
+ http://mummer.sourceforge.net .  MapView takes the output from
+ these programs and converts it to a FIG, PDF or PS file.  It can 
+ break the output into multiple files for easier viewing and printing.
+ Note that for very large reference genomes, FIG files viewed in the
+ xfig program (Unix) may be the only option that allows the entire
+ display to be stored in one file.
+
+ 2. SYSTEM REQUIREMENTS
+   -------------------
+   - PERL interpreter version 5.0 or greater.
+   - fig2dev utility (see www.linux.org for transfig rpm package and
+     installation documentation)
+   - xfig viewer to visualize the FIG format (see www.linux.org regarding 
+     xfig rpm package)
+   - Adobe Acrobat Reader for reading PDF formats (free from www.adobe.com)
+   - Ghostscript Postscript interpreter to view PDF and postscript documents 
+     (on www.linux.org, look for the 'gv' rpm package)
+       
+ 3. INPUT  
+    -----
+
+   The input to MapView is the table generated by the "show-coords"
+   program in MUMmer.  It is important to use the -r -l options in
+   show-coords in order to have the proper format for MapView. For PROmer
+   output, it can be very helpful to run show-coords with the -k option as
+   well, to reduce the redundant matches often found in highly similar
+   regions. However, this option does not always select the appropriate
+   reading frame.
+
+   Both PROmer and NUCmer writes output into a specific format that
+   can be found in the *.cluster and *.delta files. To translate this
+   output into a human readable format, the "show-coords" program
+   parses the delta alignment output of either NUCmer or PROmer and
+   displays a summary information for each alignment. (Note that
+   PROmer and NUCmer include command line options that allow them to
+   generate the same summary information without running "show-coords"
+   separately.)  The output of show-coords is then used by MapView to
+   create a FIG, PDF or PS file.
+
+   An example of the standard output of show-coords, which is used
+   directly as input for MapView, is below.  This shows just the top
+   few lines of a large file created by aligning an assembly of
+   Drosophila pseudoobscura (165 million bases) to chromosome 2L of
+   Drosophila melanogaster:
+
+ /usr/local/db/euk/internal/d_melanogaster/na_arm2R_genomic_dmel_RELEASE3.FASTA celera_scaffs.fa
+ PROMER
+
+    [S1]     [E1]  |     [S2]     [E2]  |  [LEN 1]  [LEN 2]  |  [% IDY]  [% SIM]  [% STP]  |  [LEN R]  [LEN Q]  |  [COV R]  [COV Q]  | [FRM]  [TAGS]
+ ========================================================================================================================================================
+    2540     2806  |     3216     3473  |      267      258  |    46.67    50.00     2.78  | 20302755     8916  |     0.00     2.89  |  2  3  2R    3211358
+    2540     2806  |     1939     2196  |      267      258  |    46.67    51.11     2.22  | 20302755     2375  |     0.00    10.86  |  2  1  2R    3211430
+    2540     2893  |    20172    19852  |      354      321  |    39.52    45.16     3.23  | 20302755    25647  |     0.00     1.25  |  2 -1  2R    3215406
+    2806     2534  |     5291     5536  |      273      246  |    41.94    47.31     3.76  | 20302755    12414  |     0.00     1.98  | -3  2  2R    3211507
+ ....
+
+ For more information and an explanation of this format, please see
+ the MUMmer manual http://mummer.sourceforge.net/manual
+
+
+ 4. USAGE
+    -----
+
+ USAGE: mapview  [options]  <coords file>  [UTR coords]  [CDS coords]
+
+ The optional UTR and CDS coordinates files, which are computed in
+ based on the reference seq, should be in GFF format.  These contain
+ the coordinates of coding sequences and untranslated regions for 
+ genes on the reference genome, and will be displayed graphically
+ if provided.
+
+ GFF format is a tab-delimited file format with the following columns:
+   <seq_ID> <source> <exon type> <start> <end> <score> <strand> <frame> <gene_name>
+
+ Options :
+  -f <output format> : pdf, ps or fig. the default is "fig". 
+
+  -x1 <left coord > -x2 <right coord> : only display the region on
+  the reference genome between positions x1 and x2.  By default the
+  whole sequence will be diplayed.
+
+  -d <no_bp> : the maximum distance (in bp) between the matches for
+  which the matches will be linked.  Default is 50000 bp.  To explain:
+  the query sequence may contain multiple contigs.  All matches from
+  the same contig are linked by drawing lines between each successive
+  pair of matches.  If the matches occur too far apart, then this can
+  get very messy.  Therefore we don't draw a line if the matches are
+  further apart than specified by this parameter.  This is especially
+  important if the reference genome is very long and all the output
+  is stored in a single graphical file.
+
+  -m <mag> : set the magnification at which the figure is rendered to
+   mag. The default is 1.0; this is an option for fig2dev which is
+   used to transform the fig files to pdf or ps files.
+
+  -n <no of output files> : the default is 10. The purpose of this 
+   parameter is to avoid making figures that are too 'large', in the
+   sense that they cannot be converted to PDF by fig2dev.
+
+  -p <file name> : the output file prefix;
+   By default the name of the output file(s) will be
+   PROMER_graph_<n>.fig, where <n> will be incremented for each output
+   file.  If you choose "-o MyName", for example, then the name of the
+   first output file name will be MyName_0.fig.
+
+  -h  display this help;
+
+  -v  verbosely list the files processed;
+
+  -g|ref          If the input file is provided by 'mgaps', set the
+                  reference sequence ID (as it appears in the first column
+                  of the UTR/CDS coords file)
+
+  -I              Display the name of query sequences
+
+  -Ir             Display the name of reference genes
+
+
+ 5. OUTPUT
+   ------
+ the output can be fig, pdf, or ps files.  
+ The program uses fig2dev to transform FIG files to PDF or PS. 
+
+ If you supply UTR and CDS coords files, then the genes are displayed
+ first, along the top.  Alternatively spliced genes are shown on
+ different rows, stacked vertically.  The CDS regions (i.e., the
+ protein coding portions of exons) are diplayed in light green and the
+ 5'end and 3'end UTR's are in different colors. (For details, please
+ see the legend in the left corner below the graphic.)
+
+ The reference seq is displayed in light blue, and on a row imediately
+ below it are shown the alignment matches.
+
+ The alignment matches are displayed again in vertical positions
+ depending on the percent identity (PID) of each match, ranging from
+ 50% to 100%.  Matches with PID< 50% (if any are included in the input
+ file) are considered to have PID=50%.  For better visualization, the
+ connecting lines between matches are colored differently, using
+ randomly chosen colors, from one query seq to the next.  If 
+ these connecting lines are crossed, it indicates that the sequence
+ has been reverse complemented to achieve the match; however, note that
+ if a sequence is similar at both the protein and DNA level, we often
+ detect matches in multiple reading frames.  NUCmer and PROmer have options
+ to display only one match when matches occur in multiple frames, but they
+ don't always choose the correct orientation.
+
+ 6. KNOWN PROBLEMS
+   --------------
+
+ There is a known problem with the PDF files. Fig2dev has problems if
+ the FIG file is too big. It will constantly export that file into a
+ PDF with errors. We recomend using the PS format for files that are
+ very big, or else breaking the files up using the -n option above.