Mercurial > repos > rliterman > csp2
view CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/lib/python3.8/site-packages/pybedtools-0.11.0.dist-info/METADATA @ 68:5028fdace37b
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 16:23:26 -0400 |
parents | |
children |
line wrap: on
line source
Metadata-Version: 2.1 Name: pybedtools Version: 0.11.0 Summary: Wrapper around BEDTools for bioinformatics work Home-page: https://github.com/daler/pybedtools Download-URL: Maintainer: Ryan Dale Maintainer-email: ryan.dale@nih.gov License: MIT Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: MIT License Classifier: Topic :: Scientific/Engineering :: Bio-Informatics Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Topic :: Software Development :: Libraries :: Python Modules License-File: LICENSE.txt Requires-Dist: pysam Requires-Dist: numpy Overview -------- .. image:: https://badge.fury.io/py/pybedtools.svg?style=flat :target: https://badge.fury.io/py/pybedtools .. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg :target: https://bioconda.github.io The `BEDTools suite of programs <http://bedtools.readthedocs.org/>`_ is widely used for genomic interval manipulation or "genome algebra". `pybedtools` wraps and extends BEDTools and offers feature-level manipulations from within Python. See full online documentation, including installation instructions, at https://daler.github.io/pybedtools/. The GitHub repo is at https://github.com/daler/pybedtools. Why `pybedtools`? ----------------- Here is an example to get the names of genes that are <5 kb away from intergenic SNPs: .. code-block:: python from pybedtools import BedTool snps = BedTool('snps.bed.gz') # [1] genes = BedTool('hg19.gff') # [1] intergenic_snps = snps.subtract(genes) # [2] nearby = genes.closest(intergenic_snps, d=True, stream=True) # [2, 3] for gene in nearby: # [4] if int(gene[-1]) < 5000: # [4] print gene.name # [4] Useful features shown here include: * `[1]` support for all BEDTools-supported formats (here gzipped BED and GFF) * `[2]` wrapping of all BEDTools programs and arguments (here, `subtract` and `closest` and passing the `-d` flag to `closest`); * `[3]` streaming results (like Unix pipes, here specified by `stream=True`) * `[4]` iterating over results while accessing feature data by index or by attribute access (here `[-1]` and `.name`). In contrast, here is the same analysis using shell scripting. Note that this requires knowledge in Perl, bash, and awk. The run time is identical to the `pybedtools` version above: .. code-block:: bash snps=snps.bed.gz genes=hg19.gff intergenic_snps=/tmp/intergenic_snps snp_fields=`zcat $snps | awk '(NR == 2){print NF; exit;}'` gene_fields=9 distance_field=$(($gene_fields + $snp_fields + 1)) intersectBed -a $snps -b $genes -v > $intergenic_snps closestBed -a $genes -b $intergenic_snps -d \ | awk '($'$distance_field' < 5000){print $9;}' \ | perl -ne 'm/[ID|Name|gene_id]=(.*?);/; print "$1\n"' rm $intergenic_snps See the `Shell script comparison <http://daler.github.io/pybedtools/sh-comparison.html>`_ in the docs for more details on this comparison, or keep reading the full documentation at http://daler.github.io/pybedtools.