/pdbio

Pandas-based Data Handler for VCF, BED, and SAM Files

Primary LanguagePythonMIT LicenseMIT

pdbio

Pandas-based Data Handler for VCF, BED, and SAM Files

Test Upload Python Package CI to Docker Hub

Installation

$ pip install -U pdbio

Python API

Example of API call

from pprint import pprint
from pdbio.vcfdataframe import VcfDataFrame

vcf_path = 'test/example.vcf'
vcfdf = VcfDataFrame(path=vcf_path)

pprint(vcfdf.header)      # list of header
pprint(vcfdf.samples)     # list of samples
print(vcfdf.df)           # VCF dataframe

vcfdf.sort()              # sort by CHROM, POS, and the other
print(vcfdf.df)           # sorted dataframe

Command-line interface

Example of commands

# Convert VCF data into sorted TSV data
$ pdbio vcf2csv --sort --tsv test/example.vcf

# Convert VCF data into expanded CSV data
$ pdbio vcf2csv --expand-info --expand-samples test/example.vcf

# Sort VCF data by CHROM, POS, and the other
$ pdbio vcfsort test/example.vcf

Run pdbio --help for more information.