ribiosBioMart is the 'sans Ensambl' middle ware to access the Ensembl BioMart data inside an plain MySQL/MariaDB database.
It aims to provide the same functions as the ws-based biomaRt R package
You need to install Bioconductor and the following R packages from the GitHub repository:
library(devtools)
install_github("Accio/ribios/ribiosBase")
install_github("Accio/ribios/ribiosBioMart")
Additionally, you need an operational MariaDB server instance populated with a copy of the latest Ensembl BioMart data.
library(ribiosBioMart)
library(RMySQL)
# Your acces data to your database server
# listLocalDatasets, useLocalMart, and getLocalBM will use this to create a
# single-use connection for each function call
conn <- EnsemblDBCredentials(host = "TODO",
port = 12345,
user = "TODO",
passwd = "TODO",
ensembl_version = 93)
# Alternatively: Use a direct connection for the entire session.
# You will need to manage / close it yourself
#conn <- dbConnect (MySQL (),
# user="TODO",
# password="TODO",
# dbname="ensembl_mart_93",
# host="TODO",
# port=12345)
ds <-listLocalDatasets(conn)
ds[1:5,]
localMart = useLocalMart(conn, dataset="hsapiens_gene_ensembl")
attrs = listLocalAttributes(localMart)
attrs[1:5,]
filters = listLocalFilters(localMart)
filters[1:5,]
affyids=c("202763_at","209310_s_at","207500_at")
getLocalBM(attributes = c('affy_hg_u133_plus_2', 'hgnc_symbol', 'chromosome_name',
'start_position', 'end_position', 'band'),
filters = 'affy_hg_u133_plus_2',
values = affyids,
mart = localMart)
The functionality of getLocalBM should be more or less the same as those of getBM in the biomaRt package
The getLocalBM does not support filters / attributes that require access to a database schema other than ensembl_mart_92. This package will therefore abort its execution with an error if the user tries to utilize one of the following attributes / filters:
- "name_2" ("qtl_feature.name_2")
- "qtl_region" ("qtl_feature.qtl_region")
- "go_parent_term" ("go_clos")
- "go_parent_name" ("go_name")
- "so_consequence_name" ("so_mini_parent_name")
- structure_gene_stable_id
- structure_transcript_stable_id
- structure_translation_stable_id
- structure_canonical_transcript_id
- structure_chrom_name
- structure_gene_chrom_start
- structure_gene_chrom_end
- structure_transcript_chrom_start
- structure_transcript_chrom_end
- structure_transcription_start_site
- structure_transcript_length
- structure_transcript_chrom_strand
- structure_external_gene_name
- structure_external_source_name
- structure_5_utr_start
- structure_5_utr_end
- structure_3_utr_start
- structure_3_utr_end
- structure_cds_length
- structure_cdna_length
- structure_peptide_length
- struct_transcript_count
- structure_translation_count
- structure_description
- structure_biotype
- structure_ensembl_exon_id
- structure_cds_start
- structure_cds_end
- homologs_ensembl_gene_id
- homologs_ensembl_transcript_id
- homologs_ensembl_peptide_id
- homologs_canonical_transcript_id
- homologs_chromosome_name
- homologs_start_position
- homologs_end_position
- homologs_strand
- homologs_band
- homologs_external_gene_name
- homologs_external_gene_source
- homologs_ensembl_CDS_length
- homologs_ensembl_cDNA_length
- homologs_ensembl_peptide_length
- homologs_transcript_count
- homologs_percentage_gc_content
- homologs_description
- snp_ensembl_gene_id
- snp_ensembl_transcript_id
- snp_ensembl_peptide_id
- sequence_canonical_transcript_id
- snp_chromosome_name
- snp_start_position
- snp_end_position
- snp_strand
- snp_band
- snp_external_gene_name
- snp_external_gene_source
- snp_ensembl_CDS_length
- snp_ensembl_cDNA_length
- snp_ensembl_peptide_length
- snp_transcript_count
- snp_percentage_gc_content
- snp_description
- transcript_exon_intron
- gene_exon_intron
- transcript_flank
- gene_flank
- coding_transcript_flank
- coding_gene_flank
- 5utr
- 3utr
- gene_exon
- cdna
- coding
- peptide
- upstream_flank
- downstream_flank
- sequence_gene_stable_id
- sequence_description
- sequence_external_gene_name
- sequence_external_source_name
- sequence_str_chrom_name
- sequence_gene_chrom_start
- sequence_gene_chrom_end
- sequence_gene_biotype
- sequence_family
- sequence_upi
- sequence_uniprot_swissprot_accession
- sequence_uniprot_sptrembl
- sequence_uniprot_sptrembl_predicted
- sequence_transcript_stable_id
- sequence_translation_stable_id
- sequence_canonical_transcript_id
- sequence_biotype
- sequence_transcript_biotype
- sequence_transcript_chrom_strand
- sequence_transcript_chrom_start
- sequence_transcript_chrom_end
- sequence_transcription_start_site
- sequence_peptide_length
- sequence_transcript_length
- sequence_cdna_length
- sequence_exon_stable_id
- sequence_type
- sequence_exon_chrom_start
- sequence_exon_chrom_end
- sequence_exon_chrom_strand
- sequence_rank
- sequence_phase
- sequence_end_phase
- sequence_cdna_coding_start
- sequence_cdna_coding_end
- sequence_genomic_coding_start
- sequence_genomic_coding_end
- sequence_constitutive
If you used a direct DBIConnection for the database access, don't forget to close your database connections
# dbDisconnect (conn)