jiwoongbio/Annomen

Annotate variant nomenclature

Perl

Annomen

Annotate variant nomenclature

Annotations

Gene name
Region type
- CDS
- utr5
- utr3
- intron
- non_coding_exon
- non_coding_intron
Mutation class
- silent
- missense
- inframe
- frameshift
- nonsense
- readthrough
- startcodon
- splicing
- junction
Strand
Splice distance
Transcript ID
Protein ID
Transcript variation nomenclature
Protein variation nomenclature

Requirements

Perl: https://www.perl.org
BioPerl: https://bioperl.org
- Bio::DB::Fasta
- Bio::SeqIO
EMBOSS: http://emboss.sourceforge.net or EMBOSS-6.6.0.reduced.tar.gz
- needle
- stretcher
Basic linux commands: bash, rm, gzip, sort, echo, find, sed, awk, wget

You can use conda to install the requirements as follows:

conda create -n Annomen -c bioconda perl perl-bioperl emboss
conda install -n Annomen -c anaconda wget

Install

If you already have Git (https://git-scm.com) installed, you can get the latest development version using Git.

git clone https://github.com/jiwoongbio/Annomen.git

Usages

Prepare annotation table file
- Execute Annomen_table.hg38.sh
```
./Annomen_table.hg38.sh
```
Annotate variant file in VCF or tab-separated columns of chromosome, position, reference base, variant base
- Execute Annomen.hg38.sh
```
./Annomen.hg38.sh <input file>
```
- Example: annotating ClinVar variants
```
./clinvar.sh
```