Package for working with Nucleotides and Amino Acids on The Julia Language
Pkg.init() # Only the first time you install a Julia's Package
Pkg.add("BioSeq") # Install BioSeq.jl
using BioSeq # Starting to use BioSeq
- 2-bit DNA sequence
DNA2Seq
for saving memory- Faster vectorized test for calculate percentage of GC, and test A C T G on
DNA2Seq
- Faster vectorized test for calculate percentage of GC, and test A C T G on
- 8-bit bitstype
Nucleotide
andAminoAcid
- Vectors of this types can be used as DNA, RNA or Protein Sequences
- Some string's functions working for Sequences:
- Case conversions
- Matching functions (search, replace and others)
- IUPAC Regex is available for matching functions
- PROSITE patterns are available for matching functions
- Some string's functions working for Sequences:
- Alignments can be represented as Matrices of this types
- DArray of this types can be used for parallel computation
- Memory-mapped arrays of this types can be used for huge data
- Vectors of this types can be used as DNA, RNA or Protein Sequences
- 8-bit Bit-Level Coding Scheme for Nucleotides
- Translation methods and genetic codes
- Tools for using IntSet/Set/Dict as alphabets
- Common alphabets as IntSet, including extended IUPAC
- Dicts for generate complement for nucleotide sequences or change between 3 letter and 1 letter alphabets on Proteins
- Test for characters on alphabet
- Check for all characters on alphabet
- Swap for alphabet conversions
julia> using BioSeq
julia> const dna4alphabet = alphabet(nt"ACTG", false)
Case Insensitive Alphabet{Nucleotide} of 4 elements:
indice : 256-element Uint8 Array
alphabet : 4-element Nucleotide Array
alphabet indice[alphabet]
Nucleotide (Int64) Uint8 (Int64)
A (65) 0x01 (1)
C (67) 0x02 (2)
T (84) 0x03 (3)
G (71) 0x04 (4)
julia> dnaseq = repeat( nt"GATTACA" , 2 )
14-element Nucleotide Array:
G
A
T
T
A
C
A
G
A
T
T
A
C
A
julia> check(dnaseq, dna4alphabet)
true
julia> protseq = translate(dnaseq,1)
4-element AminoAcid Array:
D
Y
R
L
julia> if ismatch( prosite"<D-x-[RM]" , protseq )
threeletters = swap(protseq, AMINO_1LETTER_TO_3 )
end
4-element ASCIIString Array:
"ASP"
"TYR"
"ARG"
"LEU"
Fork and send a pull request or create a GitHub issue for bug reports or feature requests