Pinned Repositories
uni-main
cec_clustering
ML-project
Proteins' Secondary Structure Prediction
rails_photo_gallery
VCF_converter
Next Generation Sequencing (NGS) generates a huge amount of data. One of the formats that allows the storage of information about sequences and their annotations is Variant Call Format (VCF). This thesis describes the main features of VCF file format, in particular focusing on the description of information about polymorphic sites in sequences. They have an important role in population studies, among other things. The key element of this dissertation is a converter from VCF to FASTA format. It is a script written in Python, which can be executed from command line. Therefore it is possible to execute the program in different pipelines. The appended manual includes the description of all available options as well as the explanation of the converter’s functionalities.
paulinapmk's Repositories
paulinapmk/cec_clustering
paulinapmk/ML-project
Proteins' Secondary Structure Prediction
paulinapmk/rails_photo_gallery
paulinapmk/VCF_converter
Next Generation Sequencing (NGS) generates a huge amount of data. One of the formats that allows the storage of information about sequences and their annotations is Variant Call Format (VCF). This thesis describes the main features of VCF file format, in particular focusing on the description of information about polymorphic sites in sequences. They have an important role in population studies, among other things. The key element of this dissertation is a converter from VCF to FASTA format. It is a script written in Python, which can be executed from command line. Therefore it is possible to execute the program in different pipelines. The appended manual includes the description of all available options as well as the explanation of the converter’s functionalities.