
Dataset repository

MIT LicenseMIT

Manual annotation of tardigrades Datasets

This repository provides the data derived from the manual annotation of Ramazzottius varieornatus and Hypsibius dujardini DNA reparation genes.


You can find 2 .aa files, each of which contains all the annotated proteins of one of the 2 species of tardigrades analysed. Also, 2 .seq that contains the nucleotic untranslated sequence of the same datasets. And lastly, 2 .gff, containing the GFF3 or GTF format of the annotation.

  'rvar.*' -> Ramazzottius varieornatus sets
  'hduj.*' -> Hypsibius dujardini sets

Aditionally, folders containing single-fasta file for all sequences has been provided.

Please note that the names of the specific proteins and sequences usually contain _hduj_ or _hduj or variations of this (including at some cases rvar or even dmel. This is NOT a reference to the species in which said sequence is annotated, but rather an internal tool used to differenciate diferent results using diferent subject sequences.

Reference genomes

As indicated in the publication, for the R. varieornatus we used the assembly GCA_001949185.1 (Rvar_4.0), while for the H. dujardini, the assembly was GCA_002082055.1 (nHd_3.1).

Please contact me for any questions or problems encountered.