Use for relatively short sequences and small datasets
Vedasheersh opened this issue ยท 0 comments
Hi,
Firstly, I am not quite sure if this is a good place for asking. Apologize if it is not. Regardless, I love all your implementations and their ease of use!!๐
Question: Would you think this model would work for relatively small sequences (proteins with 20 amino acids as tokens) around 1000 characters in length?
Also, the datasets I have are relatively small - around 20,000 datapoints with float labels. So basically, I am trying to use this model as a summarizer for sequences accounting for long range dependencies to generate a floating point number as output.
Because of the small dataset, I plan to use small dimensions and layer depths to make up-to total of say ~50k parameters or so.
Would love to hear your thoughts!
Many many thanks!
Veda.