TeselaGen/openVectorEditor

couldn't import protein sequence and correctly display

mega-bisharp opened this issue · 7 comments

my protein sequence from ncbi

AXN76052.1
MRCMSELVVFKANELAISRYDLTEHETKLILCCVALLNPTIENPTRKERTVSFTYNQYAQMMNISRENAYGVLAKATRELMTRTVEIRNPLVKGFEIFQWTNYAKFSSEKLELVFSEEILPYLFQLKKFIKYNLEHVKSFENKYSMRIYEWLLKELTQKKTHKANIEISLDEFKFMLMLENNYHEFKRLNQWVLKPISKDLNTYSNMKLVVDKRGRPTDTLIFQVELDRQMDLVTELENNQIKMNGDKIPTTITSDSHLHNGLRKTLHDALTAKIQLTSFEAKFLSDMQSKYDLNGSFSWLTQKQRTTLENILAKYGRI
result:
image
image

problem:
image
I think the type in here maybe "PROTEIN", not "DNA"

@tnrich

tnrich commented

@mega-bisharp can you please attach the file as a ZIP file here ? Thanks!

@mega-bisharp can you please attach the file as a ZIP file here ? [Thanks!]
This is my protein sequences, thank you for your reply!
seqdump.zip
@tnrich

tnrich commented

@mega-bisharp that file doesn't have a file extension that would indicate that it is a protein. We would need to guess based on the sequence content which can sometimes be risky..

tnrich commented

Also this is the new repo that OVE lives in - https://github.com/TeselaGen/tg-oss

@mega-bisharp该文件没有表明它是蛋白质的文件扩展名。我们需要根据序列内容进行猜测,这有时可能是有风险的。

So, what's the correctly file extension for protein sequence? Thank you for your answer!
@tnrich

tnrich commented

@mega-bisharp I believe the format you're looking for is .faa since your data is in the fasta format:

image

I'll actually need to update the code here https://github.com/TeselaGen/tg-oss/ (that's the new repo for ove/bio-parsers, this on is deprecated now) in order to handle .faa files correctly. I'll do that now

tnrich commented

@mega-bisharp ok, I've updated @teselagen/ove to v0.3.11 which should include automatic parsing of .faa files to protein. Let me know if that works for you :)