genome name in fasta idline
lynnjo opened this issue · 2 comments
hi all. Is there a way to have the genome name included in the id line when AGC outputs a fasta file?
For example: I make a query to get chr1 from different genomes. This query might look like:
agc getctg assemblies.agc chr1@LineA chr1@LineB chr1@LineC > fasta.out
AGC's output shows id lines of ">chr1" for all 3 of these, which makes it difficult to distinguish which sequence belongs to which genome. We are hoping to use AGC for our research project, and this is a scenario that will frequently be encountered.
Ahy suggestions?
Agc keeps FASTA comments. I would recommend to encode sample/species information there such that you can identify the source later.
Thank you - we'll try updating our fasta files and the code that parses AGC output.