remove columns with gaps in first line?
Closed this issue · 1 comments
dstern commented
I am wondering if trimaI can remove columns where there is a gap only in the first sequence of an alignment.
This feature would be useful to simplify use of user-defined multiple sequence alignments in alphafold2, which requires the first sequence to have no gaps.
nicodr97 commented
Hi @dstern ,
There is no feature currrently that can do what you suggest, although this could be achieved by the following workaround:
- Use the
-complementary -colnumbering
arguments only on the first sequence, either by separating it in a new filetrimal -in first_seq.fasta -nogaps -complementary -colnumbering
or by usingtrimal -in msa.fasta -selectseqs { 1-numseqs } -complementary -colnumbering
wherenumseqs
is the total number os sequences minus 1. You will get a list with the positions of the columns that are gaps in the first sequence (after #ColumnsMap). - Use this list to pass to
-selectcols { list_of_gap_positions }
in this waytrimal -in msa.fasta -selectcols { list_of_gap_positions }
. You will have to remove the whitespace between commas and column numbers of the output of previous step.
I hope this helps!