inab/trimal

remove columns with gaps in first line?

Closed this issue · 1 comments

I am wondering if trimaI can remove columns where there is a gap only in the first sequence of an alignment.
This feature would be useful to simplify use of user-defined multiple sequence alignments in alphafold2, which requires the first sequence to have no gaps.

Hi @dstern ,
There is no feature currrently that can do what you suggest, although this could be achieved by the following workaround:

  1. Use the -complementary -colnumbering arguments only on the first sequence, either by separating it in a new file trimal -in first_seq.fasta -nogaps -complementary -colnumbering or by using trimal -in msa.fasta -selectseqs { 1-numseqs } -complementary -colnumbering where numseqs is the total number os sequences minus 1. You will get a list with the positions of the columns that are gaps in the first sequence (after #ColumnsMap).
  2. Use this list to pass to -selectcols { list_of_gap_positions } in this way trimal -in msa.fasta -selectcols { list_of_gap_positions }. You will have to remove the whitespace between commas and column numbers of the output of previous step.

I hope this helps!