4ureliek/Fasta

Problem in fetching sequences based on headers in a separate file using fasta_Fetchseqs.pl

Closed this issue · 5 comments

I can only use either -m or -file flag because if I use -m followed by file name, it matches the file name in the input file. If I provide -file flag, then it gives error "ERROR: please provide a word or a file (-m); type -h to see usage".
How to resolve this?

Hi,

Could you share examples of command lines that are not working? I re-checked using -m -file:
perl /Users/aurelie/Documents/Taf/00_bin/My/fasta/FetchSeqs.pl -in test.fa -m ids -file -v

Log printed with -v:
--- Script to fetch fasta sequences started (v2.3), with
- input file = test.fa
- extraction of sequences based on matching with headers in file ids
- extraction will be based on exact match between header and the word(s) set with -m
--- Extracting sequences...
--- Done, sequences extracted in test.extract.fa

With files being:
test.fa:
>2 test1
MAFSAEDVLK
>3 test1
MSIIGATRLQ
>4 test2
MNAKYDTDQG
>5 test3
MDSLNEVCYE

ids:
2
3

And this outputs what I expect:
test.extract.fa:
>2 test1
MAFSAEDVLK
>3 test1
MSIIGATRLQ

Hi,
Thanks for your reply.
The input files are the same but the problem is in the command line, i.e., "-m ids -file -v", in this part. When I write "ids" in front of -m it literally searches for "ids" in the input file.

I am quite confused, because this is exactly what I tried. Can you send me example files and exact full command line that reproduce your issue?

My input file:
`>test1_orgname
MAFSAEDVLK

test2_org
MSIIGATRLQ
test1.1_org_name
MAFSAEDVLK`

ids.txt
`>test1_orgname

test2_org
test1.1_org_name`

Command line:
.../script/FetchSeqs.pl -in input.fasta -m ids.txt -file -v

Here is what I get by running exactly what you sent me, and it behaves as expected:

perl fasta_FetchSeqs.pl -in input.fa -m ids -file -v
    --- Script to fetch fasta sequences started (v2.3), with
           - input file = input.fa
           - extraction of sequences based on matching with headers in file ids
           - extraction will be based on exact match between header and the word(s) set with -m
     --- Extracting sequences...
     --- Done, sequences extracted in input.extract.fa

input.extract.fa:

>test1_orgname
MAFSAEDVLK
>test2_org
MSIIGATRLQ
>test1.1_org_name
MAFSAEDVLK

Input files were as follow:
input.fa:

>test1_orgname
MAFSAEDVLK
>test2_org
MSIIGATRLQ
>test1.1_org_name
MAFSAEDVLK

ids:

>test1_orgname
>test2_org
>test1.1_org_name