How to create the input file -t
Closed this issue · 3 comments
Tcvalenzuela commented
hi, thanks for your program. From the outputs of RepeatMasker it is not clear for me which one should I use as -t.
RepeatMasker offer the optional output of -gff, is that the file necessary?
From the test file rmsk.bed can you clarify what is column 5 then maybe I can construct the file myself out of other out of RepeatMasker
Thank you!
tianxiongbb commented
Hi Tomas,
TEMP2 accepts a “bed” like input for repeat mask. The 5th column doesn’t need to be specified, you can input any value between 0-255 for it.
Best,
Tianxiong Yu
Postdoctoral Research Assistant
Weng Lab, Albert Sherman Center AS5-1079
Program in Bioinformatics and Integrative Biology
UMass Chan Medical School
… On Nov 29, 2023, at 10:38 AM, Tomas carrasco ***@***.***> wrote:
hi, thanks for your program. From the outputs of RepeatMasker it is not clear for me which one should I use as -t.
RepeatMasker offer the optional output of -gff, is that the file necessary?
From the test file rmsk.bed can you clarify what is column 5 then maybe I can construct the file myself out of other out of RepeatMasker
Thank you!
—
Reply to this email directly, view it on GitHub <#20>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFPDEDH2J3AHO6FH5U4PS7DYG5JH7AVCNFSM6AAAAAA77VXDUSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYTMOBXGE3TIMQ>.
You are receiving this because you are subscribed to this thread.
Tcvalenzuela commented
Thanks for your quick reply!
I code some work around from the output "$Name.fna.out" of RepeatMasker. Maybe it is of some use for someone else, please consider to change all spaces to tab first.
import argparse
parser = argparse.ArgumentParser(description= "Select Kimura under param and create bed")
parser.add_argument("--RMfnaout", "-RepeatMasker_fnaout", help="out of RepeatMasker .fna.out")
arg = parser.parse_args()
filehandle = open(arg.RMfnaout)
for line in filehandle:
if line.startswith("SW"):
continue
stripped=line.split("\t")
Chr=stripped[4]
startChr=stripped[5]
endChr=stripped[6]
if stripped[9].startswith("("):
continue
if stripped[9].startswith("A-rich"):
continue
if stripped[9].startswith("G-rich"):
continue
else:
TEname=stripped[9]
if stripped[8].startswith("C"):
strand="-"
else:
strand=stripped[8]
print(str(Chr)+"\t"+str(startChr)+"\t"+str(endChr)+"\t"+str(TEname)+"\t"+"0"+"\t"+str(strand))
tianxiongbb commented
Awesome, thanks for the contribution, Tomas!