weng-lab/TEMP2

It successfully ran the test data, but failed with real data.

Closed this issue · 2 comments

Hi,Dr. Yu,

It successfully ran the test data, but failed with real data. I would like to ask that the program was killed while running the part "Calculate frequency of each transposon insertion". Is this due to a previous error, or does bedtools really require a lot of RAM? bedtools uses 40GB of RAM, if it is the latter, I will replace my device.

best wishes

dan


TEMP2 insertion -l /home/danding/practice/workspace/data/GALG1_1.fq.gz -r /home/danding/practice/workspace/data/GALG1_2.fq.gz -i /home/danding/practice/workspace/data/GALG1.sort.bam -g /home/danding/practice/workspace/newbegin/7b/7b_genome.fa -R /home/danding/practice/workspace/newbegin/cdhit/7b_final.fa -t ./7b.bed -o ./GALG1 -c 5
Testing required softwares:
bwa: /home/danding/miniconda3/bin/bwa
samtools: /home/danding/miniconda3/bin/samtools
bedtools: /home/danding/miniconda3/bin/bedtools
------ Start pipeline ------
get concordant-uniq-split reads Fri Mar 15 10:01:03 CST 2024
[bam_sort_core] merging from 0 files and 5 in-memory blocks...
[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 161088 reads
check fragment length Fri Mar 15 10:15:41 CST 2024
insert size set to 95 quantile: 477
get mate seq of the uniq-unpaired Fri Mar 15 10:15:42 CST 2024
[bam_sort_core] merging from 0 files and 5 in-memory blocks...
[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 629970 reads
map paired split uniqMappers and unpaired uniqMappers to transposons Fri Mar 15 10:16:35 CST 2024
merge fragments in genome and transposon Fri Mar 15 10:16:50 CST 2024
pass1 - making usageList (58 chroms): 4 millis
pass2 - checking and writing primary data (20159 records, 6 fields): 35 millis
merge support reads in the same direction within 477 - 150 Fri Mar 15 10:17:02 CST 2024
merge support reads in different direction within 2 X 477 - 150 Fri Mar 15 10:17:12 CST 2024

**filter candidate insertions which overlap with the same transposon insertion or in high depth region Fri Mar 15 10:17:14 CST 2024

Differing number of BED fields encountered at line: 3. Exiting...
Differing number of BED fields encountered at line: 3. Exiting...
Differing number of BED fields encountered at line: 3. Exiting...

filter candidate insertions in high depth region Fri Mar 15 10:17:14 CST 2024
average read number for 200bp bins is 81.287, set read number cutoff to 406.435
Filtered insertion number: 11191 - 11191 (overlap rmsk) 0 (short insertion) - 0 (high depth) = 0
generate the overall distribution of transposon mapping reads, first map all reads to transposon Fri Mar 15 10:27:57 CST 2024
sam to bed and bedGraph, multiple mappers are divided by their map times Fri Mar 15 11:16:57 CST 2024
[bam_sort_core] merging from 1 files and 5 in-memory blocks...
estimate de novo insertion number for each transposon using singleton reads Fri Mar 15 11:21:10 CST 2024

**generate distribution figures for singleton supporting reads Fri Mar 15 11:21:12 CST 2024

Error in read.table(Args[8], header = F, row.names = NULL) :
no lines available in input
Execution halted

filter unreliable singleton insertions, also filter 2p insertions overlapped with similar reference transposon copies Fri Mar 15 11:21:13 CST 2024
Calculate frequency of each transposon insertion Fri Mar 15 11:21:13 CST 2024

**[bam_sort_core] merging from 23 files and 5 in-memory blocks...

Killed**

get TSD, remove redundant insertions and recalculate de novo insertion rate Fri Mar 15 11:54:33 CST 2024


***** ERROR: Requested column 2, but database file - only has fields 1 - 0.
GALG1.t is empty
calculate de novo insertion rate per genome Fri Mar 15 11:54:33 CST 2024
clean tmp files Fri Mar 15 11:54:33 CST 2024
Done, Congras!!!🍺🍺🍺

Hi Yu,

Your intuition was correct. I modified the bed file and it worked.
Thank you for your patience!Wish you a happy life!🍺🍺🍺

Dan