vaquerizaslab/fanc

FANC aggregate generates weird results

wangfuzhou110 opened this issue · 1 comments

Hi,

I'm using FAN-C to perform APA analysis on some loop calls, but I'm having trouble with the plot generated by the command. Specifically, the plot looks funny like this.

image

Here is the command I used:

fanc aggregate -m mES.matrix.txt -p mES.pdf --loop-strength mES.strength.txt -w 210kb --pixels 21 mES_10kb_filtered_agg.fanc mES.apa.bedpe mES.matrix

After adding -e flag, the output looks fine:

fanc aggregate -m mES.matrix.txt -p mES.pdf --loop-strength mES.strength.txt -w 210kb --pixels 21 -e mES_10kb_filtered_agg.fanc mES.apa.bedpe mES.matrix

image

I'm wondering if there is any way to produce a normal aggregate plot using FAN-C, without the -e flag? I don't want to use the O/E matrix for APA.

FAN-C version I used was 0.9.21, and the output when running fanc aggregate is listed as below:

2023-04-05 11:47:29,106 INFO FAN-C version: 0.9.21
2023-04-05 11:47:29,150 INFO Detected BEDPE. Running pairwise region extraction
2023-04-05 11:51:15,619 INFO Checking region pair validity...
2023-04-05 11:51:15,636 INFO 0/15896 region pairs are invalid
Matrices 100% (15896 of 15896) |#########################################################| Elapsed Time: 0:08:34 Time:  0:08:34
2023-04-05 12:12:44,542 WARNING 39 region pairs invalid, most likely due to missing chromosome data
2023-04-05 12:12:44,542 INFO Checking region pair validity...
2023-04-05 12:12:44,713 INFO 48/47688 region pairs are invalid
Matrices 100% (47640 of 47640) |#########################################################| Elapsed Time: 0:16:21 Time:  0:16:21
/home/me/anaconda3/envs/3dgenome/lib/python3.8/site-packages/fanc/architecture/aggregate.py:804: UserWarning: Warning: converting a masked element to nan.
  value = float(np.nansum(m)/np.nansum(np.logical_not(m.mask)))
/home/me/anaconda3/envs/3dgenome/lib/python3.8/site-packages/fanc/architecture/aggregate.py:829: RuntimeWarning: divide by zero encountered in log2
  ratios.append(np.log2(r))

Any help or suggestions would be greatly appreciated. Thank you!

Hi, loops occur at different distances to the diagonal, often at TAD corners. Local signal intensities vary immensely based on how far you are from the diagonal. Without using the expected value transformation (-e), which largely removes the distance effect, most of the signal will come from the distace-dependent interaction decay.

Please also read this excellent paper for details on the aggregate plots:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5639698/