XiaoTaoWang/EagleC

Failed to download pre-trained models

shenlinyong opened this issue · 7 comments

Dear author, your software is great!  But NOW I'm in trouble about the download address https://www.dropbox.com/s/zcir6ivvwe928yv/5M-10M.zip?dl=0 could not be accessedwget , Can you send me the 5M-10M.zip data set.

Hi, considering mainland China might have trouble in accessing dropbox, I have migrated the pre-trained models to another place. Can you install the latest EagleC version 0.1.6 (probably by running pip install -U eaglec) and re-run the download-pretrained-models command?

Thank you very much for your help, I have successfully downloaded the predictive model. However, I encountered a new problem only on chromosome 1, and the others were OK.

      3 # Cool URI at 5kb = ../cnv_normalization/fat1_5000.cool
      4 # Cool URI at 10kb = ../cnv_normalization/fat1_10000.cool
      5 # Cool URI at 50kb = ../cnv_normalization/fat1_50000.cool
      6 # Balance Type = ICE
      7 # Reference Genome = other
      8 # Included Chromosomes = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'
      9 # Probability Cutoff for 5kb SVs = 0.8
     10 # Probability Cutoff for 10kb SVs = 0.8
     11 # Probability Cutoff for 50kb SVs = 0.99999
     12 # Output File Prefix = fat1
     13 # Output Format = full
     14 # Log file name = fat1.log
     15 root                      INFO    @ 07/21/22 10:52:19: Predict SVs at 5kb resolution ...
     16 root                      INFO    @ 07/21/22 10:52:21: matched sequencing depth in human at 10Kb: 18284.073006654307
     17 root                      INFO    @ 07/21/22 10:52:21: Load CNN models from /home/SLY68/anaconda3/envs/EagleC/lib/python3.8/site-packages/eaglec/data/bulk/5M-10M ...
     18 root                      INFO    @ 07/21/22 10:52:24: Done
     19 root                      INFO    @ 07/21/22 10:52:24: Interemediate results at the 5kb resolution will be cached to .fat1_5000.cool.16251373.ICE.None.100000.None
     20 eaglec.scoreUtils         INFO    @ 07/21/22 11:11:30: (chr1, chr1): Total 7046038 candidates left after filtering
     21 tensorflow                WARNING @ 07/21/22 11:12:51: 5 out of the last 1004 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fcf47e13c10> triggered tf.function retracing     21 . Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead     21  of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retrac     21 ing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
     22 tensorflow                WARNING @ 07/21/22 11:12:51: 6 out of the last 1005 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fcf4814d790> triggered tf.function retracing     22 . Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead     22  of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retrac     22 ing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
     ......
     68 eaglec.scoreUtils         INFO    @ 07/21/22 13:03:25: (chr2, chr2): Total 5683376 candidates left after filtering
     69 eaglec.scoreUtils         INFO    @ 07/21/22 14:51:53: (chr3, chr3): Total 4170773 candidates left after filtering
     70 eaglec.scoreUtils         INFO    @ 07/21/22 16:07:15: (chr4, chr4): Total 3426162 candidates left after filtering
     71 eaglec.scoreUtils         INFO    @ 07/21/22 17:08:44: (chr5, chr5): Total 2318773 candidates left after filtering
     72 eaglec.scoreUtils         INFO    @ 07/21/22 17:44:10: (chr6, chr6): Total 1428199 candidates left after filtering
     73 eaglec.scoreUtils         INFO    @ 07/21/22 18:02:59: (chr7, chr7): Total 1597804 candidates left after filtering
     74 eaglec.scoreUtils         INFO    @ 07/21/22 18:23:00: (chr8, chr8): Total 1259789 candidates left after filtering
     75 eaglec.scoreUtils         INFO    @ 07/21/22 18:39:26: (chr9, chr9): Total 1057478 candidates left after filtering
     76 eaglec.scoreUtils         INFO    @ 07/21/22 18:52:51: (chr10, chr10): Total 869493 candidates left after filtering
     77 eaglec.scoreUtils         INFO    @ 07/21/22 19:04:01: (chr11, chr11): Total 908876 candidates left after filtering
     78 eaglec.scoreUtils         INFO    @ 07/21/22 19:15:38: (chr12, chr12): Total 926099 candidates left after filtering
     79 eaglec.scoreUtils         INFO    @ 07/21/22 19:27:22: (chr13, chr13): Total 827170 candidates left after filtering

Other chromosomes are OK, and only chr1 has encountered this problem, and chr1 is many times larger than other chromosomes. _1_100kb.bed file is 1901776 lines.

Here is my running script:

predictSV --hic-5k ../cnv_normalization/fat1_5000.cool --hic-10k ../cnv_normalization/fat1_10000.cool --hic-50k ../cnv_normalization/fat1_50000.cool -O fat1 --genome other -C 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Z --balance-type ICE --output-format full --prob-cutoff-5k 0.8 --prob-cutoff-10k 0.8 --prob-cutoff-50k 0.99999 --logFile fat1.log

Also, I had trouble using the great software TADlib that you developed:
XiaoTaoWang/TADLib#16 (comment)
and thank you very much for your help.

This is my run script:

hitad -O fat.txt --exclude chrW,chrM -d fat_meta_file --logFile fat.log

This is an error message:

151 tadlib.hitad.genomeLev    DEBUG   @ 07/21/22 00:36:49:   Cache Chrom object into /storage/SLY68/2022/hic/juicer/down_analysis/tad/tadlib/.hitad/tmp6zmfixrg20220721003649 ...
    152 root                      INFO    @ 07/21/22 00:36:52: Done!
    153 root                      DEBUG   @ 07/21/22 00:36:52: Learning HMM parameters for each dataset ...
    154 tadlib.hitad.genomeLev    DEBUG   @ 07/21/22 00:36:52:   resolution: 1000, rep1
    155 Traceback (most recent call last):
    156   File "/home/SLY68/anaconda3/envs/tadlib/bin/hitad", line 121, in run
    157     G.learning(cpu_core=args.cpu_core)
    158   File "/home/SLY68/anaconda3/envs/tadlib/lib/python3.7/site-packages/tadlib/hitad/genomeLev.py", line 199, in learning
    159     seqs = self.train_data(res, rep)
    160   File "/home/SLY68/anaconda3/envs/tadlib/lib/python3.7/site-packages/tadlib/hitad/genomeLev.py", line 175, in train_data
    161     tmpcache.minWindows(0, tmpcache.chromLen, tmpcache._dw)
    162   File "/home/SLY68/anaconda3/envs/tadlib/lib/python3.7/site-packages/tadlib/hitad/chromLev.py", line 282, in minWindows
    163     diff = up - down
    164 ValueError: operands could not be broadcast together with shapes (0,) (1454,) 

Chromosome 1 seems also ok to me ... anyway this is just a warning message. You program finished running without error, right? Regarding to the TADLib issue, since I couldn't replicate the error, I have not idea why this happen, it would be great if you can share your cool file with me.

Hello, I am also having trouble downloading the pre-trained models. I am using version 0.1.7, and I run into this error when I try to download the models:

$ download-pretrained-models
--2022-08-08 22:37:35--  https://yuelab.fsm.northwestern.edu/share/eagleC/model/bulk.zip
Resolving yuelab.fsm.northwestern.edu (yuelab.fsm.northwestern.edu)... 165.124.83.33
Connecting to yuelab.fsm.northwestern.edu (yuelab.fsm.northwestern.edu)|165.124.83.33|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-08-08 22:37:36 ERROR 404: Not Found.

Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/EagleC/bin/download-pretrained-models", line 25, in <module>
    download_and_unzip(subfolder, weblinks[subfolder])
  File "/home/ubuntu/miniconda3/envs/EagleC/bin/download-pretrained-models", line 13, in download_and_unzip
    subprocess.check_call(' '.join(command), shell=True)
  File "/home/ubuntu/miniconda3/envs/EagleC/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'wget -O /home/ubuntu/miniconda3/envs/EagleC/lib/python3.8/site-packages/eaglec/data/bulk.zip -L https://yuelab.fsm.northwestern.edu/share/eagleC/model/bulk.zip' returned non-zero exit status 8.

Sorry we were migrating the pre-trained models to another server. Could you upgrade your EagleC version to (v0.1.8) (pip install -U eaglec) and try again?

It works now after updating the version! Thank you for your help!