kundajelab/chrombpnet

Invalid interval bounds when predicting bigwigs with ground truth

hdbeukel opened this issue · 5 comments

I have successfully gone through the tutorial and am now experimenting with the trained model to make predictions with chrombpnet pred_bw. It works fine until I provide a reference bigwig to be used as ground truth with -bw, in which case I get the following error:

Traceback (most recent call last):
  File "/opt/conda/bin/chrombpnet", line 33, in <module>
    sys.exit(load_entry_point('chrombpnet', 'console_scripts', 'chrombpnet')())
  File "/scratch/chrombpnet/chrombpnet/CHROMBPNET.py", line 56, in main
    predict_to_bigwig.main(args)
  File "/scratch/chrombpnet/chrombpnet/evaluation/make_bigwigs/predict_to_bigwig.py", line 163, in main
    compare_with_observed(args.bigwig, regions_df, regions, outputlen, 
  File "/scratch/chrombpnet/chrombpnet/evaluation/make_bigwigs/predict_to_bigwig.py", line 57, in compare_with_observed
    obs_data = data_utils.get_cts(regions_df,obs_bw,outputlen)
  File "/scratch/chrombpnet/chrombpnet/training/utils/data_utils.py", line 31, in get_cts
    vals.append(np.nan_to_num(bw.values(r['chr'], 
RuntimeError: Invalid interval bounds!

I am running chrombpnet with Apptainer using the provided Docker image:

apptainer exec  --nv -e \
                --no-mount /scratch \
                --bind ${data_dir}:/data \
                --bind ${model_dir}:/models \
                --bind ${out}:/output \
                $singularity_img_name \
                chrombpnet pred_bw \
                -bm /models/bias_model_scaled.h5 \
                -cm /models/chrombpnet.h5 \
                -cmb /models/chrombpnet_nobias.h5 \
                -r /data/CONTROL.mRp.clN_peaks.narrowPeak \
                -g /data/ath.fasta \
                -c /data/ath.chrom.sizes.txt \
                -op /output/ath_root_CONTROL \
                -bw /data/CONTROL.mRp.clN.bigWig \

If I remove the -bw option everything works fine, but I would like to get the additional metrics calculated based on the ground truth. Any idea what could be the reason for getting this error?

Hey,
I am also getting a similar error, but in the first steps, in trying to train the model with the chrombpnet pipeline command.
Have you found a solution?

Estimating enzyme shift in input file
Current estimated shift: +4/-5
awk -v OFS="\t" '{if ($6=="+"){print $1,$2+0,$3,$4,$5,$6} else if ($6=="-") {print $1,$2,$3+1,$4,$5,$6}}' | sort -k1,1 | bedtools genomecov -bg -5 -i stdin -g mm10.c
hrom.sizes | LC_COLLATE="C" sort -k1,1 -k2,2n 
Making BedGraph (Filter chromosomes not in reference fasta)
Making Bigwig
Traceback (most recent call last):
  File "/opt/conda/bin/chrombpnet", line 33, in <module>
    sys.exit(load_entry_point('chrombpnet', 'console_scripts', 'chrombpnet')())
  File "/scratch/chrombpnet/chrombpnet/CHROMBPNET.py", line 23, in main
    pipelines.chrombpnet_train_pipeline(args)
  File "/scratch/chrombpnet/chrombpnet/pipelines.py", line 31, in chrombpnet_train_pipeline
    build_pwm_from_bigwig.main(args)
  File "/scratch/chrombpnet/chrombpnet/helpers/preprocessing/analysis/build_pwm_from_bigwig.py", line 56, in main
    bigwig_vals = np.nan_to_num(bw.values(args.chr,0,chr_size ))
RuntimeError: Invalid interval bounds!

I haven't found a solution yet. I was hoping to get some feedback here. If this issue is not resolved, I will not be able let chrombpnet compute the metrics to evaluate my prediction relative to the ground truth and I guess I will have to compute some metrics myself. I really hope though to be able to use the built-in metrics. Anyone from the developers who has an idea why we get this error?

Hey again,
I managed to fix my issue... One row in my peak file had the wrong coordinates (with start bigger than end).
I could find out when trying to create a bigwig file independent of chrombpnet pipeline - in this case I used bedGraphToBigWig tool and the error specified which line had the weird peak coordinate.. I removed it and now it's running fine..
Maybe this will work for you too? 🤞

Hello,

yes one of your regions in the bed file is exceeding the 2114 regions input length. Do read the NOTE in the wiki readme here - https://github.com/kundajelab/chrombpnet/wiki/Generate-prediction-bigwigs

Also agree with @marzamKI this error is caused when your bed files have erroneous coordinates