Clodius aggregate with custom assembly: TypeError: Can't broadcast
liz-is opened this issue · 3 comments
Hi there,
I'm working with Drosophila data, aligned to dm6 from Flybase, which is in an Ensembl-like format (i.e., no 'chr' prefix'). Because negspy only has UCSC-like assemblies included, using --assembly dm6
I get errors like KeyError: 'X'
.
So, I'm using --chromsizes-filename
to specify a file that contains chrom sizes for my genome version and for only the main chromosomes, since my bedgraph has already been filtered to have only the main chromosomes. Here's the command I'm running and the output:
clodius aggregate bedgraph test_Rep1_10kb_corrected_pc.eigenvector.bed \
--output-file test_Rep1_10kb_corrected_pc.eigenvector.hitile \
--chromosome-col 1 --from-pos-col 2 --to-pos-col 3 --value-col 5 \
--chromsizes-filename dm6_chrom_sizes_sanitized.txt --nan-value nan --no-header
output file: test_Rep1_10kb_corrected_pc.eigenvector.hitile
assembly_size: 137547960
assembly: hg19
assembly size (max-length) 137547960
max-width 268435456
max_zoom: 18
chunk-size: 16777216
chrom-order [b'2L' b'2R' b'3L' b'3R' b'4' b'X' b'Y']
len(values): 110458336 16777216
line: X 1 120000 A 0.0 .
position: 1 progress: 0.00 elapsed: 8.87 remaining: 1220465716.46
len(data_buffers[curr_zoom]) 16777216
positions[curr_zoom]: 0
len(values): 93681120 16777216
line: X 1 120000 A 0.0 .
[some output removed]
Traceback (most recent call last):
File "/home/research/vaquerizas/liz/test/env/bin/clodius", line 11, in <module>
load_entry_point('clodius==0.10.8', 'console_scripts', 'clodius')()
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 1322, in bedgraph
chromsizes_filename, zoom_step)
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 938, in _bedgraph
values[:chunk_size], nan_values[:chunk_size]
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 842, in add_values_to_data_buffers
dsets[curr_zoom][curr_pos:curr_pos+chunk_size] = curr_chunk
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 707, in __setitem__
for fspace in selection.broadcast(mshape):
File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/h5py/_hl/selections.py", line 299, in broadcast
raise TypeError("Can't broadcast %s -> %s" % (target_shape, self.mshape))
TypeError: Can't broadcast (16777216,) -> (3330232,)
Any suggestions would be appreciated! I was wondering if this is also related to #87 ?
Hey, it sounds like you're doing everything right. Would you mind trying to convert to a bigWig and ingesting that instead?
https://docs.higlass.io/data_preparation.html#creating-bigwig-files
We need to either deprecate the clodius aggregate bedgraph
functionality or change it to just output bigWig files.
Oh, I didn't realise it was possible to ingest bigwig files directly! Is that new, or did I just completely miss it? I'll give that a try then. It's also nice not to have to create the extra file :)
Ingest bigwig files directly works well as long as I provide an appropriate chrom.sizes file as well, so I'll close this.