BIMSBbioinfo/janggu

String decoding error when running the CAGE prediction example

Closed this issue · 3 comments

Hello Wolfgang,

I've been trying to run the 03_cage_prediction jupyter notebook example with the data provided in the 01_cage_downloads.ipynb file, but I'm getting a string decode error during training. I'm not exactly sure where this error is coming from since I'm not using my own data. I'm using python 3.7, bedtools v2.30.0, tensorflow 1.14.0, and keras 2.2.5.

I've pasted the training output below, and the error is at the bottom of the output. I've omitted the training status and network architecture output:


Namespace(evaluate=False, inputpath='../data', order=2, outputpath='../')
####################
Test effect of scanning single or both strands and higher-order motifs
{'seq_dropout': 0.2, 'dnaflank': 350, 'nmotifs1': 10, 'motiflen': 15, 'pool1': 5, 'nmotifs2': 8, 'hypermotiflen': 5, 'dnaseflank': 200, 'inception': False, 'traincell': 'hepg2', 'trainrep': 'rep1', 'cageflank': 400, 'opt': 'amsgrad', 'epochs': 100, 'run': 1, 'val_chrom': 'chr2', 'order': 2, 'pretrained': False, 'stranded': 'double', 'inputs': 'dna_only'}

(Network training status is omitted to save space)

########################################
val_loss: 0.7199995372668806
val_mean_squared_error: 0.8274923270485967
loss: 0.6398879488643304
mean_squared_error: 0.7002699232404976
########################################
Model: "janggu"

(Network architecture info is omitted to save space)

=================================================================
Total params: 2,899
Trainable params: 2,863
Non-trainable params: 36


{'seq_dropout': 0.2, 'dnaflank': 350, 'nmotifs1': 10, 'motiflen': 15, 'pool1': 5, 'nmotifs2': 8, 'hypermotiflen': 5, 'dnaseflank': 200, 'inception': False, 'traincell': 'hepg2', 'trainrep': 'rep1', 'cageflank': 400, 'opt': 'amsgrad', 'epochs': 100, 'run': 1, 'val_chrom': 'chr2', 'order': 2, 'pretrained': False, 'stranded': 'double', 'inputs': 'epi_only'}

(Network training status is omitted to save space)

########################################
val_loss: 0.48243483295402373
val_mean_squared_error: 0.37687083796324977
loss: 0.4691087382861464
mean_squared_error: 0.38518023471576845
########################################
Model: "janggu"

(Network architecture info is omitted to save space)

Total params: 11
Trainable params: 7
Non-trainable params: 4


{'seq_dropout': 0.2, 'dnaflank': 350, 'nmotifs1': 10, 'motiflen': 15, 'pool1': 5, 'nmotifs2': 8, 'hypermotiflen': 5, 'dnaseflank': 200, 'inception': False, 'traincell': 'hepg2', 'trainrep': 'rep1', 'cageflank': 400, 'opt': 'amsgrad', 'epochs': 100, 'run': 1, 'val_chrom': 'chr2', 'order': 2, 'pretrained': True, 'stranded': 'double', 'inputs': 'epi_dna'}

Traceback (most recent call last):
File "cage_prediction.py", line 328, in
res = objective(shared_space)
File "cage_prediction.py", line 202, in objective
dnam = Janggu.create_by_name('cage_promoters_dna_only')
File "/Users/AHNSF9/Anaconda/anaconda3/envs/janggu1/lib/python3.7/site-packages/janggu/model.py", line 240, in create_by_name
model = load_model(path, custom_objects=custom_objects)
File "/Users/AHNSF9/Anaconda/anaconda3/envs/janggu1/lib/python3.7/site-packages/keras/engine/saving.py", line 458, in load_wrapper
return load_function(*args, **kwargs)
File "/Users/AHNSF9/Anaconda/anaconda3/envs/janggu1/lib/python3.7/site-packages/keras/engine/saving.py", line 550, in load_model
model = _deserialize_model(h5dict, custom_objects, compile)
File "/Users/AHNSF9/Anaconda/anaconda3/envs/janggu1/lib/python3.7/site-packages/keras/engine/saving.py", line 242, in _deserialize_model
model_config = json.loads(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'


Thanks in advance.

What h5py version are you using? Maybe try a different one. I’ve run into similar problems with h5py. Let us know if that helps

wkopp commented

Yes, it might be related to this issue: keras-team/keras#14265

I was using 3.2.0. I don't see the error anymore after downgrading to 2.8.0. Thanks!