jcjohnson/torch-rnn

'Unsupported HDF5 version: 1.10.1' when running train.lua

ajakk opened this issue ยท 5 comments

ajakk commented

I cannot seem to start training due to an unsupported HDF5 version error. I installed the torch-hdf5 library using the tutorial in the README. My H5 and JSON training data files are both in the directory data/. The command I'm using along with the error it throws is as follows.

> th train.lua -input_h5 data/trainingData.h5 -input_json data/trainingData.json
/home/jake/Downloads/torch/install/bin/luajit: ...ake/Downloads/torch/install/share/lua/5.1/trep
/init.lua:389: ...ake/Downloads/torch/install/share/lua/5.1/trepl/init.lua:389: .../jake/Downloads/torch/install/share/lua/5.1/hdf5/ffi.lua:71: Unsupported HDF5 version: 1.10.1
stack traceback:
    [C]: in function 'error'
        ...ake/Downloads/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    train.lua:6: in main chunk
    [C]: in function 'dofile'
    ...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x7a76420520
ajakk commented

Derp. I'm running Arch Linux, and it seems this program requires an HDF5 library that predates the current one on the Arch repos. I've now gotten past this problem by using hdf5-1.8.14-1.

I fixed the problem by updating the required version number in order to work with version 1.10.

Edit: /Users//torch/install/share/lua/5.2/hdf5/ffi.lua
change
"if maj[0] ~= 1 or min[0] ~= 8 then"
to
"if maj[0] ~= 1 or min[0] ~= 10 then"

Sisim commented

My Ubuntu Version was 18 LTS.
I downloaded the right version 1.8.12 from here
https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.12/obtain51812.html
and i just reconfigured the
~torch/install/share/lua/5.1/hdf5/config.lua
to the new paths of
hdf5.h and libhdf5.so
the paths are found in downloaded directory if you search for it.

@Sisim You are a saviour! your solution works!!!!

My Ubuntu Version was 18 LTS.
I downloaded the right version 1.8.12 from here
https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.12/obtain51812.html
and i just reconfigured the
~torch/install/share/lua/5.1/hdf5/config.lua
to the new paths of
hdf5.h and libhdf5.so
the paths are found in downloaded directory if you search for it.

I did the same but now I am encountering new error. I did search for it but I am not able to fix it. I am pasting my command along with the error encountered.

/data/tools/Basset-master/src/basset_train.lua -job pretrained_params.txt -save cd4_cnn out.h5
{
  conv_filter_sizes : 
    {
      1 : 19
      2 : 11
      3 : 7
    }
  weight_norm : 7
  momentum : 0.98
  learning_rate : 0.002
  hidden_units : 
    {
      1 : 1000
      2 : 1000
    }
  conv_filters : 
    {
      1 : 300
      2 : 200
      3 : 200
    }
  hidden_dropouts : 
    {
      1 : 0.3
      2 : 0.3
    }
  pool_width : 
    {
      1 : 3
      2 : 4
      3 : 4
    }
}
seq_len: 600, filter_size: 19, pad_width: 18	
seq_len: 200, filter_size: 11, pad_width: 10	
seq_len: 50, filter_size: 7, pad_width: 6	
nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> output]
  (1): nn.SpatialConvolution(4 -> 300, 19x1, 1,1, 9,0)
  (2): nn.SpatialBatchNormalization (4D) (300)
  (3): nn.ReLU
  (4): nn.SpatialMaxPooling(3x1, 3,1)
  (5): nn.SpatialConvolution(300 -> 200, 11x1, 1,1, 5,0)
  (6): nn.SpatialBatchNormalization (4D) (200)
  (7): nn.ReLU
  (8): nn.SpatialMaxPooling(4x1, 4,1)
  (9): nn.SpatialConvolution(200 -> 200, 7x1, 1,1, 3,0)
  (10): nn.SpatialBatchNormalization (4D) (200)
  (11): nn.ReLU
  (12): nn.SpatialMaxPooling(4x1, 4,1)
  (13): nn.Reshape(2600)
  (14): nn.Linear(2600 -> 1000)
  (15): nn.BatchNormalization (2D) (1000)
  (16): nn.ReLU
  (17): nn.Dropout(0.300000)
  (18): nn.Linear(1000 -> 1000)
  (19): nn.BatchNormalization (2D) (1000)
  (20): nn.ReLU
  (21): nn.Dropout(0.300000)
  (22): nn.Linear(1000 -> 1)
  (23): nn.Sigmoid
}
/data/torch/install/bin/luajit: /data/torch/install/share/lua/5.1/hdf5/ffi.lua:335: Reading data of class ENUM(50331749) is unsupported
stack traceback:
	[C]: in function 'error'
	/data/juhi/torch/install/share/lua/5.1/hdf5/ffi.lua:335: in function '_getTorchType'
	/data/juhi/torch/install/share/lua/5.1/hdf5/dataset.lua:88: in function 'getTensorFactory'
	/data/juhi/torch/install/share/lua/5.1/hdf5/dataset.lua:138: in function 'partial'
	/data/juhi/tools/Basset-master/src/batcher.lua:39: in function 'next'
	/data/juhi/tools/Basset-master/src/convnet.lua:1009: in function 'train_epoch'
	/data/juhi/tools/Basset-master/src/basset_train.lua:156: in main chunk
	[C]: in function 'dofile'
	...juhi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x004064f