tmbdev/clstm

load pretrainned model error

wanghaisheng opened this issue · 7 comments


>>>>>>> ./test-lstm
#: ntrain = 100000
#: ntest = 1000
#: gpu = -1
training 1:4:2 network to learn delay
.Stacked <<<0.0001 0.9 in 20 1 out 20 2>>>
.Stacked    inputs 20 1 1 Seq:[0.000000|0.400000|1.000000:20][-0.030502|-0.001325|0.025830:20]
.Stacked    outputs 20 2 1 Seq:[0.000045|0.500000|0.999955:40][-0.003537|0.000000|0.003537:40]
.Stacked.NPLSTM <<<0.0001 0.9 in 20 1 out 20 4>>>
.Stacked.NPLSTM    WCI 4 6 Bat:[-2.371926|-0.180369|1.806145:24][-0.042775|0.002258|0.046467:24]
.Stacked.NPLSTM    WGF 4 6 Bat:[-1.157534|0.080894|1.083409:24][-0.006257|0.002911|0.012813:24]
.Stacked.NPLSTM    WGI 4 6 Bat:[-0.212294|0.408031|2.245568:24][-0.002479|0.003996|0.014401:24]
.Stacked.NPLSTM    WGO 4 6 Bat:[-2.919670|0.557016|3.321400:24][-0.018948|0.007477|0.029906:24]
.Stacked.NPLSTM    inputs 20 1 1 Seq:[0.000000|0.400000|1.000000:20][-0.030502|-0.001325|0.025830:20]
.Stacked.NPLSTM    outputs 20 4 1 Seq:[-0.953226|0.038960|0.739090:80][-0.043424|0.000437|0.023548:80]
.Stacked.NPLSTM    ci 20 4 1 Seq:[-0.999453|-0.009550|0.961935:80][-0.009550|-0.000172|0.014207:80]
.Stacked.NPLSTM    gf 20 4 1 Seq:[0.103049|0.455415|0.836064:80][-0.001099|-0.000045|0.001491:80]
.Stacked.NPLSTM    gi 20 4 1 Seq:[0.691834|0.832398|0.977853:80][-0.000271|0.000066|0.000974:80]
.Stacked.NPLSTM    go 20 4 1 Seq:[0.063462|0.769145|0.996185:80][-0.001031|0.000088|0.004340:80]
.Stacked.NPLSTM    source 20 5 1 Seq:[-0.953226|0.096544|1.000000:100][-0.030502|-0.000562|0.027306:100]
.Stacked.NPLSTM    state 20 4 1 Seq:[-1.978872|-0.030874|1.210067:80][-0.010696|-0.000545|0.015035:80]
.Stacked.SoftmaxLayer <<<0.0001 0.9 in 20 4 out 20 2>>>
.Stacked.SoftmaxLayer    W1 2 5 Bat:[-6.143187|-0.003997|6.134448:10][-0.061927|-0.000000|0.061927:10]
.Stacked.SoftmaxLayer    inputs 20 4 1 Seq:[-0.953226|0.038960|0.739090:80][-0.043424|0.000437|0.023548:80]
.Stacked.SoftmaxLayer    outputs 20 2 1 Seq:[0.000045|0.500000|0.999955:40][-0.003537|0.000000|0.003537:40]
#: verbose = 0
OK (pre-save) 0.00620409
saving
loading
OK 0.00620409
nparams 106
OK (params) 0.00620409
OK (hacked-params) 0.5
OK (restored-params) 0.00620409

real    0m11.372s
user    0m11.368s
sys     0m0.005s

>>>>>>> ./test-xps-third.sh
#: ntrain = 400000
#: save_name = xps-total
#: report_time = 0
#: charsep =
got 899 files, 50 tests
#: load = xps-391100.clstm
.Stacked: 0.0001 0.9 in 0 48 out 0 5702
.Stacked.Parallel: 0.0001 0.9 in 0 48 out 0 200
.Stacked.Parallel.NPLSTM: 0.0001 0.9 in 0 48 out 0 100
.Stacked.Parallel.Reversed: 0.0001 0.9 in 0 48 out 0 100
.Stacked.Parallel.Reversed.NPLSTM: 0.0001 0.9 in 0 48 out 0 100
.Stacked.SoftmaxLayer: 0.0001 0.9 in 0 200 out 0 5702
#: start = -1
start 391101
#: test_every = 1000
#: save_every = 1000
#: report_every = 1
#: display_every = 1000
clstmocrtrain: clstm.cc:231: void ocropus::Codec::encode(ocropus::Classes&, const wstring&): Assertion `encoder->count(c) > 0' failed.
./test-xps-third.sh: line 20:  1284 Aborted                 (core dumped) ./clstmocrtrain xps-train-total xps-test-total

>>>>>>> echo TEST FAILED
TEST FAILED


Hi @wanghaisheng !

Please paste here the content of test-xps-third.sh.

You are using test-xps-third.sh to load a saved model. What commands you used to create the model in the first place?

#!/bin/bash
set -ea
find ../pic_simsun_eachline_20_with_special_char -name '*.bin.png' | sort -r > xps-total-zh-char
sed 1,0d xps-total-zh-char > xps-train-total
sed 1,850d xps-total-zh-char > xps-test-total
report_every=1
save_every=1000
ntrain=400000
dewarp=center
display_every=1000
test_every=1000
display_every=1000
testset=xps-total-test.h5
hidden=800
lrate=1e-4
save_name=xps-total
report_time=
load=xps-391100.clstm
# gdb --ex run --args \
./clstmocrtrain xps-train-total  xps-test-total

Do you use the latest commit from the master?

clstmocrtrain: clstm.cc:231: void ocropus::Codec::encode(ocropus::Classes&, const wstring&): Assertion `encoder->count(c) > 0' failed.
./test-xps-third.sh: line 20: 1284 Aborted (core dumped) ./clstmocrtrain xps-train-total xps-test-total

I suggest that you try to find with a debugger the line in clstmocrtrain.cc that triggers this abort in lstm.cc.

Did you solve it? How?

nope
i try to train another model from scratch instead of loading existing ones