IndexOutOfRangeException when using RNN
bratao opened this issue · 8 comments
Hello again!
Using today's version ( 4fad1b6 ), It can't converge using LTSM compared to yesterday version, and give me a crash when using the RNN.
The invoked command line was:
-mode train -trainfile wnnsharp-data-rsysyi.txt -modelfile rnnsharp.model -validfile wnnsharp-data-rsysyi.txt -ftrfile wnnsharp-config-qxnigh.txt -tagfile avaliable-tags.txt -modeltype 0 -layersize 50 -alpha 0.1 -crf 0 -maxiter 30 -savestep 200K -dir 0 -dropout 0
The error is:
Unhandled Exception: System.AggregateException: One or more errors occurred. ---> System.IndexOutOfRangeException: Index was outside the bounds of the array.
at RNNSharp.RNN.<>c__DisplayClass107_0.<matrixXvectorADD>b__0(Int32 i) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 660
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body)
at RNNSharp.RNN.matrixXvectorADD(SimpleLayer dest, SimpleLayer srcvec, Matrix`1 srcmatrix, Int32 DestSize, Int32 SrcSize, Int32 type) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 685
at RNNSharp.SimpleRNN.computeHiddenLayer(State state, Boolean isTrain) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\SimpleRNN.cs:line 157
at RNNSharp.RNN.PredictSentence(Sequence pSequence, RunningMode runningMode) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 267
at RNNSharp.RNN.TrainNet(DataSet trainingSet, Int32 iter) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 536
at RNNSharp.RNNEncoder.Train() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNNEncoder.cs:line 133
at RNNSharpConsole.Program.Main(String[] args) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 284
With a debug build, the error is more descriptive
Unhandled Exception: System.AggregateException: One or more errors occurred. ---> System.NotSupportedException: Vector<T>.Count cannot be called via reflection when intrinsics are enabled.
at System.Numerics.Vector`1.get_Count()
at RNNSharp.RNN.<>c__DisplayClass107_0.<matrixXvectorADD>b__0(Int32 i) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 662
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body)
at RNNSharp.RNN.matrixXvectorADD(SimpleLayer dest, SimpleLayer srcvec, Matrix`1 srcmatrix, Int32 DestSize, Int32 SrcSize, Int32 type) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 658
at RNNSharp.SimpleRNN.computeHiddenLayer(State state, Boolean isTrain) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\SimpleRNN.cs:line 157
at RNNSharp.RNN.PredictSentence(Sequence pSequence, RunningMode runningMode) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 265
at RNNSharp.RNN.TrainNet(DataSet trainingSet, Int32 iter) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 536
at RNNSharp.RNNEncoder.Train() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNNEncoder.cs:line 101
at RNNSharpConsole.Program.Train() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 492
at RNNSharpConsole.Program.Main(String[] args) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 272
Thanks @bratao . This is a known bug, which is caused by non-alignment between data size (it's hidden layer size in your case) and SIMD registers in CPU. I will fix it by data alignment,
Hi @zhongkaifu !!
This bug is showing up again after the latest set of commits =(
Could you please show me the call stack or other information for debugging ? Is this the same call stack as before ?
Sure,
One detail is that the training works now. The error in during the tagging.
In debug I just get this:
Unhandled Exception: System.NotSupportedException: Vector<T>.Count cannot be called via reflection when intrinsics are enabled.
at System.Numerics.Vector`1.get_Count()
at RNNSharp.ModelSetting.DumpSetting() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\ModelSetting.cs:line 51
at RNNSharpConsole.Program.Train() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 444
at RNNSharpConsole.Program.Main(String[] args) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 277
In Release I get this error:
info,3/9/2016 5:21:05 PM Loading template feature set...
info,3/9/2016 5:21:05 PM Template feature size: 11173
info,3/9/2016 5:21:05 PM Template feature context size: 11173
info,3/9/2016 5:21:05 PM Get model type LSTM and direction FORWARD
info,3/9/2016 5:21:05 PM Model Structure: LSTM-RNN
info,3/9/2016 5:21:05 PM Loading LSTM-RNN model: rnnsharp.model
info,3/9/2016 5:21:05 PM Loading input2hidden weights...
info,3/9/2016 5:21:05 PM Loading LSTM-Weight: width:60, height:5432, vqSize:0...
info,3/9/2016 5:21:05 PM Loading hidden2output weights...
info,3/9/2016 5:21:05 PM Loading matrix. width: 60, height: 22, vqSize: 0
info,3/9/2016 5:21:05 PM CRF Model: False
Unhandled Exception: System.AggregateException: One or more errors occurred. ---> System.IndexOutOfRangeException: Index was outside the bounds of the array.
at RNNSharp.LSTMRNN.<>c__DisplayClass36_0.<computeHiddenLayer>b__0(Int32 j) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\LSTMRNN.cs:line 736
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body)
at RNNSharp.LSTMRNN.computeHiddenLayer(State state, Boolean isTrain) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\LSTMRNN.cs:line 803
at RNNSharp.RNN.PredictSentence(Sequence pSequence, RunningMode runningMode) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNN.cs:line 277
at RNNSharp.RNNDecoder.Process(Sentence sent) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharp\RNNDecoder.cs:line 77
at RNNSharpConsole.Program.Test() in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 372
at RNNSharpConsole.Program.Main(String[] args) in C:\sbuild\mine\backend_broka\Candidatos\RNNSharp-master\RNNSharpConsole\Program.cs:line 289
Thanks.
The second one should be right exception, however, I cannot repro it in my machine. From your call stack, the exception is from "LSTMCell cell_j = neuHidden[j]", and j was outside the bounds of neuHidden. Could you please set a break point there and see what was happen ?
Parallel.For(0, L1 - 1, parallelOption, j =>
{
LSTMCell cell_j = neuHidden[j];
//hidden(t-1) -> hidden(t)
cell_j.previousCellState = cell_j.cellState;
@zhongkaifu , Sorry, my mistake !!
In the recent build, there is a regression, as it is not saving the model anymore if you´re not using a validated corpus. So I was using an older model, generated by an older version.
Fixing it solved it !
Thanks @bratao .
If validated corpus isn't provided, model should be saved when we get better result in training corpus. I will fix this problem.