HuwCampbell/grenade

index out of bounds -- in Data.Vector.Generic

sebeaumont opened this issue · 5 comments

just tried shakespeare on shakespear 100k sample and got the following...

seb@psi(0) [grenade](2646) 14:06:55> shakespeare ~/Data/misc/shakespeare.txt
TRAINING STEP WITH SIZE: 50
shakespeare: ./Data/Vector/Generic.hs:245 ((!)): index out of bounds (38,37)
CallStack (from HasCallStack):
  error, called at ./Data/Vector/Internal/Check.hs:87:5 in vector-0.12.0.1-3FWV4ejAWV0FsmvNvoLaed:Data.Vector.Internal.Check

What did I do wrong?

Wait on I'm a couple of commits behind. No I'm on the master head.

Can someone try and reproduce? ghc-8.2.2 / resolver: lts-10.2

G'day,

I bashed out the shakespeare example and included some not so safe code for decoding the results back into characters, which is probably what's biting you here.

Problems happens when there aren't the same number of unique characters as the length of the vector.

I trained it with
https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt
so you might want to try that version of Shakespeare's texts.

The offending terrible code is in the https://github.com/HuwCampbell/grenade/blob/master/src/Grenade/Utils/OneHot.hs

If you're interested, it should be quite possible to abstract the network shape to a parameter i, then use a withSomeNat pattern to make this safe for any input.

Bit of a kludge here (a whole 30 seconds of work), but it should prevent the runtime error and allow people to adjust the size of the network to fit.

#49

Thanks Huw,
That gives me a leg up for generalising it for any input. I did get a problem in OneHot with some unrelated input so I'll give it a bash.