alex-petrenko/sample-factory

get_rnn_size vs custom ActorCritic cores

alex404 opened this issue · 1 comments

So I've been using my own custom ActorCritic class, and one thing I've noticed is that the get_rnn_size function plays a low-level role initializing the learners. This makes it a bit tricky to create custom cores, because they have to play nice with the Config variables like use_rnn and rnn_size.

In practice this isn't too hard to work around, but intuitively it would be nicer if get_rnn_size could be e.g. a method of a Core. The problem is of course that get_rnn_size is accessed before the ActorCritics are initialized, so it's not entirely trivial to solve. Do you think this could be improved? If you point out a way to make this idiomatic with your code I can write it up.

Good point!

The rnn size (which should really be called "hidden state size") is called early to allocate a proper amount of space for the tensor to store all the hidden states of all agents.

I believe this function implements some hacky logic where we return rnn_size for GRU and 2*rnn_size for LSTM or something like that.

I don't know what's the best way to work around that, but how's that for an idea:
Make this function a method of a base "core" class, by default just call the current function. This will allow custom code to override core the API.

Then whenever you need get_rnn_size before the actor critic initialization, maybe just instantiate a local version of the core really quickly just to call this method? Maybe not ideal but it solves the problem?

Alternatively (maybe much easier to implement), maybe make an additional parameter hidden_state_size which should be -1 by default, and if it is -1 then we call the existing function calculating hidden state from rnn_size.
Otherwise we use the value of this parameter. Let me know if this works. I'll be happy to review the PR.