Cellular Automata Reinforcement Learning Environment.
In random network distillation, a predictor network tries to predict the output of a random network given a current observation. Poor prediction corresponds to novel stimuli and contributes to an exploration reward signal. ( Burda et al. 2018 )
See https://rivesunder.gitlab.io/rl/2019/08/24/breaking_rnd.html for experiments with random network distillation.