To compare the simulation of humanoid walking through Deep Deterministic Policy Gradient (DDPG), Naturalized Advantage Function(NAF) and Kinect.
Reinforcement Learning and Leveraging Vision through ** RGB-D** Sensor
Actor - Activation Function - Linear
Critic - Activation Function - Sigmoid
Actor - Activation Function - Linear
Critic - Activation Function - Tanh
import io
import base64
from IPython.display import HTML
video = io.open('model_linear_tanh.mp4', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
Actor - Activation Function - Sigmoid
Critic - Activation Function - Tanh
import io
import base64
from IPython.display import HTML
video = io.open('model_sigmoid_tanh.mp4', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))