homework_fall2019: A C repository from jcarreira

1) install package by running:

$ python setup.py develop

##############################################
##############################################

2)install mujoco:
$ cd ~
$ mkdir .mujoco
$ cd <location_of_your_mjkey.txt>
$ cp mjkey.txt ~/.mujoco/
$ cd <this_repo>/downloads
$ cp -r mjpro150 ~/.mujoco/

add the following to bottom of your bashrc:
export LD_LIBRARY_PATH=~/.mujoco/mjpro150/bin/

NOTE IF YOU'RE USING A MAC:
The provided mjpro150 folder is for Linux. 
Please download the OSX version yourself, from https://www.roboti.us/index.html

##############################################
##############################################

3)install other dependencies

-------------------

a) [PREFERRED] Option A:

i) install anaconda, if you don't already have it:
Download Anaconda2 (suggested v5.2 for linux): https://www.continuum.io/downloads
$ cd Downloads
$ bash Anaconda2-5.2.0-Linux-x86_64.sh #file name might be slightly different, but follows this format

Note that this install will modify the PATH variable in your bashrc.
You need to open a new terminal for that path change to take place (to be able to find 'conda' in the next step).

ii) create a conda env that will contain python 3:
$ conda create -n cs285_env python=3.5

iii) activate the environment (do this every time you open a new terminal and want to run code):
$ source activate cs285_env

iv) install the requirements into this conda env
$ pip install --user --requirement requirements.txt

v) allow your code to be able to see 'cs285'
$ cd <path_to_hw>
$ pip install -e .

Note: This conda environment requires activating it every time you open a new terminal (in order to run code), but the benefit is that the required dependencies for this codebase will not affect existing/other versions of things on your computer. This stand-alone environment will have everything that is necessary.

-------------------

b) Option B:

install dependencies locally, by running:
$ pip install -r requirements.txt

##############################################
##############################################

4) code:

Blanks to be filled in are marked with "TODO"
The following files have blanks in them:
- scripts/run_hw1_behavior_cloning.py
- infrastructure/rl_trainer.py
- agents/bc_agent.py
- policies/MLP_policy.py
- infrastructure/replay_buffer.py
- infrastructure/utils.py
- infrastructure/tf_utils.py

See the code + the hw pdf for more details.

##############################################
##############################################

5) run code: 

Run the following command for Section 1 (Behavior Cloning):

$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Ant.pkl --env_name Ant-v2 --exp_name test_bc_ant --n_iter 1 --expert_data cs285/expert_data/expert_data_Ant-v2.pkl

Run the following command for Section 2 (DAGGER):
(NOTE: the --do_dagger flag, and the higher value for n_iter)

$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Ant.pkl --env_name Ant-v2 --exp_name test_dagger_ant --n_iter 10 --do_dagger --expert_data cs285/expert_data/expert_data_Ant-v2.pkl

##############################################

6) visualize saved tensorboard event file:

$ cd cs285/data/<your_log_dir>
$ tensorboard --logdir .

Then, navigate to shown url to see scalar summaries as plots (in 'scalar' tab), as well as videos (in 'images' tab)
jcarreira/homework_fall2019