To download simply clone the repository using git clone --recursive https://github.com/MichalPospech/bc-project.git
to include the estool
library included. Then simply install the local dependency using pip install -e ./libs/estool
.
Also a log
folder must be created.
An installation of MPI (for mpi4py) is also needed, usually possible to install using your distribution's package manager.
On Windows, it is easiest to install mpi4py as follows:
- Download and install mpi_x64.Msi from the HPC Pack 2012 MS-MPI Redistributable Package
- Install a recent Visual Studio version with C++ compiler
- Open a command prompt
git clone https://github.com/mpi4py/mpi4py
cd mpi4py
python setup.py install
The program is run using MPI, therefore the running command is a bit more complex.
mpiexec -n NUM_CPU python -m mpi4py train.py -f CONFIG_FILE
where NUM_CPU
must be at least 2 and at most number of physical cores and CONFIG_FILE
a path to config file with format described below
There are 4 log files
- hist contains data from training (timestep, total time, avg, std, min and max for both reward and novelty)
- best contains currently best parameters with reward
- log contains evaluation data (timestep, reward and parameters)
- no extenstion contains current parameters
Configuration file is in JSON format with specification below, examples in examples
directory
num_worker_trial
- number of individuals PER WORKERgamename
- name of game,slimevolley
orcartpole_swingup
algorithm
- Algorithm parameters, see algorithm sectionnum_episode
- number of episodes used to evaluate each solutionbatch_mode
- how are thenum_episode
values for each solution aggregated,min
ormean
eval_steps
- how often is current solution evaluatedcap_time
- limit on number of steps per episode (-1 for unlimited, default)seed_start
- starting seed for RNGantitethic
- whether antitethic sampling should be used (default isTrue
)identifier
- identifier to identify logfile by
name
- name of algorithm used,cmaes
,ga
,pepg
,openes
,nses
,nsres
ornsraes
If some parameter is undescribed, it is because it is usual naming for the algorithm
sigma_init
- initial step size as per CMA-ES specificationweight_decay
- weight decay subtracted from rewards
sigma_init
- initial sigma for distributionsigma_decay
- rate of decaysigma_limit
- limit for decayelite_ratio
forget_best
- update the individual all the time, not only upon improvementweight_decay
- weight decay subtracted from rewards
optimizer
- see optimizer sectionsigma_init
- initial sigma for distributionsigma_decay
- rate of decaysigma_limit
- limit for decayforget_best
- update the individual all the time, not only upon improvementweight_decay
- weight decay subtracted from rewardsrank_fitness
- use rank-normalised fitness
sigma_init
- initial sigma for distributionsigma_decay
- rate of decaysigma_limit
- limit for decaysigma_max_change
learning_rate
- initial learning ratelearning_rate_decay
- rate of learning rate decaylearning_rate_limit
- limit for learning rateweight_decay
- weight decay subtracted from rewardsrank_fitness
- use rank-normalised fitness
optimizer
- see optimizer sectionsigma
metapopulation_size
k
optimizer
- see optimizer sectionsigma
metapopulation_size
k
weight
- ratio of fitness and novelty
optimizer
- see optimizer sectionsigma
metapopulation_size
k
init_weight
- initial ratio of fitness and noveltyweight_change
- how much does the ratio changeweight_change_threshold
- how often does the ratio change
name
- name of used optimizer,sgd
,adam
orsgdm
All the parameters should be self-explanatory as per the algorithm definitions
stepsize
stepsize
momentum
stepsize
beta1
beta2