ECabc is a generic, small scale feature tuning program based on the Artificial Bee Colony by N. Karboga that imitates the honey foraging techniques of bees. ECabc optimizes user supplied functions called the fitness function using a given set of variables known as the value set. The bee colony consists of three types of bees: employers, onlookers and scouts. An employer bee is an object which stores a set of values and a fitness score that correlates to that value as well as the bee's probability of being picked by an onlooker bee. An onlooker bee is an object that chooses employer bees with a high probability and calculates new positions for them. The scout bee will create a new set of random values, which will then be assigned to a poorly performing employer bee as a replacement.
While it has several applications, ECabc has been successfully used by the Energy and Combustion Research Laboratory (ECRL) at the University of Massachusetts Lowell to tune the hyperparameters of ECNet, a large-scale machine learning project for predicting fuel properties. ECNet provides scientists an open source tool for predicting key fuel properties of potential next-generation biofuels, reducing the need for costly fuel synthesis and experimentation. By increasing the accuracy of ECNet and similar models efficiently, ECabc helps to provide a higher degree of confidence in discovering new, optimal fuels. A single run of ECabc on ECNet yielded a lower average root mean square error (RMSE) for cetane number (CN) and yield sooting index (YSI) when compared to the RMSE generated by a year of manual tuning. While the manual tuning generated an RMSE of 10.13, the ECabc was able to yield an RMSE of 8.06 in one run of 500 iterations.
- Have python 3.X installed
- Have the ability to install python packages
If you are working in a Linux/Mac environment:
sudo pip install ecabc
Alternatively, in a windows environment, make sure you are running cmd as administrator:
pip install ecabc
To update your version of ECabc to the latest release version, use
pip install --upgrade ecabc
Note: if multiple Python releases are installed on your system (e.g. 2.7 and 3.6), you may need to execute the correct version of pip. For Python 3.6, change "pip install ecabc" to "pip3 install ecabc".
- Download the ECabc repository, navigate to the download location on the command line/terminal, and execute:
python setup.py install
Additional package dependencies (Numpy) will be installed during the ECabc installation process.
To get started import ECabc
from ecabc.abc import *
Then define your fitness function as a function. The fitness function is the user defined function whose solution is being optimized. Pass in the values and args and have it return the output that is being optimized
def fitness_function(values,args):
***code***
return output
After that, in the main function define your value ranges i.e. the user defined ranges for the variables which are being optimized
values = [('int', (0,10)), ('int', (0,100)), ('float',(0,80)), ('float', (0, 360))]
Optionally, one can also add args. Any additional arguments that your fitness function must take outside of the values given in value_ranges. This defaults to None.
arguments = {'test_argument', 10}
Then call ECabc as follows:
abc = ABC(fitness_fxn=fitness_function, value_ranges=values, args = arguments)
Certain setting also need to be toggled, such as
abc._minimize = True
And the settings can be imported and saved as follows
abc._import_settings = example.json
abc._save_settings = output.json
Then call create_employers on it to generate your population of employer bees. This ony needs to be done once
abc.create_employers()
After this, the code should enter a loop with a break condition. The contents of ECabc that should be in the loop have been encompassed in run_iteration()
for simplicity.
while True:
abc.run_iteration()
if (abc.best_performer[0] < 2):
break
The above snippet shows the setup if one wants to run ECabc until a certain output value has been obtained. Alternatively one could just set it up so that it runs for a preset number of cycles as follows:
for i in range(500):
run_iteration()
Other parameters that can be specified in the loop are: file logging: debug'/'info'/'warn'/'error'/'crit' or 'disable
abc._logger.file_level = 'info'
abc._logger.file_level = 'debug'
abc._logger.file_level = 'warn'
abc._logger.file_level = 'error'
abc._logger.file_level = 'crit'
abc._logger.file_level = 'disable'
print_level. This will print out log information to the console:
abc._logger.stream_level = 'info'
abc._logger.stream_level = 'debug'
abc._logger.stream_level = 'warn'
abc._logger.stream_level = 'error'
abc._logger.stream_level = 'crit'
abc._logger.stream_level = 'disable'
and processes:
processes = 1
Finally, to view the output:
print(abc.best_performer[2], abc.best_performer[1])
where best_performer[2] is the values and best_performer[1] is the fitness score associated with it.
'''
Simple sample script to demonstrate how to use the artificial bee colony, this script is a simple example, which is just
used to demonstrate how the program works.
If an ideal day is 70 degrees, with 37.5% humidity. The fitness functions takes four values and tests how 'ideal' they are.
The first two values input will be added to see how hot the day is, and the second two values will be multiplied to see how much
humidity there is. The resulting values will be compared to 70 degrees, and 37.5% humidity to determine how ideal the day those
values produce is.
The goal is to have the first two values added up to as close to 70 as possible, while the second two values multiply out to as
close to 37.5 as possible.
'''
from ecabc.abc import *
import os
import time
def idealDayTest(values, args=None): # Fitness function that will be passed to the abc
temperature = values[0] + values[1] # Calcuate the day's temperature
humidity = values[2] * values[3] # Calculate the day's humidity
cost_temperature = abs(70 - temperature) # Check how close the daily temperature to 70
cost_humidity = abs(37.5 - humidity) # Check how close the humidity is to 37.5
return cost_temperature + cost_humidity # This will be the cost of your fitness function generated by the values
if __name__ == '__main__':
# First value # Second Value # Third Value # Fourth Value
values = [('int', (0,100)), ('int', (0,100)), ('float',(0,100)), ('float', (0, 100))]
start = time.time()
abc = ABC(fitness_fxn=idealDayTest,
value_ranges=values
)
abc.create_employers()
while True:
abc.save_settings('{}/settings.json'.format(os.getcwd()))
abc.run_iteration()
if (abc.best_performer[0] < 2):
break
print("execution time = {}".format(time.time() - start))
Tests for ECabc are available in the examples folder.
To contribute to ECabc, make a pull request. Contributions should include tests for new features added, as well as extensive documentation.
To report problems with the software or feature requests, file an issue. When reporting problems, include information such as error messages, your OS/environment and Python version.
For additional support/questions, contact Sanskriti Sharma (sanskriti_sharma@student.uml.edu), Travis Kessler (travis.j.kessler@gmail.com), Hernan Gelaf-Romer (hernan_gelafromer@student.uml.edu) and/or John Hunter Mack (Hunter_Mack@uml.edu).