energy_py is reinforcement learning in energy systems. It's a reinforcement learning agent and environment built in Python.
Reinforcement learning in energy systems requires first proving the concepts in a virtual environment. This project demonstrates the ability of reinforcement learning to control a virtual energy environment.
To get started clone this repo and run core.py.
This project is built and maintained by Adam Green - adam.green@adgefficiency.com.
You can read the introductory blog post for this project here.
energy_py is composed of Python scripts. To run a single naive episode:
env = environments.energy_py(EPISODE_LENGTH)
agent = agents.Q_learner(env, verbose=0)
episode = 0 # the episode number
agent.policy_ = 1 # naive policy
agent.single_episode(episode_number)
- Creates environment, agent
- episode 0 = naive episode where all assets at 100% load for entire episode
- episode 0 to n = user defined number of episodes (n) with epsilon-greedy policy
- episode n+1 = greedy episode, epsilon = 0 (no exploration)
- outputs saved every x episodes (user defined)
- saves Keras model weights after episode n+1
- approximating Q(s,a) using a Keras neural network
- single episode runs within agent
- trains after each step using replay memory
- policy can be set to: naive - action is always the maximum of the available action space e-greedy - with probability e select random action, else optimal greedy - always select optimal action (as per current value function)
- episode stops when env.done == True
- number of functions used to deal with generating variable action space
- Q_learner.output() function generates charts and CSVs of data
- energy system environment
- ability to use multiple energy asset models - see assets/library.py
- reward based on net energy cost
- state = energy demands & prices
- actions = load and binary variable for each energy asset
- episode length = maximum depends on amount of data in assets/time_series.csv
- models of energy assets
- currently two available - gas engine or gas turbine (user defined size)
- each class has the same set of methods (would be ideal for a base class)
- allows iteration over list of assets to get technical outputs (power generated, gas consumed etc)
- Keras models to approximate the action value function Q(s,a)
- currently only one model - a Sequential Keras model
- Keras model structured with input of [state, action] output of Q(s,a) (single node) input is a normalized 1-D numpy array of [state, action]
- agent is agnostic to the specific the Keras model
Structuring the Keras model to output a single Q(s,a) means we need n forward passes to consider n [state, actions]. I've done this because I deal with a variable set of [state, actions].
This makes action selection and training expensive as both require estimating Q(s,a) for a large number of [state, actions]. This means a lot of forward passes across the network!
You can see how the action space is generated by looking at environments.energy_py.create_action_space(last_actions).
- actions a are applied to the next state (s') (unseen by the agent until the next time step)
- this means the agent is forced to do some time series forecasting. This is intended behavior
- the actions available to the agent are dependent on the previous actions
- to account for this I have added a number of different methods to the energy_py() environment
The Open AI gym paradigm has inspired the design of energy_py. I would love to one day have an energy_py environment in the Open AI GitHub repo!
I've integrated with the Open AI gym project in the following ways:
- inherits the gym.Env base class from the Open AI gym project
- makes use of step & reset methods as per gym env objects
- makes use of the gym.spaces objects
It's easy to change the energy engineering parameters of the energy_py environment. The user can add more or different types of assets. The environment is flexible enough to model an electricity only generator all the way through to a Combined Heat, Cooling & Power plant.
Features of the energy engineering modeling
- add or remove assets from the list of asset_models
- add, remove or change state variables by changing energy_py.state_models and assets/time_series.csv. Note that the order of the dictionaries in energy_py.state_models should match the headers of the time_series.csv
- actions are applied to next state (as detailed above)
- the state includes demands for high-grade heat, low-grade heat, electricity and cooling
- energy balances are done for electricity, high-grade heat, low-grade heat and cooling Any heat not supplied by the assets is supplied by a gas boiler operating at 80 % HHV Any cooling not supplied by the assets is supplied by an electric chiller operating at a COP of 3
- no operation & maintenance costs are modeled
- all gas prices, efficiencies are on a higher heating value (ie gross) basis
- heat, electricity and cooling measured in MW
- reward is calculated based on the net energy cost
net energy cost = export electricity revenue - (gas cost + import electricity cost)
- create a base class for assets
- add more complex value function approximations (1-D convolutional or recurrent neural networks)
- learner based on policy gradients
- action space to only have [0,0] rather than [67,0]
- ability to select which variables to use as state
- abstraction of steam headers - ability to link with heat demands etc
- heat and mass balances.
- deaerator modeling
- energy storage (thermal or electrical). would be another asset to operate
- expanded engine library (prime mover types & engine models)
- penalty for start/stops
- O&M costs
Visit adgefficiency.com where I blog about energy and machine learning.
Thanks for reading!