gym-multilayerthinfilm

Overview

The proposed OpenAI gym environment utilizes a parallelized transfer-matrix method (TMM) to implement the optimization of multi-layer thin films as parameterized Markov decision processes. An very intuitve example is provided in example.py. Whereas the contained physical methods are well-studied and known since decades, the contribution of this code lies the transfer to an OpenAI gym environment. The intention is to enable AI researchers without optical expertise to solve the corresponding parameterized Markov decision processes. Due to their structure, the solution of such problems is still an active field of research in the AI community.
The publication Parameterized Reinforcement learning for Optical System Optimization used this environment.

Installation

1.
pip install git+https://github.com/MLResearchAtOSRAM/gym-multilayerthinfilm.git

2.
Clone the repository and executing setup.py

In case any dependency is not fullfilled, you can create an environment using gym_multilayerthinfilm.yml which is located in the package folder; dont forget to specify your pyhton environment folder/path there (prefix).
In general, there are no weird dependencies aside from numpy, matplotlib, seaborn, dask and gym. The tmm package can be downloaded/installed from here if necessary:
pip install git+https://github.com/sbyrnes321/tmm.git

Getting started

To get started you can do the tutorial notebook example.ipynb or just check out the quickstarter.py!

Multi-layer thin films meet parameterized reinforcement learning

Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of reward. The code to be published implements such an environment for the optimization of multi-layer thin films. In principle, the proposed code allows to execute actions taken by an agent. These actions determine which material of which thickness to stack next, thereby consecutively forming a multi-layer thin film as illustrated in figure 1. Such a multi-layer thin film exhibits optical characteristics. By comparison between the actual and user-defined desired characteristics, a notion of numeric reward is computed based on which the agent learns to distinguish between good and bad design choices. Due to its physical and mathematical structure, the optimization of multi-layer thin film remains a challenging and thus still active field of research in the scientific community. As such it gained recent attention in many publications. Therefore, naturally the need for a standardised environment arises to make the corresponding research more trustful, comparable and consistent.

Figure 1: Principal idea of an OpenAI gym environment. The agent takes an action that specifies the material and thickness of the layer to stack next. The environment implements the multi-layer thin film generation as consecutive conduction of actions and assigns a reward to a proposed multi-layer thin film based on how close the actual (solid orange line) fulfils a desired (dashed orange line) characteristic. The made experience is used to adapt the taken actions made in order to increase the reward and thus generate more and more sophisticated multi-layer thin films.

Describtion of key features

The environment can include
• cladding of the multi-layer thin film (e.g. substrate and ambient materials),
• dispersive and dissipative materials,
• spectral and angular optical behavior of multi-layer thin films (See figure 2),
• … and many more.

The environment class allows to
• conduct so-called parameterized actions (See publication) that define a multi-layer thin film,
• evaluate the generated thin film given a desired optical response, and
• render the results (See figure 2).

In general, the comprehensive optimization of multi-layer thin films in regards of optical reponse encompasses
• the number of layers (integer),
• the thickness of each layer (float),
• the material of each layer (categrial, integer).