Q-Behave is machine learning library written in C++ compatible with Arduino and other embeded devices. The idea behind machine learning is to allow your device to acquire patterns based on user interactions. This is achieved without the need to program them beforehand.
To illustrate this concept take RGB diode with only one button as an example. After pressing the button the user expects to see diode flash with some color. Next user will have different color in mind when pressing the button. In a conventional way the system cannot be programmed to guess user intention on which color to show.
When we put Q-Behave as a processor between interaction (pressing a button) and state (flashing the diode) the device will choose one of possible states based on prior experience gained from user interaction. So when the user expects to see green light and instead the led flashes a different color, the user presses the button immediately again to finally get the desired color. This way the user informs the processor that he/she was expecting a different state of the diode. After several trials the device responds with a correct color right the first time because it learns what the user expects from it.
Utilizing this technique one program can behave differently and adapt to different users. In fact the same rule applies when user changes his mind thus forcing the device to learn the new behaviour. After some trials with the old color the program will adapt to new rules and satisfy user expectations without the need to reprogram it.
To understand better what is this system take a look at block diagram of Q-Behave system.
Inputs represents anything that can interact with the system like for example buttons or timers. It goes into the State Controller which checks the memory and ask Learning Processor to calculate the best action. As an outcome of this process an action changes the State of the machine. Reward or penalty is calculated either when triggered from external interaction or timer or as a result of repetition of the same action.
In Arduino code interaction is just an invocation of the function with the integer parameter representing the input type. States of the system are a bit more complicated, as each state have to be encalupsated as an object of class extending the abstract State class defined in the system. What is already provided can be represented by the folowing figure:
The only thing that might need additional explanation is the testing environment and a special class MockState. The purpose of those two thing is to provide simple solution for testing the behaviour algorithm without the need of conneting any physical device.
To download this library using git:
git clone git@github.com:smigielski/q-behave.git
or
git clone https://github.com/smigielski/q-behave.git
To install, copy libraries/q-behave to your Arduino/libraries/ directory. No other tasks are required and you have library ready to be used. Check example projects for details.
First example that you can use to learn how this library works and to check how good user experience is when using learning devices. Except Arduino there are only few requiered components so I am quite sure you can build it with what you have.
Part Type | Amount | Properties |
---|---|---|
1 |
Arduino Uno (Rev3) |
type Arduino UNO (Rev3) |
1 |
RGB LED (com. anode, rbg) |
package 5 mm [THT]; pin order rbg; |
1 |
Pushbutton |
package [THT] |
1 |
10k Ω Resistor |
package THT; resistance 10kΩ; |
It is best to first build it on the breadbord to see how it works:
If you need we also provide you detail schematics file for a reference:
When you have hardware in place there is time to see software in more detail. First thing is organization of the states.
//Define states RestState restState = RestState(); LedState green = LedState("green",9); LedState blue = LedState("blue",10); LedState red = LedState("red",11); //Prepare connection map between states. Note that it is allow only to move //from rest state to led states and off again. Action restActions[] = { { &green, 0.0 }, { &blue, 0.0 },{ &red, 0.0 } }; Action greenActions[] = { { &restState, 0.0 }}; Action blueActions[] = { { &restState, 0.0 }}; Action redActions[] = { { &restState, 0.0 }}; StateActions states[] = { {&restState,3, restActions }, {&green,1,greenActions}, {&blue,1,blueActions}, {&red,1,redActions} }; StateMap stateMap = { 4, states };
As mentioned earlier RestState is the initial state with all leds turn off and each LedState represent different color. In action arrays there is definiton of posible action that Q-Behave can take from one state to the other. The last thing is that we have to gather it together in states array and than in state map, structure representing state system.
Next we can see how interaction and reward function is defined. In this case with have only one interaction button and as a reward there is timer configured.
//Interactions SimpleButton button1 = SimpleButton(14); SimpleTimer rewardTimer; int timerId; //Button configuration void onReward(){ Serial.println("onReward: "); brain.stop(10.0); } void onButton1(SimpleButton& b){ Serial.print("onButton1: "); Serial.println(b.pin); brain.start(0); if (rewardTimer.isEnabled(timerId)){ //if timer was already enable than restart it rewardTimer.restartTimer(timerId); } else { timerId = rewardTimer.setTimeout(5000, onReward); } } //Startup code void setup() { button1.pressHandler(onButton1); } // the loop routine runs over and over again forever: void loop() { button1.process(); rewardTimer.run(); }
The last piece of information that is missing is memory and processing system.
//Simple memory definition and configuration. All memory is blocked in heap. //Please note that memory is volatile right now and will not survive restart. double prob1[6]; double* internalMemmory[]={prob1}; SimpleMemory memory = SimpleMemory(stateMap,internalMemmory); //Main learning process. QLearningMachine brain = QLearningMachine(&memory,&restState);
To see full source code of this example go to Q_Behave_Single_Button.ino
Second example, also quite simple, is mainly the same as previous one but this time there are three buttons that can be learned individually and a special reward button. To see how the learning process look like see this video.
The bread board and the schematics look almost the same. The only difference is that now it contains more buttons:
and schematics:
If you want to be certain about what we have used this is the part list:
Part Type | Amount | Properties |
---|---|---|
1 |
Arduino Uno (Rev3) |
type Arduino UNO (Rev3) |
1 |
RGB LED (com. anode, rbg) |
package 5 mm [THT]; pin order rbg; |
4 |
Pushbutton |
package [THT] |
4 |
10k Ω Resistor |
package THT; resistance 10kΩ; |
This time we need to define more buttons and reward function as a callback to the button press instead of the timer:
SimpleButton button1 = SimpleButton(14); SimpleButton button2 = SimpleButton(15); SimpleButton button3 = SimpleButton(16); SimpleButton reward = SimpleButton(17);
and more interaction function:
void onReward(SimpleButton& b){ Serial.print("onReward: "); Serial.println(b.pin); brain.stop(10.0); } void onButton1(SimpleButton& b){ Serial.print("onButton1: "); Serial.println(b.pin); brain.start(0); } void onButton2(SimpleButton& b){ Serial.print("onButton2: "); Serial.println(b.pin); brain.start(1); } void onButton3(SimpleButton& b){ Serial.print("onButton3: "); Serial.println(b.pin); brain.start(2); }
Setup and main loop is also a bit bigger now, but still quite clean:
//Startup code void setup() { button1.pressHandler(onButton1); button2.pressHandler(onButton2); button3.pressHandler(onButton3); reward.pressHandler(onReward); } // the loop routine runs over and over again forever: void loop() { reward.process(); button1.process(); button2.process(); button3.process(); }
The last difference is that memory block have to accomodate more information, one prob array for each button.
//Simple memory definition and configuration. All memory is blocked in heap. //Please note that memory is volatile right now and will not survive restart. double prob1[6],prob2[6],prob3[6]; double* internalMemmory[]={prob1,prob2,prob3}; SimpleMemory memory = SimpleMemory(stateMap,internalMemmory);
State system is exactly the same as it was before:
//Define states RestState restState = RestState(); LedState green = LedState("green",9); LedState blue = LedState("blue",10); LedState red = LedState("red",11); //Prepare connection map between states. Note that it is allow only to move //from rest state to led states and off again. Action restActions[] = { { &green, 0.0 }, { &blue, 0.0 },{ &red, 0.0 } }; Action greenActions[] = { { &restState, 0.0 }}; Action blueActions[] = { { &restState, 0.0 }}; Action redActions[] = { { &restState, 0.0 }}; StateActions states[] = { {&restState,3, restActions }, {&green,1,greenActions}, {&blue,1,blueActions}, {&red,1,redActions} }; StateMap stateMap = { 4, states };
Full source code of this example can be found at Q_Behave_RGB_Led.ino