power-allocation using Q-learning