What is the meaning of doing encode(state, ts_id) operations even though we have paired the state into a list?
Closed this issue · 2 comments
I found that the state is as follows:
def call(self) -> np.ndarray:
"""Return the default observation."""
phase_id = [1 if self.ts.green_phase == i else 0 for i in range(self.ts.num_green_phases)] # one-hot encoding
min_green = [0 if self.ts.time_since_last_phase_change < self.ts.min_green + self.ts.yellow_time else 1]
density = self.ts.get_lanes_density()
queue = self.ts.get_lanes_queue()
observation = np.array(phase_id + min_green + density + queue, dtype=np.float32)
return observation
My understanding is the observation has been a list and why do we need to do the encode operation:
def encode(self, state, ts_id):
"""Encode the state of the traffic signal into a hashable object."""
phase = int(np.where(state[: self.traffic_signals[ts_id].num_green_phases] == 1)[0])
min_green = state[self.traffic_signals[ts_id].num_green_phases]
density_queue = [self._discretize_density(d) for d in state[self.traffic_signals[ts_id].num_green_phases + 1 :]]
# tuples are hashable and can be used as key in python dictionary
return tuple([phase, min_green] + density_queue)
def _discretize_density(self, density):
return min(int(density * 10), 9)
I am looking forward to hearing from you.
This is to discretize the state when using tabular RL algorithms, where you need to encode the state to something that can be used as a key to a dictionary, for instance. In this case, the state is converted into a tuple of integers. This is only necessary for tabular RL.