-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
MOMDP representation? #67
Comments
Yes, it is possible. Check out this documentation page. Regarding fully observable state variables, you can achieve that by having an observation model that simply returns the state. |
Sorry, I did not formulate my question well. What I was meant to ask is how should I formulate my code so, when converted to .pomdpx, it represents some state variables as fully observable. Right now I am working with a model that has access to the time that has passed since the beginning of the operation, and has a maximum number of time steps to act (finite horizon). The way I have approached is by creating states that, on top of their ID (either an int or 'term' for the terminal state), they have also a property class TDState(pomdp_py.State):
def __init__(self, state_id, time_step):
self.id = state_id
self.t = time_step
self.name = f"s_{state_id}-t_{time_step}" The methods for class TDObservationModel(pomdp.ObservationModel)
def __init__(self, conf_matrix):
self. observation_matrix = conf_matrix
self.n_steps, self.n_states, self.n_obs = self.observation_matrix.shape
def probability(self, observation, next_state, action):
obs_idx = observation.id
state_idx = next_state.id
state_step = next_state.t
return self.observation_matrix[state_step][state_idx][obs_idx] The transition model includes the parameter I would like the time to be fully observable in the produced .pomdpx file, but since you commented:
I think the way I am handling it would not accomplish the MOMDP representation. How should I do it instead? |
Follow-up: I tried to convert to .pomdpx with my current problem definition and the file reflects only one state variable, which has a number of states equal to the number of possible state IDs times the possible values of t. In the case of 5 targets and 8 time-steps, I get 41 states of a single state variable (the extra state is the terminal state). I would like to know how to define my model to have a state variable with 5 values (ID), which is not fully observable, and another state variable with 8 values (time), which would be fully observable. |
I will provide a sketch for the idea. class State(pomdp_py.SimpleState)
def __init__(self, target, time_step):
super().__init__(data=(target, time_step))
class ObservationModel(pomdp_py.ObservationModel):
def sample(self, next_state, action):
time_step = next_state.data[1]
return pomdp_py.SimpleObservation(data=time_step) This makes time_step observable, but not the target. |
That makes it clear, thank you! I imagine the ObservationModel need to know What I mean is that the target needs to be part of the observation as well. |
Hello,
I would like to know if it is currently possible to create a problem with fully observable state variables and solve them using a
.pomdpx
file using SARSOP.The text was updated successfully, but these errors were encountered: