Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SlateQ agent implementation #698

Open
rahul-zomato opened this issue Nov 15, 2022 · 0 comments
Open

SlateQ agent implementation #698

rahul-zomato opened this issue Nov 15, 2022 · 0 comments

Comments

@rahul-zomato
Copy link

Is next_state deliberate here in next_q_values calculation in slateQ agent - https://github.com/facebookresearch/ReAgent/blob/main/reagent/training/slate_q_trainer.py#L230

SlateQ agent implemented by SlateQ paper authors in recsim uses state instead of next state from replay buffer to get next_q_values - google-research/recsim#26

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant