-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[WIP] AveragingEpisodesController #89
base: master
Are you sure you want to change the base?
Conversation
…ments and return the median of the returns; TODO: implement this nicely in a controller subclass
* does not support recording of trajectories and raw reward histories * allows to accumulate and average reward histories by function that is passed via feedback_averaging_function * allows to prepare an environment for each repetition (e.g. seeding) to make results repeatable
See base class "Controller" for details on usage. | ||
|
||
Additional Parameters | ||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more -
|
||
Additional Parameters | ||
---------- | ||
num_repetitions_to_average : int, optional (default: 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually try to use n_
as an abbreviation for number.
if the environment is stochastic or specifically prepared via the | ||
argument environment_preparation_function | ||
|
||
feedback_averaging_function : function, optional (default: median_of_sums) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a callback, not just a function. It also does not have to be a function, it can be any callable.
Note that the number of feedbacks per rollout may vary. | ||
See AveragingEpisodesController.median_of_sums (default) for an example | ||
|
||
environment_preparation_function : function, optional (default: None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same applies here
self.record_inputs = False | ||
self.record_outputs = False | ||
self.record_feedbacks = False | ||
self.accumulate_feedbacks = False # see feedback_averaging_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment does not really help
@maotto any progress? |
added an AveragingEpisodesController