Skip to content

Latest commit

 

History

History
47 lines (29 loc) · 1.65 KB

repeat_copy.md

File metadata and controls

47 lines (29 loc) · 1.65 KB

Repeat Copy

Title Action Type Action Shape Action Values Observation Shape Observation Values Average Total Reward Import
Repeat Copy Discrete (3,) [(0, 1),(0,1),(0,base-1)] (1,) (0,base) from gym.envs.algorithmic import repeat_copy

This task involves copying content from the input tape to the output tape in normal order, reverse order and normal order, for example for input [x​1 x2​​ …xk] the required output is [x​1 x2​​ …xk xk …x2 x1 x​1 x2​​ …xk] . This task was originally used in the paper Learning Simple Algorithms from Examples.

The model has to learn:

  • correspondence between input and output symbols.
  • executing the move left and right action on input tape.

The agent take a 3-element vector for actions. The action space is (x, w, v), where:

  • x is used for left/right movement. It can take values (0,1).
  • w is used for writing to output tape or not. It can take values (0,1).
  • r is used for selecting the value to be written on output tape.

The observation space size is (1,) .

Rewards:

Rewards are issued similar to other Algorithmic Environments. Reward schedule:

  • write a correct character: +1
  • write a wrong character: -.5
  • run out the clock: -1
  • otherwise: 0

Arguments

gym.make('RepeatCopy-v0', base=5)

base: Number of distinct characters to read/write.

Version History

  • v0: Initial versions release (1.0.0)