Repeat Copy

Title	Action Type	Action Shape	Action Values	Observation Shape	Observation Values	Average Total Reward	Import
Repeat Copy	Discrete	(3,)	[(0, 1),(0,1),(0,base-1)]	(1,)	(0,base)		from gym.envs.algorithmic import repeat_copy

This task involves copying content from the input tape to the output tape in normal order, reverse order and normal order, for example for input [x1 x2 …xk] the required output is [x1 x2 …xk xk …x2 x1 x1 x2 …xk] . This task was originally used in the paper Learning Simple Algorithms from Examples.

The model has to learn:

correspondence between input and output symbols.
executing the move left and right action on input tape.

The agent take a 3-element vector for actions. The action space is (x, w, v), where:

x is used for left/right movement. It can take values (0,1).
w is used for writing to output tape or not. It can take values (0,1).
r is used for selecting the value to be written on output tape.

The observation space size is (1,) .

Rewards:

Rewards are issued similar to other Algorithmic Environments. Reward schedule:

write a correct character: +1
write a wrong character: -.5
run out the clock: -1
otherwise: 0

Arguments

gym.make('RepeatCopy-v0', base=5)

base: Number of distinct characters to read/write.

Version History

v0: Initial versions release (1.0.0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

repeat_copy.md

repeat_copy.md

Repeat Copy

Arguments

Version History

Files

repeat_copy.md

Latest commit

History

repeat_copy.md

File metadata and controls

Repeat Copy

Arguments

Version History