Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 859 Bytes

README.md

File metadata and controls

21 lines (13 loc) · 859 Bytes

Gated Attention

Implementation of the paper : Not all attention is needed - Gated Attention Network for Sequence Data (GA-Net)

Flow Diagram for the network:

There are two networks in the model:

  1. Backbone Network
  2. Auxiliary Network

Comparison with soft attention network:

Soft Attention gives some attention (low or high) to all the input tokens whereas gated attention network chooses the most important tokens to attend.

Gate Probability and gated attention:

Visualization of probability for gate to be open for input token and the actual gated attention weight.