Skip to content

Latest commit

 

History

History
62 lines (34 loc) · 1.04 KB

concept.md

File metadata and controls

62 lines (34 loc) · 1.04 KB

Concept in NLP

1. Concept

One-hot vector

Word Embedding

Perplexity

Temperature

2. Model Card

RNN: Recurrent Neural Networks

Bi-directional RNNs

GRU: Gated Recurrent Units

LSTM: Long Short-Term Memory

Attention

  • Context Vector is an expected value

Transformer

  • Transformer Model

    T5, GPT-2, BERT

Reformer: The Reversible Transformer

3. Metric

F1 score

BLEU: Bilingual Evaluation Understudy

  • The closer to 1, the better

ROUGE: Recall-Oriented Understudy for Gisting Evaluation

GLUE: General Language Understanding Evaluation

4. Sampling and Decoding

Random sampling

Temperature in sampling

Greedy Decoding

  • Lower temperature setting: More confident, conservative network
  • Higher temperature setting: More excited, random network

Beam Search

  • Problem:
    1. Penalizes long sequences, so you should normalize by the sentence length
    2. Computationally expensive and consumes a lot of memory

MBR: Minimum Bayes Risk