Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Unable to reproduce joint training results #7

Open
xiaoda99 opened this issue Nov 13, 2018 · 0 comments
Open

Unable to reproduce joint training results #7

xiaoda99 opened this issue Nov 13, 2018 · 0 comments

Comments

@xiaoda99
Copy link

xiaoda99 commented Nov 13, 2018

I'm trying to reproduce the bAbI joint training results in the Universal Transformer paper (UT w/o ACT). My scripts are:

t2t-datagen \
  --t2t_usr_dir=t2t_usr_dir \
  --tmp_dir=babi_data/tmp \
  --data_dir=babi_data/data \
  --problem=babi_qa_sentence_all_tasks_10k

t2t-trainer \
  --t2t_usr_dir=t2t_usr_dir \
  --tmp_dir=babi_data/tmp \
  --data_dir=babi_data/data \
  --output_dir=babi_data/output \
  --problem=babi_qa_sentence_all_tasks_10k \
  --model=babi_universal_transformer \
  --hparams_set=universal_transformer_tiny \
  --train_steps=100000

However, I can't reproduce the results, getting test accuracy around 60% (I didn't train for 100000 steps. But the curve seems already plateau). In particular, I'm not sure about three things:

  1. In transformer_base, the default batch_size is 4096:
    https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py#L1312
    Unlike T2T's BabiQaConcat which inherits from TextProblem, this repo's BabiQaSentence inherits from Problem. There batch_size_means_tokens is set to False. So 4096 means a quite large batch (4096 * 70 * 12 = 3.4M tokens). I got OOM error with a Titan 1080 ti card. So I changed batch_size to 512.

  2. In a 3 Sep commit, you changed default transformer_ffn_type from sepconv to fc.
    tensorflow/tensor2tensor@e496897
    Should I use sepconv to run the experiments?

  3. T2T code has undergone many changes since this repo was out. Will that impact the results?

What actual batch_size did you use? Did you change any other hparams when running t2t-datagen and t2t-trainer?

It would be very helpful if you could share your flags.txt, flags_t2t.txt and hparams.json files. Attached are mine.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant