Steps to replicate the experiments:
[Git Clone this Git Repository]
[Mount onto Docker]
- docker run --gpus all -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/ubuntu/GPU-Research:/code nvcr.io/nvidia/pytorch:21.10-py3
- cd /code
- pip install -e .
[Download the data first]
- cd ./flexgpt/
- bash getdata.sh
- Feel free to early abort the download of 1B words dataset after several seconds.
[Generate the model.pt with random parameter]
- cd ./transformer-xl/
- python3 train_random_model_params.py
[Model Inference on the model.pt with customized setup]
- Feel free to change all configs in "if name == "main":" section: batch_size, tgt_len, ext_lens, mem_lens, clamp_len
- python3 eval_random_model_params.py
[After the above finishes running, check results]
- cd ./logs/
- View the log file you want.