This is the code for training MNIST dataset using visual Transformers through PyTorch
The implementation is based on the paper "AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE"
The visual transformer original implementation is available at "https://github.com/lucidrains/vit-pytorch#efficient-attention"
This code shows you how to use visual transformer your own dataset or datasets available with pytorch.
Feel free to use and modify.