Transformers

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: An exploration on how ViT can be used for vision.