Overview

This repository would contain my implementation of the recent/important transformers paper in the different fields including but not limited to NLP, Computer Vision, Speech Recognition, etc.

Natural Language Processing

  • DocFormer: End-to-End Transformer for Document Understanding [Paper] [Code]
  • LiLT: A Simple yet Effective Language-Independent Layout Transformer,for Structured Document Understanding [Paper] [Code]

Visual Question Answering

  • LaTr: Layout-aware transformer for scene-text VQA [Paper] [Code]

Contributing

If you have any suggestions related to some paper or if you have any specific paper in my mind, that could prove to be useful if implemented, you can raise an issue, or you can directly make a pull request.