Pytorch Image Captioning Baseline with VisionEncoderDecoderModel in transformers(huggingface)
Primary LanguagePythonMIT LicenseMIT