A Pytorch implementation of CVPR 2020 paper: Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
Website for TextVQA dataset.
#计算机科学#A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
翻译 - 来自Facebook AI Research(FAIR)的视觉和语言多模式研究的模块化框架
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]
The imdb files with SBD-Trans OCR for TextVQA dataset.
mlci model for textvqa
STVQA and TextVQA OCR results from Amazon Text in Image pipeline
TextVQA code
TextVQA for NLP