#自然语言处理#RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
Parsing-free RAG supported by VLMs
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
#自然语言处理#Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
#计算机科学#Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
#数据仓库#Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
#自然语言处理#Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
#自然语言处理#ReadingBank: A Benchmark Dataset for Reading Order Detection
Object Detection Model for Scanned Documents
#计算机科学#Checkbox Detection Model for Scanned Documents
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning