#计算机科学#MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检测,文本识别以及相应的下游任务,如关键信息提取。 它是 OpenMMLab 项目的一部分。
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
#计算机科学#A toolbox of ocr models and algorithms based on MindSpore
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
#自然语言处理#Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
Key information extraction from invoice document with Graph Convolution Network
An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
The task aims at extracting required fields in receipts captured by mobile devices 😄
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
#大语言模型#利用llm大语言模型提取卡证票据关键信息。Key Information Extraction from Image with LLM(large language model).Basically, it can extract key information from all bills and documents.
#自然语言处理#AI & ML research project for automatic product extraction, classification, and analysis of receipt data