#计算机科学#MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检测,文本识别以及相应的下游任务,如关键信息提取。 它是 OpenMMLab 项目的一部分。
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
#计算机科学#A toolbox of ocr models and algorithms based on MindSpore
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
#自然语言处理#Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
Key information extraction from invoice document with Graph Convolution Network
The task aims at extracting required fields in receipts captured by mobile devices 😄
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
#大语言模型#利用llm大语言模型提取卡证票据关键信息。Key Information Extraction from Image with LLM(large language model).Basically, it can extract key information from all bills and documents.
#自然语言处理#AI & ML research project for automatic product extraction, classification, and analysis of receipt data