#计算机科学#A Unified Toolkit for Deep Learning Based Document Image Analysis
翻译 - 用于文档布局理解的Python库
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
Document Layout Analysis resources repos for development with PdfPig.
#自然语言处理#📚 Process PDFs, Word documents and more with spaCy
Page to PAGE Layout Analysis Tool
ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...
Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting va...
A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers
#计算机科学#Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Tools for extract figure, table, text, .. from a pdf document.
#计算机科学#Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
#计算机科学#Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and ...
#计算机科学#BoundaryNet - A Semi-Automatic Layout Annotation Tool
A step-by-step C# implementation of the Docstrum algorithm
Simple docker deployment of document layout analysis using detectron2
Using a MaskRCNN model trained on the PublayNet dataset with ML.Net in C# / .Net for Document layout analysis and page segmmentation task.
#计算机科学#GloSAT Historical Measurement Table Dataset