ocr-python · GitHub Topics

hiroi-sora / Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。

paddleocr OCR ocr-python umi-ocr qml Qt screenshot

Python 31.89 k

18 天前

CatchTheTornado / text-extract-api

#大语言模型#Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSO...

API extract JSON 大语言模型 pdf anonymization OCR ocr-python pii

Python 2.53 k

6 天前

hiroi-sora / Umi-OCR_v2

结束和新的开始

OCR ocr-python paddleocr qml Qt

QML 936

1 年前

Psarpei / Multi-Type-TD-TSR

#自然语言处理#Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

图像处理深度学习 table-structure-recognition table-detection table-detection-using-deep-learning OCR ocr-recognition ocr-python 自然语言处理机器学习算法机器视觉计算机科学 computer-vision-algorithms

Jupyter Notebook 273

3 年前

maxent-ai / ocrpy

#自然语言处理#OCR, Archive, Index and Search: Implementation agnostic OCR framework.

Amazon Web Services Azure cv information-retrieval 自然语言处理 OCR ocr-python semantic-search tesseract-ocr transformers Python 机器视觉深度学习图像处理

Jupyter Notebook 223

1 年前

MrZilinXiao / Hyper-Table-OCR

#计算机科学#A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.

table-extraction OCR ocr-python 深度学习

C++ 177

2 年前

nathanaday / RealTime-OCR

Perform text detection in a variety of languages with your computer webcam using Google Tesseract OCR and OpenCV. This script achieves a real-time OCR effect via multi-threading.

OCR ocr-python cv2 opencv-python multithreading Python

Python 162

2 年前

ankandrew / fast-plate-ocr

Lightweight & fast OCR models for license plate text recognition.

plate-recognition license-plate-recognition jax onnx PyTorch Tensorflow Keras ocr-python OCR

Python 132

4 个月前

ilic5000 / pabkvizgenerator

Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in th...

机器视觉 easyocr ocr-python Python tesseract OpenCV

Python 125

3 年前

blueaxis / Cloe

Manga OCR snipping application for desktop

OCR ocr-python pyqt5

Python 113

2 年前

prp-e / persian_ocr_project

A FLOSS software for Persian Optical Character Recognition

OCR ocr-python ocr-recognition

Jupyter Notebook 89

10 个月前

nainiayoub / pdf-text-data-extractor

PDF text data extraction web app with OCR for scanned documents

pdf-to-text Streamlit streamlit-webapp text-extraction Python OCR ocr-python pdf

Python 87

10 个月前

kartikgill / Easter2

Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION

handwriting-recognition handwritten-text-recognition OCR ocr-python optical-character-recognition Python

Jupyter Notebook 79

2 年前

shibing624 / imgocr

Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理，可实现 CPU 上毫秒级的 OCR 精准预测，通用场景中英文OCR达到开源SOTA。

chinese-ocr OCR ocr-python

Python 73

3 个月前

gnana70 / tamil_ocr

#自然语言处理#OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes

indic-languages OCR optical-character-recognition Python scene-text-detection scene-text-detection-recognition scene-text-recognition ocr-python ocr-recognition 机器视觉自然语言处理 transformer handwriting-recognition handwritten-text-recognition

Python 62

22 天前

genieincodebottle / parsemypdf

Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extrac...

claude openai OCR ocr-python

Python 56

17 天前

ksasso1028 / EasyOCR-cpp

Custom C++ implementation of deep learning based OCR

C++部署 easyocr inference inference-engine libtorch OCR ocr-python ocr-recognition optical-character-recognition text-detection text-recognition

C++ 55

1 年前

bentoml / BentoOCR

Turn any OCR models into online inference API endpoint 🚀 🌖

OCR ocr-python ai-applications model-deployment model-serving

Python 54

23 天前

X-T-E-R / my-little-ocr

MyLittleOCR 是一个统一的 OCR 库包装器，提供一致的 API，便于集成和切换多个 OCR 引擎。 MyLittleOCR is a unified OCR wrapper providing a consistent API for seamless integration and switching between multiple OCR engines.

easyocr OCR ocr-python paddleocr rapidocr tesseract wrapper

Python 51

6 个月前

oidlabs-com / Lexoid

Multimodal document parser for high quality data understanding and extraction

llms pdf-document parser-library pdf-parser multimodal genai large-language-models OCR ocr-python

Python 42

1 天前