pdf-processing · GitHub Topics

Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.

openai TypeScript gpt-3 gpt-4 langchain Mongoose Next openai-api chat 聊天机器人 document-embedding pdf-processing pinecone React Tailwind CSS vectorization

TypeScript 8481

2 年前

allenai / papermage

#自然语言处理#library supporting NLP and CV research on scientific papers

机器视觉机器学习 multimodal 自然语言处理 pdf-processing scientific-papers Python

Python 757

5 个月前

ahmedkhemiri95 / PDFs-TextExtract

Multiple and Large PDF Documents Text Extraction.

pdf Parser 数据科学 Python pdf-processing extract-text pdf-document pypdf2 pdfs

Python 128

2 个月前

aws-samples / document-processing-pipeline-for-regulated-industries

#计算机科学#A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.

机器学习 Amazon Web Services cdk aws-lambda amazon-web-services amazon-textract amazon-dynamodb amazon-s3 amazon-sqs aws-cdk pdf-processing 图像处理 data-analytics data-lineage data-governance

Python 62

3 年前

Govind-S-B / pdf-to-text-chroma-search

Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma ...

chromadb pdf-processing similarity-search text-extraction

Python 23

1 年前

ManasMadan / pdf-actions

A NPM Package built on top of pdf-lib that provides functonalities like merge, rotate, split,download pdf to disk and many more...

pdf pdf-merger React react-component pdf-processing pdf-lib JavaScript npm

JavaScript 13

1 年前

ranguy9304 / LangGraphRAG

#自然语言处理#LangGraphRAG: A terminal-based Retrieval-Augmented Generation system using LangGraph. Features include message history caching, query transformation, and vector database retrieval. Ideal for NLP resea...

聊天机器人 information-retrieval langgraph 自然语言处理 openai-api pdf-processing Python rag vector-database web-scraping

Python 7

9 个月前

ManasMadan / PDFActions

Built with pdf-actions NPM package.

React pdf react-components react-component pdf-merger pdf-lib pdf-processing

JavaScript 7

1 年前

Inc44 / MaTools

An all-in-one GUI management toolkit built with PyQt6, offering a suite of tools for file synchronization, media organization, PDF merging, code formatting, and more.

application audio-processing GUI 图像处理 OCR pdf-processing productivity Python Qt Rust speech-recognition video-processing youtube-downloader

Python 6

1 个月前

enesmanan / paper-bold

AI-powered RAG-based tool for summarizing, extracting insights, and answering questions about research papers with high accuracy

gemini-api langchain pdf-processing rag academic-paper

HTML 6

24 天前

Yardenrsk / PsychometryReceiverCV

A side project to easily get and annotate questions and answers to the PsychometryBot project DB using computer vision and pdf parsing

opencv-python pandas pdf-processing

Python 3

3 年前

allanninal / document-summarizer

#自然语言处理#The Document Summarizer leverages Hugging Face’s facebook/bart-large-cnn model to transform lengthy documents into concise summaries. Built with ReactJS (Vite) for the frontend and Flask for the backe...

ai-tools Flask huggingface 自然语言处理 pdf-processing React Vite

JavaScript 3

4 个月前

DioCrafts / ai-book-summarizer

#自然语言处理#📚 AI-Powered Book PDF Knowledge Extractor & Summarizer Transform your PDF books into structured knowledge effortlessly! This tool leverages AI to analyze books page by page, extracting key insights, ...

人工智能自动化 document-analysis knowledge-extraction 机器学习 Markdown 自然语言处理 openai pdf pdf-processing Python study-materials text-analysis

Python 3

3 个月前

thinhuos0913 / python_useful_mini_projects

This is some useful mini projects that I had worked for self-learning Python programming.

OCR OpenCV Python 图像处理 pdf-processing

Python 3

1 年前

Aleptonic / PdfSnipper

PdfSnipper is a lightweight and efficient Python package designed to simplify the management of PDF files, pages, and their conversions during various NLP, Computer Vision (CV), or other data processi...

pdf-processing utilities

Python 3

2 个月前

setuc / pdf-annotation-with-azure-doc-intel

Azure Document Intelligence Result Processor: A toolset for annotating PDFs based on Azure Document Intelligence analysis results, featuring a React web application and a standalone Python script for ...

JavaScript pdf-processing Python React Vite

JavaScript 2

1 个月前

rithulkamesh / docproc

#计算机科学#Opinionated and Sophisticated Document Region Analyzer.

pdf-processing document-analysis text-extraction Python OCR 机器学习 layout-analysis content-extraction text-classification data-extraction document-parsing

Python 2

13 天前

Al-shwaib / Book-Preparation-for-Printing

A web application for preparing books and magazines for offset printing. Automatically arranges PDF pages for commercial A3 printing, supporting both Arabic (RTL) and English (LTR) books. تطبيق ويب ل...

flask-application pdf-processing

Python 2

3 个月前

Farhaj499 / RAG_with_Weaviate_DB

This project implements a Retrieval Augmented Generation (RAG) system that answers questions based on the PDF document. It utilizes Weaviate as a vector database for efficient retrieval of relevant in...

agentic-ai embeddings huggingface-transformers langchain pdf-processing Python rag retrieval-augmented-generation semantic-search vector-database weaviate

Jupyter Notebook 2

3 个月前

arsath-eng / RAG1-NVIDIA-GENAI

#大语言模型#A powerful Retrieval Augmented Generation (RAG) application built with NVIDIA AI endpoints and Streamlit. This solution enables intelligent document analysis and question-answering using state-of-the-...

document-analysis embeddings faiss langchain 大语言模型 pdf-processing question-answering rag Streamlit vector-store

Python 2

5 个月前