”pdf-extractor-llm“ 的搜索结果

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...

apify Automation beautifulsoup Crawler crawling

Python4.72 k

18 分钟前

Google Bing GitHub

web-crawling automation python playwright apify headless-chrome crawling crawler llm ocr

crawlee

@apify

#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库

web-scraping web-crawling npm headless-chrome Puppeteer

TypeScript15.84 k

2 小时前

MinerU

@opendatalab

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

extract-data layout-analysis ocr Parser pdf

Python18.85 k

4 小时前

llamachirp

@SR-Sujon

Engage in dynamic conversations with PDFs to extract and comprehend information using locally hosted LLM variants of Ollama by integrating RAG.

Python7

7 个月前

embeddings-extraction

@lperezmo

Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings for use in context-augmented LLM queries.

Python12

3 个月前

llm-pdf-extractor

@jaysara

This project demonstrates the use of llm for extracting and analyzing data from PDFs

Python0

1 年前

LLM_PDF_Text_Extractor

@DimKouts84

Use local LLM (Ollama) to parse PDF documents and extract and structure the text in Makdown.

Python0

19 天前

metadata-extractor-pdf-llm

@farrukh602

Jupyter Notebook0

3 个月前

PDF_Invoice_Extractor_LLM_Project

@Prathamesh282001

Jupyter Notebook0

9 个月前

RAG-App-HackTogether

@edilma

ChatGPT-like Application using RAG pattern that allows to ask question to my own documents - I Used Semantic Kernel to integrate a LLM (OpenAI) using C# to orchestrate AI pluggins (Azure Cognitive Se...

C#10

8 个月前