#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Engage in dynamic conversations with PDFs to extract and comprehend information using locally hosted LLM variants of Ollama by integrating RAG.
Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings for use in context-augmented LLM queries.
This project demonstrates the use of llm for extracting and analyzing data from PDFs
Use local LLM (Ollama) to parse PDF documents and extract and structure the text in Makdown.
ChatGPT-like Application using RAG pattern that allows to ask question to my own documents - I Used Semantic Kernel to integrate a LLM (OpenAI) using C# to orchestrate AI pluggins (Azure Cognitive Se...
PDF table extractor
An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)
RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,无须安装任何第三方agent库。
PyInstaller Extractor
翻译 - PyInstaller 提取器
Android backup extractor
翻译 - Android 备份提取器
Maplestory online Extractor
mXtract - Memory Extractor & Analyzer
翻译 - mXtract-攻击性内存提取器和分析器
A sketch extractor for anime/illustration.
翻译 - 动漫/插画的素描提取器。
The SOTA extractor pipeline
Python script to extract as much structured information as possible from annual/quarterly reports.
Wwise *.BNK File Extractor
PDF text data extraction web app with OCR for scanned documents
PyInstaller Extractor Next Generation
Unity Live2D Cubism 3 Extractor
Visual Background Extractor