#自然语言处理#Transforms PDF, Documents and Images into Enriched Structured Data
翻译 - 将PDF,文档和图像转换为丰富的结构化数据
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
翻译 - 对抗性鲁棒性工具箱(ART)-用于机器学习安全性的Python库-规避,中毒,提取,推理
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Visual Novels resource browser
翻译 - Visual Novels资源浏览器
#网络爬虫#Extract clean markdown from PDFs, URLs, Word docs, slides, videos, and more, ready for any LLM. ⚡
#大语言模型#🦜⛏️ Did you say you like data?
Extract files from any kind of container formats
#自然语言处理#Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Extract indicators of compromise from text, including "escaped" ones.
翻译 - 从文本中提取危害指标,包括“转义的”。