#效率工具集合#🦉 Data Versioning and ML Experiments
翻译 - 🦉数据版本控制|用于数据和模型的Git
#计算机科学#Refine high-quality datasets and visual AI models
翻译 - 用于构建高质量数据集和计算机视觉模型的开源工具
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
#大语言模型#Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Neo4j graph construction from unstructured data using LLMs
#大语言模型#🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
#自然语言处理#Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
Interact, analyze and structure massive text, image, embedding, audio and video datasets
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ...
#计算机科学#Interactively explore unstructured datasets from your dataframe.
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
#自然语言处理#Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
#大语言模型#Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
#搜索#NucliaDB, The AI Search database for RAG
#搜索# Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
python implementation of jordansissel's grok regular expression library
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
#大语言模型#Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers