#大语言模型#The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.
#数据仓库#AI Observability & Evaluation
#大语言模型#Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including OpenAI Agents SDK, CrewAI, Langchain, Autogen, AG2, and CamelAI
#计算机科学#The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
Laminar - open-source all-in-one platform for engineering AI products. Crate data flywheel for you AI app. Traces, Evals, Datasets, Labels. YC S24.
#大语言模型#🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Test your LLM-powered apps with TypeScript. No API key required.
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).
#大语言模型#Evalica, your favourite evaluation toolkit
#大语言模型#Benchmarking Large Language Models for FHIR
#大语言模型#Go Artificial Intelligence (GAI) helps you work with foundational models, large language models, and other AI models.
An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"
#大语言模型#Root Signals Python SDK
Our curated collection of templates. Use these patterns to set up your AI projects for evaluation with Openlayer.
MCP for Root Signals Evaluation Platform