”llm-evaluation“ 的搜索结果

deepeval

@confident-ai

The LLM Evaluation Framework

evaluation-metrics evaluation-framework llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Python3.83 k

1 天前

Google Bing GitHub

python llmops ai chatgpt large-language-models llms mlops llama evaluation llm

ragas

@explodinggradients

#大语言模型#Supercharge Your LLM Application Evaluations 🚀

llm llmops evaluation

Python7.38 k

8 小时前

trulens

@truera

Evaluation and Tracking for LLM Experiments

Python2.2 k

2 天前

evals

OpenAI@openai

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python15.1 k

2 个月前

giskard

@Giskard-AI

#计算机科学#🐢 Open-Source Evaluation & Testing for ML & LLM systems

机器学习人工智能 mlops quality-assurance machine-learning-testing

Python4.1 k

7 天前

gorilla

@ShishirPatil

#大语言模型#Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

API llm api-documentation ChatGPT gpt-4-api

Python11.52 k

13 小时前

auto-evaluator

@rlancemartin

Evaluation tool for LLM QA chains

Python1.06 k

2 年前

agenta

@Agenta-AI

#大语言模型#The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.

gpt4 langchain llmops Python TypeScript

Python1.31 k

1 小时前

RouteLLM

@lm-sys

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!

Python3.3 k

4 个月前

Awesome-LLMs-Evaluation-Papers

@tjunlp-lab

The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.

718

7 个月前

ChainForge

@ianarawjo

An open-source visual programming environment for battle-testing prompts to LLMs.

人工智能 evaluation large-language-models llmops llms

TypeScript2.38 k

1 个月前

phoenix

@Arize-ai

AI Observability & Evaluation

ml-observability model-observability ai-roi llmops mlops

Jupyter Notebook4.05 k

2 天前

LLMZoo

@FreedomIntelligence

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Python2.93 k

1 年前

opencompass

@open-compass

#大语言模型#OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

evaluation benchmark large-language-model ChatGPT llm

Python4.17 k

8 天前

llama.cpp

Georgi Gerganov@ggerganov

Facebook 的 LLaMA 模型在 C/C++ 中的移植

llama ggml

C++68.49 k

16 分钟前

mem0

@mem0ai

#大语言模型#The Memory layer for your AI apps

人工智能 ChatGPT llm Python chatbots

Python23.02 k

3 天前

cset

@cisagov

Cybersecurity Evaluation Tool

翻译 - 网络安全评估工具

cset Security

TSQL1.46 k

1 天前

llm-attacks

@llm-attacks

Universal and Transferable Attacks on Aligned Language Models

Python3.48 k

4 个月前

SillyTavern

@SillyTavern

#大语言模型#LLM Frontend for Power Users.

人工智能 characters chat llm openai

JavaScript8.52 k

16 小时前

govaluate

@Knetic

Arbitrary expression evaluation for golang

翻译 - golang的任意表达式求值

Go evaluation Parsing expression

Go3.77 k

6 个月前

garak

NVIDIA Corporation@NVIDIA

the LLM vulnerability scanner

人工智能 llm-evaluation llm-security security-scanners vulnerability-assessment

Python2.9 k

1 天前

LLM-Finetuning

Ashish Patel@ashishpatel26

#大语言模型#LLM Finetuning with peft

falcon fine-tuning huggingface llama llama2

Jupyter Notebook2.18 k

5 个月前

LongLM

@datamllab

#大语言模型#[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

context-window large-language-models llm longlm self-extend

Python616

6 个月前

evaluation

Neutralinojs@neutralinojs

Neutralinojs vs Electron vs Nw.js

345

3 年前

LLM-As-Chatbot

@deep-diver

LLM as a Chatbot Service

Python3.29 k

1 年前

deepframeworks

@zer0n

Evaluation of Deep Learning Frameworks

2.05 k

8 年前

”llm-evaluation“ 的搜索结果

编程语音