llm-evaluation-metrics · GitHub Topics

The LLM Evaluation Framework

evaluation-metrics evaluation-framework llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Python 5.94 k

12 小时前

A one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE and is an easily extensible framework for new datasets, evaluations, methods, and other benchmarks.

privacy-protection benchmarks llm-evaluation-metrics llms Open Source

Python 206

6 天前

cvs-health / langfair

#大语言模型#LangFair is a Python library for conducting use-case level LLM bias and fairness assessments

人工智能 bias bias-detection fairness fairness-ai fairness-ml fairness-testing large-language-models 大语言模型 responsible-ai Python ai-safety llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Python 197

1 个月前

zhuohaoyu / KIEval

#大语言模型#[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

explainable-ai 大语言模型 llm-evaluation llm-evaluation-framework llm-evaluation-metrics 机器学习

Python 36

9 个月前

pyladiesams / eval-llm-based-apps-jan2025

#大语言模型#Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.

大语言模型 llmops llms workshop llm-eval llm-evaluation-framework llm-evaluation-metrics llm-monitoring

Jupyter Notebook 7

3 个月前

ritwickbhargav80 / quick-llm-model-evaluations

This repo is for an streamlit application that provides a user-friendly interface for evaluating large language models (LLMs) using the beyondllm package.

llm-evaluation-metrics llms retrieval-augmented-generation Streamlit

Python 0

7 个月前