”llm-evaluation-metrics“ 的搜索结果

#计算机科学#Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

data-drift Jupyter Notebook pandas-dataframe 机器学习 model-monitoring

Jupyter Notebook5.46 k

5 小时前

Google Bing GitHub

evaluation-metrics llm-evaluation-metrics jupyter-notebook llmops large-language-models llms model-monitoring machine-learning mlops evaluation

langtrace

@Scale3-Labs

Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...

TypeScript605

3 天前

deepeval

@confident-ai

The LLM Evaluation Framework

evaluation-metrics evaluation-framework llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Python3.83 k

1 天前

llm-evaluation-metrics-workshop

@luca-stamatescu

An example of applying LLM evaluation metrics using PromptFlow and Azure AI Studio.

Jinja7

5 个月前

lss_eval

@facebookresearch • Meta

This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found here

Python31

1 年前

deception

@lechmazur

Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation met...

1 个月前

llm_evaluation_metrics

@zstankow

Jupyter Notebook0

4 个月前

LLM_metrcis

@giambascientist86

Repo for LLM evaluation metrics code

17 天前

LLM-Evaluation-Metrics-Hub

@DeependraVerma

Unlock LLM evaluation power! This comprehensive toolkit offers diverse metrics for analyzing and comparing large language model outputs. Ideal for developers, researchers, and AI enthusiasts aiming to...

Python0

5 个月前

Eval_metrics_NLP_LLM

@fivehills

Evaluation metrics for NLP tasks and LLM performance

5 个月前

trulens

@truera

Evaluation and Tracking for LLM Experiments

Python2.2 k

2 天前

auto-evaluator

@rlancemartin

Evaluation tool for LLM QA chains

Python1.06 k

2 年前

TrackEval

@JonathonLuiten

HOTA (and other) evaluation metrics for Multi-Object Tracking (MOT).

Python979

5 个月前

nlg-eval

@Maluuba

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

Python1.35 k

3 个月前

GAN-Metrics

@xuqiantong

An empirical study on evaluation metrics of generative adversarial networks.

Jupyter Notebook368

5 年前

ihme-ui

@ihmeuw

Institute for Health Metrics and Evaluation UI Toolkit

JavaScript21

2 年前

Metrics

@benhamner

Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

Python1.63 k

2 年前

ChainForge

@ianarawjo

An open-source visual programming environment for battle-testing prompts to LLMs.

人工智能 evaluation large-language-models llmops llms

TypeScript2.39 k

1 个月前

ASAP-AES

@benhamner

Evaluation Metrics for the Hewlett Foundation's Automated Essay Scoring competition

Python37

13 年前

person-reid-benchmark

@RSL-NEU

A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

HTML196

6 年前

py-img-seg-eval

@martinkersner

Evaluation metrics for image segmentation inspired by paper Fully Convolutional Networks for Semantic Segmentation

Python282

9 年前

phoenix

@Arize-ai

AI Observability & Evaluation

ml-observability model-observability ai-roi llmops mlops

Jupyter Notebook4.05 k

3 天前

LLMZoo

@FreedomIntelligence

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Python2.93 k

1 年前

AB3DMOT

@xinshuoweng

#计算机科学#(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

翻译 - 提交中的“ 3D多对象跟踪基准”的官方Python实现

机器视觉机器学习 Robotics tracking 3d-tracking

Python1.68 k

8 个月前

GAN_Metrics-Tensorflow

Junho Kim@taki0112

Simple Tensorflow implementation of metrics for GAN evaluation (Inception score, Frechet-Inception distance, Kernel-Inception distance)

Python217

5 年前

agenta

@Agenta-AI

#大语言模型#The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.

gpt4 langchain llmops Python TypeScript

Python1.31 k

3 小时前

ESMValTool

@ESMValGroup

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP

NCL228

4 小时前

image-similarity-measures

@nekhtiari

📈 Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.

Python581

3 个月前

Evaluation-Metrics

@deep-learning-algorithm

各种算法评价指标的实现（mAP/Flops/params/fps/error-rate/accuracy）

Python31

4 个月前

”llm-evaluation-metrics“ 的搜索结果

编程语音