”humaneval“ 的搜索结果

code-eval

@abacaj

Run evaluation on LLMs using human-eval benchmark

humaneval wizardcoder

Python381

1 年前

Google Bing GitHub

code-generation code-interpreter wizardcoder nlp nlp-machine-learning humaneval text-generation llm

llm-humaneval-benchmarks

@my-other-github-account

Jupyter Notebook86

1 年前

AutoCoder

@bin123apple

#自然语言处理#We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.

code-generation code-interpreter humaneval llm text-generation

Python817

5 个月前

humaneval-xl

@FloatAI

[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization

Python28

2 个月前

can-ai-code

@the-crypt-keeper

Self-evaluating interview for AI coders

Python541

8 天前

humaneval-results

@jamesmurdza

Evaluation results of code generation LLMs

1 年前

coverage-eval

Microsoft@microsoft

Dataset with coverage annotations for HumanEval dataset

Python21

1 年前

CoderEval

@CoderEval

A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.

122

1 年前

humaneval-langchain

@jamesmurdza

Benchmark results from code generation with LLMs

Jupyter Notebook17

1 年前

HumanEval.jl

@01-ai

Evaluate LLM-synthesized @JuliaLang code.

Julia21

3 个月前

编程语音