rlhf · GitHub Topics

#自然语言处理#Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

fine-tuning llama 大语言模型 peft transformers rlhf qlora quantization qwen instruction-tuning gpt lora large-language-models agent 人工智能 moe llama3 deepseek gemma 自然语言处理

Python 54 k

2 天前

LAION-AI / Open-Assistant

#大语言模型#面向所有人的对话式 AI，我们相信我们即将创造一场革命，正如 Stable Diffusion 改变了现代艺术的创作过程, 我们将透过对话式 AI 来改变世界.

ChatGPT language-model rlhf 人工智能 assistant discord-bot 机器学习 Next Python

Python 37.41 k

1 年前

RUCAIBox / LLMSurvey

#自然语言处理#大语言模型综述

chain-of-thought ChatGPT in-context-learning instruction-tuning large-language-models 大语言模型自然语言处理 pre-trained-language-models pre-training rlhf

Python 11.66 k

4 个月前

ymcui / Chinese-LLaMA-Alpaca-2

#自然语言处理#中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

alpaca llama 大语言模型 llama-2 large-language-models 自然语言处理 alpaca-2 flash-attention llama2 alpaca2 Yarn rlhf

Python 7.16 k

10 个月前

InternLM / InternLM

#大语言模型#Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

聊天机器人 gpt 大语言模型 long-context rlhf fine-tuning-llm 中文 flash-attention pretrained-models

Python 6.97 k

5 个月前

huggingface / alignment-handbook

#大语言模型#Robust recipes to align language models with human and AI preferences

大语言模型 rlhf transformers

Python 5.25 k

2 个月前

argilla-io / argilla

#自然语言处理#Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

human-in-the-loop 自然语言处理 mlops developer-tools text-labeling annotation-tool 机器学习 active-learning weak-supervision text-annotation 大语言模型人工智能 gpt-4 rlhf langchain

Python 4.56 k

10 天前

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

large-language-models multimodal rlhf chameleon dpo vision-language-model

Jupyter Notebook 4.17 k

1 个月前

opendilab / awesome-RLHF

#计算机科学#A curated list of reinforcement learning with human feedback resources (continually updated)

深度学习 deep-reinforcement-learning human-feedback reinforcement-learning rlhf large-language-models

4.04 k

6 天前

Kiln-AI / Kiln

#计算机科学#The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

人工智能 chain-of-thought collaboration dataset-generation fine-tuning 机器学习 macOS ollama openai prompt prompt-engineering Python rlhf synthetic-data Windows evals evaluation

Python 3.91 k

12 小时前

hiyouga / ChatGLM-Efficient-Tuning

#大语言模型#Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

chatglm ChatGPT fine-tuning lora alpaca peft huggingface language-model transformers PyTorch rlhf chatglm2 qlora

Python 3.71 k

2 年前

transformerlab / transformerlab-app

Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

Electron llama 大语言模型 lora rlhf transformers MLX diffusion diffusion-models stability-diffusion

TypeScript 3.54 k

2 天前

Docta-ai / docta

A Doctor for your data

data data-centric-ai data-centric-machine-learning data-curation data-diagnosis language-model rlhf

Python 3.35 k

6 个月前

argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

人工智能 huggingface 大语言模型 openai Python rlhf synthetic-data synthetic-dataset-generation

Python 2.8 k

3 天前