#计算机科学#Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
#计算机科学#A curated list of reinforcement learning with human feedback resources (continually updated)
#计算机科学#Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
#数据仓库#Let's build better datasets, together!
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
#大语言模型#The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.
#大语言模型#Implementation of Reinforcement Learning from Human Feedback (RLHF)
#大语言模型#Product analytics for AI Assistants
#数据仓库#BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
#数据仓库#[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
#大语言模型#Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
[ NeurIPS 2023 ] Official Codebase for "Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback"
Reinforcement Learning from Human Feedback with 🤗 TRL
#计算机科学#Search Engine Optimization using Human Implicit Feedback