rl · GitHub Topics

LlamaFamily / Llama-Chinese

#大语言模型#Llama中文社区，实时汇总最新Llama学习资料，构建最好的中文Llama大模型开源生态，完全开源可商用

llama 大语言模型 pretraining agent llama4 rl

Python 14.54 k

7 天前

google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

翻译 - 多巴胺是用于强化学习算法的快速原型制作的研究框架。

rl 机器学习人工智能 Google Tensorflow

Jupyter Notebook 10.7 k

5 个月前

thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.

翻译 - 优雅，灵活和超快的PyTorch深度强化学习平台。

PyTorch policy-gradient dqn double-dqn a2c ddpg ppo td3 sac imitation-learning mujoco atari rl cql

Python 8.39 k

22 天前

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphazero mcts alphago-zero gobang monte-carlo-tree-search alphago reinforcement-learning rl board-game self-learning PyTorch Tensorflow

Python 3.46 k

1 年前

pytorch / ELF

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

翻译 - ELF：AlphaGoZero / AlphaZero重新实现的游戏研究平台

reinforcement-learning alphago-zero rl rl-environment alpha-zero Go

C++ 3.39 k

6 年前

pytorch / rl

#计算机科学#A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

人工智能 control decision-making distributed-computing 机器学习 marl model-based-reinforcement-learning multi-agent-reinforcement-learning PyTorch reinforcement-learning rl Robotics torch

Python 2.67 k

2 天前

werner-duvaud / muzero-general

#计算机科学#MuZero

翻译 - 零

muzero reinforcement-learning alphazero PyTorch Python self-learning monte-carlo-tree-search 深度学习 deep-reinforcement-learning 神经网络 rl tensorboard gym mcts alphago 机器学习

Python 2.61 k

7 个月前

DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

rl reinforcement-learning stable-baselines openai gym pybullet hyperparameter-optimization hyperparameter-tuning hyperparameter-search optimization sde Robotics lab deep-reinforcement-learning PyTorch

Python 2.36 k

8 天前

IntelLabs / coach

#计算机科学#Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

翻译 - 英特尔AI实验室的强化学习教练可轻松进行最新的强化学习算法实验

coach openai-gym reinforcement-learning Tensorflow rl carla imitation-learning mujoco roboschool 深度学习 starcraft starcraft2 mxnet onnx

Python 2.34 k

2 年前

MaximeVandegar / Papers-in-100-Lines-of-Code

#计算机科学#Implementation of papers in 100 lines of code.

Python research 深度学习机器学习 educational PyTorch papers generative-model nerf 人工智能 gans aes 3D meta-learning neural-radiance-fields reinforcement-learning rl diffusion-models

Python 1.48 k

1 个月前

pathak22 / noreward-rl

#计算机科学#[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

deep-reinforcement-learning exploration 深度学习 rl 深度神经网络 mario doom self-supervised Tensorflow openai-gym

Python 1.43 k

2 年前

araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

rl reinforcement-learning stable-baselines openai-gym openai gym pybullet optimization hyperparameter-optimization hyperparameter-search hyperparameter-tuning

Python 1.17 k

2 年前

inclusionAI / AReaL

#大语言模型#Distributed RL System for LLM Reasoning

大语言模型 machine-learning-systems mlsys reinforcement-learning rl

Python 1.06 k

6 天前

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

benchmark mcts o1 prm reasoning rl

Python 910

12 天前

MushroomRL / mushroom-rl

#计算机科学#Python library for Reinforcement Learning.

reinforcement-learning deep-reinforcement-learning 深度学习 openai-gym atari rl PyTorch mujoco dqn ddpg pybullet sac

Python 864

11 天前

sail-sg / understand-r1-zero

#大语言模型#Understanding R1-Zero-Like Training: A Critical Perspective

大语言模型 reasoning rl

Python 836

1 天前

google-research / rliable

#计算机科学#[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

reinforcement-learning benchmarking evaluation-metrics 机器学习 Google rl

Jupyter Notebook 824

8 个月前

google-research / seed_rl

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

翻译 - SEED RL：具有加速的中央推理功能的可扩展，高效的Deep-RL。使用SEED的体系结构在TF2中实现IMPALA和R2D2算法。

rl impala r2d2 atari deepmind-lab google-research-football tf2 Google 云

Python 816

2 年前

Toni-SM / skrl

#计算机科学#Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab

reinforcement-learning Python openai-gym PyTorch 深度学习 deepmind gym isaac-sim rl 机器学习 Robotics jax

Python 721

7 天前

FareedKhan-dev / all-rl-algorithms

#大语言模型#Implementation of all RL algorithms in a simpler way

agent 大语言模型 openai Python reinforcement-learning rl

Jupyter Notebook 706

4 天前