#计算机科学#Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
翻译 - 简单钢筋学习教程
#计算机科学#High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
#计算机科学#PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
#计算机科学#Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
#计算机科学#Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
翻译 - 超级马里奥兄弟的近距离策略优化(PPO)算法
#算法刷题#This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are st...
A PyTorch library for building deep reinforcement learning agents.
翻译 - 一个PyTorch库,用于构建深度强化学习代理。
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
PyTorch C++ Reinforcement Learning
翻译 - PyTorch C ++强化学习
#计算机科学#lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
#计算机科学#Deep Reinforcement Learning (PPO) in Autonomous Driving (Carla) [from scratch]
Trading Environment(OpenAI Gym) + PPO(TensorForce)
#计算机科学#Deep Reinforcement Learning in C#
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Clean baseline implementation of PPO using an episodic TransformerXL memory
#计算机科学#PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
#计算机科学#Proximal Policy Optimization (PPO) algorithm for Contra
#计算机科学#Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)