”a2c“ 的搜索结果 | GitHub 中文社区

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...

Python3.61 k

2 年前

Google Bing GitHub

ppo pytorch alphago policy-gradient dqn actor-critic-algorithm a2c deep-reinforcement-learning a3c sarsa

A2C

@MG2033

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow

Python182

6 年前

Deep-reinforcement-learning-with-pytorch

@sweetice

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

翻译 - DQN、AC、ACER、A2C、A3C、PG、DDPG、TRPO、PPO、SAC、TD3 和...的 PyTorch 实现。

policy-gradient PyTorch actor-critic-algorithm alphago deep-reinforcement-learning

Python3.99 k

2 年前

pysc2-rl-agents

@simonmeister

StarCraft II / PySC2 Deep Reinforcement Learning Agents (A2C)

Python134

6 年前

DRL-code-pytorch

@Lizhi-sjtu

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

Python1.09 k

2 年前

A2C

@lnpalmer

PyTorch implementation of Advantage Actor-Critic (A2C)

Python44

7 年前

simple-A2C-PPO

@rgilman33

Actor-critic trained w PPO on OpenAI's Procgen Benchmark (PyTorch). Built from scratch.

Jupyter Notebook102

5 年前

PyTorch-RL

@Khrylx

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Python1.11 k

4 年前