PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
翻译 - DQN、AC、ACER、A2C、A3C、PG、DDPG、TRPO、PPO、SAC、TD3 和...的 PyTorch 实现。
StarCraft II / PySC2 Deep Reinforcement Learning Agents (A2C)
Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.
Actor-critic trained w PPO on OpenAI's Procgen Benchmark (PyTorch). Built from scratch.
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Goal driven language generation using knowledge graph A2C agents
A2C for GVG-AI
A well-documented A2C written in PyTorch
TensorFlow A2C to solve Acrobot, with synchronized parallel environments
AI (A2C agent) mastering the game of Snake with TensorFlow 2.0
[DEPRECATED] Advantage Actor Critic model in PyTorch inspired by OpenAI baselines TensorFlow implementation
General implementation of Advantage Actor Critic using Pytorch
VISSIM Reinforcement Learning for Urban Traffic Control. DQN, DDQN, D3QN, A2C, PER.
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC