Pytorch implementation of PPO2
#计算机科学#Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
翻译 - 超级马里奥兄弟的近距离策略优化(PPO)算法
Proximal Policy Optimization (PPO) algorithm for Contra
This is a reinforcement learning algorithm library. The code takes into account both performance and simplicity, with little dependence.
PPO, DDPG, SAC implementation on mujoco environment
Proximal Policy Optimization with Tensorflow 2.0
Re-produce DQN, REINFORCE, REINFORCE with baseline, one-step AC, QAC, QAC with shared network, PPO2, DDPG, TD3, SAC, SAC discrete,A2C,A3C
Reproduction of self-play described in paper "Emergent Complexity via Multi-Agent Competition", adapted from PPO2 implementation in OpenAI baselines.