mcts · GitHub Topics

#大语言模型#A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

chain-of-thought Code 大语言模型数学 mcts openai-o1 strawberry reinforcement-learning

6.64 k

2 天前

#计算机科学#A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Tensorflow PyTorch Keras gobang alpha-zero alphago-zero alphago reinforcement-learning self-play mcts monte-carlo-tree-search 深度学习 alphazero 神经网络

Jupyter Notebook 4.1 k

3 个月前

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphazero mcts alphago-zero gobang monte-carlo-tree-search alphago reinforcement-learning rl board-game self-learning PyTorch Tensorflow

Python 3.46 k

1 年前

werner-duvaud / muzero-general

#计算机科学#MuZero

翻译 - 零

muzero reinforcement-learning alphazero PyTorch Python self-learning monte-carlo-tree-search 深度学习 deep-reinforcement-learning 神经网络 rl tensorboard gym mcts alphago 机器学习

Python 2.61 k

7 个月前

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

alphazero atari continuous-control monte-carlo-tree-search muzero PyTorch reinforcement-learning mcts board-game gym self-play

Python 1.33 k

8 小时前

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

benchmark mcts o1 prm reasoning rl

Python 910

12 天前

chauvinSimon / My_Bibliography_for_Research_on_Autonomous_Driving

Personal notes about scientific and research works on "Decision-Making for Autonomous Driving"

reinforcement-learning inverse-reinforcement-learning planning model-based-reinforcement-learning decision-making game-theory mcts prediction bibliography carla imitation-learning end-to-end interaction risk-assessment

452

4 年前

s-casci / tinyzero

Easily train AlphaZero-like agents on any environment you want!

alphazero mcts reinforcement-learning

Python 429

1 年前

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

chain-of-thought cot deepseek-r1 instruction-tuning large-vision-language-model multimodal multimodal-chain-of-thought multimodal-large-language-models openai-o1 reasoning survey mcts

425

7 天前

hrpan / tetris_mcts

#计算机科学#MCTS project for Tetris

翻译 - 俄罗斯方块的MCTS项目

reinforcement-learning mcts tetris 深度学习 game tetris-bots

Python 346

6 个月前

dylandjian / SuperGo

#计算机科学#A student implementation of Alpha Go Zero

alphago-zero alphago reinforcement-learning PyTorch mcts Python 机器学习

Python 280

7 年前

DataCanvasIO / Hypernets

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

neural-architecture-search hyperparameter-optimization hyperparameter-tuning evolutionary-algorithms monte-carlo-tree-search automl autodl reinforcement-learning mcts nas Keras

Python 267

16 小时前

QueensGambit / CrazyAra

#计算机科学#A Deep Learning UCI-Chess Variant Engine written in C++ & Python 🦜

Python chess-engine 深度学习人工智能 convolutional-neural-network mcts alphazero mxnet gluon Open Source 机器学习 lichess alphago

Jupyter Notebook 263

18 天前

vgarciasc / mcts-viz

Visualization of MCTS algorithm applied to Tic-tac-toe.

mcts 可视化 p5js

JavaScript 233

4 年前

sungyubkim / Deep_RL_with_pytorch

A pytorch tutorial for DRL(Deep Reinforcement Learning)

deep-reinforcement-learning PyTorch dqn a2c ppo soft-actor-critic mcts

Jupyter Notebook 211

2 年前

initial-h / AlphaZero_Gomoku_MPI

#算法刷题#An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku

alphazero parallel Tensorflow alphago mcts tensorlayer tree-search 算法 deep-reinforcement-learning

Python 202

1 个月前

thuxugang / doudizhu

AI斗地主

人工智能 collectible-card-game dqn reinforcement-learning doudizhu mcts

Python 183

7 年前

kaesve / muzero

#计算机科学#A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

muzero alphazero reinforcement-learning Tensorflow tensorflow2 mcts tf2 深度学习 deep-reinforcement-learning

Jupyter Notebook 158

4 年前

zjeffer / chess-deep-rl

#计算机科学#Research project: create a chess engine using Deep Reinforcement Learning

chess alphazero reinforcement-learning 人工智能神经网络 mcts 深度学习机器学习 deep-reinforcement-learning neural-networks chess-engine

Jupyter Notebook 139

9 个月前

akolishchak / doom-net-pytorch

#学习与技能提升#Reinforcement learning models in ViZDoom environment

PyTorch vizdoom reinforcement-learning doom agent learning ppo mcts behavior-tree

Python 133

3 年前