#计算机科学#DeepSpeed Chat: 一键式RLHF训练,让你的类ChatGPT千亿大模型提速省钱15倍
#计算机科学#A GPipe implementation in PyTorch
翻译 - PyTorch中的GPipe实施
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
#自然语言处理#LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
#计算机科学#Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
#计算机科学#A curated list of awesome projects and papers for distributed training or inference
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
#计算机科学#Distributed training (multi-node) of a Transformer model
Distributed training of DNNs • C++/MPI Proxies (GPT-2, GPT-3, CosmoFlow, DLRM)
#计算机科学#SC23 Deep Learning at Scale Tutorial Material
#计算机科学#Deep Learning at Scale Training Event at NERSC
#计算机科学#WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
#大语言模型#Fast and easy distributed model training examples.
PyTorch implementation of 3D U-Net with model parallel in 2GPU for large model
Adaptive Tensor Parallelism for Foundation Models
Performance Estimates for Transformer AI Models in Science
#计算机科学#Official implementation of DynPartition: Automatic Optimal Pipeline Parallelism of Dynamic Neural Networks over Heterogeneous GPU Systems for Inference Tasks