Mesh TensorFlow: Model Parallelism Made Easier
翻译 - 网格TensorFlow:简化模型并行化
Minimalistic large language model 3D-parallelism training
Examples of training models with hybrid parallelism using ColossalAI
Functional local implementations of main model parallelism approaches
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as w...
A CNN example that demonstrates the workflow for using distributed TensorFlow to split the graph between multiple machines
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
FTPipe and related pipeline model parallelism research.
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
Pipeline Parallelism for PyTorch
Testing memory-level parallelism
The C++ Standard Library for Parallelism and Concurrency
翻译 - C ++并行和并发标准库
An API for data parallelism in JavaScript
A baseline repository of Auto-Parallelism in Training Neural Networks
Enhancements to tile cutter for parallelism and image format support
Use the two different methods (deepspeed and SageMaker model parallelism library) to fine tune llama model on Sagemaker. Then deploy the fine tuned llama on Sagemaker with server side batch.