Mesh TensorFlow: Model Parallelism Made Easier
翻译 - 网格TensorFlow:简化模型并行化
Minimalistic large language model 3D-parallelism training
Examples of training models with hybrid parallelism using ColossalAI
Functional local implementations of main model parallelism approaches
Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as w...
A CNN example that demonstrates the workflow for using distributed TensorFlow to split the graph between multiple machines
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
#自然语言处理#FTPipe and related pipeline model parallelism research.
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
Pipeline Parallelism for PyTorch
Testing memory-level parallelism
The C++ Standard Library for Parallelism and Concurrency
翻译 - C ++并行和并发标准库
An API for data parallelism in JavaScript
Fine-grained parallelism with sub-nanosecond overhead in Zig
Minimalistic 4D-parallelism distributed training framework for education purpose
Lightweight C 2D graphics API agnostic library with parallelism support
A baseline repository of Auto-Parallelism in Training Neural Networks
Enhancements to tile cutter for parallelism and image format support
Extended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python