Optimized primitives for collective multi-GPU communication
翻译 - 针对集体多GPU通信的优化原语
NCCL Tests
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
RDMA and SHARP plugins for nccl library
NCCL Profiling Kit
NVIDIA NCCL Tests for Distributed Training
torch bindings for nccl
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.