Example models using DeepSpeed
#计算机科学#DeepSpeed Chat: 一键式RLHF训练,让你的类ChatGPT千亿大模型提速省钱15倍
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
翻译 - 基于DeepSpeed库的GPU上类似于GPT-3的模型并行模型的实现。设计为能够训练成千上亿个参数或更大参数的模型。
#计算机科学#MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Ongoing research training transformer language models at scale, including: BERT & GPT-2
chatglm多gpu用deepspeed和
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
翻译 - 通过多GPU,TPU,混合精度训练和使用PyTorch模型的简单方法
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
DeepSpeed Tutorial
llama2 finetuning with deepspeed and lora
A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline
Teacher - student distillation using DeepSpeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note XPU is already supported by stock DeepSpeed.
Testing DeepSpeed integration in 🤗 Accelerate
LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers
Alpaca-lora for huggingface implementation using Deepspeed and FullyShardedDataParallel
Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3
Deepspeed、LLM、Medical_Dialogue、医疗大模型、预训练、微调
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload