#计算机科学#DeepSpeed Chat: 一键式RLHF训练,让你的类ChatGPT千亿大模型提速省钱15倍
SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm
#计算机科学#Efficient AI Inference & Serving
Efficient Inference of Transformer models
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
翻译 - 在 Nvidia Triton 服务器上部署优化的基于变压器的模型
Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model
[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
VideoSys: An easy and efficient system for video generation
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
SCODE : an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation
Code to accompany "Weightless Neural Networks for Efficient Edge Inference", PACT 2022
Unofficial Pytorch implementation (inference only) of the SimSwap: An Efficient Framework For High Fidelity Face Swapping
LLaMA模型的推理代码