”efficient-inference“ 的搜索结果

vllm

@vllm-project

#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs

gpt llm PyTorch llmops mlops

Python31.03 k

3 小时前

Google Bing GitHub

inference gpu pytorch llmops llm-serving deep-learning gpt llama2 llm-inference llama

DeepSpeed

Microsoft@microsoft

#计算机科学#DeepSpeed Chat: 一键式RLHF训练，让你的类ChatGPT千亿大模型提速省钱15倍

深度学习 PyTorch gpu 机器学习 billion-parameters

Python35.67 k

3 天前

BMInf

@OpenBMB

#计算机科学#Efficient Inference for Big Models

翻译 - 大型预训练语言模型 (PLM) 的低成本推理包

深度学习 gpu pretrained-language-models

Python572

2 年前

SNIPER

@mahyarnajibi

SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm

Python2.69 k

3 年前

SwiftInfer

@hpcaitech

#计算机科学#Efficient AI Inference & Serving

artificial-intelligence 深度学习 gpt inference llama

Python459

1 年前

useful-transformers

@usefulsensors

Efficient Inference of Transformer models

C++394

4 个月前

pytorch-pruning

@jacobgil

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

Python875

5 年前

torchsparse

MIT HAN Lab@mit-han-lab

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda1.23 k

17 天前

transformer-deploy

@ELS-RD

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

翻译 - 在 Nvidia Triton 服务器上部署优化的基于变压器的模型

tensorrt inference PyTorch onnxruntime triton-inference-server

Python1.66 k

1 个月前

EET

@NetEase-FuXi

Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model

Python261

4 个月前

SMSR

@The-Learning-And-Vision-Atelier-LAVA

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

Python229

3 年前

VideoSys

@NUS-HPC-AI-Lab

VideoSys: An easy and efficient system for video generation

Python1.54 k

3 个月前

sige

@lmxyy

[NeurIPS 2022] Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

Python225

2 年前

DDA

@BIT-DA

Code release for "Dynamic Domain Adaptation for Efficient Inference" (CVPR 2021)

Python27

3 年前

inferflow

@inferflow

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

llama2 llamacpp llm-inference model-quantization multi-gpu-inference

C++231

8 个月前

SCODE

@hmatsu1226

SCODE : an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation

R41

5 年前

BTHOWeN

@ZSusskind

Code to accompany "Weightless Neural Networks for Efficient Edge Inference", PACT 2022

SystemVerilog5

2 年前

simswap-inference-pytorch

@mike9251

Unofficial Pytorch implementation (inference only) of the SimSwap: An Efficient Framework For High Fidelity Face Swapping

Python93

2 年前

llama.cpp

Georgi Gerganov@ggerganov

Facebook 的 LLaMA 模型在 C/C++ 中的移植

llama ggml

C++68.5 k

1 小时前

llama

@meta-llama

LLaMA模型的推理代码

Python56.6 k

3 个月前

”efficient-inference“ 的搜索结果

编程语音