efficient-inference · GitHub Topics

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

翻译 - [CVPR2020]超越MobileNetV3：“ GhostNet：廉价运营带来的更多功能”

convolutional-neural-networks efficient-inference imagenet model-compression Tensorflow PyTorch ghostnet transformer pretrained-models vision-transformer

Python 4.18 k

1 个月前

SqueezeAILab / LLMCompiler

#自然语言处理#[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

function-calling 大语言模型 llm-agent llm-agents llms parallel-function-call efficient-inference large-language-models llama llama2 llm-framework 自然语言处理 transformer

Python 1.66 k

9 个月前

snap-research / EfficientFormer

#计算机科学#EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

深度学习 detection efficient-inference efficient-neural-networks PyTorch semantic-segmentation transformer imagenet transformers

Python 1.03 k

2 年前

huawei-noah / AdderNet

Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

翻译 - 论文代码“ AdderNet：在深度学习中我们真的需要乘法吗？”

PyTorch imagenet convolutional-neural-networks cvpr2020 efficient-inference

Python 958

3 年前

horseee / DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

diffusion-models efficient-inference model-compression stable-diffusion

Python 883

10 个月前

SqueezeAILab / SqueezeLLM

#自然语言处理#[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

efficient-inference large-language-models 大语言模型 model-compression 自然语言处理 post-training-quantization quantization text-generation transformer llama localllm

Python 685

8 个月前

VITA-Group / LightGaussian

[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

3d-reconstruction efficient-inference gaussian-splatting

Python 674

3 个月前

Zhen-Dong / Awesome-Quantization-Papers

#Awesome#List of papers related to neural network quantization in recent AI conferences and journals.

quantization Awesome Lists papers neural-networks model-compression edge-computing efficient-inference diffusion-models large-language-models

584

17 天前

liuzhuang13 / slimming

#计算机科学#Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

深度学习 convolutional-neural-networks efficient-inference

Lua 569

6 年前

SqueezeAILab / KVQuant

#自然语言处理#[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

compression efficient-inference efficient-model large-language-models llama 大语言模型 localllama localllm mistral model-compression 自然语言处理 quantization text-generation transformer

Python 340

8 个月前

lucidrains / speculative-decoding

#计算机科学#Explorations into some recent techniques surrounding speculative decoding

人工智能深度学习 efficient-inference transformers

Python 253

4 个月前

SYSU-SAIL / SMSR

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

super-resolution sparsity efficient-inference

Python 239

3 年前

Picovoice / picollm

#自然语言处理#On-device LLM Inference Powered by X-Bit Quantization

大语言模型 compression efficient-inference gemma generative-ai language-model language-models large-language-model llama llama2 llama3 llms mistral mixtral model-compression 自然语言处理 quantization 自托管 llm-inference

Python 233

2 天前

changlin31 / DS-Net

(CVPR 2021, Oral) Dynamic Slimmable Network

pruning network-pruning model-compression efficient-inference

Python 229

3 年前

xindongzhang / ELAN

[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution

efficient-inference super-resolution transformer

Python 221

3 年前

liuziwei7 / mobile-id

#人脸识别#Deep Face Model Compression

机器视觉深度学习 face-recognition model-compression efficient-inference

MATLAB 196

7 年前

czg1225 / AsyncDiff

[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

diffusion-models distributed-computing efficient-inference stable-diffusion text-to-image text-to-video

Python 195

2 个月前

xuyang-liu16 / Awesome-Generation-Acceleration

📚 Collection of awesome generation acceleration resources.

diffusion-models efficient-deep-learning efficient-inference text-to-image text-to-video image-generation video-generation

193

2 天前

cure-lab / DeciWatch

#计算机科学#[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

3d-pose-estimation efficient-inference human-pose-estimation 深度学习 efficiency efficient-neural-networks pose-estimation PyTorch eccv eccv2022

Python 177

3 年前

SimonAytes / SoT

#大语言模型#Official code repository for Sketch-of-Thought (SoT)

人工智能 efficient-inference 大语言模型 llm-inference prompting

Python 106

11 天前