Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
翻译 - [CVPR2020]超越MobileNetV3:“ GhostNet:廉价运营带来的更多功能”
#自然语言处理#[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
#计算机科学#EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
翻译 - 论文代码“ AdderNet:在深度学习中我们真的需要乘法吗?”
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
#自然语言处理#[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang
#计算机科学#Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.
#Awesome#List of papers related to neural network quantization in recent AI conferences and journals.
#自然语言处理#[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
#计算机科学#Explorations into some recent techniques surrounding speculative decoding
[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
(CVPR 2021, Oral) Dynamic Slimmable Network
[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution
#自然语言处理#On-device LLM Inference Powered by X-Bit Quantization
#人脸识别#Deep Face Model Compression
[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
#计算机科学#[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"
📚 Collection of awesome generation acceleration resources.
[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching