[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
翻译 - [ICCV 2019] TSM:高效视频理解的时移模块。
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
翻译 - [ICLR 2019] ProxylessNAS:直接在目标任务和硬件上进行神经体系结构搜索。
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
#计算机科学#A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
#自然语言处理#[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
#自然语言处理#[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
#自然语言处理#[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Efficient 3D Backbone Network for Temporal Modeling
[ICCV 2019] Harmonious Bottleneck on Two Orthogonal Dimensions, surpassing MobileNetV2
#自然语言处理#[KDD'22] Learned Token Pruning for Transformers
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
#大语言模型#Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".
#自然语言处理#[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion
[MICCAI 2021] BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation
The semantic segmentation of remote sensing images