[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
翻译 - 基于pytorch的模型压缩(1,量化:8/4 / 2bits(dorefa),三进制/二进制值(twn / bnn / xnornet); 2,修剪:常规,常规和组卷积通道修剪; 3,组卷积结构; 4,特征(A)的二进制值的分批归一化折叠)
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activations.
翻译 - EasyQuant(EQ)是一种有效且简单的训练后量化方法,它可以有效地优化权重和激活的比例。
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
(ICML 2024) BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Post-training static quantization using ResNet18 architecture
PyTorch Quantization Aware Training Example
A nnie quantization aware training tool on pytorch.
Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"
tensorflow quantization aware training(利用TensorFlow实现模型伪量化)
ICCV2021 - training a post-hoc lightweight GAN-discriminator for open-set recognition
color quantization lib
[ICCV2023] Dataset Quantization
alibabacloud-quantization-networks