SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
翻译 - 基于pytorch的模型压缩(1,量化:8/4 / 2bits(dorefa),三进制/二进制值(twn / bnn / xnornet); 2,修剪:常规,常规和组卷积通道修剪; 3,组卷积结构; 4,特征(A)的二进制值的分批归一化折叠)
#计算机科学#TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
#自然语言处理#[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
#大语言模型#[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
#计算机科学#A model compression and acceleration toolbox based on pytorch.
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
#大语言模型#[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
#计算机科学#Notes on quantization in neural networks
[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
Post-training static quantization using ResNet18 architecture
#大语言模型#[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization
[ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation
Improved the performance of 8-bit PTQ4DM expecially on FID.
[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation
quantization example for pqt & qat