#

post-training-quantization

https://static.github-zh.com/github_avatars/intel?size=40

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2.37 k
13 小时前
https://static.github-zh.com/github_avatars/666DZY666?size=40

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...

翻译基于pytorch的模型压缩(1,量化:8/4 / 2bits(dorefa),三进制/二进制值(twn / bnn / xnornet); 2,修剪:常规,常规和组卷积通道修剪; 3,组卷积结构; 4,特征(A)的二进制值的分批归一化折叠)

Python 2.24 k
2 天前
https://static.github-zh.com/github_avatars/ModelTC?size=40

#大语言模型#[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 454
6 天前
https://static.github-zh.com/github_avatars/megvii-research?size=40

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Python 331
2 年前
https://static.github-zh.com/github_avatars/sayakpaul?size=40
Jupyter Notebook 172
2 年前
https://static.github-zh.com/github_avatars/Hsu1023?size=40

#大语言模型#[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.

Python 156
6 个月前
https://static.github-zh.com/github_avatars/ModelTC?size=40

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Jupyter Notebook 62
8 个月前
https://static.github-zh.com/github_avatars/Sanjana7395?size=40
Jupyter Notebook 37
5 年前
https://static.github-zh.com/github_avatars/ModelTC?size=40

#大语言模型#[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Python 35
1 年前
https://static.github-zh.com/github_avatars/zysxmu?size=40

Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization

Python 14
3 年前
https://static.github-zh.com/github_avatars/shieldforever?size=40

[ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation

Python 10
1 个月前
https://static.github-zh.com/github_avatars/iszry?size=40

Improved the performance of 8-bit PTQ4DM expecially on FID.

Python 9
2 年前
https://static.github-zh.com/github_avatars/GongCheng1919?size=40

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

Python 7
1 个月前
loading...
Website
Wikipedia