4 bits quantization of LLaMA using GPTQ
Accessible large language models via k-bit quantization for PyTorch.
Vector (and Scalar) Quantization, in Pytorch
虚拟货币(BTC、ETH)炒币量化系统项目。币安交易所-量化交易-网格策略实践。火币、OKEX热门交易所未来都支持。最简单收益最靠谱的项目,包教包会。
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
翻译 - 基于pytorch的模型压缩(1,量化:8/4 / 2bits(dorefa),三进制/二进制值(twn / bnn / xnornet); 2,修剪:常规,常规和组卷积通道修剪; 3,组卷积结构; 4,特征(A)的二进制值的分批归一化折叠)
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
color quantization lib
[ICCV2023] Dataset Quantization
alibabacloud-quantization-networks
YOLOv3 quantization model v10, only for quantization off-line
#自然语言处理#[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Incremental Network Quantization, Kmeans quantization, Iterative Pruning, Dynamic Network Surgery
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Multi-backbone, Prune, Quantization, KD
Reproducing Quantization paper PACT
PyTorch Quantization Aware Training Example
Summary, Code for Deep Neural Network Quantization
LLaMa/RWKV onnx models, quantization and testcase
Library for 8-bit optimizers and quantization routines.