sparsity · GitHub Topics

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

low-precision pruning sparsity auto-tuning knowledge-distillation quantization quantization-aware-training post-training-quantization smoothquant large-language-models gptq int8

Python 2.37 k

1 天前

neuralmagic / sparseml

#自然语言处理#Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

翻译 - 使用几行代码将稀疏化配方应用于神经网络的库，实现更快和更小的模型

PyTorch Keras sparsification-recipes Tensorflow smaller-models deep-learning-library 深度学习 deep-learning-models automl sparsity sparsification pruning computer-vision-algorithms object-detection image-classification 自然语言处理 onnx transfer-learning

Python 2.12 k

8 个月前

pytorch / ao

PyTorch native quantization and sparsity for training and inference

brrr dtypes inference mx PyTorch quantization sparsity training float8 transformer offloading optimizer CUDA llama

Python 1.95 k

2 天前

PaddlePaddle / PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

翻译 - PaddleSlim是一个用于深度模型压缩和体系结构搜索的开源库。

pruning quantization nas bert compression detection distillation ernie segmentation sparsity tensorrt transformer yolov6 yolov5 yolov7

Python 1.59 k

4 个月前

tensorflow / model-optimization

#计算机科学#A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

翻译 - 新手和高级用户都可以使用一套工具来优化机器学习模型以进行部署和执行。

Tensorflow 机器学习深度学习 optimization Keras model-compression compression pruning sparsity quantization

Python 1.53 k

2 个月前

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

compression quantization sparsity

Python 1.21 k

2 天前

openvinotoolkit / nncf

#自然语言处理#Neural Network Compression Framework for enhanced OpenVINO™ inference

quantization pruning sparsity quantization-aware-training compression semantic-segmentation object-detection classification 自然语言处理 bert transformers PyTorch Tensorflow onnx openvino 深度学习 genai 大语言模型

Python 997

2 天前

Eric-mingjie / network-slimming

#计算机科学#Network Slimming (Pytorch) (ICCV 2017)

深度学习 convolutional-neural-networks PyTorch channel-pruning sparsity

Python 917

4 年前

Bobo-y / flexible-yolov5

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

yolov5 resnet Backbone.js cbam PyTorch shufflenet hrnet tensorrt object-detection swin-transformer gcn yolov3 sparsity

Python 673

8 个月前

FMInference / H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

gpt-3 high-throughput kv-cache large-language-models sparsity

Python 436

8 个月前

wenwei202 / caffe

Caffe for Sparse and Low-rank Deep Neural Networks

深度神经网络 sparsity acceleration compression caffe sparse-convolution

C++ 378

5 年前

intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization

cpu fp8 gpu int8 llm-inference sparsity llamacpp

C++ 351

7 个月前

mehtadushy / SelecSLS-Pytorch

#计算机科学#Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also inclu...

pytorch-implementation PyTorch cnn imagenet 深度学习 efficient efficient-architectures pruning sparsity cvpr2019 siggraph

Python 338

5 年前

bwohlberg / sporco

Sparse Optimisation Research Code

sparsity optimization optimization-algorithms Python CUDA

Python 266

3 个月前

dcmocanu / sparse-evolutionary-artificial-neural-networks

#计算机科学#Always sparse. Never dense. But never say never. A Sparse Training repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boos...

artificial-neural-networks restricted-boltzmann-machine 深度学习深度神经网络 neuroevolution complex-networks evolutionary-algorithms randomization scalability sparsity generative-models classification scalable-deep-learning

Python 247

4 年前