#大语言模型#《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
#自然语言处理#Sparsity-aware deep learning inference runtime for CPUs
#大语言模型#[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
#Awesome#A curated list of neural network pruning resources.
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
#计算机科学#AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
#自然语言处理#Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Practical course about Large Language Models.
OpenMMLab Model Compression Toolbox and Benchmark.
Config driven, easy backup cli for restic.
PaddleSlim is an open-source library for deep model compression and architecture search.
#计算机科学#A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Efficient computing methods developed by Huawei Noah's Ark Lab
#自然语言处理#Neural Network Compression Framework for enhanced OpenVINO™ inference
#大语言模型#[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
#计算机科学#PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference
mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!
#计算机科学#TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.