SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
#大语言模型#Large Language Models for All, 🦙 Cult and More, Stay in touch !
Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
#大语言模型#Run any Large Language Model behind a unified API
#大语言模型#🪶 Lightweight OpenAI drop-in replacement for Kubernetes
#大语言模型#A guide about how to use GPTQ models with langchain
#大语言模型#ChatSakura:Open-source multilingual conversational model.(开源多语言对话大模型)
Private self-improvement coaching with open-source LLMs
#大语言模型#This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for ...
#大语言模型#Run gguf LLM models in Latest Version TextGen-webui
#大语言模型#A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system
#大语言模型#This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.
#大语言模型#LLM quantization techniques: absmax, zero-point, GPTQ and GGUF
#自然语言处理#Quantizing LLMs using GPTQ
#大语言模型#Hands on some LLMs