gptq · GitHub Topics

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

low-precision pruning sparsity auto-tuning knowledge-distillation quantization quantization-aware-training post-training-quantization smoothquant large-language-models gptq int8

Python 2.37 k

1 天前

ModelCloud / GPTQModel

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

gptq peft quantization transformers vllm

Python 448

11 小时前

shm007g / LLaMA-Cult-and-More

#大语言模型#Large Language Models for All, 🦙 Cult and More, Stay in touch !

alpaca ChatGPT gpt llama ggml gpt4 gptq vicuna PyTorch Tensorflow transformers deepspeed 大语言模型

HTML 446

2 年前

intel / auto-round

Advanced Quantization Algorithm for LLMs/VLMs.

gptq quantization rounding

Python 426

2 天前

bobazooba / xllm

#大语言模型#🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

alpaca cerebras ChatGPT 深度学习深度神经网络 gpt gpt-4 gptq large-language-models llama llama2 大语言模型 mistral openai vicuna Zephyr RTOS PyTorch torch

Python 400

1 年前

1b5d / llm-api

#大语言模型#Run any Large Language Model behind a unified API

ChatGPT gptq huggingface langchain llama llamacpp 大语言模型 llm-inference 机器学习 Python

Python 168

1 年前

chenhunghan / ialacol

#大语言模型#🪶 Lightweight OpenAI drop-in replacement for Kubernetes

人工智能 helm Kubernetes langchain 大语言模型 Python openai cloudnative ggml gpu llamacpp CUDA gptq llm-inference llm-serving

Python 144

1 年前

abhinand5 / gptq_for_langchain

#大语言模型#A guide about how to use GPTQ models with langchain

人工智能 gpt gptq langchain language-model 大语言模型 quantization wizardlm

Jupyter Notebook 40

2 年前

ziwang-com / zero-lora

#大语言模型#zero零训练llm调参

gpt gptq llama 大语言模型 lora

2 年前

chinoll / chatsakura

#大语言模型#ChatSakura：Open-source multilingual conversational model.（开源多语言对话大模型）

gradio PyTorch bloom ChatGPT instruct-gpt 大语言模型 gptq transformers

Python 14

2 年前

hcd233 / Aris-AI-Model-Server

#大语言模型#An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

人工智能 embedding FastAPI gptq 大语言模型 MLX rag reranker sentence-transformers vllm

Python 13

9 个月前

tripathiarpan20 / self-improvement-4all

Private self-improvement coaching with open-source LLMs

faiss langchain Python gptq transformers

Python 12

1 年前

matlok-ai / bampe-weights

#大语言模型#This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for ...

人工智能 blip2 foundational-models generative-ai gptq image-to-image 大语言模型 safetensors stable-diffusion tiff transformers blender blender-python 深度学习

Python 9

1 年前

seyf1elislam / LocalLLM_OneClick_Colab

#大语言模型#Run gguf LLM models in Latest Version TextGen-webui

colab-notebook gguf gptq 大语言模型 llms localllama localllm Python

Jupyter Notebook 9

6 个月前

Aqirito / A.L.I.C.E

#大语言模型#A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system

langchain langchain-python 大语言模型 text-generation text-to-speech tts vits Anime 人工智能 Genshin Impact llms waifu FastAPI gptq huggingface-transformers pygmalion REST API

Python 8

2 个月前

bobazooba / shurale

#自然语言处理#Conversation AI model for open domain dialogs

cerebras ChatGPT 深度学习深度神经网络 gpt gpt-4 gptq large-language-models llama llama2 大语言模型 mistral 自然语言处理 openai PyTorch torch transformers vicuna

Python 4

1 年前

SujanNeupane42 / NEPSE-Chatbot-Using-Retrieval-augmented-generation-and-reranking

#大语言模型#This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.

faiss Flask gptq langchain 大语言模型 Python retrieval-augmented-generation sentence-transformers vector-database

Jupyter Notebook 2

1 年前

amajji / LLM-Quantization-Techniques-Absmax-Zeropoint-GPTQ-GGUF

#大语言模型#LLM quantization techniques: absmax, zero-point, GPTQ and GGUF

ggml gguf gptq llamacpp 大语言模型 quantization quantization-aware-training

Jupyter Notebook 1

8 个月前

SujanNeupane42 / LLM_Quantization

#自然语言处理#Quantizing LLMs using GPTQ

gptq huggingface llms 机器学习自然语言处理 quantization

Jupyter Notebook 0

1 年前

ElDokmak / LLMs-variety

#大语言模型#Hands on some LLMs

大语言模型 huggingface-transformers langchain gptq llama-index openai groq llama mamba mistral mixtral

Jupyter Notebook 0

1 年前