#计算机科学#A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
#大语言模型#A curated list for Efficient Large Language Models
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
#大语言模型#A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcom...
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.
[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.
[NeurIPS 2023 Spotlight] This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.
The official implementation of the ICML 2023 paper OFQ-ViT
#大语言模型#Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.
#计算机科学#A tutorial of model quantization using TensorFlow
PyTorch implementation of "BiDense: Binarization for Dense Prediction," A binary neural network for dense prediction tasks.
AI Engineering: Annotated NBs to dive into Self-Attention, In-Context Learning, RAG, Knowledge-Graphs, Fine-Tuning, Model Optimization, and many more.
Enterprise multi-agent framework for secure, borderless data collaboration with zero-trust and federated learning-lightweight edge-ready.
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instea...
#大语言模型#Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.
🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed...
This project explores generating high-quality images using depth maps and conditioning techniques like Canny edges, leveraging Stable Diffusion and ControlNet models. It focuses on optimizing image ge...