”llm-inference“ 的搜索结果

llama.cpp

Georgi Gerganov@ggerganov

Facebook 的 LLaMA 模型在 C/C++ 中的移植

llama ggml

C++68.49 k

24 分钟前

Google Bing GitHub

pytorch llm-evaluation ai large-language-models model-serving deep-learning gpt machine-learning llama llm

Awesome-LLM-Inference

@DefTruth

📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉

2.9 k

2 天前

web-llm

@mlc-ai

#大语言模型#浏览器中运行大模型，无需 server 支持

深度学习 llm tvm webgpu webml

TypeScript13.82 k

2 天前

mistral.rs

@EricLBuehler

#大语言模型#Blazingly fast LLM inference.

llm Rust

Rust4.53 k

39 分钟前

BitNet

Microsoft@microsoft

Official inference framework for 1-bit LLMs

C++11.47 k

18 天前

vllm

@vllm-project

#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs

gpt llm PyTorch llmops mlops

Python31.03 k

25 分钟前

gpt4all

@nomic-ai

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

llm-inference ai-chat

C++70.91 k

2 天前

Llama-2-Open-Source-LLM-CPU-Inference

@kennethleungty

#自然语言处理#Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

cpu cpu-inference 深度学习 faiss langchain

Python952

1 年前

optillm

@codelion

Optimizing inference proxy for LLMs

Python1.67 k

8 小时前

aphrodite-engine

@PygmalionAI

#计算机科学#Large-scale LLM inference engine

API inference-engine 机器学习

Python1.15 k

2 天前

InferLLM

@MegEngine

a lightweight LLM model inference framework

C++700

8 个月前

TinyChatEngine

MIT HAN Lab@mit-han-lab

TinyChatEngine: On-Device LLM Inference Library

C++694

5 个月前

llm-inference

@aniketmaurya

Large Language Model (LLM) Inference API and Chatbot

Python123

8 个月前

Anima

@lyogavin

#大语言模型#33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

chinese-nlp finetune generative-ai instruct-gpt instruction-set

Jupyter Notebook3.51 k

4 个月前

chat.petals.dev

@petals-infra

💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client

Python307

7 个月前

mem0

@mem0ai

#大语言模型#The Memory layer for your AI apps

人工智能 ChatGPT llm Python chatbots

Python23.02 k

3 天前

marlin

@IST-DASLab

#大语言模型#FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

4bit Kernel llm quantization

Python639

3 个月前

lightllm

@ModelTC

#自然语言处理#LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

深度学习 gpt llama llm model-serving

Python1.93 k

6 个月前

willow-inference-server

@toverainc

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

Python390

6 个月前

llama

@meta-llama

LLaMA模型的推理代码

Python56.6 k

3 个月前

llm-attacks

@llm-attacks

Universal and Transferable Attacks on Aligned Language Models

Python3.48 k

4 个月前

SillyTavern

@SillyTavern

#大语言模型#LLM Frontend for Power Users.

人工智能 characters chat llm openai

JavaScript8.52 k

2 小时前

garak

NVIDIA Corporation@NVIDIA

the LLM vulnerability scanner

人工智能 llm-evaluation llm-security security-scanners vulnerability-assessment

Python2.9 k

1 天前

LLM-Finetuning

Ashish Patel@ashishpatel26

#大语言模型#LLM Finetuning with peft

falcon fine-tuning huggingface llama llama2

Jupyter Notebook2.18 k

5 个月前

LongLM

@datamllab

#大语言模型#[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

context-window large-language-models llm longlm self-extend

Python616

6 个月前

deepeval

@confident-ai

The LLM Evaluation Framework

evaluation-metrics evaluation-framework llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Python3.83 k

1 天前

LLM-As-Chatbot

@deep-diver

LLM as a Chatbot Service

Python3.29 k

1 年前

ipex-llm

intel-analytics@intel-analytics

#大语言模型#Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such ...

PyTorch llm transformers gpu

Python6.75 k

10 小时前

”llm-inference“ 的搜索结果

编程语音