model-serving · GitHub Topics

#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs

gpt 大语言模型 PyTorch llmops mlops model-serving transformer llm-serving inference llama amd rocm CUDA inferentia trainium tpu xpu hpu deepseek qwen

Python 44.52 k

6 小时前

bentoml / BentoML

#大语言模型#The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

翻译 - 轻松进行模型服务

model-serving mlops llmops generative-ai llm-inference 深度学习 llm-serving 机器学习 Python multimodal ml-engineering 大语言模型

Python 7.6 k

18 小时前

ahkarami / Deep-Learning-in-Production

#计算机科学#In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

深度学习深度神经网络 Python PyTorch tesnorflow Keras mxnet caffe2 production serving C++model-serving 教程 Flask REST API React Angular Tensorflow

4.33 k

5 个月前

kserve / kserve

#计算机科学#Standardized Serverless ML Inference Platform on Kubernetes

knative 机器学习 model-interpretability model-serving istio kubeflow 人工智能 Tensorflow PyTorch scikit-learn xgboost Kubernetes service-mesh Hacktoberfest mlops genai llm-inference

Python 4.06 k

6 小时前

FedML-AI / FedML

#计算机科学#FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on a...

federated-learning 深度学习 distributed-training edge-ai 机器学习 on-device-training inference-engine mlops model-deployment model-serving ai-agent

Python 3.83 k

1 个月前

ModelTC / lightllm

#自然语言处理#LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

深度学习 gpt llama 大语言模型 model-serving 自然语言处理 openai-triton

Python 3.11 k

3 天前

predibase / lorax

#大语言模型#Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

fine-tuning gpt llama 大语言模型 llm-inference llm-serving llmops lora model-serving PyTorch transformers

Python 2.94 k

1 个月前

HuaizhengZhang / AI-Infra-from-Zero-to-Hero

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...

large-language-models ai-infra genai mlsys model-serving model-training

2.86 k

8 个月前

tensorchord / envd

🏕️ Reproducible development environment

developer-tools development-environment Docker buildkit Hacktoberfest llmops mlops model-serving

Go 2.11 k

11 天前

microsoft / aici

#大语言模型#AICI: Prompts as (Wasm) Programs

人工智能 Rust WebAssembly wasmtime inference language-model 大语言模型 llm-framework llm-inference llm-serving llmops model-serving transformer

Rust 2.01 k

3 个月前

beclab / Olares

Olares: An Open-Source Sovereign Cloud OS for Local AI

Kubernetes 自托管 home-automation homelab edge-ai homeserver local-ai ai-agents model-serving nas mcp

Shell 1.98 k

8 小时前

mlrun / mlrun

#计算机科学#MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...

mlops Python 数据科学机器学习 data-engineering experiment-tracking model-serving workflow Kubernetes

Python 1.51 k

2 天前