FlashInfer: Kernel Library for LLM Serving
#大语言模型#LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Serving multiple LoRA finetuned LLM as one
Serverless LLM Serving for Everyone.
#自然语言处理#Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
#大语言模型#Dynamic RAG for enterprise. Ready to run with Docker,⚡in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
Yahoo! Cloud Serving Benchmark
翻译 - 雅虎!云服务基准
Kubernetes-based, scale-to-zero, request-driven compute
翻译 - 基于Kubernetes,从零扩展到请求驱动的计算
favicon serving middleware
Serving TensorFlow models with TensorFlow Serving 📙
faiss serving :)
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)
A framework for serving GraphQL from Laravel
翻译 - Laravel提供的GraphQL服务框架
A low-latency prediction-serving system
翻译 - 低延迟预测服务系统