inference-server · GitHub Topics

#计算机科学#Turn any computer or edge device into a command center for your computer vision projects.

机器视觉 inference-api inference-server vit yolov5 yolov8 jetson tensorrt classification instance-segmentation object-detection onnx 部署 Docker inference 机器学习 Python yolo11 agents

Python 1.62 k

1 天前

containers / ramalama

#大语言模型#The goal of RamaLama is to make working with AI boring.

人工智能 containers inference-server llamacpp podman vllm 大语言模型

Python 1.52 k

11 小时前

basetenlabs / truss

#计算机科学#The simplest way to serve AI/ML models in production

机器学习人工智能 easy-to-use inference-api inference-server model-serving Open Source packaging falcon stable-diffusion Whisper wizardlm

Python 975

5 小时前

pipeless-ai / pipeless

#计算机科学#An open-source computer vision framework to build and deploy apps in minutes

Rust 749

1 年前

underneathall / pinferencia

#自然语言处理#Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

人工智能 inference-server predict inference 深度学习机器学习 Python serving model-deployment huggingface PyTorch Tensorflow transformers 数据科学 model-serving 机器视觉自然语言处理 paddlepaddle

Python 557

2 年前

NVIDIA / gpu-rest-engine

#计算机科学#A REST API for Caffe using Docker and Go

caffe gpu inference inference-server Docker 深度学习

C++ 419

7 年前

BMW-InnovationLab / BMW-YOLOv4-Inference-API-GPU

#计算机科学#This is a repository for an nocode object detection inference API using the Yolov3 and Yolov4 Darknet framework.

yolov3 inference gpu API 深度学习机器视觉 bounding-boxes inference-server Docker REST API yolo 神经网络 Dockerfile yolov4 无代码

Python 280

3 年前

BMW-InnovationLab / BMW-YOLOv4-Inference-API-CPU

#计算机科学#This is a repository for an nocode object detection inference API using the Yolov4 and Yolov3 Opencv.

yolov3 inference API cpu 深度学习机器视觉 OpenCV object-detection Docker 深度神经网络神经网络 REST API inference-server bounding-boxes yolov4 无代码

Python 220

3 年前

containers / podman-desktop-extension-ai-lab

Work with LLMs on a local environment using containers

人工智能 containers inference-server llms local podman

TypeScript 214

1 天前

BMW-InnovationLab / BMW-TensorFlow-Inference-API-CPU

#计算机科学#This is a repository for an object detection inference API using the Tensorflow framework.

Tensorflow inference API cpu 深度学习 object-detection 机器视觉 Docker bounding-boxes Docker Image docker-ce inference-engine inference-server REST API

Python 183

3 年前

autodeployai / ai-serving

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints

onnx inference-server onnx-models inference

Scala 157

6 个月前

kibae / onnxruntime-server

#计算机科学#ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

人工智能机器学习 onnx onnxruntime 深度学习 inference-server nueral-networks CUDA contributions-welcome

C++ 154

1 个月前

vertexclique / orkhon

#计算机科学#Orkhon: ML Inference Framework and Server Runtime

inference-server 机器学习 Python Tensorflow async multiprocessing data-parallelism

Rust 149

4 年前

kf5i / k3ai

K3ai is a lightweight, fully automated, AI infrastructure-in-a-box solution that allows anyone to experiment quickly with Kubeflow pipelines. K3ai is perfect for anything from Edge to laptops.

kubeflow-pipelines Kubernetes k3s 机器学习 datascience 人工智能 Edge kubeflow inference-server

PowerShell 101

3 年前

notAI-tech / fastDeploy

#计算机科学#Deploy DL/ ML inference pipelines with minimal extra code.

深度学习 PyTorch serving falcon gevent Docker model-deployment model-serving http-server gunicorn triton-inference-server Python triton inference-server streaming-audio WebSocket

Python 97

5 个月前

RubixML / Server

#计算机科学#A standalone inference server for trained Rubix ML estimators.

机器学习 http-server infrastructure API model-deployment 微服务 JSON:API PHP REST API inference inference-engine ml-infrastructure inference-server

PHP 61

16 天前

friendliai / friendli-client

#大语言模型#Friendli: the fastest serving engine for generative AI

generative-ai 大语言模型 llm-inference llmops serving gpt gpt3 inference llama2 llm-serving llms inference-engine inference-server 人工智能 llm-ops mistral 机器学习 mlops stable-diffusion

Python 43

3 个月前

curtisgray / wingman

#下载器#Wingman is the fastest and easiest way to run Llama models on your PC or Mac.

人工智能聊天机器人 ChatGPT Linux llama llamacpp 大语言模型 local macOS Windows download 下载器 openai gpu gpu-acceleration gpu-monitoring inference inference-engine inference-server

TypeScript 42

10 个月前

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server -...

triton-inference-server tensorrt onnx PyTorch nvidia-docker inference-engine inference-server inference text-detection

Python 33

4 年前

haicheviet / fullstack-machine-learning-inference

#计算机科学#Fullstack machine learning inference template

Amazon Web Services cloudformation FastAPI full-stack inference-server Infrastructure as code 机器学习

Jupyter Notebook 30

1 年前