inference · GitHub Topics

#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs

gpt 大语言模型 PyTorch llmops mlops model-serving transformer llm-serving inference llama amd rocm CUDA inferentia trainium tpu xpu hpu deepseek qwen

Python 44.48 k

4 小时前

hpcaitech / ColossalAI

#计算机科学#一个整合高效并行技术的AI大模型训练系统。

深度学习 hpc large-scale data-parallelism pipeline-parallelism model-parallelism 人工智能 big-model distributed-computing inference heterogeneous-training foundation-models

Python 40.77 k

3 天前

ggml-org / whisper.cpp

OpenAI Whisper语音识别模型，C++移植版本。

openai speech-to-text transformer Whisper inference speech-recognition

C++ 39.18 k

1 天前

deepspeedai / DeepSpeed

#计算机科学#DeepSpeed Chat: 一键式RLHF训练，让你的类ChatGPT千亿大模型提速省钱15倍

深度学习 PyTorch gpu 机器学习 billion-parameters data-parallelism model-parallelism inference pipeline-parallelism compression mixture-of-experts trillion-parameters zero

Python 37.88 k

15 小时前

google-ai-edge / mediapipe

#安卓#MediaPipe 是一个跨平台实时、流媒体机器学习解决方案。提供了人脸识别、人体姿势识别与跟踪、物体检测、自拍分割、即时运动跟踪等功能

mediapipe C++机器视觉深度学习 Android video-processing audio-processing mobile-development 机器学习 inference graph-framework graph-based calculator 框架 pipeline-framework stream-processing perception

C++ 29.33 k

21 小时前

Tencent / ncnn

#安卓#ncnn 是一个为手机端极致优化的高性能神经网络前向计算框架

inference high-preformance simd arm-neon 深度学习人工智能 Android iOS ncnn vulkan 神经网络 caffe mxnet PyTorch onnx darknet Tensorflow mlir Keras RISC-V

C++ 21.28 k

2 小时前

SYSTRAN / faster-whisper

#计算机科学#Faster Whisper transcription with CTranslate2

深度学习 inference quantization speech-recognition speech-to-text transformer Whisper openai

Python 15.36 k

23 天前

stas00 / ml-engineering

#大语言模型#Machine Learning Engineering Open Book

PyTorch slurm large-language-models 大语言模型机器学习 scalability transformers machine-learning-engineering mlops 人工智能 inference training

Python 13.38 k

5 天前

gvergnaud / ts-pattern

🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.

pattern-matching TypeScript ts pattern matching inference type-inference exhaustive conditions branching JavaScript

TypeScript 13.28 k

15 天前

sgl-project / sglang

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

CUDA inference llama llava 大语言模型 llm-serving moe PyTorch transformer vlm llama3 llama3-1 deepseek deepseek-llm deepseek-v3 deepseek-r1 deepseek-r1-zero

Python 13.12 k

10 分钟前

NVIDIA / TensorRT

#计算机科学#NVIDIA®TensorRT™是一款用于在NVIDIA GPU上进行高性能深度学习推理的SDK。此存储库包含TensorRT的开源组件。

tensorrt Nvidia 深度学习 inference gpu-acceleration

C++ 11.45 k

1 个月前

aws / amazon-sagemaker-examples

#计算机科学#Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

翻译 - 示例笔记本示例展示了如何在Amazon SageMaker中应用机器学习，深度学习和强化学习

sagemaker Amazon Web Services reinforcement-learning 机器学习深度学习 Example Jupyter Notebook mlops 数据科学 training inference

Jupyter Notebook 10.44 k

23 天前

huggingface / text-generation-inference

#自然语言处理#Large Language Model Text Generation Inference

bloom 自然语言处理 PyTorch inference gpt 深度学习 transformer falcon starcoder

Python 10 k

1 天前

triton-inference-server / server

#计算机科学#The Triton Inference Server provides an optimized cloud and edge inferencing solution.

翻译 - Triton Inference Server提供了针对NVIDIA GPU优化的云推理解决方案。

inference gpu 机器学习深度学习 cloud datacenter Edge

Python 9.05 k

15 小时前

dusty-nv / jetson-inference

#计算机科学#Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

翻译 - 您好AI World指南，介绍如何使用TensorRT和NVIDIA Jetson部署深度学习推理网络和深度视觉原语。

深度学习 inference 机器视觉 embedded image-recognition object-detection segmentation jetson jetson-tx1 jetson-tx2 jetson-xavier Nvidia tensorrt caffe video-analytics Robotics 机器学习 jetson-nano

C++ 8.22 k

6 个月前

openvinotoolkit / openvino

#自然语言处理#OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

翻译 - OpenVINO™工具包存储库

inference 深度学习 openvino 人工智能机器视觉 diffusion-models generative-ai llm-inference 自然语言处理 performance-boost speech-recognition stable-diffusion deploy-ai optimize-ai transformers yolo recommendation-system good-first-issue

C++ 8.11 k

1 天前

xorbitsai / inference

#大语言模型#Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...

ggml PyTorch chatglm 部署 flan-t5 大语言模型 wizardlm 人工智能机器学习 Whisper inference openai-api mistral gemma llama llamacpp vllm qwen llama3 glm4

Python 7.46 k

14 小时前

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

#人脸识别# 💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

face-detection arm inference mnn ncnn

Python 7.32 k

1 年前

gcanti / io-ts

Runtime type system for IO decoding/encoding

翻译 - 用于IO解码/编码的运行时类型系统

TypeScript validation inference types runtime

TypeScript 6.76 k

4 个月前

Trusted-AI / adversarial-robustness-toolbox

#计算机科学#Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

翻译 - 对抗性鲁棒性工具箱（ART）-用于机器学习安全性的Python库-规避，中毒，提取，推理

Python attack adversarial-machine-learning poisoning trusted-ai 人工智能 extraction adversarial-attacks adversarial-examples evasion inference 隐私 red-team blue-team 机器学习

Python 5.18 k

9 天前