#计算机科学#BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
翻译 - 使用 YOLO v5 和深度排序的实时多目标跟踪器
#大语言模型#Effortless data labeling with AI support from Segment Anything and other awesome models.
#自然语言处理#本项目为CLIP模型的中文版本,使用大规模中文数据进行训练(~2亿图文对),旨在帮助用户快速实现中文领域的图文特征&相似度计算、跨模态检索、零样本图片分类等任务
#搜索#Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
开放源码的无App推送服务,iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备
#计算机科学#OpenMMLab Pre-training Toolbox and Benchmark
翻译 - OpenMMLab图像分类工具箱和基准
#自然语言处理#中文nlp解决方案(大模型、数据、模型、训练、推理)
#计算机科学#Collection of AWESOME vision-language models for vision tasks
#计算机科学#Easily compute clip embeddings and build a clip retrieval system with them
#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
#大语言模型#Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for ...
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
#向量搜索引擎#Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
#自然语言处理#This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.
#安卓#Stable Diffusion in NCNN with c++, supported txt2img and img2img