MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
#数据仓库#Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
翻译 - 访问和管理PyTorch和TensorFlow数据集的最快方法。轻松构建可伸缩的数据管道。Leading Data 2.0 http://activeloop.ai
#自然语言处理#ModelScope: bring the notion of Model-as-a-Service to life.
#大语言模型#[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
#大语言模型#Start building LLM-empowered multi-agent applications in an easier way.
a state-of-the-art-level open visual language model | 多模态预训练模型
#计算机科学#Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
翻译 - 在Pytorch中实现/复制OpenAI,OpenAI的文本到图像转换器
#自然语言处理#本项目为CLIP模型的中文版本,使用大规模中文数据进行训练(~2亿图文对),旨在帮助用户快速实现中文领域的图文特征&相似度计算、跨模态检索、零样本图片分类等任务
#搜索#Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Open Source Routing Engine for OpenStreetMap
翻译 - OpenStreetMap的开源路由引擎
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
#自然语言处理#Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
#自然语言处理#[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
#大语言模型#A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
#计算机科学#Represent, send, store and search multimodal data
翻译 - 非结构化数据的数据结构
GPT4V-level open-source multi-modal model based on Llama3-8B
Mixture-of-Experts for Large Vision-Language Models
#大语言模型#Project Page for "LISA: Reasoning Segmentation via Large Language Model"