#计算机科学# Jina 是一个基于深度学习的搜索框架,支持各种类型如图片,视频,长文本,PDF等。
#大语言模型# LLaVA是一个具有 GPT-4V 级别功能的大语言和视觉模型助手
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
翻译 - NeMo:用于对话式AI的工具包
#大语言模型# 全天候24小时 AI 屏幕和麦克风录制。构建具有完整上下文的 AI 应用。与 Ollama 配合使用。Rewind.ai 的替代品。开放。安全。您拥有自己的数据。Rust 开发。
Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
#计算机科学# A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
翻译 - 来自Facebook AI Research(FAIR)的视觉和语言多模式研究的模块化框架
#自然语言处理# This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under ...
#大语言模型# Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
#大语言模型# Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vis...
Build real-time multimodal AI applications 🤖🎙️📹
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
🪩 Create Disco Diffusion artworks in one line
#计算机科学# Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
#大语言模型# Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
#大语言模型# InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, ...
#安卓# Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
#计算机科学# Represent, send, store and search multimodal data
翻译 - 非结构化数据的数据结构
#大语言模型# InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output