GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub

编程语言

”gpt-4v“ 的搜索结果

LLaVA
@haotian-liu

#大语言模型#LLaVA是一个具有 GPT-4V 级别功能的大语言和视觉模型助手

gpt-4聊天机器人ChatGPTllamamultimodal
Python22.97 k
1 年前

相关主题

gpt-4ChatGPTmultimodalGollavaopenaiChatGPT APIlarge-language-models聊天机器人gpt-3

Google   Bing   GitHub

vimGPT
@ishan0102

Browse the web with GPT-4V and Vimium

Python2.67 k
9 个月前
GPT-4V-Act
@ddupont808

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript1.05 k
7 个月前
Microsoft
SoM
Microsoft@microsoft

[arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs

Python1.42 k
1 年前
gpt4v-emotion
@zeroQiaoba

GPT-4V with Emotion

Python93
2 年前
ConnectAI-E/feishu-openai
feishu-openai
@ConnectAI-E

#大语言模型#🎒 飞书 ×(GPT-4 + GPT-4V + DALL·E-3 + Whisper)= 飞一般的工作体验 🚀 语音对话、角色扮演、多话题讨论、图片创作、表格分析、文档导出 🚀

ChatGPTfeishu-botGoopenaiChatGPT API
Go5.58 k
4 个月前
Medical-Help-App-using-GPT-4V
@AIAnytime

Medical Help App using GPT-4V

Python25
2 年前
RLAIF-V
@RLHF-V

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

聊天机器人gpt-4vmultimodal
Python384
2 个月前
GPT-4V_OCR
@SCUT-DLVCLab

Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)

Python124
2 年前
GPT4V-AD-Exploration
@PJLab-ADG

On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent

296
1 年前
MM-Navigator
@zzxslp

GPT-4V in Wonderland: LMMs as Smartphone Agents

gpt4vllm-agents
Python133
1 年前
Awesome-Multimodal-Prompts
@langgptai

#Awesome# Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.

ChatGPTgpt4multimodalprompt-engineeringprompts
256
2 年前
Q-Bench
@Q-Future

①[ICLR2024 Spotlight] (GPT-4V/Gemini-Pro/Qwen-VL-Plus+16 OS MLLMs) A benchmark for multi-modality LLMs (MLLMs) on low-level vision and visual quality assessment.

image-quality-assessmentlarge-language-modelslow-level-vision
Jupyter Notebook269
1 年前
Microsoft
OmniParser
Microsoft@microsoft

OmniParser 是一个屏幕解析工具,将用户屏幕截图解析为结构化的,易于理解的元素。以显著增强 GPT-4V 识别能力

Jupyter Notebook22.58 k
3 个月前
SeeAct
@OSU-NLP-Group

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

agent
Python759
5 个月前
HallusionBench
@tianyi-lab

#大语言模型#[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmarkvlmsgpt-4gpt-4vllava
Python286
8 个月前
go-openai
@sashabaranov

#大语言模型#OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go

Gogpt-3openaistreaming-api
Go10.13 k
12 天前
zotero-gpt
@MuiseDestiny

GPT Meet Zotero.

gptzoterozotero-plugin
TypeScript6.36 k
11 天前
OpenMMLab
Multimodal-GPT
OpenMMLab@open-mmlab

Multimodal-GPT

flamingogptgpt-4llamamultimodal
Python1.51 k
2 年前
gpt-go
@hanyuancheung

#大语言模型#OpenAI ChatGPT/GPT-4/GPT-3 SDK Go Client to Interact with the GPT-4/GPT-3 APIs.

ChatGPTChatGPT APIGogpt-3gpt-4
Go364
2 年前
NExT-GPT
@NExT-GPT

#大语言模型#Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model

ChatGPTfoundation-modelsgpt-4instruction-tuninglarge-language-models
Python3.53 k
2 个月前
Auto-GPT-Plugins存档
@Significant-Gravitas

Plugins for Auto-GPT

Python3.88 k
1 年前
loading...