”gpt4v“ 的搜索结果 | GitHub 中文社区

gpt4v-browsing

@unconv

Web Scraping with GPT-4 Vision API and Puppeteer

JavaScript545

10 个月前

Google Bing GitHub

language-model generative-ai chatgpt cogvlm multi-modal gpt4v gpt4 agent llm pretrained-models

CogVLM2

THUDM@THUDM

GPT4V-level open-source multi-modal model based on Llama3-8B

cogvlm pretrained-models language-model multi-modal

Python2.15 k

3 个月前

GPT4V-Image-Captioner

@jiayev

Python791

2 个月前

GPT4VN

@telexyz

Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu

Python111

1 年前

GPT4Vis

@whwu95

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

Python207

6 个月前

AppAgent

@mnotgod96

#大语言模型#AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

agent ChatGPT generative-ai gpt4 gpt4v

Python5.19 k

4 个月前

gpt4V-scraper

@vdutts7

AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

JavaScript258

9 个月前

GPT4V-AD-Exploration

@PJLab-ADG

On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent

288

9 个月前

GPT4Video

@gpt4video

Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation

Python135

1 个月前

Gemini-vs-GPT4V

@Qi-Zhangyang

180

1 年前

Awesome-Multimodal-Prompts

@yzfly

Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.

107

1 年前

编程语音

JavaScript
Python