#大语言模型#[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
#自然语言处理#awesome grounding: A curated list of research papers in visual grounding
#自然语言处理#[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. (CVPR 2023)
PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
#自然语言处理#[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
A tool for downloading from public image boards (which allow scraping) / preview your images & tags / edit your images & tags. Additional tabs for downloading other desired code repositories as well a...
Transcription and annotation interface for recorded audio or video files
#自然语言处理#An image and video description generator using an CNN-RNN based architecture.
M-VAD Names Dataset. Multimedia Tools and Applications (2019)
Caption generator for live camera feed
Sample app to add captions to an uploaded video. From api.video (https://api.video)
Online professional courses that are captioned and/or subtitled
Video Search using Natural Language
#大语言模型#Generate TikTok— and Instagram—tailored captions and hashtags for your videos using the power of some super creative robots up in the clouds ☁️ 🤖 💬 ☁️
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.