#计算机科学#OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
翻译 - OpenMMLab的下一代操作理解工具箱和基准
#Awesome#A curated list of action recognition and related area resources
#大语言模型#[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
翻译 - [ICCV 2019] TSM:高效视频理解的时移模块。
An open-source toolbox for action understanding based on PyTorch
翻译 - 一个基于PyTorch的用于理解动作的开源工具箱
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
基于模块化的设计,提供丰富的视频算法实现、产业级的视频算法优化与应用,包括安防、体育、互联网、媒体等行业的动作定位与识别、行为分析、智能封面、视频标注、视频打标签等,涵盖动作识别与视频分类、动作定位、动作检测、多模态文本视频检索等技术。
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
#计算机科学#Temporal Segment Networks (TSN) in PyTorch
#自然语言处理#awesome grounding: A curated list of research papers in visual grounding
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
temporal action detection with SSN
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
#数据仓库#A collection of recent video understanding datasets, under construction!
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)