video-understanding · GitHub Topics

#计算机科学#OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

翻译 - OpenMMLab的下一代操作理解工具箱和基准

action-recognition temporal-action-localization PyTorch video-understanding tsn i3d slowfast ava spatial-temporal-action-detection benchmark tsm non-local 深度学习 openmmlab video-classification

Python 4.55 k

8 个月前

jinwchoi / awesome-action-recognition

#Awesome#A curated list of action recognition and related area resources

Awesome Lists action-recognition action-detection activity-recognition video-understanding video-recognition video-processing object-recognition pose-estimation

3.88 k

2 年前

OpenGVLab / Ask-Anything

#大语言模型#[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

captioning-videos ChatGPT gradio langchain video-question-answering video-understanding stablelm chat Video big-model foundation-models large-language-models

Python 3.21 k

3 个月前

mit-han-lab / temporal-shift-module

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

翻译 - [ICCV 2019] TSM：高效视频理解的时移模块。

acceleration low-latency temporal-modeling video-understanding efficient-model nvidia-jetson-nano tsm

Python 2.11 k

9 个月前

open-mmlab / mmaction

An open-source toolbox for action understanding based on PyTorch

翻译 - 一个基于PyTorch的用于理解动作的开源工具箱

action-recognition action-detection video-understanding PyTorch temporal-action-detection temporal-action-localization spatial-temporal-action-detection

Python 1.87 k

3 年前

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

foundation-models video-understanding vision-transformer action-recognition multimodal temporal-action-localization video-question-answering zero-shot-classification benchmark contrastive-learning self-supervised instruction-tuning video-clip

Python 1.8 k

4 天前

PaddlePaddle / PaddleVideo

基于模块化的设计，提供丰富的视频算法实现、产业级的视频算法优化与应用，包括安防、体育、互联网、媒体等行业的动作定位与识别、行为分析、智能封面、视频标注、视频打标签等，涵盖动作识别与视频分类、动作定位、动作检测、多模态文本视频检索等技术。

video-recognition tsm slowfast tsn bmn action-recognition youtube-8m kinetics400 video-understanding activitynet action-detection temporal-action-detection ava

Python 1.61 k

2 个月前

yjxiong / temporal-segment-networks

Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

action-recognition video-understanding

Python 1.55 k

4 年前

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

self-supervised-learning action-recognition video-understanding transformer vision-transformer PyTorch video-analysis neurips-2022

Python 1.47 k

1 年前

yjxiong / tsn-pytorch

#计算机科学#Temporal Segment Networks (TSN) in PyTorch

action-recognition 深度学习 video-understanding PyTorch

Python 1.07 k

6 年前

TheShadow29 / awesome-grounding

#自然语言处理#awesome grounding: A curated list of research papers in visual grounding

机器视觉自然语言处理 grounding Awesome Lists papers arxiv video-understanding captioning-videos embodied-agent multimodal-deep-learning language-grounding Bukkit

1.07 k

2 年前

PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

large-language-models video-understanding vision-language-model

Python 932

6 个月前

yjxiong / action-detection

temporal action detection with SSN

action-recognition action-detection video-understanding

Python 644

6 年前

OpenGVLab / VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

cvpr2023 foundation-model self-supervised-learning video-understanding action-detection action-recognition temporal-action-detection

Python 617

6 个月前

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

video-question-answering video-understanding

Python 611

4 个月前

henghuiding / MeViS

[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

multimodal-learning referring-expression-comprehension referring-expression-segmentation referring-video-object-segmentation video-understanding

Python 522

10 个月前

yoosan / video-understanding-dataset

#数据仓库#A collection of recent video understanding datasets, under construction!

video-understanding 数据集机器视觉 action-recognition

461

7 年前

chihyaoma / Activity-Recognition-with-CNN-and-RNN

Temporal Segments LSTM and Temporal-Inception for Activity Recognition

activity-recognition video-understanding torch convolutional-neural-networks

Lua 442

5 年前

MCG-NJU / TDN

[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition

action-recognition video-understanding video-classification cvpr2021 PyTorch temporal-modeling

Python 377

3 年前

aws-samples / swift-chat

#安卓#A lightning-fast, cross-platform AI chat application built with React Native.

TypeScript 365

7 天前