基于模块化的设计,提供丰富的视频算法实现、产业级的视频算法优化与应用,包括安防、体育、互联网、媒体等行业的动作定位与识别、行为分析、智能封面、视频标注、视频打标签等,涵盖动作识别与视频分类、动作定位、动作检测、多模态文本视频检索等技术。
[AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
[AAAI 2024] XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning.
A Human Action Recognition pipeline using MMAction2 and kinetics400 dataset. MMAction2 is an open-source toolbox for video understanding based on PyTorch.
IE Fitizens Corporate Project