This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
翻译 - 这是“Swin Transformer:Hierarchical Vision Transformer using Shifted Windows”的官方实现。
Semantic Propositional Image Caption Evaluation
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
翻译 - 显示,参加和讲述|PyTorch图像字幕教程
High-resolution Networks for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm
A Tensorflow implementation of CNN-LSTM image caption generator architecture that achieves close to state-of-the-art results on the MSCOCO dataset.
Repository for experiments on MSCOCO for Unsupervised Hard Example Mining from Videos for Improved Object Detection(https://arxiv.org/abs/1808.04285)
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
SWA Object Detection
VarifocalNet: An IoU-aware Dense Object Detector
Keras Fully Convolutional Neural Network MSCOCO Food Segmentation
Deep Hashing Algorithm for Cross-modal Images and Text Retrieval