PaddleClas 是一个为工业界和学术界所准备的图像识别任务工具集
#计算机科学#PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
#计算机科学#HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
#人脸识别#Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
A PaddlePaddle version image model zoo.
(Unofficial) PyTorch implementation of Training Vision Transformers for Image Retrieval(El-Nouby, Alaaeldin, et al. 2021).
[CVPR 2024] Code for our Paper "DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets"
2D Human Pose estimation using transformers. Implementation in Pytorch
Pytorch implementation of some vision transformers, trained on CIFAR-10.
[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Image Classification Tutorial: ConvNext--> 98.8% on CIFAR10 + 92.4% on CIFAR100; ResNet18 -- 95.6% on CIFAR10 + 79.1% on CIFAR100
#计算机科学#This is a warehouse for DeiT-pytorch-model, can be used to train your image dataset
#计算机科学#The analysis of several vision-based transformers is the main emphasis of this project, which also analyzes their distinctive properties and evaluates how well they work using a common dataset. The st...
#计算机科学#VisionTransformer for Tensorflow2
#自然语言处理#Final assignment in the NLP course at the Technion (IEM097215). In this assignment we propose a novel architecture to handle both Text-to-Image translation and Image-to-Text translation tasks on paire...
Image classification with DeiT model, including data preprocessing, k-fold CV, early stopping and model saving.
This repository holds the downstream task of Face Mask Classification performed on Self Currated Custom Dataset with various State of the Art deep learning models like ViT, BeIT, DeIT, LeViT, ConvNeXt...
Implementation of a Paper related to Vision Transformer
Image captioning with pretrained encoder on MSCOCO.
This is a warehouse for Agent-Attention-Models based on pytorch framework, can be used to train your image datasets.