#计算机科学#pix2tex: Using a ViT to convert images of equations into LaTeX code.
翻译 - pix2tex:使用 ViT 将方程图像转换为 LaTeX 代码。
#Awesome#An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
#大语言模型#Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
#计算机科学#[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
#计算机科学#Turn any computer or edge device into a command center for your computer vision projects.
#计算机科学#:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
翻译 - :robot: PaddleViT:用于 PaddlePaddle 2.0+ 的最先进的 Visual Transformer 和 MLP 模型
#计算机科学#A paper list of some recent Transformer-based CV works.
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
#计算机科学#Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"
SimpleAICV:pytorch training and testing examples.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!
翻译 - FFCS课程注册使Vitians轻松自如。搜索课程并可视化时间表!
#计算机科学#i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (c...
#计算机科学#PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
Official Code of Paper "Reversible Column Networks" "RevColv2"
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
MoH: Multi-Head Attention as Mixture-of-Head Attention