#计算机科学#pix2tex: Using a ViT to convert images of equations into LaTeX code.
翻译 - pix2tex:使用 ViT 将方程图像转换为 LaTeX 代码。
#Awesome#An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
#大语言模型#Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
#计算机科学#[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
#计算机科学#Turn any computer or edge device into a command center for your computer vision projects.
#计算机科学#A paper list of some recent Transformer-based CV works.
#计算机科学#:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
翻译 - :robot: PaddleViT:用于 PaddlePaddle 2.0+ 的最先进的 Visual Transformer 和 MLP 模型
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
#计算机科学#Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"
SimpleAICV:pytorch training and testing examples.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
#计算机科学#i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (c...
FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!
翻译 - FFCS课程注册使Vitians轻松自如。搜索课程并可视化时间表!
#计算机科学#PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
Official Code of Paper "Reversible Column Networks" "RevColv2"
MoH: Multi-Head Attention as Mixture-of-Head Attention
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"