#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence
#大语言模型#FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
#计算机科学#[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
#计算机科学#Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
#大语言模型#收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
#计算机科学#A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
#自然语言处理#awesome grounding: A curated list of research papers in visual grounding
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
A collection of resources on applications of multi-modal learning in medical imaging.
#计算机科学#Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
#自然语言处理#This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
#计算机科学#A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
#计算机科学#Towards Generalist Biomedical AI
#计算机科学#Reference mapping for single-cell genomics
#大语言模型#Paper List of Pre-trained Foundation Recommender Models
Deep learning based content moderation from text, audio, video & image input modalities.
Multimodal Sarcasm Detection Dataset