[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
#自然语言处理#本项目为CLIP模型的中文版本,使用大规模中文数据进行训练(~2亿图文对),旨在帮助用户快速实现中文领域的图文特征&相似度计算、跨模态检索、零样本图片分类等任务
#搜索#Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for ...
#大语言模型#日本語LLMまとめ - Overview of Japanese LLMs
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
#大语言模型#[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
#计算机科学#Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
#大语言模型#🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
#计算机科学#[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
#自然语言处理#A Framework of Small-scale Large Multimodal Models
#计算机科学#[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.
Official implementation of SEED-LLaMA (ICLR 2024).
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
#自然语言处理#CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
#自然语言处理#CLIPort: What and Where Pathways for Robotic Manipulation
翻译 - CLIPort:机器人操作的路径和路径
#自然语言处理#多模态中文LLaMA&Alpaca大语言模型(VisualCLA)