#大语言模型#Adding guardrails to large language models.
#大语言模型#InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, ...
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
#Awesome#Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System
Official implementation of SEED-LLaMA (ICLR 2024).
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
#计算机科学#The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
The Clay Foundation Model - An open source AI model and interface for Earth
Towards a general-purpose foundation model for computational pathology - Nature Medicine
Visual Med-Alpaca is an open-source, multi-modal foundation model designed specifically for the biomedical domain, built on the LLaMa-7B.
#大语言模型#Paper List of Pre-trained Foundation Recommender Models
Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.
3D Occupancy Prediction Benchmark in Autonomous Driving
A vision-language foundation model for computational pathology - Nature Medicine
#Awesome#A curated list of awesome leaderboard-oriented resources for foundation models
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.