Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
#大语言模型#Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
#计算机科学#Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Real-time and accurate open-vocabulary end-to-end object detection
[ICLR 2023 Oral] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
#自然语言处理#The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul openai key. If keys exceed plan and are invalid, please tell us. T...
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Official implementation of "Segment Any Anomaly without Training via Hybrid Prompt Regularization (SAA+)".
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
[SIGGRAPH Asia 2024 (Journal Track)] StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
#时序数据库#A unified multi-task time series model.
[WACV 2025] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
#自然语言处理#Zero and Few shot named entity & relationships recognition
API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend
PyTorch Implementation of StyleSinger(AAAI 2024): Style Transfer for Out-of-Domain Singing Voice Synthesis
#Awesome#A curated list of awesome instruction tuning datasets, models, papers and repositories.
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control