MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...
[CVPR'25] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
#大语言模型#Explore LLM model deployment based on AXera's AI chips
PicQ: Demo for MiniCPM-o 2.6 to answer questions about images using natural language.
軽量VLMのMiniCPM-V2.6のColaboratoryサンプル
VidiQA: Demo for MiniCPM-V 2.6 to answer questions about videos using natural language.