florence-2 · GitHub Topics

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision transformers vision-and-language vqa qwen2-vl

Python 2.54 k

6 天前

jhc13 / taggui

Tag manager and captioner for image datasets

image-captioning pyside6 stable-diffusion llava cogvlm florence-2

Python 959

2 个月前

D-Ogi / WatermarkRemover-AI

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly P...

florence-2 inpainting

Python 163

3 个月前

autodistill / autodistill-grounded-sam-2

Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.

florence-2

Python 119

8 个月前

Ravi-Teja-konda / Surveillance_Video_Summarizer

#大语言模型#VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for que...

人工智能 ChatGPT florence-2 gpt-4 gradio gradio-python-llm huggingface summarization Video vision-and-language vlm

Python 106

7 个月前

Damarcreative / rem-wm

Watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.

florence-2 watermark

Python 72

3 个月前

autodistill / autodistill-florence-2

Use Florence 2 to auto-label data for use in training fine-tuned object detection models.

florence-2 object-detection zero-shot-object-detection

Python 63

8 个月前

retkowsky / florence-2

Florence-2

Azure florence-2

Jupyter Notebook 61

2 个月前

anyantudre / Florence-2-Vision-Language-Model

#计算机科学#Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

机器视觉深度学习 florence-2 huggingface vision-language vision-language-model vision-transformer vision-transformer-models

Jupyter Notebook 42

9 个月前

fireicewolf / wd-llm-caption-cli

A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.

qwen2-vl florence-2

Python 34

1 个月前

sayedmohamedscu / Vision-language-models-VLM

vision language models finetuning notebooks & use cases (paligemma - florence .....)

colab-notebook 机器视觉 finetuning multimodal paligemma vlm florence-2

Jupyter Notebook 19

7 个月前

jacobmarks / fiftyone_florence2_plugin

Run SOTA Vision-Language Model Florence-2 on your data!

机器视觉 florence-2 机器学习 transformer vision-language-model

Jupyter Notebook 10

17 天前

mithunparab / text2segment_video

Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect...

florence-2 optical-flow raft segment-anything

Python 10

2 个月前

Iteranya / AktivaAI

Local LLM Discord Bot

人工智能 discord-bot florence-2 llama multimodal roleplay 聊天机器人

Python 10

10 天前

nguyennpa412 / simple-multimodal-ai

#大语言模型#Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features

机器视觉 Docker gradio text-to-speech visual-question-answering vlm 大语言模型 mllm florence-2

Python 5

8 个月前

sitamgithub-MSIT / TextSnap

TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.

人工智能 florence-2 gradio gradio-interface huggingface-transformers optical-character-recognition vision-language-model Python

Python 4

5 个月前

regiellis / ecko-cli

ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create ...

人工智能命令行界面 florence-2 generative-ai huggingface-transformers image-classification 图像处理 onnxruntime

Python 4

5 个月前

PranayLendave / text2video_synopsis

Video Synopsis: Intelligent Video Object Summarization using Florence/OWL-ViT and SAM. It uses OWL-ViT or Florence 2 for object detection, SAM for segmentation, and a custom video synopsis algorithm t...

florence-2 sam

Python 2

4 个月前