fine-blip-qa-model

This is a fine-tuned BLIP model for Visual Question Answering (VQA).

Model Description

This model is based on the Salesforce/blip-vqa-base (or similar) architecture and has been fine-tuned. It takes an image and a question as input and generates an answer.

Intended Use

This model is intended for demo purposes related to visual question answering tasks. It can be used to answer questions about the content of images.

How to Use


# Example Usage
from transformers import BlipProcessor, BlipForQuestionAnswering
from PIL import Image
import requests

model_id = "suc1dalspinach/fine_blip_gym"

# 2. Load the processor and model
processor = BlipProcessor.from_pretrained(model_id)
model = BlipForQuestionAnswering.from_pretrained(model_id)

# 3. Prepare your input (image and question)
image = Image.open("path/to/your/image.jpg").convert("RGB")

question = "What is the name of the gym equipment?"

# 4. Process the inputs
inputs = processor(images=image, text=question, return_tensors="pt", truncation=True)

# 5. Generate the answer
out = model.generate(**inputs)

# 6. Decode and print the answer
answer = processor.decode(out[0], skip_special_tokens=True)
print(f"Question: {question}")
print(f"Answer: {answer}")

suc1dalspinach
/

fine_blip_gym

fine-blip-qa-model

Model Description

Intended Use

How to Use

Model tree for suc1dalspinach/fine_blip_gym

Dataset used to train suc1dalspinach/fine_blip_gym