fine-blip-qa-model
This is a fine-tuned BLIP model for Visual Question Answering (VQA).
Model Description
This model is based on the Salesforce/blip-vqa-base (or similar) architecture and has been fine-tuned. It takes an image and a question as input and generates an answer.
Intended Use
This model is intended for demo purposes related to visual question answering tasks. It can be used to answer questions about the content of images.
How to Use
# Example Usage
from transformers import BlipProcessor, BlipForQuestionAnswering
from PIL import Image
import requests
model_id = "suc1dalspinach/fine_blip_gym"
# 2. Load the processor and model
processor = BlipProcessor.from_pretrained(model_id)
model = BlipForQuestionAnswering.from_pretrained(model_id)
# 3. Prepare your input (image and question)
image = Image.open("path/to/your/image.jpg").convert("RGB")
question = "What is the name of the gym equipment?"
# 4. Process the inputs
inputs = processor(images=image, text=question, return_tensors="pt", truncation=True)
# 5. Generate the answer
out = model.generate(**inputs)
# 6. Decode and print the answer
answer = processor.decode(out[0], skip_special_tokens=True)
print(f"Question: {question}")
print(f"Answer: {answer}")
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for suc1dalspinach/fine_blip_gym
Base model
Salesforce/blip-vqa-base