Tj
/

SmolVLM_Proxy

Text Generation

vision-language

Model card Files Files and versions

SmolVLM Final Merged

This is a fine-tuned version of SmolVLM-Instruct, optimized for conversational AI and vision-language tasks.

Model Details

Base Model: HuggingFaceTB/SmolVLM-Instruct
Training: Fine-tuned using LLaMA-Factory
Use Cases: Chat, vision understanding, multimodal reasoning
License: Apache 2.0

Usage

from transformers import AutoProcessor, AutoModelForVision2Seq
import torch

model = AutoModelForVision2Seq.from_pretrained("Tj/smolvlm-final-merged")
processor = AutoProcessor.from_pretrained("Tj/smolvlm-final-merged")

# Your inference code here

Downloads last month: 5

Safetensors

Model size

2.25B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tj/SmolVLM_Proxy

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Quantized

HuggingFaceTB/SmolVLM-Instruct

Finetuned

(109)

this model