Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
google
/
gemma-3n-E4B-it
like
683
Follow
Google
22.9k
Image-Text-to-Text
Transformers
Safetensors
gemma3n
automatic-speech-recognition
automatic-speech-translation
audio-text-to-text
video-text-to-text
conversational
arxiv:
17 papers
License:
gemma
Model card
Files
Files and versions
xet
Community
34
Train
Deploy
Use this model
how about those multimodal Benchmark like VideoBench ?
#9
by
LukeAlan
- opened
Jun 27
Discussion
LukeAlan
Jun 27
as a omni like model, those benchmarks performance is important.
See translation
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment