SeeFun's picture

SeeFun

AI4Industry

·

seefun

AI & ML interests

None yet

Recent Activity

updated a dataset 3 days ago

AI4Industry/RxnBench

updated a collection 4 days ago

upvoted a collection 4 days ago

View all activity

Organizations

upvoted 2 collections 4 days ago

MolDet

Molecule Image Detection • 1 item • Updated 4 days ago • 1

MolParser

Molecule Image Recognition • 1 item • Updated 4 days ago • 1

upvoted an article about 1 month ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

Aug 11

• 73

upvoted a paper 4 months ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 116

upvoted a collection 5 months ago

SigLIP 2

OpenCLIP and timm SigLIP 2 models • 47 items • Updated Aug 5 • 23

upvoted 2 articles 7 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

By

and 6 others •

Feb 20

• 300

Article

Open-source DeepResearch – Freeing our search agents

By

and 4 others •

Feb 4

• 1.3k

upvoted an article 8 months ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

By

and 4 others •

Jan 16

• 51

upvoted a paper 9 months ago

MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild

Paper • 2411.11098 • Published Nov 17, 2024 • 1

upvoted 2 collections 12 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Jul 21 • 225

MobileNetV4 pretrained weights

Weights for MobileNet-V4 pretrained in timm • 17 items • Updated Aug 1 • 20

upvoted a paper about 1 year ago

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 170

upvoted an article about 1 year ago

Article

MobileNet Baselines

By

•

Jul 26, 2024

• 25

upvoted a collection about 1 year ago

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 62

upvoted a collection over 1 year ago

Searching for Better ViT Baselines

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 28 items • Updated Aug 1 • 18

upvoted an article over 1 year ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

By

and 2 others •

Apr 15, 2024

• 187

upvoted a paper over 1 year ago

Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

Paper • 2403.10301 • Published Mar 15, 2024 • 54