Choi's picture

Choi

Michael-Y

·

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

Qwen/Qwen3-235B-A22B-Instruct-2507

liked a model 15 days ago

xai-org/grok-2

liked a model 21 days ago

vec-ai/lychee-rerank

View all activity

Organizations

None yet

upvoted a collection 2 months ago

T5Gemma

32 items • Updated Jul 10 • 69

upvoted an article 2 months ago

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

By

and 1 other •

Jul 1

• 116

upvoted a collection 2 months ago

Gemma 3n

4 items • Updated Jul 10 • 216

upvoted a collection 3 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Jul 21 • 225

upvoted a paper 6 months ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 96

upvoted a collection 6 months ago

EXAONE-Deep

EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 10 items • Updated Jul 7 • 93

upvoted a collection about 1 year ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Jul 21 • 368

upvoted 3 papers over 1 year ago

Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25, 2024 • 55

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 257

FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17, 2024 • 35

upvoted an article over 1 year ago

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

By

and 2 others •

Apr 19, 2024

• 181

upvoted a paper over 1 year ago

Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5, 2024 • 16

upvoted a collection over 1 year ago

Llama 3

8 items • Updated Apr 18, 2024 • 15

upvoted 7 papers over 1 year ago

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 44

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 101

LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

Paper • 2404.01331 • Published Mar 29, 2024 • 28

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1, 2024 • 23

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

Paper • 2305.05176 • Published May 9, 2023 • 6