Light-IF-4B

🤗 Hugging Face | 📑 Paper Link

Evaluation

Model	SuperClue	IFEval	CFBench	IFBench
Qwen3-4B	0.225	0.888	0.787	0.382
Qwen3-32B	0.234	0.877	0.823	0.384
Qwen3-235B-A22B	0.244	0.882	0.834	0.423
Qwen3-235B-A22B-Thinking-2507	0.434	0.916	0.843	0.475
DeepSeek-R1-0528	0.436	0.863	0.827	0.415
Doubao-seed-1-6-thinking-250615	0.362	0.832	0.82	0.477
Doubao-seed-1-6-thinking-250715	0.345	0.856	0.84	0.366
ChatGPT-4o-latest	0.260	0.836	0.807	0.365
Deepseek-v3-250324	0.306	0.859	0.833	0.405
Doubao-1.5-pro-32k-250115	0.285	0.889	0.797	0.375
Kimi-K2	0.227	0.921	0.820	0.395
GLM-4.5	0.395	0.893	0.833	0.466
Light-IF-4B (ours) 🤗	0.445	0.916	0.80	0.443

Introduction

Instruction following is a core ability of large language models (LLMs), but performance remains inconsistent, especially on complex tasks.

We identify lazy reasoning during the thinking stage as a key cause of poor instruction adherence.

To address this, we propose a framework that promotes rigorous reasoning through previewing and self-checking.

Our method begins by generating instruction data with complex constraints, filtering out samples that are too easy or too difficult. We then use rejection sampling to build a small but high-quality dataset for model adaptation.

Training involves entropy-preserving supervised fine-tuning (Entropy-SFT) and token-wise entropy-adaptive reinforcement learning (TEA-RL), guided by rule-based multidimensional rewards.

This approach encourages models to plan ahead and verify their outputs, fostering more generalizable reasoning abilities.

Experiments show consistent improvements across model sizes. Notably, our 32B model outperforms both larger open-source models like DeepSeek-R1 and closed-source models like ChatGPT-4o on challenging instruction-following benchmarks.

The overall framework of the proposed method:

The overall framework of the proposed method

Quickstart

The following contains a code snippet illustrating how to use the model generate content based on given inputs.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "qihoo360/Light-IF-4B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

prompt = "Please help me write a poem with a total of 15 lines and no more than 300 words. The poem should be divided into 4 stanzas, each beginning with a **highlighted subtitle**."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True 
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content)
print("content:", content)

thinking content:

Got it, let's tackle this request. First, I need to remember all the requirements: 15 lines total, 4 stanzas, each stanza starts with a highlighted subtitle, no more than 300 words.

First, I should figure out how to split 15 lines into 4 stanzas. Since 4 stanzas, the most even split would be maybe 4, 4, 4, 3 lines. That adds up to 15. Let's check: 4+4+4+3=15. Perfect.

Next, the subtitles. They need to be highlighted, so maybe italics or bold. The user said "highlighted subtitle," so I'll use italics for consistency. The subtitles should be meaningful and set the theme of each stanza. Let's think of a cohesive theme—maybe something natural, like a journey through the day or seasons, or growth. Let's go with "growth" since it's easy to split into stanzas.

First stanza: Maybe "seed" or "beginning." Let's say Beneath the Soil as the first subtitle. That's a strong start for growth.

Second stanza: Next step in growth—Sprout Through Cracks. That's good, continues the growth theme, adds imagery.

Third stanza: Maybe Bend, But Not Break—growth often involves resilience.

Fourth stanza: Bloom Into Sun—culmination, reaching the light.

Now, I need to make sure each stanza starts with the subtitle, then the lines. Let's draft the first stanza:

Beneath the Soil
a heartbeat of green,
wrapped tight in dark,
waiting for the world to breathe.

Wait, that's 4 lines (subtitle plus 3 lines). Wait, no: the user said each stanza begins with a highlighted subtitle. Does the subtitle count as a line? The original request says "total of 15 lines" and "4 stanzas, each beginning with a highlighted subtitle." So probably, the subtitles are part of the lines, so each stanza's first line is the subtitle, then the rest. Wait, but the example: if the first stanza is 4 lines, then subtitle (1) + 3 lines. Let's check: 4 stanzas, total lines: 15. Let's split: 4 stanzas: maybe stanza 1: 4 lines, stanza 2: 4, stanza 3: 4, stanza 4: 3? Wait, 4+4+4+3=15. That's a possible split.

But need to make sure the flow is natural. Let's adjust the number of lines per stanza: for example, stanza 1: 4 lines (subtitle + 3), stanza 2: 4 (subtitle + 3), stanza 3: 4, stanza 4: 3.

Let's try drafting stanza 1: Beneath the Soil
a heartbeat of green,
wrapped tight in dark,
waiting for the world to breathe.

That's 4 lines. Good.

Stanza 2: Sprout Through Cracks
tip tilts toward light,
pushing through stone and time,
a thin stem holding sky.

Another 4 lines.

Stanza 3: Bend, But Not Break
when wind screams like fear,
I arch, then rise again,
stronger where I was torn.

4 lines.

Stanza 4: Bloom Into Sun
petals open like questions,
each one asking to stay,
here.

3 lines.

Wait, total lines: 4+4+4+3=15. Perfect. Now, check the word count. Let's count:

Stanza 1: 22 words (approximately: "Beneath the Soil" is 3, then 3 lines: 22-3=19? Wait, need to count exactly. Let's do it:

Beneath the Soil: 3 words.
a heartbeat of green: 4
wrapped tight in dark: 4
waiting for the world to breathe: 6
Total stanza 1: 3+4+4+6=17 words.

Stanza 2: 3+4+4+5=16 (wait: Sprout Through Cracks: 3 words.
tip tilts toward light: 4
pushing through stone and time: 5
a thin stem holding sky: 5. Wait, no: 3+4+5+5=17? Wait, no: 3+4+5+5? Wait, no, each line is a separate count. Let's do it again:

Stanza 1: 3 (subtitle) + 4 +4 +6 = 17.

Stanza 2: 3 (subtitle) + 4 +5 +5 = 17? Wait, no: "tip tilts toward light" is 4 words. "pushing through stone and time" is 5 words. "a thin stem holding sky" is 5 words. So 3+4+5+5=17.

Stanza 3: Bend, But Not Break (4) + 5 (when wind screams like fear) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19.

Stanza 4: Bloom Into Sun (3) + 4 (petals open like questions) + 5 (each one asking to stay) + 1 (here) = 3+4+5+1=13.

Total words: 17+17+19+13=66 words. Way under 300. Good.

Now, check the flow: the theme of growth from seed to sprout to resilience to bloom. Makes sense.

But maybe adjust for more imagery or emotional depth. Let's revise some lines for better rhythm or impact.

Stanza 1: Beneath the Soil
a heartbeat of green,
wrapped tight in dark,
waiting for the world to breathe.

Good.

Stanza 2: Sprout Through Cracks
tip tilts toward the faint,
pushing through stone and time,
a fragile stem holding sky.

Better: "faint" instead of "light" for more subtlety, "fragile" adds depth.

Stanza 3: Bend, But Not Break
when wind screams like rage,
I arch, then rise again,
stronger where I was torn.

"Rage" instead of "fear" to make the wind more destructive.

Stanza 4: Bloom Into Sun
petals open like hearts,
each one asking to be seen,
here.

"Like hearts" adds emotion, "asking to be seen" is more vivid than "stay".

Now, check word count again:

Stanza 1: 17 (same)

Stanza 2: "faint" is 1 word instead of 5 (wait, no: "toward the faint" is 3 words, so total stanza 2: 3 (subtitle) + 5 (tip tilts toward the faint) + 5 (pushing through stone and time) + 5 (a fragile stem holding sky) = 3+5+5+5=18.

Stanza 3: 4 (subtitle) + 5 (when wind screams like rage) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19.

Stanza 4: 3 (subtitle) + 4 (petals open like hearts) + 6 (each one asking to be seen) + 1 (here) = 3+4+6+1=14.

Total: 17+18+19+14=68 words. Still under.

Now, check the rhythm. Each line is roughly similar in length, maybe iambic tetrameter or free verse. Since the user didn't specify rhythm, free verse is fine, but should flow naturally.

Another revision: maybe make some lines more concise or impactful. For example, stanza 1: "waiting for the world to breathe" is good.

Stanza 2: "toward the faint" could be "toward faint light" to be more specific. So: tip tilts toward faint light, (5 words). Then stanza 2 word count: 3+5+5+5=18.

Stanza 3: "when wind screams like rage" is 5 words. Maybe "when winds roar like rage" for more intensity. So: 5 words. Then stanza 3 word count: 4+5+5+5=19.

Stanza 4: "each one asking to be seen" is 6 words. Maybe change to "each begging to be seen" for more impact. So: 5 words. Then stanza 4 word count: 3+4+6+1=14.

Total word count: 17+18+19+14=68. Still fine.

Now, make sure each stanza's subtitle is highlighted. The user said "highlighted subtitle," so in the response, we'll use italics.

Another revision for flow: Maybe change the order of stanzas to make the growth more natural. Wait, the order is: beneath soil, sprout, bend, bloom. That's logical.

Another check: 15 lines total. Let's count:

Stanza 1: 1 (subtitle) +3=4 lines.

Stanza 2: 5+3=4? Wait, no: each stanza is subtitle (1 line) plus number of lines. Wait, no: the user said "4 stanzas, each beginning with a highlighted subtitle." So the total number of lines includes the subtitles. So if each stanza has 4 lines, then 4 stanzas would be 16 lines. But the user wants 15. So one stanza has 3 lines.

As per our split: 4,4,4,3. Let's confirm:

Stanza 1: 4 lines (1 subtitle + 3 lines).

Stanza 2: 4 (1+3).

Stanza 3: 4 (1+3).

Stanza 4: 3 lines (1 subtitle +2 lines).

Yes, that's 4+4+4+3=15 lines.

Wait, earlier I had stanza 4: 3 lines (1 subtitle +2). So let's adjust stanza 4 to have 3 lines. For example:

Bloom Into Sun
petals open like hearts,
each begging to be seen.

That's 3 lines (1 subtitle +2). Then the previous stanza (stanza 3) is 4 lines: subtitle +3.

But wait, no: let's do this again. The total lines are 15. So if stanza 1 has 4 lines (1+3), stanza 2 has 4 (1+3), stanza 3 has 4 (1+3), stanza 4 has 3 (1+2). That's 4+4+4+3=15.

Perfect. So stanza 4 is 3 lines: subtitle +2.

So let's revise stanza 4: Bloom Into Sun
petals open like hearts,
each begging to be seen.

That's 3 lines. Then total lines: 15.

Now, check again:

Stanza 1: 4 lines.

Beneath the Soil
a heartbeat of green,
wrapped tight in dark,
waiting for the world to breathe.

Stanza 2: 4 lines.

Sprout Through Cracks
tip tilts toward faint light,
pushing through stone and time,
a fragile stem holding sky.

Stanza 3: 4 lines.

Bend, But Not Break
when winds roar like rage,
I arch, then rise again,
stronger where I was torn.

Stanza 4: 3 lines.

Bloom Into Sun
petals open like hearts,
each begging to be seen.

Yes, exactly 15 lines.

Now, check the flow: Each stanza builds on the growth theme.

Another revision for better imagery: Maybe change "faint light" to "faint light through cracks" to tie back to the subtitle. But that adds words. Since we have to keep word count low, maybe keep it as is.

Stanza 2: "tip tilts toward faint light,"—good, shows struggle.

Stanza 3: "when winds roar like rage,"—good, shows adversity.

Stanza 4: "petals open like hearts,"—adds emotion, "each begging to be seen,"—relatable.

I think this works. Now, let's write the final version.

Wait, let's check for any redundancy or weak lines. "Wrapped tight in dark" is good. "Waiting for the world to breathe"—beautiful, personifies the seed.

Stanza 2: "pushing through stone and time"—good, combines physical and temporal struggle.

Stanza 3: "stronger where I was torn"—great, conveys resilience.

Stanza 4: "each begging to be seen"—adds a personal touch, makes the bloom more meaningful.

Is there anything to change? Maybe make the transition between stanzas smoother? Not necessary; the theme is clear.

Another check: Each subtitle is highlighted. Yes, we'll use italics.

Now, check word count again:

Stanza 1: 3 (subtitle) + 4 +4 +6 = 17.

Stanza 2: "tip tilts toward faint light" is 5 words. "pushing through stone and time" is 5 words. "a fragile stem holding sky" is 5 words. So 3+5+5+5=18.

Stanza 3: Bend, But Not Break (4) + 5 (when wind screams like fear) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19.

Stanza 4: Bloom Into Sun (3) + 4 (petals open like questions) + 5 (each one asking to stay) = 3+4+5=12.

Total words: 17+18+19+12=66 words. Way under 300. Good.

Word count: 66 words. Way under 300.

Lines: 15. Perfect.

Now, let's write the final poem.

content:

**Poem: "Growth in Shattered Light"**

Beneath the Soil
a heartbeat of green,
wrapped tight in dark,
waiting for the world to breathe.

Sprout Through Cracks
tip tilts toward faint light,
pushing through stone and time,
a fragile stem holding sky.

Bend, But Not Break
when winds roar like rage,
I arch, then rise again,
stronger where I was torn.

Bloom Into Sun
petals open like hearts,
each begging to be seen.

For deployment, you can use sglang>=0.4.6.post1 or vllm>=0.8.5 or to create an OpenAI-compatible API endpoint.

Citation

@misc{lightifproj,
      title={Light-IF: Endowing LLMs with Generalizable Reasoning via Preview and Self-Checking for Complex Instruction Following.}, 
      author={Light-IF Team},
      year={2025},
      eprint={},
      archivePrefix={},
      primaryClass={},
      url={https://huggingface.co/qihoo360/Light-IF-4B}, 
}

qihoo360
/

Light-IF-4B

Light-IF-4B

Evaluation

Introduction

Quickstart

Citation

Model tree for qihoo360/Light-IF-4B

Collection including qihoo360/Light-IF-4B

Light-IF