Add pipeline tag, link to paper and project page

#4
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +33 -8
README.md CHANGED
@@ -1,9 +1,11 @@
1
  ---
2
- license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen3-0.6B-Base
5
  library_name: transformers
 
 
6
  ---
 
7
  # Qwen3-Reranker-0.6B
8
 
9
  <p align="center">
@@ -14,7 +16,7 @@ library_name: transformers
14
 
15
  The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
16
 
17
- **Exceptional Versatility**: The embedding model has achieved state-of-the-art performance across a wide range of downstream application evaluations. The 8B size embedding model ranks No.1 in the MTEB multilingual leaderboard (as of June 5, 2025, score 70.58), while the reranking model excels in various text retrieval scenarios.
18
 
19
  **Comprehensive Flexibility**: The Qwen3 Embedding series offers a full spectrum of sizes (from 0.6B to 8B) for both embedding and reranking models, catering to diverse use cases that prioritize efficiency and effectiveness. Developers can seamlessly combine these two modules. Additionally, the embedding model allows for flexible vector definitions across all dimensions, and both embedding and reranking models support user-defined instructions to enhance performance for specific tasks, languages, or scenarios.
20
 
@@ -22,6 +24,8 @@ The Qwen3 Embedding model series is the latest proprietary model of the Qwen fam
22
 
23
  ## Model Overview
24
 
 
 
25
  **Qwen3-Reranker-0.6B** has the following features:
26
 
27
  - Model Type: Text Reranking
@@ -65,7 +69,9 @@ from transformers import AutoModel, AutoTokenizer, AutoModelForCausalLM
65
  def format_instruction(instruction, query, doc):
66
  if instruction is None:
67
  instruction = 'Given a web search query, retrieve relevant passages that answer the query'
68
- output = "<Instruct>: {instruction}\n<Query>: {query}\n<Document>: {doc}".format(instruction=instruction,query=query, doc=doc)
 
 
69
  return output
70
 
71
  def process_inputs(pairs):
@@ -98,8 +104,17 @@ token_false_id = tokenizer.convert_tokens_to_ids("no")
98
  token_true_id = tokenizer.convert_tokens_to_ids("yes")
99
  max_length = 8192
100
 
101
- prefix = "<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".<|im_end|>\n<|im_start|>user\n"
102
- suffix = "<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
 
 
 
 
 
 
 
 
 
103
  prefix_tokens = tokenizer.encode(prefix, add_special_tokens=False)
104
  suffix_tokens = tokenizer.encode(suffix, add_special_tokens=False)
105
 
@@ -148,7 +163,11 @@ from vllm.inputs.data import TokensPrompt
148
  def format_instruction(instruction, query, doc):
149
  text = [
150
  {"role": "system", "content": "Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\"."},
151
- {"role": "user", "content": f"<Instruct>: {instruction}\n\n<Query>: {query}\n\n<Document>: {doc}"}
 
 
 
 
152
  ]
153
  return text
154
 
@@ -186,7 +205,13 @@ tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen3-Reranker-0.6B')
186
  model = LLM(model='Qwen/Qwen3-Reranker-0.6B', tensor_parallel_size=number_of_gpu, max_model_len=10000, enable_prefix_caching=True, gpu_memory_utilization=0.8)
187
  tokenizer.padding_side = "left"
188
  tokenizer.pad_token = tokenizer.eos_token
189
- suffix = "<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
 
 
 
 
 
 
190
  max_length=8192
191
  suffix_tokens = tokenizer.encode(suffix, add_special_tokens=False)
192
  true_token = tokenizer("yes", add_special_tokens=False).input_ids[0]
@@ -239,7 +264,7 @@ If you find our work helpful, feel free to give us a cite.
239
  ```
240
  @misc{qwen3-embedding,
241
  title = {Qwen3-Embedding},
242
- url = {https://qwenlm.github.io/blog/qwen3/},
243
  author = {Qwen Team},
244
  month = {May},
245
  year = {2025}
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen3-0.6B-Base
4
  library_name: transformers
5
+ license: apache-2.0
6
+ pipeline_tag: text-ranking
7
  ---
8
+
9
  # Qwen3-Reranker-0.6B
10
 
11
  <p align="center">
 
16
 
17
  The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
18
 
19
+ **Exceptional Versatility**: The embedding model has achieved state-of-the-art performance across a wide range of downstream application evaluations. The 8B size embedding model ranks **No.1** in the MTEB multilingual leaderboard (as of June 5, 2025, score **70.58**), while the reranking model excels in various text retrieval scenarios.
20
 
21
  **Comprehensive Flexibility**: The Qwen3 Embedding series offers a full spectrum of sizes (from 0.6B to 8B) for both embedding and reranking models, catering to diverse use cases that prioritize efficiency and effectiveness. Developers can seamlessly combine these two modules. Additionally, the embedding model allows for flexible vector definitions across all dimensions, and both embedding and reranking models support user-defined instructions to enhance performance for specific tasks, languages, or scenarios.
22
 
 
24
 
25
  ## Model Overview
26
 
27
+ This model is described in the paper [Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models](https://huggingface.co/papers/2506.05176).
28
+
29
  **Qwen3-Reranker-0.6B** has the following features:
30
 
31
  - Model Type: Text Reranking
 
69
  def format_instruction(instruction, query, doc):
70
  if instruction is None:
71
  instruction = 'Given a web search query, retrieve relevant passages that answer the query'
72
+ output = "<Instruct>: {instruction}
73
+ <Query>: {query}
74
+ <Document>: {doc}".format(instruction=instruction,query=query, doc=doc)
75
  return output
76
 
77
  def process_inputs(pairs):
 
104
  token_true_id = tokenizer.convert_tokens_to_ids("yes")
105
  max_length = 8192
106
 
107
+ prefix = "<|im_start|>system
108
+ Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".<|im_end|>
109
+ <|im_start|>user
110
+ "
111
+ suffix = "<|im_end|>
112
+ <|im_start|>assistant
113
+ <think>
114
+
115
+ </think>
116
+
117
+ "
118
  prefix_tokens = tokenizer.encode(prefix, add_special_tokens=False)
119
  suffix_tokens = tokenizer.encode(suffix, add_special_tokens=False)
120
 
 
163
  def format_instruction(instruction, query, doc):
164
  text = [
165
  {"role": "system", "content": "Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\"."},
166
+ {"role": "user", "content": f"<Instruct>: {instruction}
167
+
168
+ <Query>: {query}
169
+
170
+ <Document>: {doc}"}
171
  ]
172
  return text
173
 
 
205
  model = LLM(model='Qwen/Qwen3-Reranker-0.6B', tensor_parallel_size=number_of_gpu, max_model_len=10000, enable_prefix_caching=True, gpu_memory_utilization=0.8)
206
  tokenizer.padding_side = "left"
207
  tokenizer.pad_token = tokenizer.eos_token
208
+ suffix = "<|im_end|>
209
+ <|im_start|>assistant
210
+ <think>
211
+
212
+ </think>
213
+
214
+ "
215
  max_length=8192
216
  suffix_tokens = tokenizer.encode(suffix, add_special_tokens=False)
217
  true_token = tokenizer("yes", add_special_tokens=False).input_ids[0]
 
264
  ```
265
  @misc{qwen3-embedding,
266
  title = {Qwen3-Embedding},
267
+ url = {https://qwenlm.github.io/blog/qwen3-embedding/},
268
  author = {Qwen Team},
269
  month = {May},
270
  year = {2025}