nv-bschifferer commited on
Commit
16cdad8
·
1 Parent(s): 9daa5d1

modify README

Browse files
README.md CHANGED
@@ -27,8 +27,10 @@ The **nvidia/llama-nemoretriever-colembed-3b-v1** is a late interaction embeddin
27
  This model is for non-commercial/research use only.
28
 
29
  ### License/Terms of Use
30
- Governing Terms: [NVIDIA License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/blob/main/LICENSE)
31
- Additional Information: [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/) for [siglip2-giant-opt-patch16-384](https://huggingface.co/google/siglip2-giant-opt-patch16-384); and [LLAMA 3.2 Community License Agreement](https://huggingface.co/meta-llama/Llama-3.2-3B/blob/main/LICENSE.txt) for [Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B). Built with Meta Llama 3. Improved using Qwen.
 
 
32
 
33
  This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
34
 
@@ -37,6 +39,7 @@ This project will download and install additional third-party open source softwa
37
  - Gabriel Moreira
38
  - Radek Osmulski
39
  - Ronay Ak
 
40
  - Even Oldridge
41
  - Benedikt Schifferer
42
 
@@ -119,6 +122,7 @@ model = AutoModel.from_pretrained(
119
  trust_remote_code=True,
120
  torch_dtype=torch.bfloat16,
121
  attn_implementation="flash_attention_2",
 
122
  ).eval()
123
 
124
  # Queries
@@ -164,7 +168,7 @@ The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia
164
  pip install git+https://github.com/illuin-tech/vidore-benchmark@e0eb9032e7e00adc8aa6f9cb35d5a9371f67485a
165
  # Downgrade transformers as vidore will install latest transformers
166
  pip install transformers==4.49.0
167
- CUDA_VISIBLE_DEVICES=0; python3 vidore_eval.py --model_name_or_path nvidia/llama-nemoretriever-colembed-3b-v1 --savedir_datasets ./results/
168
  ```
169
 
170
  The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/blob/main/mteb_eval.py) to evaluate MTEB VisualDocumentRetrieval. We install ViDoRe benchmark to capture dependencies, first.
@@ -205,6 +209,16 @@ We evaluate the model on multiple benchmarks for Visual Document Retrieval, ViDo
205
  - **Labeling Method by dataset:** Hybrid: Automated, Human, Synthetic
206
  - **Properties:** More details on ViDoRe V1 and ViDoRe V2 can be found on their leaderboard. [Visual Document Retrieval Benchmark](https://huggingface.co/vidore), ViDoRe, is composed of various page-level retrieving tasks spanning multiple domains, languages, and settings.
207
 
 
 
 
 
 
 
 
 
 
 
208
  ## Inference:
209
  **Acceleration Engine:** Not Applicable <br>
210
  **Test Hardware:** A100 40GB, A100 80GB, H100 80GB
 
27
  This model is for non-commercial/research use only.
28
 
29
  ### License/Terms of Use
30
+ Governing Terms for llama-nemoretriever-colembed-3b-v1 model: [NVIDIA Non-Commercial License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/blob/main/LICENSE)
31
+ Additional Information: [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/) for [siglip2-giant-opt-patch16-384](https://huggingface.co/google/siglip2-giant-opt-patch16-384); and [LLAMA 3.2 Community License Agreement](https://huggingface.co/meta-llama/Llama-3.2-3B/blob/main/LICENSE.txt) for [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-3B). Built with Meta Llama 3. Improved using Qwen.
32
+
33
+ This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
34
 
35
  This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
36
 
 
39
  - Gabriel Moreira
40
  - Radek Osmulski
41
  - Ronay Ak
42
+ - Yauhen Babakhin
43
  - Even Oldridge
44
  - Benedikt Schifferer
45
 
 
122
  trust_remote_code=True,
123
  torch_dtype=torch.bfloat16,
124
  attn_implementation="flash_attention_2",
125
+ revision='50c36f4d5271c6851aa08bd26d69f6e7ca8b870c'
126
  ).eval()
127
 
128
  # Queries
 
168
  pip install git+https://github.com/illuin-tech/vidore-benchmark@e0eb9032e7e00adc8aa6f9cb35d5a9371f67485a
169
  # Downgrade transformers as vidore will install latest transformers
170
  pip install transformers==4.49.0
171
+ CUDA_VISIBLE_DEVICES=0; python3 vidore_eval.py --model_name_or_path nvidia/llama-nemoretriever-colembed-3b-v1 --savedir_datasets ./results/ --model_revision 50c36f4d5271c6851aa08bd26d69f6e7ca8b870c
172
  ```
173
 
174
  The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/blob/main/mteb_eval.py) to evaluate MTEB VisualDocumentRetrieval. We install ViDoRe benchmark to capture dependencies, first.
 
209
  - **Labeling Method by dataset:** Hybrid: Automated, Human, Synthetic
210
  - **Properties:** More details on ViDoRe V1 and ViDoRe V2 can be found on their leaderboard. [Visual Document Retrieval Benchmark](https://huggingface.co/vidore), ViDoRe, is composed of various page-level retrieving tasks spanning multiple domains, languages, and settings.
211
 
212
+ | **Benchmark** | **Model 1B** | **Model 3B** |
213
+ |--------------------------------|--------------|--------------|
214
+ | ViDoRe V1 (06/27/2025) | 0.9050 | 0.9100 |
215
+ | ViDoRe V1 (deprecated) | 0.9049 | 0.9098 |
216
+ | ViDoRe V2 (06/27/2025) | 0.6209 | 0.6352 |
217
+ | ViDoRe V2 (deprecated) | 0.6261 | 0.6342 |
218
+ | MTEB Visual Document Retrieval | 0.8238 | 0.8315 |
219
+
220
+ Note: All scores are Avg. NDCG@5. ViDoRe V1 and V2 was updated on June 27th 2025 to use the calculated scores from [MTEB](https://github.com/embeddings-benchmark/mteb), which can result in slightly different scores. The ViDoRe V2 (06/27/2025) uses only 4 of the original 7 datasets.
221
+
222
  ## Inference:
223
  **Acceleration Engine:** Not Applicable <br>
224
  **Test Hardware:** A100 40GB, A100 80GB, H100 80GB
configuration_siglip.py CHANGED
@@ -1,7 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # --------------------------------------------------------
2
  # Copyright (c) 2025 NVIDIA
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
 
 
 
5
 
6
  """ Siglip model configuration"""
7
 
 
1
+ # coding=utf-8
2
+
3
+ # Copyright 2024 The HuggingFace Inc. team. All rights reserved.
4
+ #
5
+ # Licensed under the Apache License, Version 2.0 (the "License");
6
+ # you may not use this file except in compliance with the License.
7
+ # You may obtain a copy of the License at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # Unless required by applicable law or agreed to in writing, software
12
+ # distributed under the License is distributed on an "AS IS" BASIS,
13
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14
+ # See the License for the specific language governing permissions and
15
+
16
  # --------------------------------------------------------
17
  # Copyright (c) 2025 NVIDIA
18
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
19
  # --------------------------------------------------------
20
+ # Not a contribution
21
+ # Changes made by NVIDIA CORPORATION & AFFILIATES enabling llama-nemoretriever-colemebed models or otherwise documented as
22
+ # NSCLv1 are not a contribution and subject to the terms and conditions in LICENSE.md
23
 
24
  """ Siglip model configuration"""
25
 
flash_attention.py CHANGED
@@ -3,7 +3,9 @@
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
- # https://github.com/Dao-AILab/flash-attention/blob/v0.2.8/flash_attn/flash_attention.py
 
 
7
  import torch
8
  import torch.nn as nn
9
  from einops import rearrange
 
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
+ # Based on https://github.com/Dao-AILab/flash-attention/blob/v0.2.8/flash_attn/flash_attention.py
7
+ # https://github.com/Dao-AILab/flash-attention/blob/main/LICENSE
8
+
9
  import torch
10
  import torch.nn as nn
11
  from einops import rearrange
modeling_llama_nemoretrievercolembed.py CHANGED
@@ -3,7 +3,11 @@
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
 
 
 
6
  # Importing torch before transformers can cause `segmentation fault`
 
7
  from transformers import AutoTokenizer, AutoConfig
8
  from transformers.modeling_outputs import SequenceClassifierOutputWithPast
9
  import base64
 
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
+ # Based on https://github.com/OpenGVLab/InternVL/blob/main/streamlit_demo/model_worker.py
7
+ # https://github.com/OpenGVLab/InternVL/?tab=MIT-1-ov-file#readme
8
+
9
  # Importing torch before transformers can cause `segmentation fault`
10
+
11
  from transformers import AutoTokenizer, AutoConfig
12
  from transformers.modeling_outputs import SequenceClassifierOutputWithPast
13
  import base64
modeling_siglip.py CHANGED
@@ -1,7 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # --------------------------------------------------------
2
  # Copyright (c) 2025 NVIDIA
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
 
 
 
 
 
5
  """ PyTorch Siglip model."""
6
 
7
 
 
1
+ # coding=utf-8
2
+ # Copyright 2024 The HuggingFace Inc. team. All rights reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+
15
+
16
  # --------------------------------------------------------
17
  # Copyright (c) 2025 NVIDIA
18
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
19
  # --------------------------------------------------------
20
+ # Not a contribution
21
+ # Changes made by NVIDIA CORPORATION & AFFILIATES enabling llama-nemoretriever-colemebed models or otherwise documented as
22
+ # NSCLv1 are not a contribution and subject to the terms and conditions in LICENSE.md
23
+
24
+
25
  """ PyTorch Siglip model."""
26
 
27
 
results.json ADDED
@@ -0,0 +1,994 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "timestamp": "2025-06-26T06:21:27.128658",
4
+ "vidore_benchmark_version": "5.0.1.dev12+ge0eb903"
5
+ },
6
+ "metrics": {
7
+ "vidore/arxivqa_test_subsampled": {
8
+ "ndcg_at_1": 0.834,
9
+ "ndcg_at_3": 0.87602,
10
+ "ndcg_at_5": 0.88351,
11
+ "ndcg_at_10": 0.89382,
12
+ "ndcg_at_20": 0.89856,
13
+ "ndcg_at_50": 0.9021,
14
+ "ndcg_at_100": 0.90271,
15
+ "map_at_1": 0.834,
16
+ "map_at_3": 0.86567,
17
+ "map_at_5": 0.86987,
18
+ "map_at_10": 0.87409,
19
+ "map_at_20": 0.87549,
20
+ "map_at_50": 0.87605,
21
+ "map_at_100": 0.87609,
22
+ "recall_at_1": 0.834,
23
+ "recall_at_3": 0.906,
24
+ "recall_at_5": 0.924,
25
+ "recall_at_10": 0.956,
26
+ "recall_at_20": 0.974,
27
+ "recall_at_50": 0.992,
28
+ "recall_at_100": 0.996,
29
+ "precision_at_1": 0.834,
30
+ "precision_at_3": 0.302,
31
+ "precision_at_5": 0.1848,
32
+ "precision_at_10": 0.0956,
33
+ "precision_at_20": 0.0487,
34
+ "precision_at_50": 0.01984,
35
+ "precision_at_100": 0.00996,
36
+ "mrr_at_1": 0.834,
37
+ "mrr_at_3": 0.8656666666666666,
38
+ "mrr_at_5": 0.8698666666666666,
39
+ "mrr_at_10": 0.8740904761904763,
40
+ "mrr_at_20": 0.8754917376740906,
41
+ "mrr_at_50": 0.8760454757022237,
42
+ "mrr_at_100": 0.8760882870575352,
43
+ "naucs_at_1_max": 0.6703392545655197,
44
+ "naucs_at_1_std": 0.4132759009409903,
45
+ "naucs_at_1_diff1": 0.9520605650682898,
46
+ "naucs_at_3_max": 0.6840495063273536,
47
+ "naucs_at_3_std": 0.45158630828217644,
48
+ "naucs_at_3_diff1": 0.9315116117368949,
49
+ "naucs_at_5_max": 0.7034989434370238,
50
+ "naucs_at_5_std": 0.46818025455796014,
51
+ "naucs_at_5_diff1": 0.9329205366357055,
52
+ "naucs_at_10_max": 0.6890968508615603,
53
+ "naucs_at_10_std": 0.474195738901625,
54
+ "naucs_at_10_diff1": 0.9457813428401673,
55
+ "naucs_at_20_max": 0.7880126409538182,
56
+ "naucs_at_20_std": 0.5496301084536358,
57
+ "naucs_at_20_diff1": 0.9283559577677175,
58
+ "naucs_at_50_max": 0.7480158730158629,
59
+ "naucs_at_50_std": 0.8190943043884249,
60
+ "naucs_at_50_diff1": 0.8978758169934562,
61
+ "naucs_at_100_max": 1.0,
62
+ "naucs_at_100_std": 1.0,
63
+ "naucs_at_100_diff1": 1.0
64
+ },
65
+ "vidore/docvqa_test_subsampled": {
66
+ "ndcg_at_1": 0.58537,
67
+ "ndcg_at_3": 0.64531,
68
+ "ndcg_at_5": 0.65942,
69
+ "ndcg_at_10": 0.67939,
70
+ "ndcg_at_20": 0.69023,
71
+ "ndcg_at_50": 0.70398,
72
+ "ndcg_at_100": 0.71126,
73
+ "map_at_1": 0.58537,
74
+ "map_at_3": 0.63008,
75
+ "map_at_5": 0.63762,
76
+ "map_at_10": 0.64581,
77
+ "map_at_20": 0.64889,
78
+ "map_at_50": 0.65114,
79
+ "map_at_100": 0.65181,
80
+ "recall_at_1": 0.58537,
81
+ "recall_at_3": 0.68958,
82
+ "recall_at_5": 0.72506,
83
+ "recall_at_10": 0.78714,
84
+ "recall_at_20": 0.82927,
85
+ "recall_at_50": 0.898,
86
+ "recall_at_100": 0.94235,
87
+ "precision_at_1": 0.58537,
88
+ "precision_at_3": 0.22986,
89
+ "precision_at_5": 0.14501,
90
+ "precision_at_10": 0.07871,
91
+ "precision_at_20": 0.04146,
92
+ "precision_at_50": 0.01796,
93
+ "precision_at_100": 0.00942,
94
+ "mrr_at_1": 0.5853658536585366,
95
+ "mrr_at_3": 0.6300813008130081,
96
+ "mrr_at_5": 0.6376201034737617,
97
+ "mrr_at_10": 0.6458064970260089,
98
+ "mrr_at_20": 0.6488879496173225,
99
+ "mrr_at_50": 0.6511443628922823,
100
+ "mrr_at_100": 0.6518089180219389,
101
+ "naucs_at_1_max": 0.2585563149452475,
102
+ "naucs_at_1_std": 0.3159595366492725,
103
+ "naucs_at_1_diff1": 0.8678053985855875,
104
+ "naucs_at_3_max": 0.22128557788133152,
105
+ "naucs_at_3_std": 0.258482179935891,
106
+ "naucs_at_3_diff1": 0.8393573283409745,
107
+ "naucs_at_5_max": 0.21064270306895383,
108
+ "naucs_at_5_std": 0.23472397866953829,
109
+ "naucs_at_5_diff1": 0.8274386494106453,
110
+ "naucs_at_10_max": 0.15097771381217617,
111
+ "naucs_at_10_std": 0.2839241224708214,
112
+ "naucs_at_10_diff1": 0.7893641996302178,
113
+ "naucs_at_20_max": 0.15466073118900037,
114
+ "naucs_at_20_std": 0.33553862379172555,
115
+ "naucs_at_20_diff1": 0.7724781458006538,
116
+ "naucs_at_50_max": 0.10906657289614331,
117
+ "naucs_at_50_std": 0.5786081651360842,
118
+ "naucs_at_50_diff1": 0.7426293121947367,
119
+ "naucs_at_100_max": 0.072673798370254,
120
+ "naucs_at_100_std": 0.8933863552254951,
121
+ "naucs_at_100_diff1": 0.7602442332060952
122
+ },
123
+ "vidore/infovqa_test_subsampled": {
124
+ "ndcg_at_1": 0.91498,
125
+ "ndcg_at_3": 0.94325,
126
+ "ndcg_at_5": 0.94908,
127
+ "ndcg_at_10": 0.95095,
128
+ "ndcg_at_20": 0.95363,
129
+ "ndcg_at_50": 0.95442,
130
+ "ndcg_at_100": 0.95476,
131
+ "map_at_1": 0.91498,
132
+ "map_at_3": 0.93623,
133
+ "map_at_5": 0.93947,
134
+ "map_at_10": 0.94019,
135
+ "map_at_20": 0.94099,
136
+ "map_at_50": 0.94111,
137
+ "map_at_100": 0.94114,
138
+ "recall_at_1": 0.91498,
139
+ "recall_at_3": 0.96356,
140
+ "recall_at_5": 0.97773,
141
+ "recall_at_10": 0.98381,
142
+ "recall_at_20": 0.99393,
143
+ "recall_at_50": 0.99798,
144
+ "recall_at_100": 1.0,
145
+ "precision_at_1": 0.91498,
146
+ "precision_at_3": 0.32119,
147
+ "precision_at_5": 0.19555,
148
+ "precision_at_10": 0.09838,
149
+ "precision_at_20": 0.0497,
150
+ "precision_at_50": 0.01996,
151
+ "precision_at_100": 0.01,
152
+ "mrr_at_1": 0.9149797570850202,
153
+ "mrr_at_3": 0.936234817813765,
154
+ "mrr_at_5": 0.9394736842105261,
155
+ "mrr_at_10": 0.9401902191375874,
156
+ "mrr_at_20": 0.9409887775689154,
157
+ "mrr_at_50": 0.9411103076140448,
158
+ "mrr_at_100": 0.9411405209199847,
159
+ "naucs_at_1_max": 0.6703820792124877,
160
+ "naucs_at_1_std": 0.16833937392893533,
161
+ "naucs_at_1_diff1": 0.9498534501270284,
162
+ "naucs_at_3_max": 0.5822492726969997,
163
+ "naucs_at_3_std": 0.06259669593622658,
164
+ "naucs_at_3_diff1": 0.9492105456414766,
165
+ "naucs_at_5_max": 0.739556315880861,
166
+ "naucs_at_5_std": 0.5078895506993929,
167
+ "naucs_at_5_diff1": 0.9643814216187027,
168
+ "naucs_at_10_max": 0.6766067765559199,
169
+ "naucs_at_10_std": 0.3559984957278538,
170
+ "naucs_at_10_diff1": 0.9836748182418953,
171
+ "naucs_at_20_max": 0.9564661819784134,
172
+ "naucs_at_20_std": 0.8638879360590604,
173
+ "naucs_at_20_diff1": 1.0,
174
+ "naucs_at_50_max": 0.8693985459351681,
175
+ "naucs_at_50_std": 0.8693985459351681,
176
+ "naucs_at_50_diff1": 1.0,
177
+ "naucs_at_100_max": null,
178
+ "naucs_at_100_std": null,
179
+ "naucs_at_100_diff1": null
180
+ },
181
+ "vidore/tabfquad_test_subsampled": {
182
+ "ndcg_at_1": 0.91786,
183
+ "ndcg_at_3": 0.95383,
184
+ "ndcg_at_5": 0.95935,
185
+ "ndcg_at_10": 0.95935,
186
+ "ndcg_at_20": 0.96032,
187
+ "ndcg_at_50": 0.96109,
188
+ "ndcg_at_100": 0.96109,
189
+ "map_at_1": 0.91786,
190
+ "map_at_3": 0.94524,
191
+ "map_at_5": 0.9481,
192
+ "map_at_10": 0.9481,
193
+ "map_at_20": 0.94839,
194
+ "map_at_50": 0.94854,
195
+ "map_at_100": 0.94854,
196
+ "recall_at_1": 0.91786,
197
+ "recall_at_3": 0.97857,
198
+ "recall_at_5": 0.99286,
199
+ "recall_at_10": 0.99286,
200
+ "recall_at_20": 0.99643,
201
+ "recall_at_50": 1.0,
202
+ "recall_at_100": 1.0,
203
+ "precision_at_1": 0.91786,
204
+ "precision_at_3": 0.32619,
205
+ "precision_at_5": 0.19857,
206
+ "precision_at_10": 0.09929,
207
+ "precision_at_20": 0.04982,
208
+ "precision_at_50": 0.02,
209
+ "precision_at_100": 0.01,
210
+ "mrr_at_1": 0.9178571428571428,
211
+ "mrr_at_3": 0.9452380952380953,
212
+ "mrr_at_5": 0.948095238095238,
213
+ "mrr_at_10": 0.948095238095238,
214
+ "mrr_at_20": 0.9483928571428571,
215
+ "mrr_at_50": 0.9485416666666666,
216
+ "mrr_at_100": 0.9485416666666666,
217
+ "naucs_at_1_max": 0.04487882109365634,
218
+ "naucs_at_1_std": 0.15499533146591998,
219
+ "naucs_at_1_diff1": 0.928754110339789,
220
+ "naucs_at_3_max": 0.8358232181761669,
221
+ "naucs_at_3_std": 0.9101307189542569,
222
+ "naucs_at_3_diff1": 1.0,
223
+ "naucs_at_5_max": 0.9346405228758269,
224
+ "naucs_at_5_std": 0.9346405228758269,
225
+ "naucs_at_5_diff1": 1.0,
226
+ "naucs_at_10_max": 0.9346405228758269,
227
+ "naucs_at_10_std": 0.9346405228758269,
228
+ "naucs_at_10_diff1": 1.0,
229
+ "naucs_at_20_max": 1.0,
230
+ "naucs_at_20_std": 1.0,
231
+ "naucs_at_20_diff1": 1.0,
232
+ "naucs_at_50_max": 1.0,
233
+ "naucs_at_50_std": 1.0,
234
+ "naucs_at_50_diff1": 1.0,
235
+ "naucs_at_100_max": 1.0,
236
+ "naucs_at_100_std": 1.0,
237
+ "naucs_at_100_diff1": 1.0
238
+ },
239
+ "vidore/tatdqa_test": {
240
+ "ndcg_at_1": 0.70535,
241
+ "ndcg_at_3": 0.7868,
242
+ "ndcg_at_5": 0.80621,
243
+ "ndcg_at_10": 0.82194,
244
+ "ndcg_at_20": 0.82672,
245
+ "ndcg_at_50": 0.83053,
246
+ "ndcg_at_100": 0.83277,
247
+ "map_at_1": 0.70535,
248
+ "map_at_3": 0.76742,
249
+ "map_at_5": 0.77826,
250
+ "map_at_10": 0.78488,
251
+ "map_at_20": 0.78628,
252
+ "map_at_50": 0.78693,
253
+ "map_at_100": 0.78712,
254
+ "recall_at_1": 0.70535,
255
+ "recall_at_3": 0.84265,
256
+ "recall_at_5": 0.88943,
257
+ "recall_at_10": 0.93742,
258
+ "recall_at_20": 0.95565,
259
+ "recall_at_50": 0.97448,
260
+ "recall_at_100": 0.98846,
261
+ "precision_at_1": 0.70535,
262
+ "precision_at_3": 0.28088,
263
+ "precision_at_5": 0.17789,
264
+ "precision_at_10": 0.09374,
265
+ "precision_at_20": 0.04778,
266
+ "precision_at_50": 0.01949,
267
+ "precision_at_100": 0.00988,
268
+ "mrr_at_1": 0.7035236938031592,
269
+ "mrr_at_3": 0.767010935601458,
270
+ "mrr_at_5": 0.7775212636695018,
271
+ "mrr_at_10": 0.7841125190456905,
272
+ "mrr_at_20": 0.7855296898659594,
273
+ "mrr_at_50": 0.7861619428321974,
274
+ "mrr_at_100": 0.7863629532691659,
275
+ "naucs_at_1_max": 0.22208704220171518,
276
+ "naucs_at_1_std": 0.15415493193166072,
277
+ "naucs_at_1_diff1": 0.85100407554343,
278
+ "naucs_at_3_max": 0.21518686518713465,
279
+ "naucs_at_3_std": 0.23043576383644288,
280
+ "naucs_at_3_diff1": 0.8025394364016292,
281
+ "naucs_at_5_max": 0.2641753480647899,
282
+ "naucs_at_5_std": 0.2706978631234192,
283
+ "naucs_at_5_diff1": 0.7806280461098983,
284
+ "naucs_at_10_max": 0.2523834187563826,
285
+ "naucs_at_10_std": 0.32315905277258156,
286
+ "naucs_at_10_diff1": 0.7532415640618384,
287
+ "naucs_at_20_max": 0.26510647248798225,
288
+ "naucs_at_20_std": 0.3525755808522635,
289
+ "naucs_at_20_diff1": 0.7348455395997588,
290
+ "naucs_at_50_max": 0.242804665789723,
291
+ "naucs_at_50_std": 0.44822875988285776,
292
+ "naucs_at_50_diff1": 0.7286029440909012,
293
+ "naucs_at_100_max": 0.05932622574605986,
294
+ "naucs_at_100_std": 0.28624024988935604,
295
+ "naucs_at_100_diff1": 0.7502131245549767
296
+ },
297
+ "vidore/shiftproject_test": {
298
+ "ndcg_at_1": 0.81,
299
+ "ndcg_at_3": 0.88678,
300
+ "ndcg_at_5": 0.907,
301
+ "ndcg_at_10": 0.907,
302
+ "ndcg_at_20": 0.907,
303
+ "ndcg_at_50": 0.90904,
304
+ "ndcg_at_100": 0.90904,
305
+ "map_at_1": 0.81,
306
+ "map_at_3": 0.86833,
307
+ "map_at_5": 0.87933,
308
+ "map_at_10": 0.87933,
309
+ "map_at_20": 0.87933,
310
+ "map_at_50": 0.87968,
311
+ "map_at_100": 0.87968,
312
+ "recall_at_1": 0.81,
313
+ "recall_at_3": 0.94,
314
+ "recall_at_5": 0.99,
315
+ "recall_at_10": 0.99,
316
+ "recall_at_20": 0.99,
317
+ "recall_at_50": 1.0,
318
+ "recall_at_100": 1.0,
319
+ "precision_at_1": 0.81,
320
+ "precision_at_3": 0.31333,
321
+ "precision_at_5": 0.198,
322
+ "precision_at_10": 0.099,
323
+ "precision_at_20": 0.0495,
324
+ "precision_at_50": 0.02,
325
+ "precision_at_100": 0.01,
326
+ "mrr_at_1": 0.81,
327
+ "mrr_at_3": 0.8683333333333334,
328
+ "mrr_at_5": 0.8793333333333334,
329
+ "mrr_at_10": 0.8793333333333334,
330
+ "mrr_at_20": 0.8793333333333334,
331
+ "mrr_at_50": 0.8796781609195403,
332
+ "mrr_at_100": 0.8796781609195403,
333
+ "naucs_at_1_max": -0.18730682592068792,
334
+ "naucs_at_1_std": -0.7260202210697273,
335
+ "naucs_at_1_diff1": 0.8433652889098441,
336
+ "naucs_at_3_max": 0.3544195455960126,
337
+ "naucs_at_3_std": -0.26914098972922335,
338
+ "naucs_at_3_diff1": 0.9319172113289744,
339
+ "naucs_at_5_max": 0.12278244631185926,
340
+ "naucs_at_5_std": 0.35807656395891135,
341
+ "naucs_at_5_diff1": 1.0,
342
+ "naucs_at_10_max": 0.12278244631185926,
343
+ "naucs_at_10_std": 0.35807656395891135,
344
+ "naucs_at_10_diff1": 1.0,
345
+ "naucs_at_20_max": 0.12278244631185926,
346
+ "naucs_at_20_std": 0.35807656395891135,
347
+ "naucs_at_20_diff1": 1.0,
348
+ "naucs_at_50_max": null,
349
+ "naucs_at_50_std": null,
350
+ "naucs_at_50_diff1": null,
351
+ "naucs_at_100_max": null,
352
+ "naucs_at_100_std": null,
353
+ "naucs_at_100_diff1": null
354
+ },
355
+ "vidore/syntheticDocQA_artificial_intelligence_test": {
356
+ "ndcg_at_1": 0.99,
357
+ "ndcg_at_3": 0.99631,
358
+ "ndcg_at_5": 0.99631,
359
+ "ndcg_at_10": 0.99631,
360
+ "ndcg_at_20": 0.99631,
361
+ "ndcg_at_50": 0.99631,
362
+ "ndcg_at_100": 0.99631,
363
+ "map_at_1": 0.99,
364
+ "map_at_3": 0.995,
365
+ "map_at_5": 0.995,
366
+ "map_at_10": 0.995,
367
+ "map_at_20": 0.995,
368
+ "map_at_50": 0.995,
369
+ "map_at_100": 0.995,
370
+ "recall_at_1": 0.99,
371
+ "recall_at_3": 1.0,
372
+ "recall_at_5": 1.0,
373
+ "recall_at_10": 1.0,
374
+ "recall_at_20": 1.0,
375
+ "recall_at_50": 1.0,
376
+ "recall_at_100": 1.0,
377
+ "precision_at_1": 0.99,
378
+ "precision_at_3": 0.33333,
379
+ "precision_at_5": 0.2,
380
+ "precision_at_10": 0.1,
381
+ "precision_at_20": 0.05,
382
+ "precision_at_50": 0.02,
383
+ "precision_at_100": 0.01,
384
+ "mrr_at_1": 0.99,
385
+ "mrr_at_3": 0.995,
386
+ "mrr_at_5": 0.995,
387
+ "mrr_at_10": 0.995,
388
+ "mrr_at_20": 0.995,
389
+ "mrr_at_50": 0.995,
390
+ "mrr_at_100": 0.995,
391
+ "naucs_at_1_max": 0.12278244631185359,
392
+ "naucs_at_1_std": 0.12278244631185359,
393
+ "naucs_at_1_diff1": 1.0,
394
+ "naucs_at_3_max": 1.0,
395
+ "naucs_at_3_std": 1.0,
396
+ "naucs_at_3_diff1": 1.0,
397
+ "naucs_at_5_max": 1.0,
398
+ "naucs_at_5_std": 1.0,
399
+ "naucs_at_5_diff1": 1.0,
400
+ "naucs_at_10_max": 1.0,
401
+ "naucs_at_10_std": 1.0,
402
+ "naucs_at_10_diff1": 1.0,
403
+ "naucs_at_20_max": 1.0,
404
+ "naucs_at_20_std": 1.0,
405
+ "naucs_at_20_diff1": 1.0,
406
+ "naucs_at_50_max": null,
407
+ "naucs_at_50_std": null,
408
+ "naucs_at_50_diff1": null,
409
+ "naucs_at_100_max": null,
410
+ "naucs_at_100_std": null,
411
+ "naucs_at_100_diff1": null
412
+ },
413
+ "vidore/syntheticDocQA_energy_test": {
414
+ "ndcg_at_1": 0.96,
415
+ "ndcg_at_3": 0.96631,
416
+ "ndcg_at_5": 0.96631,
417
+ "ndcg_at_10": 0.96946,
418
+ "ndcg_at_20": 0.97209,
419
+ "ndcg_at_50": 0.97406,
420
+ "ndcg_at_100": 0.97406,
421
+ "map_at_1": 0.96,
422
+ "map_at_3": 0.965,
423
+ "map_at_5": 0.965,
424
+ "map_at_10": 0.96625,
425
+ "map_at_20": 0.96702,
426
+ "map_at_50": 0.96732,
427
+ "map_at_100": 0.96732,
428
+ "recall_at_1": 0.96,
429
+ "recall_at_3": 0.97,
430
+ "recall_at_5": 0.97,
431
+ "recall_at_10": 0.98,
432
+ "recall_at_20": 0.99,
433
+ "recall_at_50": 1.0,
434
+ "recall_at_100": 1.0,
435
+ "precision_at_1": 0.96,
436
+ "precision_at_3": 0.32333,
437
+ "precision_at_5": 0.194,
438
+ "precision_at_10": 0.098,
439
+ "precision_at_20": 0.0495,
440
+ "precision_at_50": 0.02,
441
+ "precision_at_100": 0.01,
442
+ "mrr_at_1": 0.96,
443
+ "mrr_at_3": 0.965,
444
+ "mrr_at_5": 0.965,
445
+ "mrr_at_10": 0.96625,
446
+ "mrr_at_20": 0.9670192307692308,
447
+ "mrr_at_50": 0.9673222610722612,
448
+ "mrr_at_100": 0.9673222610722612,
449
+ "naucs_at_1_max": 0.7169701213818873,
450
+ "naucs_at_1_std": -0.03863211951446941,
451
+ "naucs_at_1_diff1": 1.0,
452
+ "naucs_at_3_max": 0.7152194211017727,
453
+ "naucs_at_3_std": -0.34126984126984133,
454
+ "naucs_at_3_diff1": 1.0,
455
+ "naucs_at_5_max": 0.7152194211017693,
456
+ "naucs_at_5_std": -0.3412698412698435,
457
+ "naucs_at_5_diff1": 1.0,
458
+ "naucs_at_10_max": 0.7957516339869297,
459
+ "naucs_at_10_std": 0.35807656395892185,
460
+ "naucs_at_10_diff1": 1.0,
461
+ "naucs_at_20_max": 0.7222222222222276,
462
+ "naucs_at_20_std": 0.35807656395891135,
463
+ "naucs_at_20_diff1": 1.0,
464
+ "naucs_at_50_max": null,
465
+ "naucs_at_50_std": null,
466
+ "naucs_at_50_diff1": null,
467
+ "naucs_at_100_max": null,
468
+ "naucs_at_100_std": null,
469
+ "naucs_at_100_diff1": null
470
+ },
471
+ "vidore/syntheticDocQA_government_reports_test": {
472
+ "ndcg_at_1": 0.95,
473
+ "ndcg_at_3": 0.97393,
474
+ "ndcg_at_5": 0.97823,
475
+ "ndcg_at_10": 0.97823,
476
+ "ndcg_at_20": 0.97823,
477
+ "ndcg_at_50": 0.97823,
478
+ "ndcg_at_100": 0.97823,
479
+ "map_at_1": 0.95,
480
+ "map_at_3": 0.96833,
481
+ "map_at_5": 0.97083,
482
+ "map_at_10": 0.97083,
483
+ "map_at_20": 0.97083,
484
+ "map_at_50": 0.97083,
485
+ "map_at_100": 0.97083,
486
+ "recall_at_1": 0.95,
487
+ "recall_at_3": 0.99,
488
+ "recall_at_5": 1.0,
489
+ "recall_at_10": 1.0,
490
+ "recall_at_20": 1.0,
491
+ "recall_at_50": 1.0,
492
+ "recall_at_100": 1.0,
493
+ "precision_at_1": 0.95,
494
+ "precision_at_3": 0.33,
495
+ "precision_at_5": 0.2,
496
+ "precision_at_10": 0.1,
497
+ "precision_at_20": 0.05,
498
+ "precision_at_50": 0.02,
499
+ "precision_at_100": 0.01,
500
+ "mrr_at_1": 0.95,
501
+ "mrr_at_3": 0.9683333333333333,
502
+ "mrr_at_5": 0.9708333333333333,
503
+ "mrr_at_10": 0.9708333333333333,
504
+ "mrr_at_20": 0.9708333333333333,
505
+ "mrr_at_50": 0.9708333333333333,
506
+ "mrr_at_100": 0.9708333333333333,
507
+ "naucs_at_1_max": 0.6765639589168986,
508
+ "naucs_at_1_std": 0.5556489262371623,
509
+ "naucs_at_1_diff1": 0.9738562091503253,
510
+ "naucs_at_3_max": 1.0,
511
+ "naucs_at_3_std": 0.8692810457516356,
512
+ "naucs_at_3_diff1": 0.8692810457516356,
513
+ "naucs_at_5_max": 1.0,
514
+ "naucs_at_5_std": 1.0,
515
+ "naucs_at_5_diff1": 1.0,
516
+ "naucs_at_10_max": 1.0,
517
+ "naucs_at_10_std": 1.0,
518
+ "naucs_at_10_diff1": 1.0,
519
+ "naucs_at_20_max": 1.0,
520
+ "naucs_at_20_std": 1.0,
521
+ "naucs_at_20_diff1": 1.0,
522
+ "naucs_at_50_max": null,
523
+ "naucs_at_50_std": null,
524
+ "naucs_at_50_diff1": null,
525
+ "naucs_at_100_max": null,
526
+ "naucs_at_100_std": null,
527
+ "naucs_at_100_diff1": null
528
+ },
529
+ "vidore/syntheticDocQA_healthcare_industry_test": {
530
+ "ndcg_at_1": 0.98,
531
+ "ndcg_at_3": 0.99262,
532
+ "ndcg_at_5": 0.99262,
533
+ "ndcg_at_10": 0.99262,
534
+ "ndcg_at_20": 0.99262,
535
+ "ndcg_at_50": 0.99262,
536
+ "ndcg_at_100": 0.99262,
537
+ "map_at_1": 0.98,
538
+ "map_at_3": 0.99,
539
+ "map_at_5": 0.99,
540
+ "map_at_10": 0.99,
541
+ "map_at_20": 0.99,
542
+ "map_at_50": 0.99,
543
+ "map_at_100": 0.99,
544
+ "recall_at_1": 0.98,
545
+ "recall_at_3": 1.0,
546
+ "recall_at_5": 1.0,
547
+ "recall_at_10": 1.0,
548
+ "recall_at_20": 1.0,
549
+ "recall_at_50": 1.0,
550
+ "recall_at_100": 1.0,
551
+ "precision_at_1": 0.98,
552
+ "precision_at_3": 0.33333,
553
+ "precision_at_5": 0.2,
554
+ "precision_at_10": 0.1,
555
+ "precision_at_20": 0.05,
556
+ "precision_at_50": 0.02,
557
+ "precision_at_100": 0.01,
558
+ "mrr_at_1": 0.98,
559
+ "mrr_at_3": 0.99,
560
+ "mrr_at_5": 0.99,
561
+ "mrr_at_10": 0.99,
562
+ "mrr_at_20": 0.99,
563
+ "mrr_at_50": 0.99,
564
+ "mrr_at_100": 0.99,
565
+ "naucs_at_1_max": 0.6381886087768457,
566
+ "naucs_at_1_std": -0.14122315592903503,
567
+ "naucs_at_1_diff1": 1.0,
568
+ "naucs_at_3_max": 1.0,
569
+ "naucs_at_3_std": 1.0,
570
+ "naucs_at_3_diff1": 1.0,
571
+ "naucs_at_5_max": 1.0,
572
+ "naucs_at_5_std": 1.0,
573
+ "naucs_at_5_diff1": 1.0,
574
+ "naucs_at_10_max": 1.0,
575
+ "naucs_at_10_std": 1.0,
576
+ "naucs_at_10_diff1": 1.0,
577
+ "naucs_at_20_max": 1.0,
578
+ "naucs_at_20_std": 1.0,
579
+ "naucs_at_20_diff1": 1.0,
580
+ "naucs_at_50_max": null,
581
+ "naucs_at_50_std": null,
582
+ "naucs_at_50_diff1": null,
583
+ "naucs_at_100_max": null,
584
+ "naucs_at_100_std": null,
585
+ "naucs_at_100_diff1": null
586
+ },
587
+ "vidore/synthetic_rse_restaurant_filtered_v1.0_multilingual": {
588
+ "ndcg_at_1": 0.49561,
589
+ "ndcg_at_3": 0.53551,
590
+ "ndcg_at_5": 0.57573,
591
+ "ndcg_at_10": 0.62797,
592
+ "ndcg_at_20": 0.66435,
593
+ "ndcg_at_50": 0.68753,
594
+ "ndcg_at_100": 0.69778,
595
+ "map_at_1": 0.24991,
596
+ "map_at_3": 0.39804,
597
+ "map_at_5": 0.45596,
598
+ "map_at_10": 0.50329,
599
+ "map_at_20": 0.52856,
600
+ "map_at_50": 0.54302,
601
+ "map_at_100": 0.54853,
602
+ "recall_at_1": 0.24991,
603
+ "recall_at_3": 0.49812,
604
+ "recall_at_5": 0.62081,
605
+ "recall_at_10": 0.78505,
606
+ "recall_at_20": 0.89032,
607
+ "recall_at_50": 0.94781,
608
+ "recall_at_100": 0.97423,
609
+ "precision_at_1": 0.49561,
610
+ "precision_at_3": 0.37865,
611
+ "precision_at_5": 0.30965,
612
+ "precision_at_10": 0.20921,
613
+ "precision_at_20": 0.13246,
614
+ "precision_at_50": 0.06912,
615
+ "precision_at_100": 0.03846,
616
+ "mrr_at_1": 0.4956140350877193,
617
+ "mrr_at_3": 0.6271929824561403,
618
+ "mrr_at_5": 0.642982456140351,
619
+ "mrr_at_10": 0.6540744221665276,
620
+ "mrr_at_20": 0.6558978051818611,
621
+ "mrr_at_50": 0.6561780443192584,
622
+ "mrr_at_100": 0.6561780443192584,
623
+ "naucs_at_1_max": -0.07400167664801681,
624
+ "naucs_at_1_std": 0.03458415022230023,
625
+ "naucs_at_1_diff1": 0.36837819306504144,
626
+ "naucs_at_3_max": -0.12444851869211698,
627
+ "naucs_at_3_std": -0.03350495496687875,
628
+ "naucs_at_3_diff1": 0.14678568781036586,
629
+ "naucs_at_5_max": -0.18380991972520577,
630
+ "naucs_at_5_std": -0.04485546676356389,
631
+ "naucs_at_5_diff1": 0.09319332805741351,
632
+ "naucs_at_10_max": -0.2287733254906937,
633
+ "naucs_at_10_std": -0.11817355407871401,
634
+ "naucs_at_10_diff1": 0.039989126851164826,
635
+ "naucs_at_20_max": -0.26978462701811906,
636
+ "naucs_at_20_std": -0.17072707397422024,
637
+ "naucs_at_20_diff1": -0.043988113501541394,
638
+ "naucs_at_50_max": -0.2775075319567234,
639
+ "naucs_at_50_std": -0.20957246437121108,
640
+ "naucs_at_50_diff1": -0.1133061107255248,
641
+ "naucs_at_100_max": -0.27585910810842096,
642
+ "naucs_at_100_std": -0.2097908968784823,
643
+ "naucs_at_100_diff1": -0.14037054801741544
644
+ },
645
+ "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered_multilingual": {
646
+ "ndcg_at_1": 0.60469,
647
+ "ndcg_at_3": 0.61057,
648
+ "ndcg_at_5": 0.63196,
649
+ "ndcg_at_10": 0.66415,
650
+ "ndcg_at_20": 0.68919,
651
+ "ndcg_at_50": 0.71209,
652
+ "ndcg_at_100": 0.72404,
653
+ "map_at_1": 0.3749,
654
+ "map_at_3": 0.50426,
655
+ "map_at_5": 0.54282,
656
+ "map_at_10": 0.57771,
657
+ "map_at_20": 0.59195,
658
+ "map_at_50": 0.60066,
659
+ "map_at_100": 0.60317,
660
+ "recall_at_1": 0.3749,
661
+ "recall_at_3": 0.56825,
662
+ "recall_at_5": 0.65837,
663
+ "recall_at_10": 0.75073,
664
+ "recall_at_20": 0.81942,
665
+ "recall_at_50": 0.8876,
666
+ "recall_at_100": 0.93527,
667
+ "precision_at_1": 0.60469,
668
+ "precision_at_3": 0.37448,
669
+ "precision_at_5": 0.28031,
670
+ "precision_at_10": 0.17922,
671
+ "precision_at_20": 0.10523,
672
+ "precision_at_50": 0.05022,
673
+ "precision_at_100": 0.02742,
674
+ "mrr_at_1": 0.6046875,
675
+ "mrr_at_3": 0.693489583333333,
676
+ "mrr_at_5": 0.7104427083333327,
677
+ "mrr_at_10": 0.7158568948412692,
678
+ "mrr_at_20": 0.7190933507966805,
679
+ "mrr_at_50": 0.7196874906299467,
680
+ "mrr_at_100": 0.7198739903840827,
681
+ "naucs_at_1_max": 0.21052397754348515,
682
+ "naucs_at_1_std": 0.09375197289505234,
683
+ "naucs_at_1_diff1": 0.5111101758127156,
684
+ "naucs_at_3_max": 0.06090465086494804,
685
+ "naucs_at_3_std": -0.001418024019873419,
686
+ "naucs_at_3_diff1": -0.03565597745007234,
687
+ "naucs_at_5_max": 0.004465748594415919,
688
+ "naucs_at_5_std": -0.0484521722756207,
689
+ "naucs_at_5_diff1": -0.1245720478106472,
690
+ "naucs_at_10_max": -0.059499017411910264,
691
+ "naucs_at_10_std": -0.07086245514678893,
692
+ "naucs_at_10_diff1": -0.22255807507622197,
693
+ "naucs_at_20_max": -0.08861071305293747,
694
+ "naucs_at_20_std": -0.04972647301862899,
695
+ "naucs_at_20_diff1": -0.28111304038576185,
696
+ "naucs_at_50_max": -0.0772093711850375,
697
+ "naucs_at_50_std": -0.03833832084634795,
698
+ "naucs_at_50_diff1": -0.3229404564565436,
699
+ "naucs_at_100_max": -0.09326229510606512,
700
+ "naucs_at_100_std": -0.05851062266000862,
701
+ "naucs_at_100_diff1": -0.33725273449156307
702
+ },
703
+ "vidore/synthetics_economics_macro_economy_2024_filtered_v1.0_multilingual": {
704
+ "ndcg_at_1": 0.65086,
705
+ "ndcg_at_3": 0.60789,
706
+ "ndcg_at_5": 0.57982,
707
+ "ndcg_at_10": 0.56497,
708
+ "ndcg_at_20": 0.5906,
709
+ "ndcg_at_50": 0.66142,
710
+ "ndcg_at_100": 0.6981,
711
+ "map_at_1": 0.09835,
712
+ "map_at_3": 0.19265,
713
+ "map_at_5": 0.2438,
714
+ "map_at_10": 0.31201,
715
+ "map_at_20": 0.37105,
716
+ "map_at_50": 0.43282,
717
+ "map_at_100": 0.45993,
718
+ "recall_at_1": 0.09835,
719
+ "recall_at_3": 0.2383,
720
+ "recall_at_5": 0.31559,
721
+ "recall_at_10": 0.44357,
722
+ "recall_at_20": 0.59021,
723
+ "recall_at_50": 0.79179,
724
+ "recall_at_100": 0.90968,
725
+ "precision_at_1": 0.65086,
726
+ "precision_at_3": 0.55316,
727
+ "precision_at_5": 0.49828,
728
+ "precision_at_10": 0.40603,
729
+ "precision_at_20": 0.30948,
730
+ "precision_at_50": 0.1981,
731
+ "precision_at_100": 0.12724,
732
+ "mrr_at_1": 0.6508620689655172,
733
+ "mrr_at_3": 0.7586206896551727,
734
+ "mrr_at_5": 0.7706896551724141,
735
+ "mrr_at_10": 0.7741362205801864,
736
+ "mrr_at_20": 0.7757055120898937,
737
+ "mrr_at_50": 0.7760672193481526,
738
+ "mrr_at_100": 0.7760672193481526,
739
+ "naucs_at_1_max": -0.1577982313304122,
740
+ "naucs_at_1_std": 0.07083760025943213,
741
+ "naucs_at_1_diff1": 0.1180440061698451,
742
+ "naucs_at_3_max": -0.03551684594794198,
743
+ "naucs_at_3_std": 0.18649544217765762,
744
+ "naucs_at_3_diff1": 0.006931180468183028,
745
+ "naucs_at_5_max": -0.042439023438686566,
746
+ "naucs_at_5_std": 0.1463028288463992,
747
+ "naucs_at_5_diff1": 0.0052961279206988725,
748
+ "naucs_at_10_max": -0.014346231321749392,
749
+ "naucs_at_10_std": 0.13820096240926596,
750
+ "naucs_at_10_diff1": 0.060959204965535974,
751
+ "naucs_at_20_max": -0.04036150486209418,
752
+ "naucs_at_20_std": 0.10161400684234778,
753
+ "naucs_at_20_diff1": 0.058238772027959955,
754
+ "naucs_at_50_max": -0.027805254364547293,
755
+ "naucs_at_50_std": 0.06305093612338106,
756
+ "naucs_at_50_diff1": 0.014479645829478357,
757
+ "naucs_at_100_max": -0.04967371405554246,
758
+ "naucs_at_100_std": -0.0014108802561097272,
759
+ "naucs_at_100_diff1": 0.004463197803405348
760
+ },
761
+ "vidore/restaurant_esg_reports_beir": {
762
+ "ndcg_at_1": 0.66026,
763
+ "ndcg_at_3": 0.71844,
764
+ "ndcg_at_5": 0.74746,
765
+ "ndcg_at_10": 0.78463,
766
+ "ndcg_at_20": 0.79647,
767
+ "ndcg_at_50": 0.80898,
768
+ "ndcg_at_100": 0.81195,
769
+ "map_at_1": 0.46731,
770
+ "map_at_3": 0.6321,
771
+ "map_at_5": 0.67856,
772
+ "map_at_10": 0.71154,
773
+ "map_at_20": 0.71984,
774
+ "map_at_50": 0.72621,
775
+ "map_at_100": 0.72685,
776
+ "recall_at_1": 0.46731,
777
+ "recall_at_3": 0.71218,
778
+ "recall_at_5": 0.7989,
779
+ "recall_at_10": 0.89575,
780
+ "recall_at_20": 0.92767,
781
+ "recall_at_50": 0.97191,
782
+ "recall_at_100": 0.981,
783
+ "precision_at_1": 0.67308,
784
+ "precision_at_3": 0.41026,
785
+ "precision_at_5": 0.30769,
786
+ "precision_at_10": 0.18654,
787
+ "precision_at_20": 0.10096,
788
+ "precision_at_50": 0.04462,
789
+ "precision_at_100": 0.02308,
790
+ "mrr_at_1": 0.6730769230769231,
791
+ "mrr_at_3": 0.7756410256410257,
792
+ "mrr_at_5": 0.7852564102564104,
793
+ "mrr_at_10": 0.7940705128205129,
794
+ "mrr_at_20": 0.7940705128205129,
795
+ "mrr_at_50": 0.7949862637362638,
796
+ "mrr_at_100": 0.7949862637362638,
797
+ "naucs_at_1_max": 0.2155210634937776,
798
+ "naucs_at_1_std": 0.20643949634517547,
799
+ "naucs_at_1_diff1": 0.5572918449312647,
800
+ "naucs_at_3_max": -0.22870270992303324,
801
+ "naucs_at_3_std": -0.16008064610097344,
802
+ "naucs_at_3_diff1": -0.1052498381156227,
803
+ "naucs_at_5_max": -0.20249887866652966,
804
+ "naucs_at_5_std": -0.10577079083523987,
805
+ "naucs_at_5_diff1": -0.21560021268353965,
806
+ "naucs_at_10_max": -0.13190257359821772,
807
+ "naucs_at_10_std": -0.02595056097795122,
808
+ "naucs_at_10_diff1": -0.25059078573348686,
809
+ "naucs_at_20_max": -0.10804998986280905,
810
+ "naucs_at_20_std": 0.009844091874970871,
811
+ "naucs_at_20_diff1": -0.3106578658424983,
812
+ "naucs_at_50_max": -0.14217724030501422,
813
+ "naucs_at_50_std": -0.0010783625765841411,
814
+ "naucs_at_50_diff1": -0.2900029814490325,
815
+ "naucs_at_100_max": -0.15203872114868228,
816
+ "naucs_at_100_std": -0.01643757244869887,
817
+ "naucs_at_100_diff1": -0.2814362025783248
818
+ },
819
+ "vidore/synthetic_rse_restaurant_filtered_v1.0": {
820
+ "ndcg_at_1": 0.52632,
821
+ "ndcg_at_3": 0.55122,
822
+ "ndcg_at_5": 0.58777,
823
+ "ndcg_at_10": 0.63895,
824
+ "ndcg_at_20": 0.66853,
825
+ "ndcg_at_50": 0.69944,
826
+ "ndcg_at_100": 0.7087,
827
+ "map_at_1": 0.28596,
828
+ "map_at_3": 0.42149,
829
+ "map_at_5": 0.48239,
830
+ "map_at_10": 0.52458,
831
+ "map_at_20": 0.54701,
832
+ "map_at_50": 0.56361,
833
+ "map_at_100": 0.56876,
834
+ "recall_at_1": 0.28596,
835
+ "recall_at_3": 0.5136,
836
+ "recall_at_5": 0.6131,
837
+ "recall_at_10": 0.77097,
838
+ "recall_at_20": 0.8575,
839
+ "recall_at_50": 0.93975,
840
+ "recall_at_100": 0.96272,
841
+ "precision_at_1": 0.52632,
842
+ "precision_at_3": 0.37427,
843
+ "precision_at_5": 0.31228,
844
+ "precision_at_10": 0.21053,
845
+ "precision_at_20": 0.12807,
846
+ "precision_at_50": 0.06947,
847
+ "precision_at_100": 0.03842,
848
+ "mrr_at_1": 0.5263157894736842,
849
+ "mrr_at_3": 0.6461988304093568,
850
+ "mrr_at_5": 0.6549707602339182,
851
+ "mrr_at_10": 0.6659844054580898,
852
+ "mrr_at_20": 0.6673339331234068,
853
+ "mrr_at_50": 0.6679388938009629,
854
+ "mrr_at_100": 0.6679388938009629,
855
+ "naucs_at_1_max": -0.018448353550608226,
856
+ "naucs_at_1_std": 0.0821493132969922,
857
+ "naucs_at_1_diff1": 0.30290509894323814,
858
+ "naucs_at_3_max": -0.16441638665467292,
859
+ "naucs_at_3_std": 0.05181451792946125,
860
+ "naucs_at_3_diff1": 0.17081667434507056,
861
+ "naucs_at_5_max": -0.2223104695809391,
862
+ "naucs_at_5_std": -0.0034901998225501346,
863
+ "naucs_at_5_diff1": 0.11827209266301716,
864
+ "naucs_at_10_max": -0.3490810408001137,
865
+ "naucs_at_10_std": -0.10137127012539782,
866
+ "naucs_at_10_diff1": 0.024290790916341346,
867
+ "naucs_at_20_max": -0.41179472120133376,
868
+ "naucs_at_20_std": -0.18184189514711724,
869
+ "naucs_at_20_diff1": -0.026948094739752244,
870
+ "naucs_at_50_max": -0.4169831140986695,
871
+ "naucs_at_50_std": -0.2419827968566681,
872
+ "naucs_at_50_diff1": -0.14707184113572777,
873
+ "naucs_at_100_max": -0.41046730276590754,
874
+ "naucs_at_100_std": -0.2436064377498405,
875
+ "naucs_at_100_diff1": -0.16240322922206768
876
+ },
877
+ "vidore/synthetic_economics_macro_economy_2024_filtered_v1.0": {
878
+ "ndcg_at_1": 0.7931,
879
+ "ndcg_at_3": 0.69066,
880
+ "ndcg_at_5": 0.65993,
881
+ "ndcg_at_10": 0.62291,
882
+ "ndcg_at_20": 0.64258,
883
+ "ndcg_at_50": 0.70498,
884
+ "ndcg_at_100": 0.73984,
885
+ "map_at_1": 0.11768,
886
+ "map_at_3": 0.22479,
887
+ "map_at_5": 0.28825,
888
+ "map_at_10": 0.35934,
889
+ "map_at_20": 0.42423,
890
+ "map_at_50": 0.48412,
891
+ "map_at_100": 0.51142,
892
+ "recall_at_1": 0.11768,
893
+ "recall_at_3": 0.26124,
894
+ "recall_at_5": 0.34909,
895
+ "recall_at_10": 0.46824,
896
+ "recall_at_20": 0.62417,
897
+ "recall_at_50": 0.80904,
898
+ "recall_at_100": 0.92098,
899
+ "precision_at_1": 0.7931,
900
+ "precision_at_3": 0.62069,
901
+ "precision_at_5": 0.56552,
902
+ "precision_at_10": 0.43966,
903
+ "precision_at_20": 0.325,
904
+ "precision_at_50": 0.20172,
905
+ "precision_at_100": 0.12914,
906
+ "mrr_at_1": 0.7931034482758621,
907
+ "mrr_at_3": 0.8534482758620691,
908
+ "mrr_at_5": 0.8568965517241378,
909
+ "mrr_at_10": 0.8597701149425288,
910
+ "mrr_at_20": 0.8620158408190863,
911
+ "mrr_at_50": 0.8620158408190863,
912
+ "mrr_at_100": 0.8620158408190863,
913
+ "naucs_at_1_max": 0.271935938073248,
914
+ "naucs_at_1_std": 0.43674674991288676,
915
+ "naucs_at_1_diff1": 0.3617820074090711,
916
+ "naucs_at_3_max": 0.045282507585294354,
917
+ "naucs_at_3_std": 0.14844039042832088,
918
+ "naucs_at_3_diff1": 0.11059599624516278,
919
+ "naucs_at_5_max": 0.06086984174322351,
920
+ "naucs_at_5_std": 0.15073825039690825,
921
+ "naucs_at_5_diff1": 0.07597281303252662,
922
+ "naucs_at_10_max": 0.030803253605626704,
923
+ "naucs_at_10_std": 0.12023189876899391,
924
+ "naucs_at_10_diff1": 0.09409994930103974,
925
+ "naucs_at_20_max": 0.06111477495291343,
926
+ "naucs_at_20_std": 0.1377686346379398,
927
+ "naucs_at_20_diff1": 0.08342034215323584,
928
+ "naucs_at_50_max": -0.0214422324455313,
929
+ "naucs_at_50_std": 0.0629645228435432,
930
+ "naucs_at_50_diff1": 0.03433912496739525,
931
+ "naucs_at_100_max": -0.07270216260742496,
932
+ "naucs_at_100_std": -0.009713954266926159,
933
+ "naucs_at_100_diff1": -0.00289458364944889
934
+ },
935
+ "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered": {
936
+ "ndcg_at_1": 0.63125,
937
+ "ndcg_at_3": 0.63272,
938
+ "ndcg_at_5": 0.65663,
939
+ "ndcg_at_10": 0.68997,
940
+ "ndcg_at_20": 0.71467,
941
+ "ndcg_at_50": 0.73633,
942
+ "ndcg_at_100": 0.74718,
943
+ "map_at_1": 0.39991,
944
+ "map_at_3": 0.5245,
945
+ "map_at_5": 0.57059,
946
+ "map_at_10": 0.6074,
947
+ "map_at_20": 0.62094,
948
+ "map_at_50": 0.62966,
949
+ "map_at_100": 0.63202,
950
+ "recall_at_1": 0.39991,
951
+ "recall_at_3": 0.57796,
952
+ "recall_at_5": 0.67437,
953
+ "recall_at_10": 0.77096,
954
+ "recall_at_20": 0.84121,
955
+ "recall_at_50": 0.90687,
956
+ "recall_at_100": 0.94874,
957
+ "precision_at_1": 0.63125,
958
+ "precision_at_3": 0.38958,
959
+ "precision_at_5": 0.29375,
960
+ "precision_at_10": 0.18688,
961
+ "precision_at_20": 0.10844,
962
+ "precision_at_50": 0.05075,
963
+ "precision_at_100": 0.02756,
964
+ "mrr_at_1": 0.63125,
965
+ "mrr_at_3": 0.715625,
966
+ "mrr_at_5": 0.7274999999999998,
967
+ "mrr_at_10": 0.7328447420634919,
968
+ "mrr_at_20": 0.7370251189782439,
969
+ "mrr_at_50": 0.7376058481449105,
970
+ "mrr_at_100": 0.737786021875342,
971
+ "naucs_at_1_max": 0.40116615786322174,
972
+ "naucs_at_1_std": 0.12804177750728857,
973
+ "naucs_at_1_diff1": 0.5318484780069298,
974
+ "naucs_at_3_max": 0.031070174638639564,
975
+ "naucs_at_3_std": -0.03258401349687653,
976
+ "naucs_at_3_diff1": -0.08211207879257611,
977
+ "naucs_at_5_max": -0.03804589294380602,
978
+ "naucs_at_5_std": -0.0638529839983047,
979
+ "naucs_at_5_diff1": -0.1625118765251509,
980
+ "naucs_at_10_max": -0.08589738686452425,
981
+ "naucs_at_10_std": -0.0652662376777225,
982
+ "naucs_at_10_diff1": -0.27588514824978205,
983
+ "naucs_at_20_max": -0.14280800843691288,
984
+ "naucs_at_20_std": -0.050772727555263505,
985
+ "naucs_at_20_diff1": -0.33260125020460346,
986
+ "naucs_at_50_max": -0.18794085961226306,
987
+ "naucs_at_50_std": -0.09315309781822757,
988
+ "naucs_at_50_diff1": -0.3809855756340824,
989
+ "naucs_at_100_max": -0.20443798069201818,
990
+ "naucs_at_100_std": -0.09564469982214785,
991
+ "naucs_at_100_diff1": -0.416619080759233
992
+ }
993
+ }
994
+ }
vidore_eval.py CHANGED
@@ -36,6 +36,12 @@ def get_args():
36
  help='Path to model checkpoint if HF',
37
  default=''
38
  )
 
 
 
 
 
 
39
  parser.add_argument(
40
  '--batch_size',
41
  type=int,
@@ -74,7 +80,8 @@ if __name__ == "__main__":
74
  device_map='cuda',
75
  trust_remote_code=True,
76
  torch_dtype=torch.bfloat16,
77
- attn_implementation="flash_attention_2"
 
78
  ).eval()
79
 
80
  vidore_evaluator_qa = ViDoReEvaluatorQA(vision_retriever) # ViDoRe-v1
 
36
  help='Path to model checkpoint if HF',
37
  default=''
38
  )
39
+ parser.add_argument(
40
+ '--model_revision',
41
+ type=str,
42
+ help='Commit Hash of the model as custom code is downloaded and executed',
43
+ default=None
44
+ )
45
  parser.add_argument(
46
  '--batch_size',
47
  type=int,
 
80
  device_map='cuda',
81
  trust_remote_code=True,
82
  torch_dtype=torch.bfloat16,
83
+ attn_implementation="flash_attention_2",
84
+ revision=args.model_revision
85
  ).eval()
86
 
87
  vidore_evaluator_qa = ViDoReEvaluatorQA(vision_retriever) # ViDoRe-v1