Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,25 @@ This model is a fine-tuned version of [Qwen/Qwen2.5-VL-7B-Instruct](https://hugg
|
|
27 |
**A quick demo is shown below:**
|
28 |
<details>
|
29 |
<summary>Generative Scoring (for classification and retrieval):</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
```python
|
32 |
# Import necessary libraries
|
@@ -98,6 +117,26 @@ print(f"Score: {score:.4f}")
|
|
98 |
|
99 |
<details>
|
100 |
<summary>Natural Language Generation</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
101 |
|
102 |
```python
|
103 |
# The model is trained on 8.0 FPS which we recommend for optimal inference
|
|
|
27 |
**A quick demo is shown below:**
|
28 |
<details>
|
29 |
<summary>Generative Scoring (for classification and retrieval):</summary>
|
30 |
+
|
31 |
+
We have two ways of using our model for this application. The first is the recommended `t2v_metrics` approach which we recommend. The latter is a back-up approach directly using Qwen2.5-VL's inference demo.
|
32 |
+
|
33 |
+
1. `t2v_metrics` Approach
|
34 |
+
```python
|
35 |
+
# Install the package using: pip install git+https://github.com/chancharikmitra/t2v_metrics.git
|
36 |
+
|
37 |
+
import t2v_metrics
|
38 |
+
|
39 |
+
### For a single (video, text) pair:
|
40 |
+
qwen_score = t2v_metrics.VQAScore(model='qwen2.5-vl-7b', checkpoint='chancharikm/qwen2.5-vl-7b-cam-motion-preview')
|
41 |
+
video = "videos/baby.mp4" # a video path in string format
|
42 |
+
text = "a baby crying"
|
43 |
+
# Calculate probability of "Yes" response
|
44 |
+
score = qwen_score(images=[video], texts=[text])
|
45 |
+
```
|
46 |
+
For more details, please refer to the t2v_metrics [fork](https://github.com/chancharikmitra/t2v_metrics.git).
|
47 |
+
|
48 |
+
2. Qwen2.5-VL Inference Code Approach
|
49 |
|
50 |
```python
|
51 |
# Import necessary libraries
|
|
|
117 |
|
118 |
<details>
|
119 |
<summary>Natural Language Generation</summary>
|
120 |
+
|
121 |
+
We have two ways of using our model for this application. The first is the recommended `t2v_metrics` approach which we recommend. The latter is a back-up approach directly using Qwen2.5-VL's inference demo.
|
122 |
+
|
123 |
+
1. `t2v_metrics` Approach
|
124 |
+
|
125 |
+
```python
|
126 |
+
# Install the package using: pip install git+https://github.com/chancharikmitra/t2v_metrics.git
|
127 |
+
|
128 |
+
import t2v_metrics
|
129 |
+
|
130 |
+
### For a single (video, text) pair:
|
131 |
+
qwen_score = t2v_metrics.VQAScore(model='qwen2.5-vl-7b', checkpoint='chancharikm/qwen2.5-vl-7b-cam-motion-preview')
|
132 |
+
video = "videos/baby.mp4" # a video path in string format
|
133 |
+
text = "Please describe this image: "
|
134 |
+
# Calculate probability of "Yes" response
|
135 |
+
score = qwen_score.model.generate(images=[video], texts=[text])
|
136 |
+
```
|
137 |
+
For more details, please refer to the t2v_metrics [fork](https://github.com/chancharikmitra/t2v_metrics.git).
|
138 |
+
|
139 |
+
2. Qwen2.5-VL Inference Code Approach
|
140 |
|
141 |
```python
|
142 |
# The model is trained on 8.0 FPS which we recommend for optimal inference
|