Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `decoder_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_model_bnb4.onnx` (added)
### β
Based on `encoder_model.onnx` *with* slimming
β³ β
`fp16`: `encoder_model_fp16.onnx` (added)
β³ β `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β
`uint8`: `encoder_model_uint8.onnx` (added)
β³ β
`q4`: `encoder_model_q4.onnx` (added)
β³ β
`q4f16`: `encoder_model_q4f16.onnx` (added)
β³ β
`bnb4`: `encoder_model_bnb4.onnx` (added)
### β
Based on `decoder_with_past_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_with_past_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_with_past_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_with_past_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_with_past_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_with_past_model_bnb4.onnx` (added)
### β
Based on `decoder_model_merged.onnx` *without* slimming
- README.md +19 -1
- onnx/decoder_model_bnb4.onnx +3 -0
- onnx/decoder_model_fp16.onnx +3 -0
- onnx/decoder_model_int8.onnx +3 -0
- onnx/decoder_model_q4.onnx +3 -0
- onnx/decoder_model_q4f16.onnx +3 -0
- onnx/decoder_model_uint8.onnx +3 -0
- onnx/decoder_with_past_model_bnb4.onnx +3 -0
- onnx/decoder_with_past_model_fp16.onnx +3 -0
- onnx/decoder_with_past_model_int8.onnx +3 -0
- onnx/decoder_with_past_model_q4.onnx +3 -0
- onnx/decoder_with_past_model_q4f16.onnx +3 -0
- onnx/decoder_with_past_model_uint8.onnx +3 -0
- onnx/encoder_model_bnb4.onnx +3 -0
- onnx/encoder_model_fp16.onnx +3 -0
- onnx/encoder_model_q4.onnx +3 -0
- onnx/encoder_model_q4f16.onnx +3 -0
- onnx/encoder_model_uint8.onnx +3 -0
@@ -5,4 +5,22 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/openai/whisper-large with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
6 |
https://huggingface.co/openai/whisper-large with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
9 |
+
```bash
|
10 |
+
npm i @huggingface/transformers
|
11 |
+
```
|
12 |
+
|
13 |
+
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
14 |
+
|
15 |
+
```js
|
16 |
+
import { pipeline } from '@huggingface/transformers';
|
17 |
+
|
18 |
+
// Create the pipeline
|
19 |
+
const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-large', {
|
20 |
+
dtype: 'fp32', // Options: "fp32", "fp16", "q8", "q4"
|
21 |
+
});
|
22 |
+
|
23 |
+
// Use the model
|
24 |
+
const result = await pipe('input text or data');
|
25 |
+
console.log(result);
|
26 |
+
```
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0a092fd8fac81949b4fb0d5129a6e5d4e576c0b9e659a92f3b9b347831f1345d
|
3 |
+
size 743928480
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8b2e35b18e5dca50013aaf40b693592410dc4ff7b27eaae7c7357605863a74b5
|
3 |
+
size 1814428308
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5d57bdd9fbb44864a7859ca5ffb756c7086cb8ffe587dbdb20b580f148d54c8d
|
3 |
+
size 1177657684
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1cf4f5b76c96c78f2d9f531914f87ba3d5303efb57188d3da63efa0582aa18d4
|
3 |
+
size 796354688
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0bb3035c75d9fb3aeba42a0dc6fb77853d1c5847579b252998a0552a810757ab
|
3 |
+
size 608612981
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4dfd6da7eba8f5e1b44c1393b433e20e51ab31be750130065d584683c7cb125d
|
3 |
+
size 1177657843
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f2050a958d02ee663611df68039c510ba92ff96837dd4dd01d93196654cd91fe
|
3 |
+
size 684822244
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1fc63d003ba88585bbe48fd144c272d7fc3d8741e00af5088f3eb3d5180fc13c
|
3 |
+
size 1604702870
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:37bf8be4e29b3b4059e9b1f4bbee4719748758e1ee5145491def3265dfe5b333
|
3 |
+
size 1072623047
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:56b5c3fde37df91c2c6194e9d3f4bae58de215427fe0c5a8dadbe897d73976ac
|
3 |
+
size 730695364
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:81e46cc2e6cedc437b9cd8df5998c9a8dc4652dd601597df1024bc67a9efaf5b
|
3 |
+
size 549610825
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0700c4a931fcb9c452371bf5a564e4030b138b1e27caf58a8906a9506a037f21
|
3 |
+
size 1072623177
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6aaf5ca88dd45492ef5b8260f7c5a1f81f2b65d01ae24962985af77eb6e5330a
|
3 |
+
size 384910274
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8d1b85d4bb1d70f082d42ec013f98530ba9fabfcd79c839cacfef4a528b84543
|
3 |
+
size 1274001163
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3de616cf18abd158e7a9de2f5e4853c1489aa4a71a4d170486138eb2437ded29
|
3 |
+
size 424230306
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3b967d1f305845661342c15295122b9fffa02ee3fb600997906729d8872ac3cd
|
3 |
+
size 369632638
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1dfcbe78cab0de36ad319acbe763b388bb8b99d5a4c4df5aeaa8d747f0687755
|
3 |
+
size 644662684
|