whitphx HF Staff commited on
Commit
1006211
Β·
verified Β·
1 Parent(s): 407cbac

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `encoder_model_fp16.onnx` (added)
↳ ❌ `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *without* slimming

README.md CHANGED
@@ -5,4 +5,24 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/openai/whisper-large-v3 with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/openai/whisper-large-v3 with ONNX weights to be compatible with Transformers.js.
7
 
8
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
9
+ ```bash
10
+ npm i @huggingface/transformers
11
+ ```
12
+
13
+ ## Basic Usage
14
+
15
+ ```js
16
+ import { pipeline } from '@huggingface/transformers';
17
+
18
+ // Create the pipeline
19
+ const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-large-v3', {
20
+ dtype: 'fp32', // Options: "fp32", "fp16", "q8", "q4"
21
+ });
22
+
23
+ // Use the model
24
+ const result = await pipe('input text or data');
25
+ console.log(result);
26
+ ```
27
+
28
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7dc412404e5dcf75f3460e99409c576872ef07f41955398e9511cce73efe12a
3
+ size 743934101
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b207ac1c12d5b2bc0d04b5d77ff157fcd7557cc415fe40e679841207e2e2640
3
+ size 1814431369
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99d1fbb5cd1e8280770fadf677341429b9a2797c4f33e9053e8ff387e2fc8460
3
+ size 1177664585
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cd41aca47b475130bb4cbe890e789ee03d86a15c5d1a5a0ea6bdae114b21b04
3
+ size 796360309
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8eeab7499c781192052fa35fa5c70183e4de4ec58e7a6430566d98526e673d4f
3
+ size 608616042
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efab6ad5e2faca22ebdd806e37f10e937b5fb7898c0231eb9f836acd474757cf
3
+ size 1177664753
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea65268f66a1ae01d6da7f229cc03fa93fe9bb8df99b9fa597c9011ce898caaa
3
+ size 684827365
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97dfb35d66906a9b125dd9a4d79c3a9eda491f527c8e6674cde60ea54680e26a
3
+ size 1604705431
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9a522f8dcc2dea45744b3e752d981b811911aeafa64816b38439724f3cebf4d
3
+ size 1072629448
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:47a24bd589e07111e26396101a20ca0f51fe51e6cd15433667b7a93fb5e5210c
3
+ size 730700485
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0377090ee4a86efb28c79bc8bd5af8871b131c0dd74e5de7c5c3da6d39b6e21a
3
+ size 549613386
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d437c329be4e4fb83d4208899e4e034bddc06e7283242c6653f85f67ed57d7c2
3
+ size 1072629582
onnx/encoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f54e0132f396ddbfdd956d5161903f48f489cdce8adaa9bece026d7c5a68ea0
3
+ size 385647556
onnx/encoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ad19eab7d13d905b758443d45dbf13abafe56636c2c48bf3fdcd26dd2cff67c
3
+ size 1274369805
onnx/encoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e5b2677a44b67503a04ea7fb215dd7e885205c9b846279e4ca2d0759def6c8f6
3
+ size 424967588
onnx/encoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ebeb92dd2b337c3269cc66628293647ddd4c7761bb29df12e9302ffa3e745ad
3
+ size 370001280
onnx/encoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83821fce0df93f8ebf28c9fac0c0b328745418629d4cb51099e4257c82a2cf61
3
+ size 644847003