Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ✅ Based on `decoder_model_merged.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)
onnx/decoder_model_merged_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b33de7d628701b2b2b34de32b50e8e767239441317f08c1ebca18416bf8d6eb8
|
3 |
+
size 163138982
|
onnx/decoder_model_merged_fp16.onnx
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e4fa8b54e64a94302834f26aa40bc750c11f76a624a55355e9ae1d8d0bc5a538
|
3 |
+
size 276510700
|
onnx/decoder_model_merged_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8cc51d58c6dd25e1350c69a53391b19cdebb2e40a18a5b6cb9727841a9bd3c34
|
3 |
+
size 237594391
|
onnx/decoder_model_merged_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fca8b9b055bfd90f157682f76bcbe18d0e2ec1468c6c5393adddacaf75be6ee0
|
3 |
+
size 170215142
|
onnx/decoder_model_merged_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e3e40050a46297112c269d423203d8e0ae6e5d4c6e71a15d3af113dd6c1542ff
|
3 |
+
size 113750446
|
onnx/decoder_model_merged_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4d41b52b26bed9b4a47db966417b5a1a638d427fc4cd102ab672924dcd1f5553
|
3 |
+
size 237594453
|