huzy0 commited on
Commit
59be175
·
verified ·
1 Parent(s): 983ceec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -7
README.md CHANGED
@@ -22,7 +22,7 @@ language:
22
  <h1 align="center">🎧 MERaLiON-SpeechEncoder-2 🎧</h1>
23
 
24
  <p align="center">
25
- <a href="https://meralion.org/demo/">💻 ASR Web Demo (Coming Soon!)</a> |
26
  </p>
27
 
28
 
@@ -31,6 +31,8 @@ We introduce **MERaLiON-SpeechEncoder-2**, an update of [MERaLiON-SpeechEncoder-
31
 
32
  Unlike many existing models optimized for high-resource, Western languages, MERaLiON-SpeechEncoder-2 is designed from the ground up to reflect the linguistic diversity and complexity of Southeast Asia. The model can be finetuned on custom datasets, allowing developers to build speech systems tailored to their specific needs.
33
 
 
 
34
  ## Model Highlights
35
 
36
  ### Small model size
@@ -52,18 +54,13 @@ MERaLiON-SpeechEncoder-2 was trained from scratch with an novel extension of the
52
  - **Language(s):** Primarily English (Global & Singapore), Chinese, Malay, Tamil, Thai, Indonesian, and Vietnamese. See [pre-training data](#Language coverage of pre-training data) for full breakdown of language coverage.
53
  - **License:** [MERaLiON Public License](https://huggingface.co/MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION/blob/main/MERaLiON-Public-Licence-v1.pdf)
54
 
55
- The following Hugging Face-compatible models are implemented:
56
-
57
- - **`MeralionBestRqModel`**: The base BEST-RQ Conformer encoder. It outputs the final hidden states and is suitable for feature extraction or as a base for other heads.
58
- - **`MeralionBestRqModelForCTC`**: The Conformer model with a linear CTC head for ASR.
59
- - **`MeralionBestRqModelForLSTMCTC`**: The Conformer model with a more powerful CTC head that includes two LSTM layers before the final projection layer. This version can also be configured to use a weighted sum of all encoder hidden states.
60
-
61
  For details on background, pre-training, tuning experiments and evaluation, please refer to our [technical report](https://arxiv.org/abs/2412.11538).
62
 
63
  ## Language coverage of pre-training data
64
 
65
 
66
 
 
67
  ### Model Sources [optional]
68
 
69
  <!-- Provide the basic links for the model. -->
 
22
  <h1 align="center">🎧 MERaLiON-SpeechEncoder-2 🎧</h1>
23
 
24
  <p align="center">
25
+ <a href="https://meralion.org/demo/">💻 ASR Web Demo (Coming Soon!)</a>
26
  </p>
27
 
28
 
 
31
 
32
  Unlike many existing models optimized for high-resource, Western languages, MERaLiON-SpeechEncoder-2 is designed from the ground up to reflect the linguistic diversity and complexity of Southeast Asia. The model can be finetuned on custom datasets, allowing developers to build speech systems tailored to their specific needs.
33
 
34
+ <img src="data1.svg" width="425"/> <img src="data2.svg" width="425"/>
35
+
36
  ## Model Highlights
37
 
38
  ### Small model size
 
54
  - **Language(s):** Primarily English (Global & Singapore), Chinese, Malay, Tamil, Thai, Indonesian, and Vietnamese. See [pre-training data](#Language coverage of pre-training data) for full breakdown of language coverage.
55
  - **License:** [MERaLiON Public License](https://huggingface.co/MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION/blob/main/MERaLiON-Public-Licence-v1.pdf)
56
 
 
 
 
 
 
 
57
  For details on background, pre-training, tuning experiments and evaluation, please refer to our [technical report](https://arxiv.org/abs/2412.11538).
58
 
59
  ## Language coverage of pre-training data
60
 
61
 
62
 
63
+
64
  ### Model Sources [optional]
65
 
66
  <!-- Provide the basic links for the model. -->