huzy0 commited on
Commit
a8e5393
·
verified ·
1 Parent(s): c3b680d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -48,7 +48,7 @@ Building on [MERaLiON-SpeechEncoder-v1](https://huggingface.co/MERaLiON/MERaLiON
48
  The model retains near state-of-the-art results on the SUPERB benchmark for English, and showcases strong multilingual capabilities demonstrated through its integration into a [high-performance ASR system](#automatic-speech-recognition-asr).
49
 
50
  #### Innovative pre-training techniques
51
- MERaLiON-SpeechEncoder-2 was trained from scratch with a **novel extension of the BEST-RQ** self-supervised objective, by using more informative latent targets. We also adopted the **Muon optimizer**, which has previously only been shown to outperform the popular AdamW for LLM training. We find its advantages also carry over to speech-based models.
52
 
53
  ## Model Summary
54
 
@@ -69,7 +69,7 @@ For details on background, pre-training, tuning experiments and evaluation, plea
69
  | MERaLiON-SpeechEncoder-v1 | 82.62 | 3.14 | 4.16 | 97.63 | 0.0590 | 91.09 | 5.18 | 5.06 | 68.02 | 98.60 | 88.99 / 23.89 |
70
  | MERaLiON-SpeechEncoder-2 | 82.72 | 3.40 | 4.96 | 97.57 | 0.0575 | 88.96 | 3.93 | 3.90 | 68.80 | 98.95 | 89.50 / 23.46 |
71
 
72
- [SUPERB](https://superbbenchmark.org/) is an English-based benchmark for speech encoders covering a wide range of downstream speech tasks across domains such as recognition, detection, semantics, speaker, and paralinguistics, where each task is finetuned separately with a frozen encoder.
73
 
74
  MERaLiON-SpeechEncoder-2 is competitive to state-of-the-art, improving slightly against our own v1 model on speaker and paralinguistic tasks.
75
 
 
48
  The model retains near state-of-the-art results on the SUPERB benchmark for English, and showcases strong multilingual capabilities demonstrated through its integration into a [high-performance ASR system](#automatic-speech-recognition-asr).
49
 
50
  #### Innovative pre-training techniques
51
+ MERaLiON-SpeechEncoder-2 was trained from scratch with a **novel extension of the BEST-RQ** self-supervised objective, by using more informative latent targets. We also adopted the **Muon optimizer**, which has previously only been shown to outperform the wide-used AdamW optimizer for LLM training. We find its advantages also carry over to speech-based models.
52
 
53
  ## Model Summary
54
 
 
69
  | MERaLiON-SpeechEncoder-v1 | 82.62 | 3.14 | 4.16 | 97.63 | 0.0590 | 91.09 | 5.18 | 5.06 | 68.02 | 98.60 | 88.99 / 23.89 |
70
  | MERaLiON-SpeechEncoder-2 | 82.72 | 3.40 | 4.96 | 97.57 | 0.0575 | 88.96 | 3.93 | 3.90 | 68.80 | 98.95 | 89.50 / 23.46 |
71
 
72
+ [SUPERB](https://superbbenchmark.github.io/#/) is an English-based benchmark for speech encoders covering a wide range of downstream speech tasks across domains such as recognition, detection, semantics, speaker, and paralinguistics, where each task is finetuned separately with a frozen encoder.
73
 
74
  MERaLiON-SpeechEncoder-2 is competitive to state-of-the-art, improving slightly against our own v1 model on speaker and paralinguistic tasks.
75