|
--- |
|
license: cc-by-4.0 |
|
thumbnail: null |
|
tags: |
|
- automatic-speech-recognition |
|
- speech |
|
- audio |
|
- Transducer |
|
- TDT |
|
- FastConformer |
|
- Conformer |
|
- pytorch |
|
- NeMo |
|
- hf-asr-leaderboard |
|
- coreml |
|
- apple |
|
language: |
|
- en |
|
pipeline_tag: automatic-speech-recognition |
|
base_model: |
|
- nvidia/parakeet-tdt-0.6b-v2 |
|
--- |
|
|
|
# Parakeet TDT 0.6B V2 - CoreML |
|
|
|
This is a CoreML-optimized version of NVIDIA's Parakeet TDT 0.6B V2 model, designed for high-performance automatic speech recognition on Apple platforms. |
|
|
|
## Model Description |
|
|
|
Models will continue to evolve as we optimize performance and accuracy. This model has been converted to CoreML format for efficient on-device inference on Apple Silicon and iOS devices, enabling real-time speech recognition with |
|
minimal memory footprint. |
|
|
|
## Usage in Swift |
|
|
|
See the [FluidAudio repository](https://github.com/FluidInference/FluidAudioSwift) for instructions. |
|
|
|
## Performance |
|
|
|
- Real-time factor: ~110x on M4 Pro |
|
- Memory usage: ~800MB peak |
|
- Supported platforms: macOS 14+, iOS 17+ |
|
- Optimized for: Apple Silicon |
|
|
|
## Model Details |
|
|
|
- Architecture: FastConformer-TDT |
|
- Parameters: 0.6B |
|
- Sample rate: 16kHz |
|
|
|
## License |
|
|
|
This model is released under the CC-BY-4.0 license. See the LICENSE file for details. |
|
|
|
Acknowledgments |
|
|
|
Based on NVIDIA's Parakeet TDT model. CoreML conversion and Swift integration by the FluidInference team. |