Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
5.42.0
YourMT3+ Enhanced Music Transcription
This is an enhanced version of YourMT3+ with instrument conditioning capabilities to solve instrument switching mid-track issues.
Features
- Instrument Conditioning: Choose your target instrument to maintain consistency throughout transcription
- Multi-track Support: Transcribe multiple instruments from polyphonic audio
- Format Options: Output as MIDI, MusicXML, ABC notation, or audio
- Free CPU Inference: Optimized to run on HuggingFace Spaces free tier (CPU-only, 16GB RAM)
How to Use
- Upload Your Audio: Drag and drop or select an audio file
- Select Target Instrument: Choose from the dropdown (vocals, piano, guitar, drums, etc.)
- Choose Output Format: MIDI, MusicXML, ABC, or audio
- Transcribe: Click the transcribe button and wait for results
Instrument Conditioning System
This enhanced version addresses the common issue where YourMT3+ switches instruments mid-track (e.g., vocals → violin → guitar). The system uses:
- Task Tokens: Special conditioning tokens when available in the model
- Post-processing Filtering: Consistent instrument filtering based on MIDI program numbers
- Debug Output: Console logs showing instrument detection and filtering results
Supported Instruments
- Vocals/Singing
- Piano
- Guitar (Electric/Acoustic)
- Bass
- Drums
- Violin
- Trumpet
- Saxophone
- And many more...
Technical Details
- Model: YourMT3+ (Multi-channel T5 decoder with Perceiver-TF encoder)
- Framework: PyTorch Lightning + Gradio
- Inference: CPU-only for free tier compatibility
- Memory: Optimized for 16GB RAM constraint
Credits
Based on the original YourMT3 by the MT3 team, enhanced with instrument conditioning capabilities.