# YourMT3+ Enhanced Music Transcription This is an enhanced version of YourMT3+ with **instrument conditioning** capabilities to solve instrument switching mid-track issues. ## Features - **Instrument Conditioning**: Choose your target instrument to maintain consistency throughout transcription - **Multi-track Support**: Transcribe multiple instruments from polyphonic audio - **Format Options**: Output as MIDI, MusicXML, ABC notation, or audio - **Free CPU Inference**: Optimized to run on HuggingFace Spaces free tier (CPU-only, 16GB RAM) ## How to Use 1. **Upload Your Audio**: Drag and drop or select an audio file 2. **Select Target Instrument**: Choose from the dropdown (vocals, piano, guitar, drums, etc.) 3. **Choose Output Format**: MIDI, MusicXML, ABC, or audio 4. **Transcribe**: Click the transcribe button and wait for results ## Instrument Conditioning System This enhanced version addresses the common issue where YourMT3+ switches instruments mid-track (e.g., vocals → violin → guitar). The system uses: - **Task Tokens**: Special conditioning tokens when available in the model - **Post-processing Filtering**: Consistent instrument filtering based on MIDI program numbers - **Debug Output**: Console logs showing instrument detection and filtering results ## Supported Instruments - Vocals/Singing - Piano - Guitar (Electric/Acoustic) - Bass - Drums - Violin - Trumpet - Saxophone - And many more... ## Technical Details - **Model**: YourMT3+ (Multi-channel T5 decoder with Perceiver-TF encoder) - **Framework**: PyTorch Lightning + Gradio - **Inference**: CPU-only for free tier compatibility - **Memory**: Optimized for 16GB RAM constraint ## Credits Based on the original YourMT3 by the MT3 team, enhanced with instrument conditioning capabilities.