MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

Github | Paper | Dataset

This is a checkpoint for MelodySim, a MERT-based music audio similarity model which can be used for melody similarity detection. This checkpoint contains pre-trained weights of m-a-p/MERT-v1-95M.

Usage

  1. Clone the MelodySim github repo
git clone https://github.com/AMAAI-Lab/MelodySim.git
cd MelodySim
pip install -r requirements.txt
  1. Download model checkpoint
from huggingface_hub import hf_hub_download

repo_id = "amaai-lab/MelodySim"
model_path = hf_hub_download(repo_id=repo_id, filename="siamese_net_20250328.ckpt")

or using wget in linux wget https://huggingface.co/amaai-lab/MelodySim/resolve/main/siamese_net_20250328.ckpt

  1. Run inference Try out inference.py to run the model on two audio files, analyzing their similarity and reaching a decesion on whether or not they are the same song. We provide a positive pair and a negative pair as examples. Try out
python inference.py -audio-path1 ./data/example_wavs/Track01968_original.mp3 -audio-path2 ./data/example_wavs/Track01976_original.mp3 -ckpt-path path/to/checkpoint.ckpt
python inference.py -audio-path1 ./data/example_wavs/Track01976_original.mp3 -audio-path2 ./data/example_wavs/Track01976_version1.mp3 -ckpt-path path/to/checkpoint.ckpt

Feel free to play around the hyperparameters

  • -window-len-sec, -hop-len-sec (the way segmenting the input audios);
  • --proportion-thres (how many similar segments should we consider the two pieces to be the same);
  • --decision-thres (between 0 and 1, the smallest similarity value that we consider to be the same);
  • --min-hits (for each window in piece1, the minimum number of similar windows in piece2 to assign that window to be plagiarized).
  1. Training and testing details are summarized in MelodySim Github. You may need the MelodySim dataset, containing 1,710 valid synthesized pieces originated from Slakh2100 dataset, each having 4 different versions (through various augmentation settings), with a total duration of 419 hours.

The testing results for the checkpoint on MelodySim Dataset testing split are as follows:

Precision Recall F1
Different 1.00 0.94 0.97
Similar 0.94 1.00 0.97
Average 0.97 0.97 0.97
Accuracy 0.97

Citation

If you find this work useful in your research, please cite:

@article{lu2025melodysim,
  title={Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment},
  author={Tongyu Lu and Charlotta-Marlena Geist and Jan Melechovsky and Abhinaba Roy and Dorien Herremans},
  year={2025},
  journal={arXiv:2505.20979}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train amaai-lab/MelodySim