MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

Github | Paper | Dataset

This is a checkpoint for MelodySim, a MERT-based music audio similarity model which can be used for melody similarity detection. This checkpoint contains pre-trained weights of m-a-p/MERT-v1-95M.

Usage

Clone the MelodySim github repo

git clone https://github.com/AMAAI-Lab/MelodySim.git
cd MelodySim
pip install -r requirements.txt

Download model checkpoint

from huggingface_hub import hf_hub_download

repo_id = "amaai-lab/MelodySim"
model_path = hf_hub_download(repo_id=repo_id, filename="siamese_net_20250328.ckpt")

or using wget in linux wget https://huggingface.co/amaai-lab/MelodySim/resolve/main/siamese_net_20250328.ckpt

Run inference Try out inference.py to run the model on two audio files, analyzing their similarity and reaching a decesion on whether or not they are the same song. We provide a positive pair and a negative pair as examples. Try out

python inference.py -audio-path1 ./data/example_wavs/Track01968_original.mp3 -audio-path2 ./data/example_wavs/Track01976_original.mp3 -ckpt-path path/to/checkpoint.ckpt
python inference.py -audio-path1 ./data/example_wavs/Track01976_original.mp3 -audio-path2 ./data/example_wavs/Track01976_version1.mp3 -ckpt-path path/to/checkpoint.ckpt

Feel free to play around the hyperparameters

-window-len-sec, -hop-len-sec (the way segmenting the input audios);
--proportion-thres (how many similar segments should we consider the two pieces to be the same);
--decision-thres (between 0 and 1, the smallest similarity value that we consider to be the same);
--min-hits (for each window in piece1, the minimum number of similar windows in piece2 to assign that window to be plagiarized).

Training and testing details are summarized in MelodySim Github. You may need the MelodySim dataset, containing 1,710 valid synthesized pieces originated from Slakh2100 dataset, each having 4 different versions (through various augmentation settings), with a total duration of 419 hours.

The testing results for the checkpoint on MelodySim Dataset testing split are as follows:

	Precision	Recall	F1
Different	1.00	0.94	0.97
Similar	0.94	1.00	0.97
Average	0.97	0.97	0.97
Accuracy			0.97

Citation

If you find this work useful in your research, please cite:

@article{lu2025melodysim,
  title={Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment},
  author={Tongyu Lu and Charlotta-Marlena Geist and Jan Melechovsky and Abhinaba Roy and Dorien Herremans},
  year={2025},
  journal={arXiv:2505.20979}
}

amaai-lab
/

MelodySim

MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

Usage

Citation

Dataset used to train amaai-lab/MelodySim