Spaces:
Running
Running
File size: 4,096 Bytes
b607d8c 771bba0 b607d8c 771bba0 f524cd4 771bba0 105058f b607d8c 771bba0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
<!doctype html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width" />
<title>Melodiff MusicLDM v2</title>
<link rel="stylesheet" href="style.css" />
</head>
<body>
<div class="card">
<h1>Melodiff MusicLDM v2</h1>
<p>This is next version after <a href="https://huggingface.co/spaces/JanBabela/Riffusion-Melodiff-v1" target="_blank">Melodiff Riffusion v1</a> </p>
<p>Melodiff MusicLDM continues to explore the idea of using the audio to audio pipeline of Stable Difussion audio models for creating cover versions of songs.</p>
<p><br>Melodiff MusicLDM uses <a href="https://huggingface.co/ucsd-reach/musicldm" target="_blank">MusicLDM model</a> as base model for audio generation.</p>
<p>What was done and what is presented here: Deconstructing the base pipeline and reconstructing back for audio to audio modifications.</p>
<p>No new model training, nor finetuning was done, only modifications to base pipeline.</p>
<p><br>MusicLDM generates audio of better quality compared to Riffusion (first) model. It generates samples of length 10s compared to 5s samples of previous model.</p>
<p>Also speed of generation improved: previously it took about 8s to generate 5s long sample of mono audio. Now it takes about 8s to generate 10s long sample of stereo audio.</p>
<p>Also consistency. Previosly only about 30% of modified samples were good (or ok) and some prompt and seed play was needed to find good sound quality.</p>
<p>Now about 70% of modified samples are good (or ok).</p>
<p>Again longer modifications are possible by splitting, modifying and concatenating back the samples.</p>
<p>Underlying MusicLDM model is two years old. It would be interesting to try new models, which have notably better quality.</p>
<p><br> Examples of music generated by modifying the underlying song: <br></p>
<p>
Bella Ciao, originally played by saxophone, modified to be played by electric guitar
<audio controls>
<source src="BellaElGuitar.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p>
Bella Ciao, originally played by violin, modified to be played by piano
<audio controls>
<source src="BellaPiano.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p>
Iko iko, originally played by saxophone, modified to be played by violin
<audio controls>
<source src="IkoViolin.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p>
When the Saints, originally played by saxophone, modified to be played by strings
<audio controls>
<source src="SaintsStrings.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p><br> Examples of original with modified samples: <br></p>
<p>
Saxophone solo, original
<audio controls>
<source src="MindscapeResampled.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p>
Modified to be played by violin
<audio controls>
<source src="MindScapeViolin.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
<p>
Modified to be played by electric guitar
<audio controls>
<source src="MindScapeElguitar.wav" type="audio/wav">
Your browser does not support the audio element.
</audio>
</p>
</div>
</body>
</html> |