Do It Yourself

Audio Diffusion

Create music from spectograms generated with diffusion models

As audio can also be represented as images by transforming to a spectrogram a diffusion modell (like Stable Diffusion↗︎ or Midjourney↗︎) is trained on a set of spectrograms that have been generated from a directory of audio files. It is then used to synthesize similar spectrograms, which are then converted back into audio.

You can play around with some pre-trained models on Google Colab or Hugging Face spaces. Check out some automatically generated loops here.