Generative AI Models for Creatives

Audio Diffusion

Do It Yourself
Diffusion
Song
Music
Sound

Synthesize music with diffusion models

Audio Spectogramm
As audio can also be represented as images by transforming to a spectrogram (picture above) a diffusionmodell (like Stable Diffusion↗︎ or Midjourney↗︎) is trained on a set of spectrograms that have been generated from a directory of audio files. It is then used to synthesize similar spectrograms, which are then converted back into audio.

You can play around with some pre-trained models on Google Colab or Hugging Face spaces. Check out some automatically generated loops here.

Model details
AuthorRobert Dargavel Smith
Published in2022
Architecturediffusion
Licensegpl30
Related models:
Didn't find what you are looking for? Send us your suggestions!
notfound::false