JP van Oosten

Using image generation AI to create music with a text-prompt

Jan 12, 2023

You know you can use AI to generate text with ChatGPT, and images with tools like DALL-E and Stable Diffusion. But, a surprising use of Stable Diffusion is to generate music. This interesting new take on image generation uses generated spectrograms which are turned into audio.

A spectrogram is a visual representation of the frequencies in a sound. With a bit of tinkering, a bit of pre-training, and audio-generation from a spectrogram, you get Riffusion! It’s now easy to create endless streams of generated music. I found “from typing to jazz” really cool, and transferring styles is also very interesting, for example: Classical music in the style of Miles Davis

You can learn more about it here:

