Use GPT-4 to Generate MusicLM Prompts. Chef’s Kiss.

I just got access to MusicLM from Google’s AI Test Kitchen (ATK). I clicked around the ATK and used some of the prebuilt prompts to generate music. Wow. The idea is that you use text prompts to create music. You just have to describe exactly what you want so that the MusicLM model can generate the audio for you.

Google’s AI Test Kitchen

MusicLM works incredibly well, but I quickly ran out of ideas to prompt the model. The future of AI is Chimeric AI. Let’s use the models together. Here’s a prompt template to help you generate text prompts for MusicLM.

GPT-4 Prompt for MusicLM

Let’s take a closer look at this GPT-4 prompt to better understand why it works the way it does.

  • Write an English-only prompt to generate music. – This grounds the model to output a text prompt using English that is used to be the input into a generative AI model for music generation.
  • The music is lo-fi, mostly played in the background while I am writing. – This part of the prompt is used to describe the music that I want MusicLM to generate. Replace with your music. Experiment with how detailed you want to be.
  • Generate a complete track 20 seconds in length. – The MusicLM model currently produces 20 seconds of music. I found that telling the model to generate 20 seconds in length makes the track make sense versus being cut off.
  • Limit to 100 tokens for the prompt. – This limits the output of GPT- to 100 words (ish). If you do not set a limit, ChatGPT will write a very long prompt.
MusicLM Lo-Fi

