Breaking News

Google’s new AI system is capable of generating music from text

Google has created an AI system, MusicLM, that generates up to several minutes long musical pieces based on text prompts. It can also transform a melody into various instruments, quite similar to DALL-E’s ability of creating images from text inputs.

MusicLM is a neural network-based system trained on over 280,000 hours of music data, allowing it to generate unique music tracks across various instruments, genres, and themes based on text input. MusicLM can generate music across various genres like jazz, pop, rock, death metal, and more.

MusicLM can also generate painting caption conditioning wherein the AI will generate music from painting description. There are also 5-minute pieces generated from simple phrases like “melodic techno.” A standout demo is the “story mode” where the AI is given a script and transitions between various prompts.

A paper published by Cornell University stated, “We introduce MusicLM, a model generating high-fidelity music from text descriptions such as “a calming violin melody backed by a distorted guitar riff”. MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes.”

The technology is not ready for people to use by themselves as of now, but the company has uploaded a few samples to demonstrate the kind of music MusicLM can generate using texts. There are over 30 music samples shared on the company’s page that have been generated using rich texts.