MusicLM, an experimental AI technology developed by Google, can convert written descriptions into musical compositions. MusicLM is a tool within the AI Test Kitchen app (web, Android, iOS) that allows users to write in a prompt and have the tool generate multiple versions of the song based on the input. Users can modify their MusicLM-generated creations by specifying instrument types like “electronic” or “classical” and by indicating the “vibe, mood, or emotion” they’re going for.
Conditional music creation is modeled as a hierarchical sequence-to-sequence modeling task in MusicLM, and the resulting music maintains a constant 24 kHz sampling rate over many minutes. The results of the studies demonstrate that MusicLM is superior to competing systems in terms of audio quality and accuracy in the written description. Researchers at Google show that MusicLM can be trained on the text and a melody, adapting whistled and hummed melodies to match the style described in a text caption. MusicCaps, a dataset including 5.5k music-text combinations with rich text descriptions produced by human experts, has been made freely available by Google researchers to facilitate further research.
There are 5,521 musical examples in the MusicCaps dataset, each accompanied by a free-text caption written by a musician and an English aspect list. For instance, “pop, tinny wide hi-hats, mellow piano melody, high pitched female vocal melody, sustained pulsating synth lead” is a list of characteristics. This is a low-quality recording. Several phrases describing the music are included in the caption. For example: “A low-sounding male voice is rapping over fast-paced drums playing a reggaeton beat along with a bass.” The accompanying music sounds like it’s being played on a guitar. Some are chuckling off in the distance. One could hear this tune in a bar. Only the music itself, not any metadata like the artist’s name, is discussed in the text. AudioSet contains 2,858 evaluations and 2,663 training examples, each lasting 10 seconds.
Different audio/music generations can be seen here as an example.
In a January research paper, Google previewed MusicLM but specified it had “no immediate plans” to distribute the software. MusicLM, the method described in the article, presents numerous ethical concerns, such as incorporating copyrighted material from training data into the created songs, as the paper’s authors pointed out. Google has been doing workshops with musicians to “see how [the] technology can empower the creative process.” Possible result? MusicLM, implemented in the AI Test Kitchen, does not produce songs with particular artists or vocals. Take that for what it’s worth. The larger problems with generative music don’t have a simple solution.
Popular recently are amateur tracks that employ generative AI to produce recognizable sounds convincing enough to be passed off as real. The music industry has been eager to alert their streaming partners, citing intellectual property issues, when they discover new songs.
Check out the Paper, Project, and Dataset. Don’t forget to join our 21k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
🚀 Check Out 100’s AI Tools in AI Tools Club
Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.