Meta launches AudioCraft, an open-source AI music generator

2023-08-03 16:26

Meta's new AI music generator is the latest in a slew of AI products recently

Meta launches AudioCraft, an open-source AI music generator

Meta's new AI music generator is the latest in a slew of AI products recently released by the tech company.

On Wednesday, Meta announced the release of AudioCraft, an open-source generative AI that creates audio and music from text prompts. AudioCraft has three models, MusicGen for composing music, AudioGen for creating sound effects, and EnCodec, which uses AI to assist in audio compression that outperforms the MP3 format.

In case you were wondering about copyright issues, MusicGen was trained on Meta-owned and licensed music.

Meta has been aggressively pushing to bring AI-powered tools to the masses in competition with OpenAI, Google, and Microsoft. In July, it released its open-source Llama 2, the newest version of its LLM (large language model.)

Unlike OpenAI's GPT-4 and Google's PaLM 2, Llama 2 is open-source, which wins Meta points amongst developers and ethicists who believe in transparency of AI development. There are also rumors of Meta launching AI "personas" aka chatbots for Instagram, Facebook, and WhatsApp.

AudioCraft was designed with musicians and sound designers in mind to "provide inspiration, help people quickly brainstorm, and iterate on their compositions in new ways," the announcement said.

Examples in the blog post include audio samples from the prompt "Whistling with wind blowing" and "Pop dance track with catchy melodies, tropical percussions, and upbeat rhythms, perfect for the beach," which... successfully sound like those descriptions.

Much of recent generative AI developments have focused on text and image generation, which is a simpler process.

Text-to-audio is a more complicated undertaking that Meta seems to have cracked. AudioCraft learns audio tokens from raw signals using its proprietary EnCodec neural audio codec to create a new "vocabulary" for the model.

It then trains language models over this audio vocabulary so that the model understands the associations between audio and text. Since AudioCraft is also open-source, the code is available on GitHub for users to explore and test out for themselves.