Meta Unveils AudioCraft: Open Source Generative AI For Audio

We see the AudioCraft family of models as tools for musicians’ and sound designers’ professional toolboxes in that they can provide inspiration, help people quickly brainstorm, and iterate on their compositions in new ways.

Meta AI has released the open source code for AudioCraft, a generative AI framework capable of synthesizing high-quality, and realistic audio and music based on user text inputs.

AudioCraft family of Models

The framework consists of three different models: MusicGen, AudioGen, and EnCodec. MusicGen trained on 20,000 hours of music owned by Meta or specifically licensed to generate music based on a given text input. AudioGen generates audio from text-based prompts. This model was trained on publicly available sound effects and can generate environmental sounds and effects such as a barking dog or a car honk.

The third model is an improved version of Meta’s EnCodec decoder which can compress audio files and reconstruct the original signal. This allows users to generate quality music using fewer artifacts.

AI-generated music is not particularly new. Earlier this year saw a wave of clips on social media of different artists performing special “covers”. The use of artificial intelligence to generate a clone celebrities’ voices caused quite a sensation on various social media platforms racking up millions of views.

This wave saw Drake(AI Drake) “perform” various music covers and artists “releasing” new music such as the viral Drake and the Weeknd feature before it was taken down from various streaming services. This particular type of music did not board too well with most artists and major recording labels citing copyright infringement in the data used to train the AI models.

Meta in a blog, has pointed out the importance of building AI responsibly. In this line, the company has recognized the lack of diversity in its training datasets. “In particular, the music dataset used contains a larger portion of western-style music and only contains audio-text pairs with text and metadata written in English.” Meta adds, “By sharing the code for AudioCraft, we hope other researchers can more easily test new approaches to limit or eliminate potential bias in and misuse of generative models.”

Championing For Open Source AI development

Meta has been on a campaign promoting open-source AI development. The company points to the importance of collaboration in research and development. In a blog post, the team states, “Responsible innovation can’t happen in isolation. Open sourcing our research and resulting models helps ensure that everyone has equal access.”

The company has recently been releasing open-source code for its generative models in addition to free publications. By doing so, Meta enables the community to build on their models, further advancing generative AI development. The company’s AI research team recently released the open-source code for Llama 2. In addition, Meta has published the Llama 2 paper as it seeks to advance the development of Large Language Models (LLMs).

Meta believes having a solid open-source foundation will foster innovation. The company adds, “With even more controls, we think MusicGen can turn into a new type of instrument — just like synthesizers when they first appeared.”