Computer chip giant Nvidia entered the music AI race by announcing its new model, Fugatto, on Tuesday (November 26). The company calls the Fugatto, short for Foundational Generative Audio Transformer Opus 1, a “Swiss army knife for audio.”
Using text or audio messages, Fugatto can create new music at the touch of a button and edit existing audio, including removing or adding instruments from a song or changing the accent and emotion in a voice, in seconds.
With Fugatto, Nvidia aims to take on today's top AI music models, including Suno, Udio and many others. Although it's late to join the race to build the best music AI model, Fugatto looks to have pure sound quality and a host of features that could change the music-making process for producers and composers.
According to the announcement on Nvidia's blog, “One of the most difficult parts of the effort was creating a combined dataset containing millions of audio samples used for training,” which the company says it worked for more than a year to he does it right. “The team used a multifaceted strategy to generate data and instructions that greatly expanded the range of tasks the model could perform, while achieving more accurate performance and enabling new tasks without requiring additional data.” The company says its model is trained on open source datasets licensed by Creative Commons and complies with copyright law.
Nvidia suggests a number of use cases for Fugatto, including scoring for visual media. editing certain parts of a score. and changing a voice to have different accents, emotions and timbres. “Fugatto can make a trumpet bark or a saxophone meow. Whatever the users can describe, the model can create,” he says Raphael Valledirector of applied audio research at Nvidia.
“The history of music is also the history of technology,” he says Ido Zmishlanyproducer/songwriter and co-founder of One Take Audio, a member of Nvidia Inception, its startup program. “With artificial intelligence we are writing the next chapter of music. We have a new instrument, a new tool for making music — and that's extremely exciting.”
Nvidia claims this is the first musical AI model to exhibit “emergent properties – capabilities that arise from the interaction of its various trained abilities – and the ability to combine free-form instructions.” Valle adds that Fugatto is “our first step toward a future where unsupervised multitasking learning in audio composition and conversion emerges from data and model scale.”
There's just one catch: so far, the company describes it as an internal research project, not available to the public.