The Times Australia
The Times World News

.
The Times Real Estate

.

Text-to-audio generation is here. One of the next big AI disruptions could be in the music industry

  • Written by Oliver Bown, Postdoctoral fellow, UNSW Sydney
Text-to-audio generation is here. One of the next big AI disruptions could be in the music industry

The past few years have seen an explosion in applications of artificial intelligence to creative fields. A new generation of image and text generators is delivering impressive[1] results[2]. Now AI has also found applications in music, too.

Last week, a group of researchers at Google released MusicLM[3] – an AI-based music generator that can convert text prompts into audio segments. It’s another example of the rapid pace of innovation in an incredible few years for creative AI.

With the music industry still adjusting to disruptions caused by the internet and streaming services, there’s a lot of interest in how AI might change the way we create and experience music.

Read more: Neil Young’s ultimatum to Spotify shows streaming platforms are now a battleground where artists can leverage power[4]

Automating music creation

A number of AI tools now allow users to automatically generate musical sequences or audio segments. Many are free and open source, such as Google’s Magenta[5] toolkit.

Two of the most familiar approaches in AI music generation are:

  1. continuation, where the AI continues a sequence of notes or waveform data, and

  2. harmonisation or accompaniment, where the AI generates something to complement the input, such as chords to go with a melody.

Similar to text- and image-generating AI, music AI systems can be trained on a number of different data sets. You could, for example, extend a melody by Chopin using a system trained in the style of Bon Jovi – as beautifully demonstrated in OpenAI’s MuseNet[6].

Such tools can be great inspiration for artists with “blank page syndrome”, even if the artist themselves provide the final push. Creative stimulation is one of the immediate applications of creative AI tools today.

But where these tools may one day be even more useful is in extending musical expertise. Many people can write a tune, but fewer know how to adeptly manipulate chords to evoke emotions, or how to write music in a range of styles.

Although music AI tools have a way to go to reliably do the work of talented musicians, a handful of companies are developing AI platforms for music generation.

Boomy[7] takes the minimalist path: users with no musical experience can create a song with a few clicks and then rearrange it. Aiva[8] has a similar approach, but allows finer control; artists can edit the generated music note-by-note in a custom editor.

There is a catch, however. Machine learning techniques are famously hard to control, and generating music using AI is a bit of a lucky dip for now; you might occasionally strike gold while using these tools, but you may not know why.

An ongoing challenge for people creating these AI tools is to allow more precise and deliberate control over what the generative algorithms produce.

New ways to manipulate style and sound

Music AI tools also allow users to transform a musical sequence or audio segment. Google Magenta’s Differentiable Digital Signal Processing[9] library technology, for example, performs timbre transfer.

Timbre[10] is the technical term for the texture of the sound – the difference between a car engine and a whistle. Using timbre transfer, the timbre of a segment of audio can be changed.

Such tools are a great example of how AI can help musicians compose rich orchestrations and achieve completely new sounds. In the first AI Song Contest[11], held in 2020, Sydney-based music studio Uncanny Valley[12] (with whom I collaborate), used timbre transfer to bring singing koalas into the mix.

Uncanny Valley’s song Beautiful The World won the 2020 AI Song Contest.

Timbre transfer has joined a long history of synthesis techniques that have become instruments in themselves.

Taking music apart

Music generation and transformation are just one part of the equation. A longstanding problem in audio work is that of “source separation”. This means being able to break an audio recording of a track into its separate instruments.

Although it’s not perfect, AI-powered source separation has come a long way. Its use is likely to be a big deal for artists; some of whom won’t like that others can “pick the lock” on their compositions.

Meanwhile, DJs and mashup artists will gain unprecedented control over how they mix and remix tracks. Source separation start-up Audioshake[13] claims this will provide new revenue streams for artists who allow their music to be adapted more easily, such as for TV and film.

Artists may have to accept this Pandora’s box has been opened, as was the case when synthesizers and drum machines first arrived and, in some circumstances, replaced the need for musicians in certain contexts.

But watch this space, because copyright laws do offer artists protection from the unauthorised manipulation of their work. This is likely to become another grey area in the music industry, and regulation may struggle to keep up[14].

New musical experiences

Playlist popularity has revealed how much we like to listen to music that has some “functional” utility[15], such as to focus, relax, fall asleep, or work out to.

The start-up Endel[16] has made AI-powered functional music its business model, creating infinite streams to help maximise certain cognitive states.

Endel’s music can be hooked up to physiological data such as a listener’s heart rate. Its manifesto[17] draws heavily on practices of mindfulness and makes the bold proposal we can use “new technology to help our bodies and brains adapt to the new world”, with its hectic and anxiety-inducing pace.

Other start-ups are also exploring functional music. Aimi[18] is examining how individual electronic music producers can turn their music into infinite and interactive streams.

Aimi’s listener app invites fans to manipulate the system’s generative parameters such as “intensity” or “texture”, or deciding when a drop happens. The listener engages with the music rather than listening passively.

It’s hard to say how much heavy lifting AI is doing in these applications – potentially little. Even so, such advances are guiding companies’ visions of how musical experience might evolve in the future.

The future of music

The initiatives mentioned above are in conflict with several long-established conventions, laws and cultural values regarding how we create and share music.

Will copyright laws be tightened to ensure companies training AI systems on artists’ works compensate those artists? And what would that compensation be for? Will new rules apply to source separation? Will musicians using AI spend less time making music, or make more music than ever before?

If there’s one thing that’s certain, it’s change. As a new generation of musicians grows up immersed in AI’s creative possibilities, they’ll find new ways of working with these tools.

Such turbulence is nothing new in the history of music technology, and neither powerful technologies nor standing conventions should dictate our creative future.

Read more: No, the Lensa AI app technically isn’t stealing artists' work – but it will majorly shake up the art world[19]

References

  1. ^ impressive (theconversation.com)
  2. ^ results (www.theguardian.com)
  3. ^ released MusicLM (google-research.github.io)
  4. ^ Neil Young’s ultimatum to Spotify shows streaming platforms are now a battleground where artists can leverage power (theconversation.com)
  5. ^ Magenta (magenta.tensorflow.org)
  6. ^ MuseNet (soundcloud.com)
  7. ^ Boomy (boomy.com)
  8. ^ Aiva (www.aiva.ai)
  9. ^ Differentiable Digital Signal Processing (magenta.tensorflow.org)
  10. ^ Timbre (www.youtube.com)
  11. ^ AI Song Contest (www.aisongcontest.com)
  12. ^ Uncanny Valley (uncannyvalley.com.au)
  13. ^ Audioshake (www.digitalmusicnews.com)
  14. ^ struggle to keep up (www.tandfonline.com)
  15. ^ has some “functional” utility (pitchfork.com)
  16. ^ Endel (endel.io)
  17. ^ manifesto (manifesto.endel.io)
  18. ^ Aimi (www.aimi.fm)
  19. ^ No, the Lensa AI app technically isn’t stealing artists' work – but it will majorly shake up the art world (theconversation.com)

Read more https://theconversation.com/text-to-audio-generation-is-here-one-of-the-next-big-ai-disruptions-could-be-in-the-music-industry-193956

The Times Features

Brisbane Water Bill Savings: Practical Tips to Reduce Costs

Brisbane residents have been feeling the pinch as water costs continue to climb. With increasing prices, it's no wonder many households are searching for ways to ease the burde...

Exploring Hybrid Heating Systems for Modern Homes

Consequently, energy efficiency as well as sustainability are two major considerations prevalent in the current market for homeowners and businesses alike. Hence, integrated heat...

Are Dental Implants Right for You? Here’s What to Think About

Dental implants are now among the top solutions for those seeking to replace and improve their teeth. But are dental implants suitable for you? Here you will find out more about ...

Sunglasses don’t just look good – they’re good for you too. Here’s how to choose the right pair

Australians are exposed to some of the highest levels[1] of solar ultraviolet (UV) radiation in the world. While we tend to focus on avoiding UV damage to our skin, it’s impor...

How to Style the Pantone Color of the Year 2025 - Mocha Mousse

The Pantone Color of the Year never fails to set the tone for the coming year's design, fashion, and lifestyle trends. For 2025, Pantone has unveiled “Mocha Mousse,” a rich a...

How the Aussie summer has a profound effect on 'Climate Cravings’

Weather whiplash describes the rollercoaster-like shifts in weather we’ve experienced this summer —a blazing hot day one moment, followed by an unexpectedly chilly or rainy tur...

Times Magazine

HYROX - the World Series of Fitness Racing Arrives Down Under

The Fitness Competition for Everybody – Sydney 12 Aug and Melbourne 26 Aug  The world's fastest growing indoor fitness competition, HYROX, is ready to hit  Australian shores with its signature spectacle of endurance, fitness, and human achieveme...

Protecting Stray Cats in Your Community

Stray cats are a common sight in many neighbourhoods in Melbourne and all around Australia. These feline wanderers, often abandoned or born on the streets, struggle to survive in the harsh urban environment. Many of them face dangers such as traf...

How to Optimize Your Dust Collector’s Performance with the Right Filter Cartridge

The filter cartridge is one of the critical components of your dust collector system, and the efficiency of your system depends largely on it. The type of cartridge used in the dust collection system can significantly influence its performance, cos...

The Paddle Board Offers the Ultimate Adventure

Types of Paddle Boards  Paddle boarding is one of the most popular outdoor activities and it is no surprise why. It’s a great way to explore nature, get some exercise, and just have fun! But before you invest in a paddle board, it’s essential to ...

Managing Your Online Reputation: Strategies for Removing Negative Content

Maintaining a positive online reputation is crucial for individuals and businesses in today's digital age. However, negative content such as negative reviews, defamatory posts, or outdated information can tarnish your reputation and harm your credi...

TWS Andes Earbuds with Active Noise Cancelling

TWS Andes Earbuds with ANC Boasting the most up-to-the-minute Dual Mic Active Noise Cancelling (ANC), the EFM TWS Andes Earbuds offer complete peace as well as peace of mind. The TWS Andes are sweat and dust-resistant IP54 rated and equi...

LayBy Shopping