The Times Australia
The Times World News

.

Robots are creating images and telling jokes. 5 things to know about foundation models and the next generation of AI

  • Written by Aaron J. Snoswell, Post-doctoral Research Fellow, Computational Law & AI Accountability, Queensland University of Technology
Robots are creating images and telling jokes. 5 things to know about foundation models and the next generation of AI

If you’ve seen photos of a teapot shaped like an avocado[1] or read a well-written article that veers off on slightly weird tangents[2], you may have been exposed to a new trend in artificial intelligence (AI).

Machine learning systems called DALL-E[3], GPT[4] and PaLM[5] are making a splash with their incredible ability to generate creative work.

These systems are known as “foundation models” and are not all hype and party tricks. So how does this new approach to AI work? And will it be the end of human creativity and the start of a deep-fake nightmare?

1. What are foundation models?

Foundation models[6] work by training a single huge system on large amounts of general data, then adapting the system to new problems. Earlier models tended to start from scratch for each new problem.

DALL-E 2, for example, was trained to match pictures (such as a photo of a pet cat) with the caption (“Mr. Fuzzyboots the tabby cat is relaxing in the sun”) by scanning hundreds of millions of examples. Once trained, this model knows what cats (and other things) look like in pictures.

But the model can also be used for many other interesting AI tasks, such as generating new images from a caption alone (“Show me a koala dunking a basketball”) or editing images based on written instructions (“Make it look like this monkey is paying taxes”).

2. How do they work?

Foundation models run on “deep neural networks[7]”, which are loosely inspired by how the brain works. These involve sophisticated mathematics and a huge amount of computing power, but they boil down to a very sophisticated type of pattern matching.

For example, by looking at millions of example images, a deep neural network can associate the word “cat” with patterns of pixels that often appear in images of cats – like soft, fuzzy, hairy blobs of texture. The more examples the model sees (the more data it is shown), and the bigger the model (the more “layers” or “depth” it has), the more complex these patterns and correlations can be.

Read more: What is a neural network? A computer scientist explains[8]

Foundation models are, in one sense, just an extension of the “deep learning” paradigm that has dominated AI research for the past decade. However, they exhibit un-programmed or “emergent” behaviours that can be both surprising and novel.

For example, Google’s PaLM language model seems to be able to produce explanations for complicated metaphors and jokes. This goes beyond simply imitating the types of data it was originally trained to process[9].

A user interacting with the PaLM language model by typing questions. The AI system responds by typing back answers.
The PaLM language model can answer complicated questions. Google AI[10]

3. Access is limited – for now

The sheer scale of these AI systems is difficult to think about. PaLM has 540 billion parameters, meaning even if everyone on the planet memorised 50 numbers, we still wouldn’t have enough storage to reproduce the model.

The models are so enormous that training them requires massive amounts of computational and other resources. One estimate put the cost of training OpenAI’s language model GPT-3 at around US$5 million[11].

Read more: Can robots write? Machine learning produces dazzling results, but some assembly is still required[12]

As a result, only huge tech companies such as OpenAI, Google and Baidu can afford to build foundation models at the moment. These companies limit who can access the systems, which makes economic sense.

Usage restrictions may give us some comfort these systems won’t be used for nefarious purposes (such as generating fake news or defamatory content) any time soon. But this also means independent researchers are unable to interrogate these systems and share the results in an open and accountable way. So we don’t yet know the full implications of their use.

4. What will these models mean for ‘creative’ industries?

More foundation models will be produced in coming years. Smaller models are already being published in open-source forms[13], tech companies are starting to experiment with licensing and commercialising these tools[14] and AI researchers are working hard to make the technology more efficient and accessible.

The remarkable creativity shown by models such as PaLM and DALL-E 2 demonstrates that creative professional jobs could be impacted by this technology sooner than initially expected.

Read more: AI could be our radiologists of the future, amid a healthcare staff crisis[15]

Traditional wisdom always said robots would displace “blue collar” jobs first. “White collar” work was meant to be relatively safe from automation – especially professional work that required creativity and training.

Deep learning AI models already exhibit super-human accuracy in tasks like reviewing x-rays[16] and detecting the eye condition macular degeneration[17]. Foundation models may soon provide cheap, “good enough” creativity in fields such as advertising, copywriting, stock imagery or graphic design.

The future of professional and creative work could look a little different than we expected.

5. What this means for legal evidence, news and media

Foundation models will inevitably affect the law[18] in areas such as intellectual property and evidence, because we won’t be able to assume creative content is the result of human activity[19].

We will also have to confront the challenge of disinformation and misinformation generated by these systems. We already face enormous problems with disinformation, as we are seeing in the unfolding Russian invasion of Ukraine[20] and the nascent problem of deep fake[21] images and video, but foundation models are poised to super-charge these challenges.

Read more: 3.2 billion images and 720,000 hours of video are shared online daily. Can you sort real from fake?[22]

Time to prepare

As researchers who study the the effects of AI on society[23], we think foundation models will bring about huge transformations. They are tightly controlled (for now), so we probably have a little time to understand their implications before they become a huge issue.

The genie isn’t quite out of the bottle yet, but foundation models are a very big bottle – and inside there is a very clever genie.

References

  1. ^ a teapot shaped like an avocado (www.nytimes.com)
  2. ^ veers off on slightly weird tangents (www.theguardian.com)
  3. ^ DALL-E (openai.com)
  4. ^ GPT (openai.com)
  5. ^ PaLM (ai.googleblog.com)
  6. ^ Foundation models (arxiv.org)
  7. ^ deep neural networks (theconversation.com)
  8. ^ What is a neural network? A computer scientist explains (theconversation.com)
  9. ^ imitating the types of data it was originally trained to process (arxiv.org)
  10. ^ Google AI (ai.googleblog.com)
  11. ^ around US$5 million (lambdalabs.com)
  12. ^ Can robots write? Machine learning produces dazzling results, but some assembly is still required (theconversation.com)
  13. ^ open-source forms (openai.com)
  14. ^ experiment with licensing and commercialising these tools (openai.com)
  15. ^ AI could be our radiologists of the future, amid a healthcare staff crisis (theconversation.com)
  16. ^ reviewing x-rays (theconversation.com)
  17. ^ detecting the eye condition macular degeneration (www.macularsociety.org)
  18. ^ affect the law (www.abc.net.au)
  19. ^ creative content is the result of human activity (www.smithsonianmag.com)
  20. ^ unfolding Russian invasion of Ukraine (theconversation.com)
  21. ^ deep fake (theconversation.com)
  22. ^ 3.2 billion images and 720,000 hours of video are shared online daily. Can you sort real from fake? (theconversation.com)
  23. ^ study the the effects of AI on society (www.admscentre.org.au)

Read more https://theconversation.com/robots-are-creating-images-and-telling-jokes-5-things-to-know-about-foundation-models-and-the-next-generation-of-ai-181150

Times Magazine

Headless CMS in Digital Twins and 3D Product Experiences

Image by freepik As the metaverse becomes more advanced and accessible, it's clear that multiple sectors will use digital twins and 3D product experiences to visualize, connect, and streamline efforts better. A digital twin is a virtual replica of ...

The Decline of Hyper-Casual: How Mid-Core Mobile Games Took Over in 2025

In recent years, the mobile gaming landscape has undergone a significant transformation, with mid-core mobile games emerging as the dominant force in app stores by 2025. This shift is underpinned by changing user habits and evolving monetization tr...

Understanding ITIL 4 and PRINCE2 Project Management Synergy

Key Highlights ITIL 4 focuses on IT service management, emphasising continual improvement and value creation through modern digital transformation approaches. PRINCE2 project management supports systematic planning and execution of projects wit...

What AI Adoption Means for the Future of Workplace Risk Management

Image by freepik As industrial operations become more complex and fast-paced, the risks faced by workers and employers alike continue to grow. Traditional safety models—reliant on manual oversight, reactive investigations, and standardised checklist...

From Beach Bops to Alpine Anthems: Your Sonos Survival Guide for a Long Weekend Escape

Alright, fellow adventurers and relaxation enthusiasts! So, you've packed your bags, charged your devices, and mentally prepared for that glorious King's Birthday long weekend. But hold on, are you really ready? Because a true long weekend warrior kn...

Effective Commercial Pest Control Solutions for a Safer Workplace

Keeping a workplace clean, safe, and free from pests is essential for maintaining productivity, protecting employee health, and upholding a company's reputation. Pests pose health risks, can cause structural damage, and can lead to serious legal an...

The Times Features

Duke of Dural to Get Rooftop Bar as New Owners Invest in Venue Upgrade

The Duke of Dural, in Sydney’s north-west, is set for a major uplift under new ownership, following its acquisition by hospitality group Good Beer Company this week. Led by resp...

Prefab’s Second Life: Why Australia’s Backyard Boom Needs a Circular Makeover

The humble granny flat is being reimagined not just as a fix for housing shortages, but as a cornerstone of circular, factory-built architecture. But are our systems ready to s...

Melbourne’s Burglary Boom: Break-Ins Surge Nearly 25%

Victorian homeowners are being warned to act now, as rising break-ins and falling arrest rates paint a worrying picture for suburban safety. Melbourne residents are facing an ...

Exploring the Curriculum at a Modern Junior School in Melbourne

Key Highlights The curriculum at junior schools emphasises whole-person development, catering to children’s physical, emotional, and intellectual needs. It ensures early year...

Distressed by all the bad news? Here’s how to stay informed but still look after yourself

If you’re feeling like the news is particularly bad at the moment, you’re not alone. But many of us can’t look away – and don’t want to. Engaging with news can help us make ...

The Role of Your GP in Creating a Chronic Disease Management Plan That Works

Living with a long-term condition, whether that is diabetes, asthma, arthritis or heart disease, means making hundreds of small decisions every day. You plan your diet against m...