The Times Australia
The Times World News

.
The Times Real Estate

.

OpenAI’s new ‘deep research’ agent is still just a fallible tool – not a human-level expert

  • Written by Raffaele F Ciriello, Senior Lecturer in Business Information Systems, University of Sydney
OpenAI’s new ‘deep research’ agent is still just a fallible tool – not a human-level expert

OpenAI’s “deep research[1]” is the latest artificial intelligence (AI) tool making waves[2] and promising to do in minutes what would take hours for a human expert to complete.

Bundled as a feature in ChatGPT Pro and marketed[3] as a research assistant that can match a trained analyst, it autonomously searches the web, compiles sources and delivers structured reports. It even scored[4] 26.6% on Humanity’s Last Exam (HLE), a tough AI benchmark, outperforming[5] many models.

But deep research doesn’t quite live up to the hype. While it produces polished reports, it also has serious flaws. According to journalists[6] who’ve tried it[7], deep research can miss key details, struggle with recent information and sometimes invents facts.

OpenAI flags this when listing the limitations of its tool. The company also says it[8] “can sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations”.

It’s no surprise that unreliable data can slip in, since AI models don’t “know” things in the same way humans do.

The idea of an AI “research analyst” also raises a slew of questions. Can a machine – no matter how powerful – truly replace a trained expert? What would be the implications for knowledge work? And is AI really helping us think better, or just making it easier to stop thinking altogether?

What is ‘deep research’ and who is it for?

Marketed towards professionals in finance, science, policy, law and engineering, as well as academics, journalists and business strategists, deep research is the latest “agentic experience[9]” OpenAI has rolled out in ChatGPT. It promises to do the heavy lifting of research in minutes.

Currently, deep research is only available to ChatGPT Pro users in the United States, at a cost of US$200 per month. OpenAI says[10] it will roll out to Plus, Team and Enterprise users in the coming months, with a more cost-effective version planned for the future.

Unlike a standard chatbot that provides quick responses, deep research follows a multi-step process to produce a structured report:

  1. The user submits a request. This could be anything from a market analysis to a legal case summary.
  2. The AI clarifies the task. It may ask follow-up questions to refine the research scope.
  3. The agent searches the web. It autonomously browses hundreds of sources, including news articles, research papers and online databases.
  4. It synthesises its findings. The AI extracts key points, organises them into a structured report and cites its sources.
  5. The final report is delivered. Within five to 30 minutes, the user receives a multi-page document – potentially even a PhD-level thesis[11] – summarising the findings.

At first glance, it sounds like a dream tool for knowledge workers. A closer look reveals significant limitations.

Many[12] early[13] tests[14] have exposed shortcomings:

  • It lacks context. AI can summarise, but it doesn’t fully understand what’s important.
  • It ignores new developments. It has missed major legal rulings and scientific updates.
  • It makes things up. Like other AI models, it can confidently generate false information.
  • It can’t tell fact from fiction. It doesn’t distinguish authoritative sources from unreliable ones.

While OpenAI claims its tool rivals human analysts, AI inevitably lacks the judgement, scrutiny and expertise that make good research valuable.

What AI can’t replace

ChatGPT isn’t the only AI tool that can scour the web and produce reports with just a few prompts. Notably, a mere 24 hours after OpenAI’s release[15], Hugging Face released a free, open-source version that nearly matches its performance.

The biggest risk of deep research and other AI tools marketed for “human-level” research is the illusion that AI can replace human thinking. AI can summarise information, but it can’t question its own assumptions, highlight knowledge gaps, think creatively or understand different perspectives.

And AI-generated summaries don’t match the depth[16] of a skilled[17] human researcher.

Any AI agent, no matter how fast, is still just a tool, not a replacement for human intelligence. For knowledge workers, it’s more important than ever to invest in skills that AI can’t replicate: critical thinking, fact-checking, deep expertise and creativity.

If you do want to use AI research tools, there are ways to do so responsibly. Thoughtful use of AI can enhance research without sacrificing accuracy or depth. You might use AI for efficiency, like summarising documents, but retain human judgement for making decisions.

Always verify sources, as AI-generated citations can be misleading. Don’t trust conclusions blindly, but apply critical thinking and cross-check information with reputable sources. For high-stakes topics — such as health[18], justice[19] and democracy[20] — supplement AI findings with expert input.

Despite prolific marketing that tries to tell us otherwise, generative AI still has plenty of limitations. Humans who can creatively synthesise information, challenge assumptions and think critically will remain in demand – AI can’t replace them just yet.

References

  1. ^ deep research (openai.com)
  2. ^ making waves (www.forbes.com)
  3. ^ marketed (www.theguardian.com)
  4. ^ scored (www.zdnet.com)
  5. ^ outperforming (www.techradar.com)
  6. ^ According to journalists (www.theverge.com)
  7. ^ who’ve tried it (www.platformer.news)
  8. ^ The company also says it (openai.com)
  9. ^ agentic experience (openai.com)
  10. ^ says (openai.com)
  11. ^ potentially even a PhD-level thesis (futureofbeinghuman.com)
  12. ^ Many (www.theverge.com)
  13. ^ early (www.nature.com)
  14. ^ tests (www.datacamp.com)
  15. ^ 24 hours after OpenAI’s release (arstechnica.com)
  16. ^ depth (futureofbeinghuman.com)
  17. ^ skilled (www.tandfonline.com)
  18. ^ health (www.theguardian.com)
  19. ^ justice (www.theguardian.com)
  20. ^ democracy (www.theguardian.com)

Read more https://theconversation.com/openais-new-deep-research-agent-is-still-just-a-fallible-tool-not-a-human-level-expert-249496

The Times Features

Tassie’s best pie enters NSW with the launch National Pies’ new fresh range

Fresh from Tasmanian Bakeries in Hobart, National Pies has just delivered Tassie’s best-selling pie to the ready meals aisles of Woolworths stores across NSW.  The delicious roll o...

IORDANES SPYRIDON GOGOS RUNWAY | AFW 2025

Fifth Collection by ISG | Words + Photography by Cesar Ocampo Some runway shows are about the clothes. Others are about the culture they carry. With Iordanes Spyridon Gogos, it’s ...

AJE Resort ‘26 — “IMPRESSION”

Photographed by Cesar Ocampo | AFW 2025 Day 3, Barangaroo Pier Pavilion There are runways, and then there are moments. Aje’s Resort ‘26 collection, IMPRESSION, wasn’t just a fashi...

Miimi & Jiinda: Weaving Culture, Connection, and Country into Every Thread

By Cesar Ocampo When I sat down with Melissa Greenwood and her mother, Lauren Jarrett—founders of the First Nations brand Miimi & Jiinda—I knew this wasn’t going to be your st...

American Express to Provide $3.95M in Support for Restaurants Worldwide with 2025 “Backing Small” Grant Programs

Sydney, Australia 14 May 2025 – Applications are now open to small business owners who qualify for one  of American Express’ signature grant programs in 2025: Backing Internati...

FARAGE Summer '26 Brings Back the Power Suit — with Edge

Words & Photography by Cesar Ocampo On Day 2 of Australian Fashion Week, I stepped into the FARAGE Summer ’26 runway show not quite knowing what to expect—but walked away thin...

Times Magazine

Senior of the Year Nominations Open

The Allan Labor Government is encouraging all Victorians to recognise the valuable contributions of older members of our community by nominating them for the 2025 Victorian Senior of the Year Awards.  Minister for Ageing Ingrid Stitt today annou...

CNC Machining Meets Stage Design - Black Swan State Theatre Company & Tommotek

When artistry meets precision engineering, incredible things happen. That’s exactly what unfolded when Tommotek worked alongside the Black Swan State Theatre Company on several of their innovative stage productions. With tight deadlines and intrica...

Uniden Baby Video Monitor Review

Uniden has released another award-winning product as part of their ‘Baby Watch’ series. The BW4501 Baby Monitor is an easy to use camera for keeping eyes and ears on your little one. The camera is easy to set up and can be mounted to the wall or a...

Top Benefits of Hiring Commercial Electricians for Your Business

When it comes to business success, there are no two ways about it: qualified professionals are critical. While many specialists are needed, commercial electricians are among the most important to have on hand. They are directly involved in upholdin...

The Essential Guide to Transforming Office Spaces for Maximum Efficiency

Why Office Fitouts MatterA well-designed office can make all the difference in productivity, employee satisfaction, and client impressions. Businesses of all sizes are investing in updated office spaces to create environments that foster collaborat...

The A/B Testing Revolution: How AI Optimized Landing Pages Without Human Input

A/B testing was always integral to the web-based marketing world. Was there a button that converted better? Marketing could pit one against the other and see which option worked better. This was always through human observation, and over time, as d...

LayBy Shopping