Yesterday, while scrolling through my Twitter feed, I came across a tweet from OpenAIโs CEO, Sam Altman, announcing some exciting news: ChatGPT Deep Search was finally available to ChatGPT Plus users. Until then, it had been exclusive to Pro users paying $200 a month and Enterprise users.
The moment I saw it, I jumped straight into my ChatGPT interface and spent the next eight hours exploring it non-stop. It felt like unwrapping a long-awaited Christmas gift, I was literally excited. I couldnโt wait to see how it performed especially in tasks related to academic research and more specifically writing literature reviews.

But why all this excitement?
Well, here is the story!
As an academic researcher, I saw ChatGPT Deep Search as a catalyst for tackling complex cognitive work specifically conducting quality research. ChatGPT-4o already handles what I like to call shallow cognitive work quite well: it summarizes texts, extracts key points, writes in a polished, college-level style, and so on. In fact, most of the writing we encounter daily, whether online or in academic settings, falls within this category.
That said, even with these more surface-level tasks, ChatGPT leaves behind a distinctive linguistic footprint. Once you’ve been exposed to enough AI-generated text, you start developing an intuitive sense for its style (e.g., the repetitive structures, the predictable phrasing, the polished but oddly uniform tone, etc).
Iโve written about this before and even compiled a list of linguistic markers that give away ChatGPT-generated content. Iโve also argued that teachers donโt necessarily need AI detectors (which, letโs be honest, are unreliable and riddled with controversy) to spot AI-written text. The patterns are so glaringly obvious that even someone with basic language awareness can pick up on them without any specialized training.
Three years of exposure to this ‘artificial language’ generated by AI chatbots feels like an eternity. These days, everyone writes in a perfectly polished, error-free style. Spelling mistakes and grammatical errors have become rare, not just because of built-in grammar checkers but because AI-powered writing assistants are now woven into the very fabric of the internet.
If you find yourself longing for the days when human writing (i.e., raw, imperfect, and deeply personal) was the norm, youโre not alone.
Amidst this tsunami of AI-generated writings, it feels just wonderful to come across a piece of authentic human-written text. It feels like a breath of fresh air, doesn’t it?
The analogy I like to use here is like traveling to a distant country, spending years immersed in its language and culture, and then, one day, while wandering through one of its old markets, you suddenly hear someone speaking your native tongue. In that moment, those familiar words evoke a flood of memories and emotions, something that canโt be replicated or experienced vicariously. If youโve ever been in that situation, you know exactly what I mean.
Don’t get me wrong, I love AI and ChatGPT. I think weโre incredibly lucky to be living through this era of once-in-a-lifetime technological transformations. And yes, I use ChatGPT daily, whether itโs for editing my writing (including this piece you are reading now), brainstorming ideas, or refining my thoughts. Ignoring such a powerful tool would be a missed opportunity. Iโd be foolish not to take advantage of it to enhance my creative thinking.
However, thereโs a big difference between what author and professor Ethan Mollick calls writing with AI and having AI write for you (cited in Khan, 2024). The former is what we should ideally be doing. You do the thinkingโwhether aided by AI or notโbut you remain in control of the process, human in the loop!
You engage intellectually, generate original ideas based on your experiences and accumulated knowledge, and then use AI to refine those ideas, strengthen your arguments, and enhance clarity. Thatโs what writing with AI is all about. Itโs like having a thinking partner by your side (a co-intelligence as Mollick (2024) refers to it), helping you sharpen your creative and analytical skills, but replace them or do the work for you!
Unfortunately, human nature gravitates toward shortcuts. And letโs be honestโif AI can do the work for you, why bother?
Well, you should botherโbecause if AI can do it for you, it can do it for everyone else, using the same language, the same linguistic and cliched patterns, the same predictable phrasing. And guess what? You end up blending into a sea of sameness, indistinguishable from everyone else.
Thatโs exactly whatโs happening online right now. Everywhere you look, AI-generated language stares back at you, filling spaces with its uniform, polished, yet strangely soulless tone. Itโs everywhere so much so that it starts to feel invasive, almost like itโs harassing you with its relentless sameness.
Itโs against this backdrop of developments and frustrations that I welcomed the arrival of ChatGPT Deep Search. I thought this might finally be the breakthrough that makes a real difference. For the first time, it felt like we had a technology capable of tackling the complexities of research head-on.
Iโm talking specifically about research in the social sciences, where interpretation and creativity play a central role, and where researchers navigate nuances that donโt always fit neatly into predefined structures.
I might be wrong but here is what I think: AI already does a great job in fields that rely on calculations and statistical analysis (i.e., positivist qualitative research). It speaks the same mathematical language as these disciplines making it a natural fit. But when it comes to research that thrives on critical interpretation, individual human lived experiences (i.e., interpretivist qualitative research), and deep contextual understanding, thatโs where the real challenge lies.
However, AIโs strengths in quantitative versus qualitative research is a discussion for another time. Right now, my focus is on what Deep Search brings to the table for more complex, open-ended research.
So back to my story with ChatGPT Deep Search!
Before Deep Search became available to Plus users, I had read several reviews of the tool almost all of them glowing, especially regarding its research capabilities. Naturally, I was eager to try it out particularly on literature reviews which are notoriously time-consuming and tedious.
To put it to the test, I started with topics in areas that fall within my research interest especially discourse analysis, research methodology, and AI integration in education. Iโve read so much on these areas that I can immediately recognize seminal works and distinguish them from less critical contributions; essentially, I know what must be cited in a literature review in any of those areas.
I also tested it on a topic directly relevant to a chapter Iโm currently writing for my upcoming book on AI in academic research. Since Iโve already gathered a wealth of high-quality papers on this subject, I had a solid benchmark for evaluating how well ChatGPT Deep Search would perform. I knew exactly which sources it should reference, and I was eager to see whether it would meet those expectations.
Each task took ChatGPT Deep Search anywhere from 3 to 9 minutes, at least in my case. What I really liked was the ability to follow along with ChatGPTโs reasoning, to see how it thinks, what resources it accesses, and how it processes information. That alone felt like an impressive intellectual feat.
In many ways, this aligns with what researchers and AI ethicists have been advocating forโan explainable AI, one that offers more transparency about its sources and the data it pulls from. Reasoners do this too; they donโt just provide conclusions but also show their thought process. Seeing Deep Search adopt this approach feels like a step in the right direction.
The results I got for my queries were decent, to say the least but nowhere near the stellar reviews I had seen from early adopters of this technology. As an experienced academic researcher with a solid track record of published work, I can confidently say that the literature reviews ChatGPT Deep Search generates are, at best, C-level.
What disappointed me the most was the absence of many seminal works that should have been cited. On the upside, the final reports were fairly extensive, and the argumentation was definitely stronger than what ChatGPT-4o would typically produce. Even the language was more refined. But when it came to depth of reasoning and scholarly rigor, it still fell far short of what a seasoned researcher would bring to a literature review.
Let me share an example of a section from the literature review that Deep Search generated on the difference between qualitative and quantitative methods. Here is the link to the whole literature review it generated.
Notice that it pulled all of its information from a single sourceโSheppard’s Research Methods for the Social Sciences: An Introduction, which is publicly available on pressbooks.bccampus.ca. As someone well-versed in research methodology, I know that seminal works in this field include those by John Creswell, Egon Guba, Yvonna S. Lincoln, Robert Yin, Norman Denzin, and Alan Bryman, among others.
But if you check the reference list it generated, youโll see an odd inconsistencyโit does mention Creswell, yet thereโs no in-text citation for his work. This suggests that not only does it fail to prioritize authoritative sources, but it also includes references that arenโt actually cited within the text, raising questions about the reliability of its sourcing.
So why did ChatGPT Deep Search rely solely on that one reference?
I think there are a few reasons. First, because the source was publicly available. Second, much of the work from established authors in the field is behind paywalls, meaning Deep Search likely couldnโt access it. And third, the summaries it pulled from directly answered my query, mirroring the phrasing of my question almost word for word.
This suggests that Deep Search operates as a form of advanced semantic searchโit retrieves information that closely matches the questionโs wording, then synthesizes and presents it in a coherent argument. While impressive in its ability to structure responses, this also highlights a major limitation: its reliance on what is accessible rather than what is authoritative.
The problem with ChatGPT Deep Search (and all GenAI for that matter) is that it only pulls from publicly available online sources. Unfortunately, a huge body of serious academic work requires access to high-quality research which is often locked behind paywalls whether through institutional subscriptions to academic journals and databases or through algorithmic restrictions that prevent AI from crawling certain repositories.
This creates a major limitation. While Deep Search can synthesize and present information well, itโs only as good as the sources it has access to. And as, I mentioned, in academia, the most authoritative and groundbreaking research isnโt freely available on the open webโitโs gated behind paywalls, making AI-generated literature reviews inherently incomplete and intellectually shallow.
So, I donโt think this is entirely Deep Searchโs fault, itโs more a reflection of the accessibility barriers weโve built around human knowledge. If ChatGPT Deep Search had access to the millions of peer-reviewed papers and copyrighted books out there, the quality of its reports would be significantly better. But given the capitalistic world we live in, thatโs simply not feasible.
The reality is that knowledge gaps and accessibility issues have always been part of our education systems, and theyโre unlikely to disappear anytime soon. So when you ask Deep Search to generate a literature review on an academic topic like critical discourse analysis, what you get isnโt a comprehensive synthesis of the fieldโitโs a well-structured argument based on blog posts, a handful of open-access papers, and book summaries rather than the books themselves. This inevitably results in shallow analysis, at least in my experience within my own research niche.
So hereโs what I thinkโno matter how advanced generative AI becomes, if it doesnโt have access to paywalled knowledge, how can it truly produce quality knowledge? We all know that much of the most valuable research and analysis is locked behind paywalls, making it inaccessible to AI models.
And this isnโt just an issue in academia. Look at online media and journalismโif you want in-depth analysis and responsible reporting, you have to pay. Try accessing The New York Times, The Washington Post, The Wall Street Journal, Foreign Policyโthey all operate on a subscription basis. Almost everything of value is gated now. Mind you, The New York Times has sued OpenAI for copyright infringement over allegedly using its content in the training of its GPT models.
So instead of getting caught up in discussions about AI singularity, AGI, and all these grand futuristic projections, maybe we should step back and focus on a more fundamental issue: how to provide AI with access to quality training data.
No matter how advanced we make these systems, if they arenโt trained on high-quality information, their performance will always be limitedโlike building a state-of-the-art rocket but fueling it with low-grade gasoline. The real conversation should be about how to democratize access to human knowledge and make it available to everyone.
References:
- Khan, S. (2024). Brave new words: How AI will revolutionize education (and why that’s a good thing). Viking.
- Mollick, E. (2024). Co-intelligence: Living and working with AI. Portfolio/Penguin.