Human-Produced Content And Experts Are Crucial To Prevent LLM “Model Collapse”

Serge Gladkoff, CEO of Logrus Global.

When the GenAI hype was just picking up steam, I wrote about the danger of drowning in LLM-produced blah if we failed to utilize the expertise of human linguists. It gives me no pleasure to say I was right, but here we are. The warning I issued two years ago is even more pertinent now.

Take ChatGPT as an example. Despite marketing claims that GPT-5 is like having a team of Ph.D. experts in everything at your beck and call, its rollout has been underwhelming to many. Users took to Reddit to complain about sluggish response times and the same hallucinations plaguing previous models. A Change.org petition requested GPT-4o to remain available, prompting OpenAI to reinstate it.

But let’s ask ourselves: Why would anyone expect newer GenAI models to be something they’re not and can never be? More frequently, we’re seeing fewer groundbreaking innovations in system architecture or algorithms—just minor tweaks.

The Limits Of Self-Trained AI

To help put this into perspective, Adam Becker writes in his book, More Everything Forever, that “ChatGPT is a text generation engine that speaks in the smeared-out voice of the internet as a whole. All it knows how to do is emulate that voice, and all it cares about is getting the voice right.” Becker goes on to note that hallucinations in this sense are less of a “mistake” and more just how models are trained.

Most critically, GenAI models are running out of human-produced training data. Every scrap of freely available text has already been harvested. Now, AI companies face a dilemma: either pay content creators for new data (driving costs even higher) or keep scraping the internet—relying on platforms like Reddit and Wikipedia as primary sources.

But here’s the catch: the “new” content is already polluted with outputs from older LLMs, complete with their hallucinations, errors and nonsensical answers. Since language models lack true intelligence, they can’t distinguish human writing from AI-generated sludge, so they ingest and regurgitate it all indiscriminately. The result will likely be more LLMs trained on mountains of low-quality, synthetic content—and we are just starting to see the consequences. I believe the worst, however, is yet to come.

A Nature article warns that “indiscriminate use of model-generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear.” Researchers call this effect “model collapse”—an inevitable degradation where AI-trained-on-AI data loses grasp of rare but meaningful patterns, producing oversimplified, repetitive or distorted outputs.

The study echoes my argument from two years ago: human oversight is irreplaceable, whether in creating original content or filtering AI-generated noise from training datasets.

Back then, human professionals—not LLMs—were essential for high-quality writing and translation. Today, research and real-world failures only underscore that urgency. GenAI can be useful in niche applications, as a great tool for experts. But to harness it effectively, you need critical thinking and expertise. Otherwise, its smooth-talking algorithms can fool you by delivering something that should be corrected but was not—especially when the stakes are high.

Why Human Judgment Still Matters Most

None of this means we should abandon GenAI—far from it. These tools are here to stay, and their potential is real. But we need to rethink how we build and use them. That means valuing human expertise, demanding transparency in training data and designing systems that augment—rather than replace—human judgment. Human experts in all fields are needed more than ever.

As AI models become more capable, raw data as well as annotation tasks shift from simple labeling (like identifying objects) to nuanced tasks in fields such as medicine or law. When evaluating outputs or solving specialist problems, only experts can discern subtle differences and spot errors, requiring senior engineers, doctors, lawyers and so on. Non-expert crowdworkers are insufficient for these roles and can lead to weaker model performance or missed critical issues.

This approach, of course, is not an easy task. It all comes back to the very message that we have been reiterating right from the start—human experts are crucial to ensure AI works well. The facts of the “model collapse” from the aforementioned Nature paper deliver a clear and sobering message: AI cannot be reliably trained on its own output. This fundamental limitation underscores the indispensable role of human content, annotation and expert judgment.

Instead of pursuing full autonomy, the path forward for AI must be one of synergy with human intelligence. AI should be seen as a powerful augmentation for human experts—a tool designed to elevate our capabilities, ensuring that technology serves humanity, not the other way around.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

What's Hot

Human-Produced Content And Experts Are Crucial To Prevent LLM “Model Collapse”

The Limits Of Self-Trained AI

Why Human Judgment Still Matters Most

Keep Reading

News

Mobile Apps

Subscribe to Updates