Why Is ChatGPT Getting Dumber?

Table of Contents

A computer screen featuring the word "chatgpt" repeated multiple times in a creative arrangement.
Photo by Andrew Neel on Unsplash

“Is ChatGPT getting dumber over time?” “What causes such a condition?” These are among the most frequently asked questions in every discussion regarding ChatGPT development. 

Such a discussion emerged from the experience of multiple users as they found that responses provided by the tool no longer satisfied them; some even were irrelevant and out-of-topic.

In this article, I present possible reasons for the model’s declining performance in response to the notion.

Common Myths and Misunderstandings

The public’s perception of the declining capability of AI tool stems from the worsening accuracy of GPT-4 when compared with GPT-3.5. Social media users complained about how the updated version became less intelligent than its predecessor and theorized that OpenAI had failed to predict the massive workload ChatGPT was bound to handle, thus causing the machine to deteriorate as its processing power drained.

OpenAI’s CEO Sam Altman’s comments in a closed-door meeting with 20 developers in London last year backed the public’s assumption as he admitted that the company was facing a challenge of obtaining enough GPUs to power ChatGPT, explaining why the tool’s reliability and speed got compromised. Considering how society has increasingly depended on AI applications, including GPT-4, it is understandable that people start questioning the technology’s intelligence once it fails to deliver what it promotes.

Stanford and UC Berkeley Studies

When researchers from reputable institutions such as Stanford University and UC Berkeley released a report, you know it is a big deal. That was what happened when they presented how the model’s behavior changed over time. The report outlined ChatGPT problems as reflected in the decline in GPT-4’s performance, making it a worse version than GPT-3.5.

The study assessed the tool’s performance in seven tasks, namely solving math problems, answering sensitive questions, answering opinion surveys, answering multi-hop knowledge-intensive questions, generating code, handling US Medical License exams, and visual reasoning. These tasks allowed researchers to determine if the tool responded accurately to the questions. 

Intriguingly, the researchers found that GPT-3.5 gave better results in accuracy in math questions with 30.6% in March 2023 and 48.2% in June 2023, signifying an increase in performance in three months. On the contrary, GPT-4 declined during the same timeframe with 83.6% in March and 35.2% in June. The same condition applies to other tasks in general, indicating a drop in GPT-4’s performance.

Why is the chatbot less intelligent nowadays, you may ask?

The researchers attributed the decrease in performance to the tool’s declining instruction following capacity. For example, GPT-4 refused to follow certain instructions in the math exam task in June although it responded to the same instructions in March. The tool also did the same when dealing with several opinion surveys.

The study changed the public’s perceptions of understanding AI. It made people realize that the technology is incapable of solving their problems by itself; it requires careful and intensive human monitoring to function properly. This realization highlights the role of inclusivity in AI development, as explored in this article on education.

a person holding a cell phone with in their hand, ChatGPT installed
Source: Unsplash

Actual Situation: Is ChatGPT Getting Dumber?

While it is compelling to blame ChatGPT’s growing incompetence on OpenAI’s lack of capacity to maintain the tool, we should understand that several technical conditions may also contribute to its declining performance. For example, GPT-4’s feedback element may deteriorate the tool’s response capacity as users can decide if the tool’s responses are good or bad through the reinforcement learning from human feedback (RLHF) process. 

Unfortunately, such feedback may drift the accuracy of AI tool’s future responses as they are no longer based on objective facts but rather on users’ preferences. In this context, the two most plausible causes of AI performance issues are apparent in GPT-4’s declining accuracy.

AI Drift

AI drift is a condition where AI tools’ responses gradually become less relevant and accurate due to changes in the datasets that those tools train on, or that the datasets stay the same, but the fundamental world changes to the extent that AI’s responses move from being relevant to irrelevant due to the change. In this context, a drift occurs when AI fails to accurately respond to users’ prompts.

Three AI drift types exist. The first is data drift which is caused by changes in the retrieval source. The second is model drift, which occurs due to changes in the ranking algorithm. The last is interaction drift which is caused by feedback loops. These three drifts suggest that AI drifts are more likely when the quality of data input gets worse.

Model Collapse

Have you ever wondered what happens if some responses and data generated by AI are accidentally added to the datasets used to train AI? It can lead to model collapse or a condition where AI trains on AI-generated content, making its capacity to handle queries and prompts decline due to its data pollution. To some extent, its responses may become homogenous and less human.

In this regard, it is not too far-fetched to attribute GPT-4’s declining performance to model collapse given how vast AI-generated data has been circulating on the internet, thus increasing the probability of having such synthetic data in AI training. 

According to a researcher at the University of Oxford, Ilia Shumailov, what makes a model collapse threatening is that it may gradually erase the probability rate of responses because AI loses the chance to learn from unpredictable human inputs. Moreover, it is understood that humans keep evolving, both in terms of cognition and expectations. In short, model collapse makes AI more like a repetitive machine and less human.

From our investigation, we can confirm that there is indeed a possibility of ChatGPT getting dumber due to its declining capacity to provide accurate responses to user prompts. I see this issue as a double-edged sword. On the one hand, it reassures humans that AI may never replace their cognition, given that it depends heavily on human-generated contexts and human monitoring to function adequately. On the other hand, it blows many people’s hopes of having AI as their reliable automated assistant. Either way, it is safe to say that human involvement remains the key to successful AI adoption.

If you are intrigued with how this topic rolls, do not forget to comment with your thoughts on AI evolution.

References

Abid, A. (2023, July 28). Is ChatGPT getting dumber? DW. https://www.dw.com/en/is-chatgpt-getting-dumber/a-66352529

Chen, L., Zaharia, M., & Zou, J. (2023). How is ChatGPT’s behavior changing over time? arXiv. https://arxiv.org/abs/2307.09009.

Claburn, T. (2024, January 26). What is Model Collapse and how to avoid it. The Register. https://www.theregister.com/2024/01/26/what_is_model_collapse/

Human Driven AI (2023, August 16). What Is AI Drift And Why Is It Happening to ChatGPT? Human Driven AI. https://humandrivenai.com/2023/08/16/what-is-chatgpt-drift-and-why-is-it-happening/

Lutkevich, B. (2023, July 7). Model collapse explained: How synthetic training data breaks AI. TechTarget. https://www.techtarget.com/whatis/feature/Model-collapse-explained-How-synthetic-training-data-breaks-AI

Stokel-Walker, C. (2023, June 6). People think ChatGPT is getting worse and slower. This is why they might be right. iNews. https://inews.co.uk/news/chat-gpt-users-ai-tool-intelligent-processing-power-2389759

CATEGORIES
TAGS
Share This

COMMENTS

Wordpress (0)
Disqus (0 )

Discover more from The Inclusive AI

Subscribe now to keep reading and get access to the full archive.

Continue reading