As 2024 comes to a close, the AI landscape finds itself at a crossroads. For much of the past decade, tech giants like OpenAI, Google, and Anthropic have competed relentlessly to build ever-larger and more powerful artificial intelligence systems. Each new leap—from GPT-3.5 to GPT-4, and Google’s Gemini—ignited waves of excitement and heightened expectations.
However, the AI race has reached an unexpected slowdown. Leaders in the field are realizing that simply scaling up models—by throwing in more computing power, more data, and more parameters—is no longer yielding the dramatic improvements it once did. The golden age of “scaling laws” appears to have plateaued.
Understanding the AI Scaling Plateau
For years, the “scaling laws” served as a guiding principle in AI development. Much like baking a bigger cake requires more ingredients and longer baking time, AI scaling suggests that larger models trained on bigger datasets would naturally deliver better performance. This strategy worked exceptionally well during the 2010s, fueling innovations in natural language processing, computer vision, and generative AI.
But in 2024, the cracks in this approach have begun to show. Former OpenAI co-founder Ilya Sutskever, now leading his own AI venture, Safe Superintelligence (SSI), recently told Reuters, “The 2010s were the age of scaling, but now we’re back in the age of wonder and discovery. Everyone is searching for the next big thing.”
Take OpenAI’s new model, Orion. Reports suggest that while Orion already matches GPT-4’s performance with just 20% of its training completed, the improvements are far less dramatic compared to the leap seen from GPT-3 to GPT-4. Challenges remain, particularly in domains like coding, where insufficient specialized training data has hindered results. Consequently, OpenAI has reportedly delayed Orion’s public release until early 2025.
Anthropic’s Claude 3.5 Opus, another highly anticipated AI release, has faced similar hurdles. Despite being a larger and more resource-intensive model, its performance does not significantly outshine its predecessors.
The Data Dilemma
At the heart of this slowdown lies a major bottleneck: access to high-quality data. For years, AI companies relied on scraping publicly available content from the internet. However, this approach is reaching its limits. The need for curated, high-quality datasets—especially for niche or complex tasks—has grown significantly.
In response, companies like OpenAI have begun forming partnerships with publishers and content creators to obtain targeted datasets. While promising, this approach is slow and expensive. Adding to the challenge is the rise of synthetic data—AI-generated text and visuals—which, while abundant, often lack the richness and complexity of human-created content.
The Cost of Bigger Models
The financial burden of AI research has also become staggering. Training state-of-the-art models like GPT-4 or Google’s Gemini reportedly costs upwards of $100 million, and these figures are expected to soar into the billions in the coming years.