Can OpenAI’s o1 Model Outperform Humans? Early Tests Say Yes

NEWS

By

12 September 2024

OpenAI’s new o1 model brings AI closer to human-like reasoning

OpenAI is taking another major leap toward advanced AI systems with the launch of its latest model, dubbed o1, which promises enhanced reasoning capabilities that can tackle intricate questions at a speed faster than human comprehension. Alongside o1, a smaller and more cost-effective version called o1-mini is also being introduced. For those following the AI rumor mill, this is the highly anticipated Strawberry model we have been hearing about so long.

The o1 model is part of OpenAI’s vision to create more human-like artificial intelligence, but its current focus is on improving its ability to solve complex problems—particularly in coding and multi-step reasoning. However, this comes at a price: the model is both more expensive and slower than OpenAI’s previous GPT-4o, with the company referring to this release as a “preview” to emphasize its early stage.

Starting today, users of ChatGPT Plus and Team will have access to both the o1-preview and o1-mini models, while Enterprise and Educational users will gain access in the coming week. OpenAI has also announced that it intends to roll out o1-mini to its free-tier users, although the exact date is still undetermined. Developer access to the o1 model comes with a hefty price tag: $15 per million input tokens (chunks of text the model processes) and $60 for each million output tokens it generates. For comparison, GPT-4o costs significantly less at $5 per million input tokens and $15 per million output tokens.

^{A New Approach to AI Training}

The o1 model represents a significant departure from previous AI models. According to Jerry Tworek, OpenAI’s research lead, the training methodology used for o1 is fundamentally different, although specific details remain under wraps. What sets o1 apart is its use of reinforcement learning—a technique where the system is trained through rewards and penalties to develop problem-solving skills autonomously. The model’s thought process is designed to mirror the human approach to problem-solving, utilizing a “chain of thought” methodology that allows it to tackle queries step by step.

Tworek highlights that the new training approach has reduced hallucinations (a common problem where AI models generate incorrect information). However, OpenAI acknowledges that the issue hasn’t been completely resolved, and occasional inaccuracies still occur.

^{Superior Problem-Solving Abilities}

One of the standout features of o1 is its enhanced capability to handle complex tasks like coding and mathematics. In a test conducted by OpenAI, the model solved 83% of the problems in a qualifying exam for the International Mathematics Olympiad, a massive leap from GPT-4o’s 13% success rate. Bob McGrew, OpenAI’s chief research officer, also noted that in Codeforces competitions (online programming contests), o1 performed in the 89th percentile, far surpassing its predecessors.

However, while o1 excels in reasoning and problem-solving, it lacks some of GPT-4o’s capabilities, such as understanding factual information about the world or browsing the web. The model also cannot process images or files, underscoring its current limitations. Still, OpenAI views o1 as a foundational step toward a new era of AI models, with its name symbolizing a fresh start.

^{The Future of AI Reasoning}

OpenAI’s goal with the o1 model isn’t just to create a smarter assistant—it’s to pave the way for autonomous AI agents capable of making decisions and taking actions independently. The ability to reason through complex problems is a critical milestone toward achieving human-level intelligence, with potential breakthroughs in fields like medicine and engineering.

Though impressive, o1’s reasoning abilities are currently slow and costly, making it a tool better suited for developers and specialized applications. Nonetheless, McGrew emphasized that the company’s focus on reasoning is crucial for the long-term evolution of AI, as it moves beyond pattern recognition into a realm where true intelligence becomes a possibility.

Receive daily updates, inspiration, and exclusive deals delivered to your inbox.

Share this page:

Copyright ©2024 TechyMenia. All Rights Reserved.

This article may include affiliate links. Please refer to our privacy policy for further details.

Today’s NYT Strands Hints, Answers and Tips for Sept. 21 #202

Published 21 September 2024 –

By Landon Cole

Today’s NYT Connections Hints, Answers and Tips for Sept. 21 #468

Published 21 September 2024 –

By Hina Takahashi

Today’s Wordle Hints, Answer and Tips for Sept. 21 #1190

Published 21 September 2024 –

By Grayson Reed

About Author

Ryker Westin

Ryker Westin is a security and networking expert based in Houston, Texas, covering everything from data breaches and malware to optimizing Wi-Fi coverage for homes and businesses. With years of experience in cybersecurity and B2B tech, he’s always on the lookout for the next major cyberattack or data breach. Having worked remotely since 2018, Ryker has reviewed numerous standing desks and essential remote working accessories. He’s a strong advocate for cable management and staying organized. When he's not writing, you’ll find him tinkering with PCs and game consoles, managing cables, and upgrading his smart home.