AI

I pitted ChatGPT’s new o3-mini reasoning model against DeepSeek-R1, and I was shocked by the results

It’s been a rollercoaster week for artificial intelligence, with DeepSeek completely destabilizing the AI market by releasing its R1 reasoning model and not only giving everybody access to it for free (as a chatbot), but also giving developers incredibly cost-effective access to it as an API.

DeepSeek was then hit by cyber attacks that temporarily took it offline, but it appears to be up and running again. To cap the week off, OpenAI responded by releasing its o3-mini and o3-mini-high reasoning models across all its subscription services, including its Plus and Pro subscriptions and its free tier.

To use o3-mini on the free tier of ChatGPT, on mobile or the web, is simple. You’ll need to update your ChatGPT app on mobile first, then tap the new Reason button next to search in the Message box. It works exactly the same way in the web browser version of ChatGPT. You’ll find a new Reason button where you type your text prompt:

The new Reason button in ChatGPT

ChatGPT has a new Reason button you can select when messaging the chatbot. (Image credit: OpenAI)

Reasoning models are particularly good at tasks like writing complex code and solving difficult math problems, however, most of us use chatbots to get quick answers to the kind of questions that appear in everyday life. So, I immediately started wondering how the new o3-mini reasoning model would do compared to DeepSeek-R1 since they’re both free to access. I immediately set about asking it some tough questions that would require a little bit of thought to answer.

Life advice

The first thing I asked o3-mini was to give me a bit of life advice and (pretending I was 18 years old again) help me choose between starting my career or going to university. That’s the sort of question that has a lot of factors that need consideration, so I thought it would be a good place to start.

While both LLMs gave me a decent answer, the difference between how they presented it was quite shocking. ChatGPT o3-mini thought about if for a few seconds, giving me a brief insight into its thinking, telling me, “I’m weighing the decision between starting a career now or pursuing further education. Need to gather more details, like goals and specific circumstances, before giving any advice.” and “I’m evaluating fields’ requirements, considering interests, preferences, finances, career goals, and job market. Mentorship and research are pivotal. Personal context is crucial for an informed decision” before giving me an actual answer that was pretty balanced.

DeepSeek, however, completely lifted the lid on its reasoning process, telling me what it was considering at every point. In fact, there was almost too much information! It feels like you’re looking into the anxious mind of an over-thinker. “Wait,” DeepSeek wonders, “but how do I know what I want? Maybe I should list out the pros and cons”. And later, it ponders, “What about passion? Am I excited about a particular field of study, or am I more eager to get into the workforce? If I’m not sure what to study, maybe working for a while could help me figure that out before committing to a degree.” And so it goes on. And that is just a small sample of the behind-the-scenes reasoning DeepSeek-R1 provides.

Both models gave me a breakdown of the final answer, with bullet points and categories, before hitting a summary. This is how deep reasoning models tend to provide their answers, in contrast to things like ChatGPT 4o, which will just give you a more concise answer.

Obviously, I didn’t stop there, but the results are the same for most queries I threw at the models. ChatGPT o3-mini is more concise in showing reasoning, and DeepSeek-R1 is more sprawling and verbose. If you really need to see the way the LLM arrived at the answer, then DeepSeek-R1’s approach feels like you’re getting the full reasoning service, while ChatGPT 03-mini feels like an overview in comparison.

I’ve read reports on how o3-mini can crush DeepSeek-R1 in terms of physics simulations and complex geometric challenges, but for the simple stuff, I think I prefer DeepSeek-R1.

You may also like

Leave a Reply