Technology

ChatGPT has prompted a ‘code red’ – and other, alternative chatbots could be coming for it

2025-12-03 18:45
844 views
ChatGPT has prompted a ‘code red’ – and other, alternative chatbots could be coming for it

OpenAI’s system is behind both on technical prowess and personality

  1. Tech
ChatGPT has prompted a ‘code red’ – and other, alternative chatbots could be coming for it

OpenAI’s system is behind both on technical prowess and personality

Andrew GriffinWednesday 03 December 2025 18:45 GMTCommentsThe ChatGPT logo pictured in Mulhouse, eastern France on 19 October, 2023The ChatGPT logo pictured in Mulhouse, eastern France on 19 October, 2023 (AFP/Getty)IndyTech

Sign up to our free weekly IndyTech newsletter delivered straight to your inbox

Sign up to our free IndyTech newsletter

Sign up to our free IndyTech newsletter

IndyTechEmail*SIGN UP

I would like to be emailed about offers, events and updates from The Independent. Read our Privacy notice

Almost exactly three years ago, everything changed: OpenAI launched ChatGPT and upended the entire world in a moment. Since then, artificial intelligence has transformed industries, become one of the most popular talking points in the world and much more besides.

Throughout all that, ChatGPT has remained so popular that it is almost synonymous with generative AI and chatbots. But that dominance might be under threat: this week, it was reported that OpenAI boss Sam Altman has issued a “code red” and urged staff to improve ChatGPT, amid fears it could be overtaken by rivals.

There are many of those rivals, each of them also offering a system that will use a large language model to answer questions that are written into a box, just like ChatGPT. They include Google’s Gemini model – the one that is improving so rapidly it worried Mr Altman – but also others such as Elon Musk’s Grok, Perplexity and Anthropic’s Claude.

For the most part, the systems are more similar than they are different. All of them were created by training a large language model on vast amounts of text, and then putting that into a chat system so that people can easily interact with them. All of them will aim to answer questions and fulfil requests as helpfully as possible.

Broadly, the different systems are better or worse at different things: Claude tends to be better for coding, for instance, while Gemini can call on its connection with the rest of Google search and produce better answers about real-time events.

There are some leaderboards that aim to offer some objective ways of comparing different systems. AI company Hugging Space for instance operates a leaderboard that allows them to be evaluated based on a variety of different criteria: how much context they can bring into your discussions, how quickly they work, and how well they perform on a series of tests. Broadly, that suggests that Google’s top-end models are the best, followed by Anthropic’s Claude and then OpenAI’s ChatGPT.

Similarly, researchers have created a tool called Humanity’s Last Exam which aims to evaluate how close AI systems are to expert-level humans by asking it questions from a 2,500-strong set of advanced problems. Google is also winning there: Gemini is at the top of the leaderboard, followed by two different releases from OpenAI and then Claude. (None of them are yet doing all that well: the top score is under 38 per cent, suggesting that humans are still more intelligent for now.)

All of these more objective rankings have their issues, however. One important one is that new models are being trained after the leaderboards’ tests were made, so that the models can be specifically trained to be good at the test. Another is that their ability to excel on those tests might not actually mean that the models are going to be more helpful.

The more obvious and perhaps more substantial difference between all of those models is more likely to be one of personality. The main thing that marks the main systems apart is the training process they have been through, during which they are built to favour different kinds of answers: one system might be more playful, for instance, or more verbose. Users can find out those stylistic differences by spending some time with each of the systems.

OpenAI seems to have been panicked into launching its “code red” by the technical superiority of Google’s Gemini. But ChatGPT has also received sustained criticism for problems with its style, too. When it launched GPT-5 in summer – after intense attempts to build hype around its reveal – it was hit by criticism from users who found it cold, or unfriendly, or less fun. Likewise, the company rushed to respond to the concerns, offering the ability to use the older model as well as making quick tweaks to the new one to make it more satisfying to users.

For now, ChatGPT remains by far the most popular of the chatbots. But the panic at OpenAI seemingly reflects a real threat: that, because of technical prowess or simply being a bit annoying, people might stop talking to it.

More about

ChatGPTGoogleOpenAI

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Most popular

    Popular videos

      Bulletin

        Read next