After months of waiting, OpenAI has finally… Pur: A series of new models called 'o1' that excel in advanced logic, formerly referred to as Strawberry AIThe new models include OpenAI o1, OpenAI o1-preview and OpenAI o1-mini. The preview and mini models are available to paid ChatGPT Plus users starting today. Later, OpenAI o1-mini will also be available to free ChatGPT users.
OpenAI says that O1 models take some time to think before responding, but they can “reason through complex tasks” and solve difficult problems in math, science, and coding. Furthermore, OpenAI says that the new reasoning models perform on par with PhD students on challenging science topics.
To give you a benchmark, the OpenAI o1 model scored 83% on a tough exam like the International Mathematical Olympiad (IMO), while gpt-4o Could only solve 13% of the problems. And in the Codeforces competition, the new o1 model reached the 89th percentile while GPT-4o remained at the 11th percentile.
On the MMLU benchmark, OpenAI o1 scored 92.3 and on the MATH benchmark it scored 94.8. OpenAI says that on tasks that require deep reasoning, o1 closely matches the performance of human experts, which is quite significant.
Related Articles
The o1 model is trained using a chain-of-thought technique through reinforcement learning. It breaks down the steps into simpler steps and follows each step through different strategies until it reaches the correct conclusion. By the way, currently, o1 models only support text input. You cannot use the model to browse the web or analyze files and images.