Exclusive scoop: OpenAI is developing groundbreaking reasoning technology under the code name ‘Strawberry’! 🍓🔍

July 19, 2024

On July 12, it was revealed that OpenAI, the creator of ChatGPT, is developing a new approach to its AI models under the project code-named “Strawberry,” according to a source familiar with the matter and internal documents reviewed by Reuters. The project, previously undisclosed, aims to demonstrate that the Microsoft-backed startup’s models can provide advanced reasoning capabilities.

Teams within OpenAI are working on the project “Strawberry,” according to a recent internal document viewed by Reuters in May. Reuters was unable to determine the exact date of the document, which outlines OpenAI’s plan to use Strawberry for research purposes. The source described the plan as a work in progress, and it is unclear how close Strawberry is to being publicly available.

According to the source, the workings of Strawberry are closely guarded even within OpenAI. The document outlines a project where Strawberry models aim to enable the company’s AI not only to generate answers but also to autonomously and reliably navigate the internet for “deep research.” This capability has eluded AI models so far, according to interviews with over a dozen AI researchers.

When asked about Strawberry and the details in this story, an OpenAI spokesperson stated: “We want our AI models to see and understand the world more like we do. Continuous research into new AI capabilities is a common practice in the industry, with a shared belief that these systems will improve in reasoning over time.” The spokesperson did not directly address questions about Strawberry.

The project, formerly known as Q*, was reported by Reuters last year as a breakthrough within the company. Two sources described seeing demos of Q* earlier this year, where OpenAI staffers demonstrated its ability to answer complex science and math questions that current commercially-available models cannot handle. Another source mentioned that OpenAI has internally tested AI that scored over 90% on a MATH dataset, a benchmark for challenging math problems, though it is unclear if this is related to Strawberry.

At an internal all-hands meeting on Tuesday, OpenAI showcased a research project claiming new human-like reasoning skills, according to Bloomberg. An OpenAI spokesperson confirmed the meeting but declined to provide details. Reuters could not confirm if the demonstrated project was Strawberry.

OpenAI aims for Strawberry to significantly enhance its AI models’ reasoning capabilities. The project involves a specialized method of processing an AI model after pre-training on extensive datasets. Researchers interviewed by Reuters emphasized that reasoning is crucial for AI to achieve human or super-human-level intelligence.

credits : REUTERS

While large language models can efficiently summarize dense texts and compose elegant prose, they often struggle with common sense problems, such as recognizing logical fallacies and playing tic-tac-toe. When faced with these challenges, the models can “hallucinate” incorrect information.

AI researchers interviewed by Reuters agree that reasoning in AI involves creating a model that allows the AI to plan ahead, understand the physical world, and tackle complex multi-step problems reliably. Improving reasoning in AI models is viewed as crucial for enabling them to make significant scientific discoveries and develop new software applications.

OpenAI CEO Sam Altman emphasized earlier this year that reasoning ability is one of the most important areas of progress in AI. Other companies like Google, Meta, and Microsoft, as well as many academic labs, are also experimenting with techniques to improve AI reasoning. However, there is disagreement among researchers on whether large language models (LLMs) can incorporate ideas and long-term planning into their predictions. For example, Yann LeCun from Meta has frequently stated that LLMs are not capable of human-like reasoning.

AI CHALLENGES

Strawberry is a key part of OpenAI’s strategy to address these challenges, according to a source familiar with the matter. The internal document seen by Reuters described Strawberry’s goals but not its methods. Recently, the company has privately indicated to developers and other stakeholders that it is close to releasing technology with significantly more advanced reasoning capabilities.

Strawberry involves a specialized “post-training” method for OpenAI’s generative AI models, which fine-tunes the models after they have been trained on large datasets. This process, which includes techniques like “fine-tuning” with human feedback, aims to improve the models’ performance in specific ways.

One source said that Strawberry has similarities to Stanford’s 2022 method called “Self-Taught Reasoner” (STaR), which allows AI models to enhance their intelligence by creating their own training data. Stanford professor Noah Goodman, one of STaR’s creators, explained that this method could potentially enable language models to surpass human-level intelligence, though he is not affiliated with OpenAI or familiar with Strawberry.

Strawberry aims to perform long-horizon tasks (LHT), which involve planning and executing a series of actions over an extended period. OpenAI is creating and testing these models on a “deep-research” dataset, though details about the dataset and the duration of these tasks are unclear.

OpenAI wants its models to conduct autonomous research by browsing the web with the help of a “computer-using agent” (CUA) that can act on its findings. The company also plans to test the models’ capabilities in performing the work of software and machine learning engineers.