OpenAI introduces GPT-4o, a model with groundbreaking capabilities but classified as ‘Medium’ risk.

August 11, 2024

Balancing innovation with safety, this model opens new possibilities while keeping ethical considerations in check.

OpenAI has published the GPT-4o System Card, a research document detailing the safety precautions and risk assessments undertaken prior to the launch of their newest model.

GPT-4o was publicly released in May this year. Before its launch, OpenAI employed an external group of red teamers—security experts who identify system vulnerabilities—to assess potential risks in the model, a common practice. They evaluated concerns such as the model’s potential to generate unauthorized voice clones, explicit or violent content, and segments of copyrighted audio. The findings from this evaluation are now being published.

OpenAI’s framework categorized GPT-4o as having a “medium” risk level. This assessment was based on the highest risk rating across four categories: cybersecurity, biological threats, persuasion, and model autonomy. While cybersecurity, biological threats, and model autonomy were all rated as low risk, persuasion stood out. Researchers found that some writing samples from GPT-4o had the potential to influence readers’ opinions more effectively than human-written text, though the model’s overall persuasiveness was not higher.

Lindsay McCallum Rémy, an OpenAI spokesperson, informed The Verge that the system card features readiness assessments developed by an internal team, as well as evaluations from external testers. These external testers, identified on OpenAI’s website as Model Evaluation and Threat Research (METR) and Apollo Research, are both involved in evaluating AI systems.

OpenAI has previously released system cards for GPT-4, GPT-4 with vision, and DALL-E 3, all of which underwent similar testing and research. However, the release of GPT-4o’s system card comes at a critical moment. The company has faced ongoing criticism regarding its safety standards, from both its employees and state senators. Just minutes before the release, The Verge reported on an open letter from Sen. Elizabeth Warren (D-MA) and Rep. Lori Trahan (D-MA) demanding answers about OpenAI’s handling of whistleblowers and safety reviews. The letter highlights various safety concerns, including CEO Sam Altman’s brief removal from the company in 2023 due to board concerns and the exit of a safety executive who criticized the company for prioritizing flashy products over safety.

Additionally, the company is launching a highly advanced multimodal model just before the US presidential election. This raises a significant risk of the model unintentionally spreading misinformation or being exploited by malicious actors. Despite OpenAI’s efforts to demonstrate that it is testing real-world scenarios to prevent misuse, these concerns remain.

There have been numerous calls for OpenAI to increase its transparency, not only regarding the model’s training data (e.g., whether it uses YouTube data) but also concerning its safety testing. In California, where OpenAI and other major AI labs are located, state Sen. Scott Wiener is working on a bill to regulate large language models. This proposed legislation would impose legal responsibilities on companies if their AI is misused. Should the bill pass, OpenAI’s advanced models would need to undergo state-mandated risk assessments before being released to the public. However, the key takeaway from the GPT-4o System Card is that, despite the involvement of external red teamers and testers, much of the evaluation still depends on OpenAI’s self-assessment.