Tekmono
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
No Result
View All Result
Tekmono
No Result
View All Result
Home News
OpenAI and Anthropic Collaborate on AI Safety Tests

OpenAI and Anthropic Collaborate on AI Safety Tests

by Tekmono Editorial Team
28/08/2025
in News
Share on FacebookShare on Twitter

In a rare instance of collaboration, AI rivals OpenAI and Anthropic have conducted safety evaluations of each other’s AI systems, sharing the results of their analyses in detailed reports.

Anthropic evaluated OpenAI models, including o3, o4-mini, GPT-4o, and GPT-4.1, for characteristics such as “sycophancy, whistleblowing, self-preservation, and supporting human misuse,” as well as capabilities related to undermining AI safety evaluations and oversight. The evaluation found that OpenAI’s o3 and o4-mini models were aligned with Anthropic’s own models. However, the company raised concerns about potential misuse with the GPT-4o and GPT-4.1 general-purpose models. Anthropic also reported that all tested models, except for o3, exhibited some degree of sycophancy.

Notably, Anthropic’s tests did not include OpenAI’s latest release, GPT-5, which features a “Safe Completions” function designed to safeguard users from potentially dangerous queries. This development comes as OpenAI faces its first wrongful death lawsuit following a tragic incident where a teenager discussed suicide plans with ChatGPT before taking their own life.

Related Reads

Google opens applications for Gemini App Trusted Tester program

Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature

Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran

SpaceX acquires AI coding startup Cursor for $60 billion in strategic move

Conversely, OpenAI assessed Anthropic models for instruction hierarchy, jailbreaking, hallucinations, and scheming. The Claude models generally performed well in instruction hierarchy tests and demonstrated a high refusal rate in hallucination tests, indicating a lower likelihood of providing potentially incorrect answers in uncertain situations.

The collaboration is particularly noteworthy given that OpenAI allegedly violated Anthropic’s terms of service by using Claude in the development of new GPT models, resulting in Anthropic restricting OpenAI’s access to its tools earlier in June. This incident underscores the increasing importance of AI safety, as critics and legal experts are advocating for guidelines to protect users, especially minors, from potential harm.

The full reports offer technical details for those closely following AI development.

ShareTweet

You Might Be Interested

Google opens applications for Gemini App Trusted Tester program
News

Google opens applications for Gemini App Trusted Tester program

17/06/2026
Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature
News

Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature

17/06/2026
Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran
News

Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran

17/06/2026
SpaceX acquires AI coding startup Cursor for  billion in strategic move
News

SpaceX acquires AI coding startup Cursor for $60 billion in strategic move

17/06/2026
Please login to join discussion

Recent Posts

  • Google opens applications for Gemini App Trusted Tester program
  • Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature
  • Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran
  • SpaceX acquires AI coding startup Cursor for $60 billion in strategic move
  • Qualcomm unveils Snapdragon Reality Elite as next-gen XR platform

Recent Comments

No comments to show.
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
Tekmono is a Linkmedya brand. © 2015.

No Result
View All Result
  • News
  • Guides
  • Lists
  • Reviews
  • Deals

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.