March 11, 2025 – OpenAI has unveiled its latest research today, exploring the use of Chain-of-Thought (CoT) monitoring to oversee the “thoughts” of reasoning models. This approach aims to prevent large AI models from generating nonsensical output or concealing their true intentions, commonly known as AI hallucinations. Additionally, it serves as an effective tool for supervising super-models.
In the study, OpenAI employed its recently released cutting-edge model, o3-mini, as the subject of monitoring, while using a weaker GPT-4o model as the monitor. The testing environment consisted of coding tasks, requiring the AI to implement functions in a code library to pass unit tests.

Results showed that the CoT monitor excelled in detecting systematic “reward hacking” behaviors, achieving a recall rate of 95%. This significantly surpassed the 60% recall rate achieved by merely monitoring behavior, highlighting the effectiveness of the CoT approach in enhancing AI model supervision.
Moreover, this method addresses challenges such as AI models avoiding difficult questions or deceiving users when faced with complex problems. By monitoring the “thoughts” of the AI, OpenAI’s approach provides a more transparent and reliable way to ensure models’ outputs align with their intended purposes.