OpenAI details safety controls for Codex coding agent

ai-technology · 2026-05-08

On May 7, 2026, OpenAI released a technical summary detailing the safety protocols for its Codex coding agent, which autonomously assesses repositories, executes commands, and interacts with development tools. Sandboxing is employed to establish execution limits, while approval policies govern actions deemed high-risk. An auto-review feature automatically greenlights low-risk requests. Network access is regulated by a policy that permits known destinations and blocks unfamiliar ones. User credentials are secured in the OS keyring, with logins conducted through ChatGPT and linked to enterprise workspaces. Potentially harmful shell commands can either be blocked or require approval. OpenTelemetry log exports provide telemetry for user prompts, tool approvals, and network choices, with activity logs accessible via the OpenAI Compliance Platform. OpenAI also utilizes Codex logs in conjunction with an AI-driven security triage agent to analyze intent and identify deviations from expected behavior. The implemented controls are enforced through cloud-managed requirements, macOS managed preferences, and local requirements files that cannot be altered by users.

Key facts

Codex is a coding agent that autonomously reviews repos, runs commands, and interacts with dev tools.
Sandboxing defines technical execution boundaries for Codex.
Approval policy determines when Codex must ask for permission to act outside the sandbox.
Auto-review mode auto-approves low-risk requests to reduce interruptions.
Managed network policy allows expected destinations and blocks unfamiliar domains.
CLI and MCP OAuth credentials are stored in the OS keyring.
Login is forced through ChatGPT and pinned to enterprise workspaces.
Dangerous shell commands can be blocked or require approval.
Codex supports OpenTelemetry log export for events like prompts, approvals, tool results, and network decisions.
Activity logs are available through the OpenAI Compliance Platform for Enterprise and Edu customers.
OpenAI uses an AI-powered security triage agent that inspects Codex logs to distinguish expected behavior from anomalies.
Controls are applied via cloud-managed requirements, macOS managed preferences, and local requirements files.
The post was published on May 7, 2026.

OpenAI details safety controls for Codex coding agent

Key facts

Entities

Institutions

Sources