Codex

Use Codex as a coding agent interface for your Orq.ai workspace. Query traces, run evals, inspect deployments, and debug production LLM behavior from the Codex CLI or desktop app, with the Orq.ai dashboard available for deeper drill‑down.

Codex

Use Codex as a coding agent interface for your Orq.ai workspace. Query traces, run evals, inspect deployments, and debug production LLM behavior from the Codex CLI or desktop app, with the Orq.ai dashboard available for deeper drill‑down.

MCP

Skills

Natural language

Local-first

Field

Value

Integration type

MCP server (Streamable HTTP)

Setup time

Quick setup once Codex (CLI and/or desktop app) is installed and an Orq API key is configured.

Auth

Orq.ai API key (workspace‑ or project‑level) passed as a bearer token via the ORQ_API_KEY environment variable, referenced from the Codex MCP config.

Skills support

Codex discovers Orq MCP tools at startup and can call them whenever they are relevant to your requests.

Cloud-based

The Codex CLI runs locally in your terminal, and the Codex desktop app runs on your machine. Both send MCP requests to Orq’s cloud APIs for workspace data.

Multi‑workspace

Use different ORQ_API_KEY values or multiple MCP server entries to point Codex at different Orq workspaces or environments.

Vendor

OpenAI

Pricing

Included with supported Orq.ai workspaces. Codex is available through ChatGPT Plus, Pro, Business, and Enterprise plans; confirm availability in both your Codex and Orq plans.

Why Connect Claude Code to Orq.ai?

Stay in your terminal

Stop switching between Codex, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same agentic workflow your team already uses.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Codex turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize.

Connect evals to your development workflow

Use Orq.ai inside your existing Codex workflows. From the CLI or desktop app, pull trace data, design and run evals, and kick off experiments as part of your normal coding sessions.

Keep production behavior visible

Orq.ai gives teams visibility into MCP‑driven activity, including which tools ran, when they ran, and which key or workspace triggered them. Codex brings that visibility into the same environment where agents are already working on your code.

Stay in your terminal

Stop switching between Codex, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same agentic workflow your team already uses.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Codex turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize.

Connect evals to your development workflow

Use Orq.ai inside your existing Codex workflows. From the CLI or desktop app, pull trace data, design and run evals, and kick off experiments as part of your normal coding sessions.

Keep production behavior visible

Orq.ai gives teams visibility into MCP‑driven activity, including which tools ran, when they ran, and which key or workspace triggered them. Codex brings that visibility into the same environment where agents are already working on your code.

Stay in your terminal

Stop switching between Codex, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same agentic workflow your team already uses.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Codex turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize.

Connect evals to your development workflow

Use Orq.ai inside your existing Codex workflows. From the CLI or desktop app, pull trace data, design and run evals, and kick off experiments as part of your normal coding sessions.

Keep production behavior visible

Orq.ai gives teams visibility into MCP‑driven activity, including which tools ran, when they ran, and which key or workspace triggered them. Codex brings that visibility into the same environment where agents are already working on your code.

Stay in your terminal

Stop switching between Codex, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same agentic workflow your team already uses.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Codex turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize.

Connect evals to your development workflow

Use Orq.ai inside your existing Codex workflows. From the CLI or desktop app, pull trace data, design and run evals, and kick off experiments as part of your normal coding sessions.

Keep production behavior visible

Orq.ai gives teams visibility into MCP‑driven activity, including which tools ran, when they ran, and which key or workspace triggered them. Codex brings that visibility into the same environment where agents are already working on your code.

Setup

1: Install Codex

  • Codex CLI

Install the Codex CLI and sign in with your OpenAI / ChatGPT account.

  • Codex desktop app

Install the Codex desktop app for macOS or Windows and sign in; it shares configuration with the CLI via the same Codex config file.

2: Create an Orq.ai API key

In Orq.ai, create an API key for the workspace or project you want Codex to access.

Export it in your shell so Codex can reference it:

bash

export ORQ_API_KEY="<your-orq-api-key>"

3: Add the Orq MCP server to Codex

You can configure Orq as a Streamable HTTP MCP server either from the CLI or from the Codex settings UI; both paths write to the same Codex config.

Option A - Codex CLI

Use Codex’s MCP command to register Orq as a remote MCP server:

bash

codex mcp add orq \

 --url https://my.orq.ai/v2/mcp \

 --bearer-token-env-var ORQ_API_KEY

Here, orq is the name Codex will use for this MCP server, https://my.orq.ai/v2/mcp is the Orq MCP endpoint, and --bearer-token-env-var ORQ_API_KEY tells Codex to send Authorization: 

Bearer <your-orq-api-key> on each request.

Verify the connection with:

bash

codex mcp list

and confirm the Orq MCP server appears.

Option B - Edit the Codex config file

Codex stores MCP configuration in a TOML config file (for example, ~/.codex/config.toml). You can add Orq manually:

text

[mcp_servers.orq]

url = "https://my.orq.ai/v2/mcp"

bearer_token_env_var = "ORQ_API_KEY"

4: Start using Orq.ai tools from Codex

Start Codex (CLI or desktop), and ask it to list available Orq tools, query traces, or run an experiment to confirm everything is wired up.


What Can You Do with Orq.ai + Codex

Query observability data in natural language

Use Codex to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model, then reuse those insights directly in your coding workflows.

Design and run evaluations

Describe the behavior you want to test, let Codex scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Codex, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect results in Orq.ai when you need more detail.

Generate reusable synthetic datasets

Ask Codex to create challenging synthetic test cases for a workflow, such as contract analysis or support tickets, and prepare them as reusable Orq.ai datasets. Then reuse them across evals and experiments.

Debug production regressions as a team

When something breaks, stay in Codex. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Codex to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model, then reuse those insights directly in your coding workflows.

Design and run evaluations

Describe the behavior you want to test, let Codex scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Codex, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect results in Orq.ai when you need more detail.

Generate reusable synthetic datasets

Ask Codex to create challenging synthetic test cases for a workflow, such as contract analysis or support tickets, and prepare them as reusable Orq.ai datasets. Then reuse them across evals and experiments.

Debug production regressions as a team

When something breaks, stay in Codex. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Codex to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model, then reuse those insights directly in your coding workflows.

Design and run evaluations

Describe the behavior you want to test, let Codex scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Codex, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect results in Orq.ai when you need more detail.

Generate reusable synthetic datasets

Ask Codex to create challenging synthetic test cases for a workflow, such as contract analysis or support tickets, and prepare them as reusable Orq.ai datasets. Then reuse them across evals and experiments.

Debug production regressions as a team

When something breaks, stay in Codex. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Codex to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model, then reuse those insights directly in your coding workflows.

Design and run evaluations

Describe the behavior you want to test, let Codex scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Codex, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect results in Orq.ai when you need more detail.

Generate reusable synthetic datasets

Ask Codex to create challenging synthetic test cases for a workflow, such as contract analysis or support tickets, and prepare them as reusable Orq.ai datasets. Then reuse them across evals and experiments.

Debug production regressions as a team

When something breaks, stay in Codex. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Codex direct vs with Orq.ai MCP

Capability

Codex alone

Codex + Orq.ai MCP

Query production LLM traces

No built‑in view into Orq.ai’s observability data.

Ask Codex to list, filter, and group Orq.ai traces (failures, slow runs, agent tool calls, etc.).

Run experiments on prompts

Teams can iterate on prompts manually in chat, but no native experiment tracking in Orq.ai.

Create and run Orq.ai experiments comparing prompts, models, or configs against datasets, directly from a Codex session.


Generate synthetic eval data

You can prompt Codex to generate examples, then copy/paste them elsewhere.

Generate synthetic test cases and save them as reusable Orq.ai datasets for evals and experiments.

Pull cost and usage analytics

No view into Orq router or deployment analytics.

Query Orq.ai’s cost, usage, and performance metrics for models and deployments via MCP tools, then bring those numbers into your Codex workflows.

Run evaluators on datasets

No built‑in concept of Orq evaluators or datasets.

Work with Orq evaluators and datasets from Codex, depending on the MCP tools enabled.


FAQs

Do I have to use Codex to get value from Orq.ai?

No. Orq.ai works on its own through the UI and API. Codex is an optional agentic front‑end for your workspace. You get the same experiments, evals, and observability in Orq; Codex simply lets teams drive them from an AI coding agent using natural language.

What can Codex see in my Orq.ai workspace, and how is access controlled?

Codex only sees what the Orq API key you configure is allowed to access. If you use a project‑level key scoped to a specific workspace or environment, Codex can only query traces, experiments, datasets, and deployments inside that scope. Rotate or revoke the key in Orq to instantly cut off access.

Can I point Codex at different Orq environments (dev, staging, prod)?

Yes. You can set different environment variables and/or multiple MCP server entries for each Orq project or environment, then choose which one Codex uses before starting a session. That way, you can run evals and inspect traces in dev or staging first, then switch the same Codex setup to the production key.

Do the Codex CLI and Codex desktop app share MCP configuration?

Yes. Both the CLI and desktop app read from the same Codex MCP config file, so adding the Orq MCP server once (via CLI commands or the app’s MCP settings) makes it available in both surfaces

Create an account and start building today.

Create an account and start building today.

Create an account and start building today.

Create an account and start building today.