Claude Code

Use Claude Code as a terminal interface for your Orq.ai workspace. Query traces, run evals, inspect deployments, and debug production LLM behavior from your CLI with the Orq.ai dashboard available for deeper drill-down.

Claude Code

MCP

Skills

Natural language

Local-first

Get your API key

View setup docs

Field	Value
Integration type	MCP server
Setup time	Quick setup once Claude Code is installed and an Orq API key is configured.
Auth	Orq.ai API key (workspace‑ or project‑level) configured via the ORQ_API_KEY environment variable
Skills support	Claude Code can call Orq MCP tools when the server is configured.
Local‑first	Claude Code runs locally in your terminal. MCP requests call Orq’s cloud APIs for workspace data.
Multi‑workspace	Use different Orq API keys to point Claude Code at different Orq workspaces or environments
Vendor	Anthropic
Pricing	Included with supported Orq.ai workspaces. Confirm availability in your plan.

Why Connect Claude Code to Orq.ai?

Stay in your terminal

Stop switching between Claude Code, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same developer workflow.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Claude Code turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize

Connect evals to your development workflow

Use Orq.ai inside your existing CLI workflows. From Claude Code, pull trace data, design and run evals, and kick off experiments as part of CI jobs or git hooks.

Keep production behavior visible

Orq.ai gives teams visibility into MCP-driven activity, including which tools ran, when they ran, and which key or workspace triggered them.

Stay in your terminal

Stop switching between Claude Code, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same developer workflow.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Claude Code turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize

Connect evals to your development workflow

Use Orq.ai inside your existing CLI workflows. From Claude Code, pull trace data, design and run evals, and kick off experiments as part of CI jobs or git hooks.

Keep production behavior visible

Orq.ai gives teams visibility into MCP-driven activity, including which tools ran, when they ran, and which key or workspace triggered them.

Stay in your terminal

Stop switching between Claude Code, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same developer workflow.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Claude Code turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize

Connect evals to your development workflow

Use Orq.ai inside your existing CLI workflows. From Claude Code, pull trace data, design and run evals, and kick off experiments as part of CI jobs or git hooks.

Keep production behavior visible

Orq.ai gives teams visibility into MCP-driven activity, including which tools ran, when they ran, and which key or workspace triggered them.

Stay in your terminal

Stop switching between Claude Code, the Orq.ai dashboard, and separate eval scripts. Query traces, run experiments, and inspect deployments from the same developer workflow.

Ask operational questions in natural language

Ask questions like “Show me yesterday’s failed agent runs grouped by error type” and let Claude Code turn that intent into Orq MCP tool calls. No SDKs to learn and no API URLs to memorize

Connect evals to your development workflow

Use Orq.ai inside your existing CLI workflows. From Claude Code, pull trace data, design and run evals, and kick off experiments as part of CI jobs or git hooks.

Keep production behavior visible

Orq.ai gives teams visibility into MCP-driven activity, including which tools ran, when they ran, and which key or workspace triggered them.

Setup

1: Install Claude Code

Follow the Claude Code install instructions for your OS so you can run claude from your terminal.

2: Create an Orq.ai API key

In Orq.ai, create an API key for the workspace or project you want Claude Code to access, then export it in your shell:

export ORQ_API_KEY="<your-orq-api-key>"

3: Add the Orq MCP server to Claude Code

Use Claude Code’s MCP command to register Orq as a remote MCP server, for example:

claude mcp add --transport http orq https://my.orq.ai/v2/mcp --header "Authorization: Bearer ${ORQ_API_KEY}"

Here, orq is the name Claude Code will use for this MCP server, https://my.orq.ai/v2/mcp is the Orq MCP endpoint, and the Authorization header passes your Orq API key from the ORQ_API_KEY environment variable.

Verify the connection with claude mcp list and confirm the Orq MCP server appears.

4: Start using Orq.ai tools from the terminal

In a project directory, start Claude Code and ask it to list available Orq tools, query traces, or run an experiment to confirm everything is wired up.

What Can You Do with Orq.ai + Claude Code

Query observability data in natural language

Use Claude Code to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model.

Design and run evaluations

Describe the behavior you want to test, let Claude Code scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Claude Code, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect the results in Orq.ai when you need deeper drill‑down.

Generate reusable synthetic datasets

Ask Claude Code to create challenging synthetic test cases for a workflow, such as contract analysis or support tickets, and prepare them for use as reusable Orq.ai datasets. Then reuse them across evals and experiments.

Debug production regressions

When something breaks, stay in Claude Code. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Claude Code to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model.

Design and run evaluations

Describe the behavior you want to test, let Claude Code scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Claude Code, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect the results in Orq.ai when you need deeper drill‑down.

Generate reusable synthetic datasets

Debug production regressions

When something breaks, stay in Claude Code. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Claude Code to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model.

Design and run evaluations

Describe the behavior you want to test, let Claude Code scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Claude Code, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect the results in Orq.ai when you need deeper drill‑down.

Generate reusable synthetic datasets

Debug production regressions

When something breaks, stay in Claude Code. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Query observability data in natural language

Use Claude Code to “talk” to your Orq.ai traces. Ask for failed agent runs, slowest requests over the last 24 hours, or errors grouped by model.

Design and run evaluations

Describe the behavior you want to test, let Claude Code scaffold evaluators and datasets, then run evals against your deployments without moving into a separate tool.

Compare prompts, models, and configs

From Claude Code, create experiments that compare prompts, models, or configurations, run them on real or synthetic datasets, and inspect the results in Orq.ai when you need deeper drill‑down.

Generate reusable synthetic datasets

Debug production regressions

When something breaks, stay in Claude Code. Pull recent traces for a deployment, filter by failure pattern, and use those examples to guide prompt or model changes backed by experiments and evals.

Claude Code direct vs with Orq.ai MCP

Capability	Claude Code alone	Claude Code + Orq.ai MCP
Query production LLM traces	No built‑in view into Orq.ai’s observability data.	Ask Claude Code to list, filter, and group Orq.ai traces (failures, slow runs, agent tool calls, etc.).
Run experiments on prompts	Can iterate on prompts manually, but no native experiment tracking in Orq.ai.	Create and run Orq.ai experiments comparing prompts, models, or configs against datasets, directly from the terminal.
Generate synthetic eval data	You can prompt Claude Code to generate examples, then copy/paste them elsewhere.	Generate synthetic test cases and save them as reusable Orq.ai datasets for evals and experiments.
Pull cost and usage analytics	No view into Orq router or deployment analytics.	Query Orq.ai’s cost, usage, and performance metrics for models and deployments via MCP tools.
Run evaluators on datasets	No built‑in concept of Orq evaluators or datasets.	Work with Orq evaluators and datasets from Claude Code, depending on the MCP tools enabled.

FAQs

Do I have to use Claude Code to get value from Orq.ai?

No. Orq.ai works on its own through the UI and API. Claude Code is an optional “terminal front‑end” for your workspace. You get the same experiments, evals, and observability in Orq; Claude Code simply lets you drive them from the CLI using natural language.

What can Claude Code see in my Orq.ai workspace, and how is access controlled?

Claude Code only sees what the Orq API key you configure is allowed to access. If you use a project‑level key scoped to a specific workspace or environment, Claude Code can only query traces, experiments, datasets, and deployments inside that scope. Rotate or revoke the key in Orq to instantly cut off access.

Can I point Claude Code at different Orq environments (dev, staging, prod)?

Yes. You can generate separate API keys for each Orq project or environment and switch which one Claude Code uses via environment variables or shell profiles. That way, you can run evals and inspect traces in dev or staging first. When ready, switch the same Claude Code setup to the production key.

Create an account and start building today.

Book a demo

Explore docs

Create an account and start building today.

Book a demo

Explore docs

Create an account and start building today.

Book a demo

Explore docs

Create an account and start building today.

Book a demo

Explore docs