OpenAI Agents Integration Guide

A complete reference for enforcing tool execution and side effects in OpenAI Agents with MachineID.

Companion to Implementation Guide, Operational Guarantees, and External Control Plane.
Primary boundary: tool execution Secondary: side effects + resume Control: revoke/restore + org-wide disable
What this is

This page is a complete, implementation-oriented guide for adding MachineID enforcement to OpenAI Agents: systems where a model can request tool calls and your application executes them.

What you get here
  • Exact enforcement boundaries for hosted agent execution
  • A tool-server gating model (the canonical integration point)
  • Device identity patterns for multi-tool and multi-surface setups
  • Fail-closed behavior and stop latency design
What this is not
  • A “soft guardrails” tutorial
  • A usage metering guide
  • A best-effort policy pattern
MachineID is external authority: if validate fails, work does not begin.
Core invariant

Everything reduces to one invariant:

Register. Validate. Work.

If validation fails, work does not begin.

In OpenAI Agents, your strongest “work begins” boundary is when your application is about to execute a tool call.

Fastest path (prove remote stop works)
Goal
  • Add one hard gate to your tool server
  • Revoke that device in the dashboard
  • Confirm the next tool call is denied (fail closed)
The tool boundary is the stop boundary. If tools are gated, execution becomes controllable even when the agent is hosted.
What “OpenAI Agents” means in practice

The modern OpenAI direction is the Agents SDK paired with the Responses API tool-calling model: a model requests tools, and your application executes them and returns outputs. :contentReference[oaicite:3]{index=3}

Why MachineID fits
  • You cannot rely on in-process flags inside a hosted agent runtime
  • You can always gate your own tool execution server-side
  • Revoke/disable becomes effective at the next tool boundary
Tool calling flow (the canonical enforcement boundary)

Tool calling is a multi-step interaction: you send tools the model may call, the model requests a tool, your application executes code, then you send tool results back. :contentReference[oaicite:4]{index=4}

MachineID placement
  • Validate immediately before executing a tool function
  • Validate again immediately before high-risk side effects (payments/email/writes)
  • Fail closed on timeout/network failure
Where to validate (recommended placement)
Recommended baseline
  • Tool boundary (mandatory): validate before each tool executes
  • Side-effect boundary: validate immediately before irreversible actions
  • Resume boundary: validate before resuming paused/delayed execution
Streaming is UX. Tool execution is control.
Stop latency (what stops, and when)

Revoke/disable is realized on the next validate. In Agents systems, that typically means: the next time the model requests a tool call and your server is about to execute it.

Make stop control operational
  • Validate before every tool call
  • Validate at loop re-entry points (if you have server-side loops)
  • Validate before side effects even if the tool is already gated
Assistants vs Responses (avoid building on deprecated surfaces)

OpenAI has deprecated the Assistants API and published a sunset date of August 26, 2026. :contentReference[oaicite:5]{index=5} For new builds, the Responses API + Agents SDK direction is the safer default. :contentReference[oaicite:6]{index=6}

Path A: MachineID Python SDK (recommended)
Install
pip install machineid-io
Minimal hard gate (tool execution boundary)
import os
from machineid import MachineID

m = MachineID.from_env()
DEVICE_ID = os.getenv("MACHINEID_DEVICE_ID", "openai-agent:prod:tool-server:01")

def must_be_allowed(boundary: str):
    m.register(DEVICE_ID)  # idempotent
    d = m.validate(DEVICE_ID)
    if not d["allowed"]:
        raise RuntimeError(f"Denied at {boundary}: {d.get('code')} {d.get('request_id')}")
    return d
SDK surface area (what exists)
  • register(device_id), validate(device_id)
  • list_devices(), usage()
  • revoke(device_id), unrevoke(device_id) (alias: restore)
  • remove(device_id)
Path B: Direct HTTP (canonical endpoints)

If you want minimal dependencies, call the canonical endpoints directly using the x-org-key header.

Register
POST https://machineid.io/api/v1/devices/register
Headers:
  x-org-key: org_...
Body:
  {"deviceId":"openai-agent:prod:tool-server:01"}
Validate (canonical)
POST https://machineid.io/api/v1/devices/validate
Headers:
  x-org-key: org_...
Body:
  {"deviceId":"openai-agent:prod:tool-server:01"}
Wrapper patterns (minimal refactor, maximal control)
Tool-call wrapper (canonical)
def guarded_tool(tool_fn):
    def _inner(*args, **kwargs):
        must_be_allowed("before_tool_call")
        return tool_fn(*args, **kwargs)
    return _inner
Side-effect gate (highest risk)
def commit_side_effect(do_effect):
    must_be_allowed("before_side_effect")
    return do_effect()
If you add only one boundary, make it tool execution.
Device ID strategy (Agents-friendly)
openai-agent:{env}:{surface}:{instance}

Examples:

  • openai-agent:dev:tool-server:01
  • openai-agent:prod:tool-server:03
  • openai-agent:prod:payments:01
  • openai-agent:prod:email:01
Rule
  • Device IDs are identifiers, not secrets
  • Map devices to execution surfaces you want to stop independently
Timeouts and failures (fail closed)
Recommended policy
  • Short client timeout (for example, 1–3 seconds)
  • Timeout/network failure treated as not allowed
  • Tool does not execute; return an error to the agent
“Proceed anyway” creates a second authority path. That breaks external enforcement.
Example: up to 3 devices
Suggested 3-device model
  • openai-agent:dev:tool-server:01 — gates all tools for a single environment
  • openai-agent:dev:payments:01 — isolate high-risk side effects
  • openai-agent:dev:email:01 — isolate outbound messaging
Example: up to 25 devices
Example topology
  • 10 tool servers: openai-agent:prod:tool-server:01:10
  • 8 specialized tool surfaces: openai-agent:prod:payments:01:04, openai-agent:prod:email:01:04
  • 7 event/cron triggers: openai-agent:prod:cron:01:07
At this tier, “tool boundary everywhere” becomes operationally meaningful stop control.
Example: up to 250 devices
Patterns that drive device count
  • Autoscaling tool servers / multi-region execution
  • Per-tenant or per-workflow tool surfaces
  • High fan-out tool usage under load
Example: up to 1000 devices
Scale guidance
  • Prefer per-replica identity (avoid one fleet device)
  • Validate before tools + before side effects
  • No fallback authority paths
Dashboard controls

The console at machineid.io/dashboard is external to the agent runtime, so you can intervene from anywhere.

Org-wide disable
Operational semantics
  • Does not change device revoked/restored state
  • Causes validate to deny across the org
  • Takes effect at the next validate boundary (tool execution)
What not to do
  • Proceed anyway on validation timeout or error
  • Cache “allowed” decisions for long windows
  • Validate only once at startup
  • Fallback to internal flags as alternate authority
Troubleshooting
Revocation doesn’t stop immediately
  • Your tool server is not validating at tool boundaries
  • Add validate before every tool and before every side effect
Denied decisions
  • Inspect code and request_id
  • Confirm device is not revoked; confirm org-wide disable is not enabled
  • Confirm device cap is not exceeded (new unique device IDs)
LLM implementation prompts
Prompt 1 — Gate OpenAI tool execution using MachineID
I have an OpenAI Agents / Responses API integration where the model calls tools and my server executes them.

Constraints:
- I will gate tool execution using MachineID (register + validate).
- Fail closed with short timeout.
- Device ID pattern: openai-agent:{env}:{surface}:{instance}

Required boundaries:
1) Before every tool call
2) Before irreversible side effects (payments/email/writes)
3) Before resuming delayed/paused work

Please provide:
- Exact code locations to add enforcement
- Copy/paste snippets (SDK and direct HTTP variants)
- A test plan: revoke device, restore, org-wide disable, verify stops at next tool boundary
LLM checklist
A correct implementation includes
  • Tool execution is gated (validate before tool runs)
  • Side effects are gated (validate before irreversible actions)
  • Fail-closed policy (timeouts deny)
  • Denials logged (include request_id)
  • Device model supports surgical revoke
References