Implementation Guide | MachineID.io

What this is

This page explains how to implement runtime enforcement in real systems: where to place validation boundaries, how frequently to validate, and how to handle failures without turning enforcement into best-effort policy. It is operational guidance that assumes the guarantees are already understood.

Design principle: enforcement over cooperation

Engineering goal: boundaries that are explicit and testable

Core invariant

Everything reduces to a single invariant:

Register. Validate. Work.

If validation fails, work does not begin.

Your only job as an implementer is to choose and enforce where validation boundaries exist. If the boundary is vague, enforcement becomes vague.

Where to validate

Validation belongs at execution boundaries—points where work begins or commits side effects. Most systems already have these boundaries; they are simply not treated as enforcement points.

Startup boundary: before an instance begins processing work
Task boundary: before each unit of work (job, message, task, tool call)
Side-effect boundary: before committing external actions (writes, payments, emails)

If your system can multiply execution surfaces (replicas, workers, consumers), startup validation is non-negotiable. If your system can trigger real-world effects, side-effect boundaries are where you get the most leverage.

Boundary patterns

Pattern A: Startup gate (required)

Validate once at startup. If allowed is false, do not start the loop. Exit immediately.

register(device_id)
val = validate(device_id)
if not val.allowed:
    exit(1)

start_work_loop()

Pattern B: Per-unit-of-work gate (recommended)

Validate before each job/message/task. This makes revocation effective at predictable points.

while True:
    job = dequeue()
    val = validate(device_id)
    if not val.allowed:
        exit(1)

    process(job)

Pattern C: Side-effect gate (high leverage)

Validate immediately before irreversible actions: writes, transfers, external API calls, outbound messages.

val = validate(device_id)
if not val.allowed:
    exit(1)

commit_side_effect()

Reality: You can combine patterns. Most production systems do Startup + Per-unit-of-work, and add Side-effect gates for the highest-risk actions.

Long-running loops

MachineID does not introspect internal loops. If a process runs for hours, you must define enforcement boundaries inside the loop.

Good boundaries are:

Before each iteration that triggers a tool call
Before each external request that can incur cost
Before each write or irreversible side effect
Before entering a high-cost sub-loop (batch runs, fan-out, recursion)

Rule of thumb: if an action can cost money or change state, put a validation boundary immediately before it.

Timeouts

Treat validation like a safety-critical call: it must be fast, and it must fail safely.

Recommended default: short client timeout (for example, 1–3 seconds) and fail closed (do not proceed when validation cannot be confirmed).

If your runtime cannot tolerate fail-closed for certain workloads, you are describing a different system: one that accepts best-effort enforcement. That is explicitly outside the guarantees.

Network failures

Decide your failure policy up front. Do not allow “it depends” behavior per team, per service, or per environment. Enforcement must be consistent.

Recommended policy: fail closed

If validate fails (timeout/network): treat as allowed:false
Exit the process or stop the worker loop
Surface the failure via logs/alerts

This policy makes your system controllable under uncertainty. If you cannot confirm permission, you do not execute.

No degraded mode

The guarantee is binary. Avoid patterns like:

“Proceed anyway but log a warning”
“Continue for 10 minutes until checks recover”
“Fallback to internal flags when validation is down”

Those patterns create a second authority path inside the runtime. That is precisely what an external control plane avoids.

Identity model

MachineID enforces permission on an identity (a “device”) representing an execution surface. This identity should map to a specific runtime instance or logical worker identity—not a whole service.

Good: one agent instance = one device
Good: one worker replica = one device
Risky: one entire cluster = one device (loses surgical control)

Device ID strategy

Device IDs should be stable enough to audit, but specific enough to revoke. A common pattern is:

{service}:{env}:{role}:{instance}

Examples:

agent:prod:planner:01
worker:prod:queue-consumer:07
job:prod:nightly-reindex:01

Important: Do not embed secrets in device IDs. Treat IDs as identifiers, not credentials.

Rate and frequency

Choose validation frequency based on risk:

Low risk: validate at startup, then at task boundaries
Medium risk: validate at startup and per unit of work
High risk: validate at startup + per unit of work + before side effects

If you are running an agent that can spend money, write production data, or trigger external actions, validate at every side-effect boundary.

Logging and audit

Enforcement without observability is operationally painful. At minimum, log:

Device ID
Validation result (allowed)
Reason/error (when denied)
Boundary type (startup / task / side-effect)

Keep logs structured. Treat enforcement denials as first-class operational events.

Example: agents

Validate at startup, then before each tool call or task step. If denied, exit immediately.

register(agent_id)

if not validate(agent_id).allowed:
    exit(1)

for step in plan:
    if not validate(agent_id).allowed:
        exit(1)

    run_step(step)

Example: workers

Validate before pulling work. If denied, stop consuming.

register(worker_id)

if not validate(worker_id).allowed:
    exit(1)

while True:
    if not validate(worker_id).allowed:
        exit(1)

    job = dequeue()
    process(job)

Example: scheduled jobs

Validate at job start. If denied, exit before doing any work.

register(job_id)

if not validate(job_id).allowed:
    exit(1)

run_job()

Example: event consumers

Validate before consuming or before handling each message, depending on risk and throughput.

register(consumer_id)

if not validate(consumer_id).allowed:
    exit(1)

while True:
    msg = read_message()

    if not validate(consumer_id).allowed:
        exit(1)

    handle(msg)

Example: webhooks

Validate before triggering any downstream side effects.

register(handler_id)

if not validate(handler_id).allowed:
    return 403

commit_side_effect()

Checklist

Every execution surface has a stable device identity
Startup validation is enforced and fails closed
Boundaries are explicit (task / side-effect)
Revocation becomes effective at predictable checkpoints
Timeouts are short and consistent
Denials are logged and actionable

If your system can multiply execution, and you cannot reliably stop it, you do not have control.

← Back to Docs