Python Runtime Enforcement Guide

A complete, framework-agnostic reference for enforcing hard runtime control in Python systems.

Companion to External Control Plane, Implementation Guide, and Operational Guarantees.
Audience: Python runtimes (workers, services, jobs) Goal: explicit stop boundaries Designed for LLM-assisted implementation
What this is

This page is a complete Python-first manual for implementing MachineID runtime enforcement without relying on any specific agent framework.

Use this guide if you have:
  • Workers that pull from queues
  • Long-running loops
  • Scheduled jobs (cron)
  • Webhooks or event consumers
  • Any runtime that can multiply execution surfaces
MachineID provides external authority. Your runtime must validate before it executes.
Core invariant

Everything reduces to a single invariant:

Register. Validate. Work.

If validation fails, work does not begin.

This invariant is intentionally simple. Its strength comes from being external, enforceable, and non-negotiable.

Fast start
1) Get a key

Generate a free org key (supports up to 3 devices): machineid.io

2) Know where you will stop

Decide your validation boundaries before you integrate. If boundaries are vague, enforcement becomes vague.

3) Use one of the two integration paths
  • Path A (recommended): Python SDK: machineid-io/python-sdk
  • Path B: Direct HTTP to canonical endpoints (POST register + validate)
Identity model

MachineID enforces permission on an identity (a “device”) that represents an execution surface. In Python, an execution surface is usually one of:

  • One worker process
  • One queue consumer replica
  • One scheduled job identity
  • One tool runner / side-effect executor
Correct mapping
  • Good: one worker instance = one device
  • Good: one agent runtime = one device
  • Risky: one entire fleet = one device (loses surgical control)
Device ID strategy

Device IDs should be stable enough to audit, but specific enough to revoke. A practical format:

{service}:{env}:{role}:{instance}

Examples:

  • worker:prod:queue-consumer:07
  • job:prod:nightly-reindex:01
  • agent:dev:planner:01
  • tool:prod:payment-runner:02
Important: do not embed secrets in device IDs. Treat IDs as identifiers, not credentials.
Endpoints

MachineID uses canonical POST endpoints. Send your org key via x-org-key.

Register (idempotent)
POST https://machineid.io/api/v1/devices/register
Headers:
  x-org-key: org_...
Body:
  {"deviceId":"worker:prod:queue-consumer:07"}
Validate (hard gate)
POST https://machineid.io/api/v1/devices/validate
Headers:
  x-org-key: org_...
Body:
  {"deviceId":"worker:prod:queue-consumer:07"}
Decision semantics

Validation returns a decision including allowed, a stable code, and a request_id. Your runtime must treat allowed:false as an immediate stop condition.

Required behavior
  • If allowed is false: do not begin work
  • Log code + request_id for auditability
  • Exit the loop / stop consuming / terminate the job
Where to validate

Validation belongs at execution boundaries: points where work begins or commits side effects. Most production Python systems already have these boundaries; they are simply not treated as enforcement points.

  • Startup boundary: before an instance begins processing work
  • Task boundary: before each unit of work (job/message/task)
  • Side-effect boundary: before irreversible actions (writes, payments, emails)
Revocation and org-wide disable take effect on the next validate. Boundary placement determines stop latency.
Boundary patterns

Pattern A: Startup gate (required)

Register once, validate once, then start the work loop. If validation fails, exit immediately.

register(device_id)
val = validate(device_id)
if not val.allowed:
    exit(1)

start_work_loop()

Pattern B: Per-unit-of-work gate (recommended)

Validate before each job/message/task. This makes revocation effective at predictable points.

while True:
    job = dequeue()

    val = validate(device_id)
    if not val.allowed:
        exit(1)

    process(job)

Pattern C: Side-effect gate (high leverage)

Validate immediately before irreversible actions: writes, transfers, external API calls, outbound messages.

val = validate(device_id)
if not val.allowed:
    exit(1)

commit_side_effect()
Reality: Most production systems combine Startup + Per-unit-of-work, and add Side-effect gates for the highest-risk actions.
Long-running loops

MachineID does not introspect internal loops. If a process runs for hours, you must define enforcement boundaries inside the loop.

Good boundaries are:
  • Before each iteration that triggers a tool call
  • Before each external request that can incur cost
  • Before each write or irreversible side effect
  • Before entering a high-cost sub-loop (fan-out, recursion, batch runs)
Rule of thumb: if an action can cost money or change state, put a validation boundary immediately before it.
Side-effect gates (the most leverage)

Side effects are where runtime mistakes become expensive or irreversible: payments, emails, writes, deletes, credential rotations, outbound API calls, queue publishes, deployments.

Recommended practice
  • Treat side-effect functions as privileged surfaces
  • Validate immediately before the side effect
  • Fail closed: if you cannot confirm permission, do not execute
Timeouts

Treat validation like a safety-critical call: it must be fast, and it must fail safely.

Recommended default: short client timeout (for example, 1–3 seconds) and fail closed.
Network failures

Decide your failure policy up front. Do not allow “it depends” behavior per team, per service, or per environment. Enforcement must be consistent.

Recommended policy: fail closed
  • If validate fails (timeout/network): treat as allowed:false
  • Exit the process or stop the worker loop
  • Surface the failure via logs/alerts
This policy makes your system controllable under uncertainty. If you cannot confirm permission, you do not execute.
No degraded mode

The guarantee is binary. Avoid patterns like:

  • Proceed anyway but log a warning
  • Continue for 10 minutes until checks recover
  • Fallback to internal flags when validation is down

Those patterns create a second authority path inside the runtime. That is precisely what an external control plane avoids.

Examples: up to 3 devices

The free tier is enough to implement a complete control loop and prove stop behavior end-to-end. A practical three-device model:

Suggested 3-device topology
  • worker:dev:queue-consumer:01 — consumes and processes jobs
  • tool:dev:external-call-runner:01 — tool-heavy or cost-heavy actions
  • job:dev:nightly-maintenance:01 — scheduled work identity
Prove control
  • Start the worker loop with Startup + Per-job validate
  • Revoke tool:dev:external-call-runner:01 in the dashboard
  • Confirm the next validate boundary stops tool execution
  • Restore the device and confirm work resumes

Dashboard: machineid.io/dashboard

Examples: up to 25 devices

This tier supports multiple concurrent workers and multiple execution surfaces for tool calls and side effects.

Example topology
  • 10 queue consumers: worker:prod:queue-consumer:01:10
  • 8 tool runners: tool:prod:external-call-runner:01:08
  • 4 schedulers: job:prod:nightly:01:04
  • 3 side-effect executors: effect:prod:email:01, effect:prod:payments:01, effect:prod:writes:01
At this tier, tool-call and side-effect gates become the primary stop points in real operations.
Examples: up to 250 devices

At this scale, execution surfaces multiply under load: autoscaling, retries, event storms, and distributed consumers. The dominant requirement is consistent enforcement across replicas and across time.

Patterns that drive device count
  • Autoscaling worker pools (many replicas)
  • Multiple queues or topic partitions
  • Per-tenant or per-workflow identities
  • Dedicated side-effect runners (payments/email/write)
  • Long-running loops with periodic tool calls
Examples: up to 1000 devices

Near the upper end of standard caps, the dominant failure mode is execution multiplication: autoscale, fan-out, retries, recursion, and delayed resumes.

Scale guidance
  • Prefer per-replica identities (avoid a single “fleet” identity)
  • Validate at boundaries that occur during real work (tool + side-effect)
  • Avoid fallback authority paths (no degraded enforcement)
  • Treat stop boundaries as part of your operational design
Custom device limits

If you need device limits beyond standard tiers, MachineID supports custom device limits. This does not require changes to runtime code — the identity model and enforcement boundaries remain the same.

Dashboard controls

The console lives outside your runtime so control does not depend on the process cooperating: machineid.io/dashboard

Controls available
  • Revoke / restore devices (including bulk)
  • Remove devices
  • Register devices
  • Rotate key
  • Lost key recovery via magic link
  • Org-wide disable (stop validates across the org)
Practical advantage: a human can revoke or disable from a separate device while the runtime continues running elsewhere.
Org-wide disable (stop everything)

In addition to revoking individual devices, MachineID supports an org-wide disable control. This is a deliberate stop mechanism that affects validate outcomes across the org.

Operational semantics
  • Org disable does not change device revoked/restored states
  • It causes validate decisions to deny across the org
  • It takes effect at the next validation boundary you defined
Stop latency (what actually stops, and when)

Revoke/restore and org disable are effective at the next validate. Stop latency is determined by boundary placement.

Make stop control operationally useful
  • Validate before tool calls
  • Validate before side effects
  • Validate at loop re-entry points
  • Validate before resuming long-paused work
Plan behavior (enforcement timing)
Enforcement timing
  • Plan caps are enforced immediately on upgrade and downgrade
  • Cancellation at end of billing cycle is enforced as the cycle ends
  • Moving from paid to free retains the org and re-caps back to 3 devices
Your runtime code does not change when plan state changes. Enforcement is external; validate outcomes reflect current authority.
LLM implementation prompts (step-by-step plans)

The prompts below are designed to produce practical integration plans with minimal guesswork. Replace bracketed placeholders and paste into your LLM of choice.

Prompt 1 — Integrate MachineID into my Python worker runtime
I have a Python worker/service runtime and I want hard enforcement using MachineID.io.

Context:
- Org key: [PASTE ORG KEY]
- Device ID scheme: {service}:{env}:{role}:{instance}
- Fail-closed policy, short timeout (1–3s)
- Validation boundaries required:
  1) Startup (register + validate)
  2) Before each unit of work (job/message/task)
  3) Before any external tool call that can incur cost
  4) Before any irreversible side effect (writes, payments, emails)
  5) Before entering high-cost loops or fan-out cycles

Please provide:
1) Exact files/functions where validation should be added
2) Copy/paste code blocks (SDK and direct-HTTP variants)
3) A recommended boundary map for my runtime
4) A test plan using the MachineID.io dashboard:
   - revoke a device
   - restore it
   - use org-wide disable
   - verify work stops at the next validate boundary
Prompt 2 — Design a device model by tier
Help me model MachineID devices for my Python system.

Inputs:
- Execution surfaces: [describe workers/queues/cron/tools/side-effects]
- Expected scale: [3 / 25 / 250 / 1000]
- I need readable device IDs and surgical revoke control.

Output:
- Device ID naming scheme
- A concrete list of device IDs for the target tier
- Where to validate (startup / per-work / tool / side-effect / loops)
- A minimal runbook for revoke + org-wide disable
Prompt 3 — Add stop points inside a long-running loop
I have a Python process that can run for hours and includes tool calls and side effects.

Goal:
- Add MachineID.io validate boundaries inside the loop so I can stop it remotely.
- Fail closed on validation timeout or error.

Please provide:
- Exactly where to place validation calls
- A wrapper pattern that is hard to forget
- A test plan using dashboard revoke and org-wide disable
LLM checklist (what a correct integration includes)
A correct implementation should have all of the following
  • Stable device identity per execution surface
  • Startup gating (register + validate, fail closed)
  • At least one stop point during real work (tool-call or side-effect boundary)
  • Short timeout and consistent failure policy
  • Denials logged as operational events (include request_id)
  • A runbook to revoke/restore and use org-wide disable
  • Device model that scales with replicas (no “one fleet device” anti-pattern)