Commit Graph

2 Commits

Author SHA1 Message Date
Hongming Wang
0d3058585b feat(runtime): adapter-declared idle_timeout_override end-to-end
Capability primitive #2 (task #117). The first cross-cutting capability
where the adapter actually displaces platform behavior — claude-code's
streaming session can legitimately go silent for 8+ minutes during
synthesis + slow tool calls; the platform's hardcoded 5min idle timer
in a2a_proxy.go cancels it mid-flight (the bug PR #2128 patched at
the env-var layer). This PR fixes it at the right layer: the adapter
declares "I need 600s" and the platform's dispatch path honors it.

Wire shape (Python → Go):

  POST /registry/heartbeat
  {
    "workspace_id": "...",
    ...
    "runtime_metadata": {
      "capabilities": {"heartbeat": false, "scheduler": false, ...},
      "idle_timeout_seconds": 600    // optional, omitted = use default
    }
  }

Default behavior preserved: any adapter that doesn't override
BaseAdapter.idle_timeout_override() (returns None by default) sends
no idle_timeout_seconds field; the Go side falls through to
idleTimeoutDuration (env A2A_IDLE_TIMEOUT_SECONDS, default 5min).
Existing langgraph / crewai / deepagents workspaces are unaffected.

Components:

  Python:
  - adapter_base.py: idle_timeout_override() method on BaseAdapter
    returning None (the platform-default sentinel).
  - heartbeat.py: _runtime_metadata_payload() lazy-imports the active
    adapter and assembles the capability + override block. Try/except
    swallows ANY error so heartbeat never breaks because of capability
    discovery — observability outranks capability accuracy.

  Go:
  - models.HeartbeatPayload.RuntimeMetadata (pointer so absent =
    "old runtime, didn't say"; explicit zero-cap = "new runtime,
    declared no native ownership").
  - handlers.runtimeOverrides: in-memory sync.Map cache keyed by
    workspaceID. Populated by the heartbeat handler, consulted on
    every dispatchA2A. Reset on platform restart (worst-case 30s of
    platform-default behavior — acceptable; nothing about overrides
    is correctness-critical).
  - a2a_proxy.dispatchA2A: looks up the override before applyIdle
    Timeout; falls through to global default when absent.

Tests:
  Python (17, all new):
    - RuntimeCapabilities dataclass shape (frozen, defaults, wire keys)
    - BaseAdapter.capabilities() default + override + sibling isolation
    - idle_timeout_override default, positive override, dropped-override
    - Heartbeat metadata producer: default adapter emits all-False,
      native adapter emits flag + override, missing ADAPTER_MODULE
      returns {} (graceful), zero/negative override is omitted from
      wire, exception inside adapter swallowed
  Go (6, all new):
    - SetIdleTimeout + IdleTimeout round-trip
    - Zero/negative duration clears the override
    - Empty workspace_id ignored
    - Replacement (heartbeat overwrites prior value)
    - Reset clears entire cache
    - Concurrent reads + writes (sync.Map invariant)

Verification:
  - 1308 / 1308 workspace pytest pass (was 1300, +8)
  - All Go handlers tests pass (6 new + existing)
  - go vet clean

See project memory `project_runtime_native_pluggable.md` for the
architecture principle this implements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:38:01 -07:00
Hongming Wang
205a454c09 feat(runtime): RuntimeCapabilities dataclass + BaseAdapter.capabilities()
Foundation primitive for the native+pluggable runtime principle (task
#117, blocks #87). Lets each adapter declare which cross-cutting
capabilities it owns natively (heartbeat, scheduler, durable session,
status mgmt, retry, activity decoration, channel dispatch) versus
delegates to the platform's fallback implementation.

Pure additive: every existing adapter inherits BaseAdapter.capabilities()
which returns RuntimeCapabilities() — every flag False — so today's
"platform owns everything" behavior is preserved exactly. Subsequent
PRs land platform-side consumers (idle-timeout override, scheduler
skip, status-transition hook, etc.) one capability at a time.

Why a frozen dataclass instead of class attributes: capabilities are
declared at class-load time and read by the platform on every heartbeat.
A mutable value would let a runtime change capabilities mid-flight,
creating impossible-to-debug state where the platform's idea of who-
owns-heartbeat drifts from the adapter's actual code.

Why a `to_dict()` with explicit short keys: the Go side will read these
from the heartbeat payload by string key. The dict's wire names are
pinned independently of Python field names so a Python-side rename
doesn't silently break the Go consumer (test pins this).

Tests (9 new):
  - is a frozen dataclass (mutation rejected)
  - all 7 default flags are False (load-bearing — flipping any default
    silently moves ownership for langgraph/crewai/deepagents)
  - to_dict() keys are stable wire names (Go contract)
  - BaseAdapter.capabilities() default returns all-False
  - subclass override mechanism works
  - sibling adapters' defaults aren't affected by an override

Verification:
  - 1300/1300 workspace pytest pass (was 1291, +9)
  - Zero behavior change for any existing code path

See project memory `project_runtime_native_pluggable.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:17:49 -07:00