Commit Graph

2 Commits

Author SHA1 Message Date
Hongming Wang
c0a5d842b4 feat(runtime): native_scheduler skip — primitive #3 of 6
When an adapter declares provides_native_scheduler=True (because its
SDK has built-in cron / Temporal-style workflows), the platform's
polling loop must skip firing schedules for that workspace — otherwise
the schedule fires twice (once natively, once via platform). The
native skip preserves observability (next_run_at still advances, the
schedule row stays in the DB, last_run_at would still update) while
moving the FIRE responsibility to the SDK.

Stacked on PR #2139 (idle_timeout_override end-to-end). The
RuntimeMetadata heartbeat block already carries the capability map;
this PR teaches the platform how to read and act on the scheduler bit.

Components:

  - handlers/runtime_overrides.go: extended the cache to store
    capability flags alongside idle timeout. Two heartbeat fields are
    independent — SetIdleTimeout / SetCapabilities each update one
    without stomping the other. Defensive copy on SetCapabilities so
    a caller mutating its map after the call doesn't retroactively
    change cached declarations. Empty entries dropped to avoid stale
    husks.

  - handlers/runtime_overrides.go: new HasCapability(workspaceID, name)
    + ProvidesNativeScheduler(workspaceID) — the latter is the
    package-level adapter the scheduler imports (avoids a
    handlers/scheduler import cycle).

  - handlers/registry.go: heartbeat handler now calls SetCapabilities
    in addition to SetIdleTimeout.

  - scheduler/scheduler.go: NativeSchedulerCheck function-pointer DI
    (mirrors the existing QueueDrainFunc pattern). New() leaves the
    field nil so existing callers preserve today's "always fire"
    behavior. SetNativeSchedulerCheck wires production. tick() drops
    workspaces declaring native ownership before goroutine fan-out;
    advances next_run_at so we don't tight-loop on the same row.

  - cmd/server/main.go: wires handlers.ProvidesNativeScheduler into
    the cron scheduler at server boot.

Tests:
  Go (7 new):
    - SetCapabilitiesAndHas (round-trip)
    - per-workspace isolation (ws-a's declaration doesn't leak to ws-b)
    - nil/empty map clears (adapter dropping the flag restores fallback)
    - SetCapabilities is a defensive copy (caller mutation can't
      retroactively flip cached value)
    - SetIdleTimeout preserves capabilities and vice-versa (two-field
      independence)
    - empty entry deleted (no stale husks)
    - ProvidesNativeScheduler reads the same singleton heartbeat writes
    - SetNativeSchedulerCheck wires the function (scheduler-side)
    - nil-check safety contract for tick

  Python: no change needed — the heartbeat already serializes the
  full capability map via _runtime_metadata_payload (PR #2139). An
  adapter setting RuntimeCapabilities(provides_native_scheduler=True)
  automatically flows through.

Verification:
  - 1308 / 1308 Python pytest pass (unchanged)
  - All Go handlers + scheduler tests pass
  - go build + go vet clean

See project memory `project_runtime_native_pluggable.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:47:00 -07:00
Hongming Wang
0d3058585b feat(runtime): adapter-declared idle_timeout_override end-to-end
Capability primitive #2 (task #117). The first cross-cutting capability
where the adapter actually displaces platform behavior — claude-code's
streaming session can legitimately go silent for 8+ minutes during
synthesis + slow tool calls; the platform's hardcoded 5min idle timer
in a2a_proxy.go cancels it mid-flight (the bug PR #2128 patched at
the env-var layer). This PR fixes it at the right layer: the adapter
declares "I need 600s" and the platform's dispatch path honors it.

Wire shape (Python → Go):

  POST /registry/heartbeat
  {
    "workspace_id": "...",
    ...
    "runtime_metadata": {
      "capabilities": {"heartbeat": false, "scheduler": false, ...},
      "idle_timeout_seconds": 600    // optional, omitted = use default
    }
  }

Default behavior preserved: any adapter that doesn't override
BaseAdapter.idle_timeout_override() (returns None by default) sends
no idle_timeout_seconds field; the Go side falls through to
idleTimeoutDuration (env A2A_IDLE_TIMEOUT_SECONDS, default 5min).
Existing langgraph / crewai / deepagents workspaces are unaffected.

Components:

  Python:
  - adapter_base.py: idle_timeout_override() method on BaseAdapter
    returning None (the platform-default sentinel).
  - heartbeat.py: _runtime_metadata_payload() lazy-imports the active
    adapter and assembles the capability + override block. Try/except
    swallows ANY error so heartbeat never breaks because of capability
    discovery — observability outranks capability accuracy.

  Go:
  - models.HeartbeatPayload.RuntimeMetadata (pointer so absent =
    "old runtime, didn't say"; explicit zero-cap = "new runtime,
    declared no native ownership").
  - handlers.runtimeOverrides: in-memory sync.Map cache keyed by
    workspaceID. Populated by the heartbeat handler, consulted on
    every dispatchA2A. Reset on platform restart (worst-case 30s of
    platform-default behavior — acceptable; nothing about overrides
    is correctness-critical).
  - a2a_proxy.dispatchA2A: looks up the override before applyIdle
    Timeout; falls through to global default when absent.

Tests:
  Python (17, all new):
    - RuntimeCapabilities dataclass shape (frozen, defaults, wire keys)
    - BaseAdapter.capabilities() default + override + sibling isolation
    - idle_timeout_override default, positive override, dropped-override
    - Heartbeat metadata producer: default adapter emits all-False,
      native adapter emits flag + override, missing ADAPTER_MODULE
      returns {} (graceful), zero/negative override is omitted from
      wire, exception inside adapter swallowed
  Go (6, all new):
    - SetIdleTimeout + IdleTimeout round-trip
    - Zero/negative duration clears the override
    - Empty workspace_id ignored
    - Replacement (heartbeat overwrites prior value)
    - Reset clears entire cache
    - Concurrent reads + writes (sync.Map invariant)

Verification:
  - 1308 / 1308 workspace pytest pass (was 1300, +8)
  - All Go handlers tests pass (6 new + existing)
  - go vet clean

See project memory `project_runtime_native_pluggable.md` for the
architecture principle this implements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:38:01 -07:00