mirror of https://github.com/openclaw/openclaw.git
528 lines
22 KiB
Markdown
528 lines
22 KiB
Markdown
Temporary internal migration note: remove this document once the extension-host migration is complete.
|
|
|
|
# OpenClaw Extension Host Implementation Guide
|
|
|
|
Date: 2026-03-15
|
|
|
|
## Purpose
|
|
|
|
This is the main execution guide for implementing the extension-host and kernel transition.
|
|
|
|
Use it as the top-level implementation document.
|
|
|
|
## How We Fix It
|
|
|
|
Fix this as a staged architectural migration, not a broad refactor.
|
|
|
|
1. Lock the boundary first by writing the cutover inventory and adding anti-corruption interfaces so no new plugin-specific behavior leaks into the kernel.
|
|
2. Introduce source-of-truth extension schema types and the `ResolvedExtension` model while preserving current `openclaw/plugin-sdk/*` loading through minimal compatibility support.
|
|
3. Move discovery, policy, provenance, static metadata, and registration ownership into the extension host, including hooks, channels, providers, tools, routes, CLI, setup, services, and slot-backed providers.
|
|
4. Prove the path with pilot migrations: `thread-ownership` first for non-channel hook behavior, then `telegram` for channel compatibility.
|
|
5. After pilot parity is established, move runtime behavior onto canonical event stages and replace the fragmented tool, provider, and slot-selection paths with one catalog and arbitration model.
|
|
6. Remove the legacy plugin runtime as the default path only after the host path has parity and the duplicate legacy systems are gone or explicitly downgraded to compatibility-only shims.
|
|
|
|
The other docs remain the source of truth for their domains:
|
|
|
|
- `openclaw-extension-contribution-schema-spec.md`
|
|
- `openclaw-extension-host-lifecycle-and-security-spec.md`
|
|
- `openclaw-kernel-event-pipeline-spec.md`
|
|
- `openclaw-capability-catalog-and-arbitration-spec.md`
|
|
- `openclaw-kernel-extension-host-transition-plan.md`
|
|
|
|
## TODOs
|
|
|
|
- [ ] Confirm the implementation order and owners for each phase.
|
|
- [x] Create the initial code skeleton for kernel and extension-host boundaries.
|
|
- [x] Write the initial boundary cutover inventory for every current plugin-owned surface.
|
|
- [ ] Keep the boundary cutover inventory updated as surfaces move.
|
|
- [ ] Track PRs, migrations, and follow-up gaps by phase.
|
|
- [ ] Keep the linked spec TODO sections in sync with implementation progress.
|
|
- [ ] Define the detailed pilot migration matrix and parity checks before Phase 3 starts.
|
|
- [ ] Mark this guide complete only when the legacy plugin path is no longer the primary runtime path.
|
|
|
|
## Implementation Status
|
|
|
|
Current status against this guide:
|
|
|
|
- Phase 0 has started but is not complete.
|
|
- Phase 1 has started but is not complete.
|
|
- Phase 2 has started in a narrow, compatibility-preserving form but is not complete.
|
|
- Phases 3 through 7 have not started in a meaningful way yet.
|
|
|
|
What has been implemented so far:
|
|
|
|
- a new `src/extension-host/*` boundary now exists in code
|
|
- active runtime registry ownership moved into `src/extension-host/active-registry.ts`
|
|
- `src/plugins/runtime.ts` now acts as a compatibility facade over the host-owned active registry
|
|
- registry activation now routes through `src/extension-host/activation.ts`
|
|
- initial source-of-truth types landed in `src/extension-host/schema.ts`, including `ResolvedExtension`, `ResolvedContribution`, and `ContributionPolicy`
|
|
- static manifest and package metadata are now normalized through host-owned helpers rather than being interpreted only inside plugin-era modules
|
|
- `src/plugins/manifest-registry.ts` now carries a normalized `resolvedExtension` alongside the legacy flat manifest record
|
|
- `src/extension-host/resolved-registry.ts` now exposes a host-owned resolved-extension registry view
|
|
- an initial Phase 0 inventory now exists in `src/extension-host/cutover-inventory.md`
|
|
- plugin SDK alias resolution now routes through `src/extension-host/loader-compat.ts`
|
|
- loader cache key construction and registry cache control now route through `src/extension-host/loader-cache.ts`
|
|
- loader provenance, duplicate-order, and warning policy now route through `src/extension-host/loader-policy.ts`
|
|
- loader initial candidate planning and record creation now route through `src/extension-host/loader-records.ts`
|
|
- loader entry-path opening and module import now route through `src/extension-host/loader-import.ts`
|
|
- loader module-export resolution, config validation, and memory-slot load decisions now route through `src/extension-host/loader-runtime.ts`
|
|
- loader post-import planning and `register(...)` execution now route through `src/extension-host/loader-register.ts`
|
|
- loader per-candidate orchestration now routes through `src/extension-host/loader-flow.ts`
|
|
- loader record-state transitions now route through `src/extension-host/loader-state.ts`, including explicit compatibility `lifecycleState` mapping
|
|
- loader final cache, warning, and activation finalization now routes through `src/extension-host/loader-finalize.ts`
|
|
- runtime registration normalization has started in `src/extension-host/runtime-registrations.ts` for channel, provider, HTTP-route, gateway-method, tool, CLI, service, command, context-engine, and hook registrations
|
|
- several static and lookup consumers now read through the host boundary or resolved-extension model:
|
|
- channel registry and dock lookups
|
|
- message-channel normalization
|
|
- plugin HTTP route registry default lookup
|
|
- discovery and install package metadata parsing
|
|
- channel catalog package metadata parsing
|
|
- plugin skill discovery
|
|
- plugin auto-enable
|
|
- config doc baseline generation
|
|
- config validation indexing
|
|
|
|
How it has been done:
|
|
|
|
- by extracting narrow host-owned modules first and making existing plugin modules delegate to them
|
|
- by preserving current behavior and import surfaces wherever possible instead of attempting a broad rewrite
|
|
- by introducing normalized static records before touching heavy runtime activation paths
|
|
- by converting one static consumer at a time so each call site can move without forcing a loader rewrite
|
|
- by extracting low-risk runtime registration helpers next and letting `src/plugins/registry.ts` delegate to them as a compatibility facade
|
|
- by keeping duplicate enforcement in legacy subsystems only where that logic has not moved yet, such as plugin commands
|
|
- by starting loader and lifecycle migration with compatibility helpers for activation and SDK alias resolution before changing discovery or policy behavior
|
|
- by moving cache-key construction, cache reads, cache writes, and cache clearing behind host-owned helpers before changing activation-state ownership
|
|
- by moving loader-owned policy helpers next, while keeping module loading and enablement flow behavior unchanged
|
|
- by moving initial candidate planning and record construction behind host-owned helpers before changing import and registration flow
|
|
- by moving entry-path opening and module import behind host-owned helpers before changing cache wiring or lifecycle orchestration
|
|
- by moving loader runtime decisions behind host-owned helpers while preserving lazy loading, config validation behavior, and memory-slot policy behavior
|
|
- by moving post-import planning and `register(...)` execution behind host-owned helpers before changing entry-path and import flow
|
|
- by composing those seams into one host-owned per-candidate orchestrator before changing cache and lifecycle finalization behavior
|
|
- by moving loader record-state transitions into host-owned helpers before introducing a full lifecycle state machine
|
|
- by moving cache writes, provenance warnings, final memory-slot warnings, and activation into a host-owned loader finalizer before introducing an explicit lifecycle state machine
|
|
- by adding explicit compatibility `lifecycleState` mapping on loader-owned plugin records before introducing a full lifecycle state machine
|
|
- by moving central readers first, so later lifecycle and compatibility work can land on one boundary instead of many ad hoc call sites
|
|
- by adding focused tests for each extracted seam before widening the boundary further
|
|
|
|
Committed implementation slices so far:
|
|
|
|
- `6abf6750ee` `Plugins: add extension host registry boundary`
|
|
- `1aab89e820` `Plugins: extract loader host seams`
|
|
- `7bc3135082` `Plugins: extract loader candidate planning`
|
|
- `3a122c95fa` `Plugins: extract loader register flow`
|
|
- `fc81454038` `Plugins: extract loader import flow`
|
|
- `e1b207f4cf` `Plugins: extract loader candidate orchestration`
|
|
- `0c44d8049b` `Plugins: extract loader finalization`
|
|
- `33ef55a9ee` `Plugins: add loader lifecycle state mapping`
|
|
- `6590e19095` `Plugins: extract loader cache control`
|
|
- `89414ed857` `Docs: track extension host migration internally`
|
|
- `d8af1eceaf` `Docs: refresh extension host migration status`
|
|
|
|
What is still missing for these phases:
|
|
|
|
- keeping the cutover inventory current as more surfaces move
|
|
- the full lifecycle state machine, remaining explicit activation-state ownership, policy gate, and broad host-owned registries described for Phase 2
|
|
- minimal SDK compatibility work beyond preserving current behavior indirectly through existing loading
|
|
- any pilot migration, event pipeline, canonical catalog, or arbitration implementation
|
|
|
|
## Implementation Order
|
|
|
|
Implement phases in this order:
|
|
|
|
1. Phase 0: boundary inventory and anti-corruption layer
|
|
2. Phase 1: contribution schema, package metadata, and minimal SDK compatibility
|
|
3. Phase 2: extension host lifecycle and registries
|
|
4. Phase 3: broader legacy compatibility bridges
|
|
5. Phase 4: canonical event pipeline
|
|
6. Phase 5: capability catalog migration
|
|
7. Phase 6: arbitration migration
|
|
8. Phase 7: broader migration and legacy removal
|
|
|
|
This order matters because each layer depends on the previous one:
|
|
|
|
- catalogs depend on normalized contributions
|
|
- normalized contributions depend on host discovery and validation
|
|
- existing extensions must keep loading while the schema and SDK boundary changes
|
|
- migrated hooks depend on the canonical event pipeline
|
|
- install, onboarding, and status flows depend on static metadata before runtime activation
|
|
- catalogs and arbitration already exist in partial forms, so their phases are migrations, not greenfield work
|
|
- safe removal of legacy paths depends on compatibility coverage and parity checks
|
|
|
|
## Implementation Guardrails
|
|
|
|
Do not implement every abstraction in the docs in the first cut.
|
|
|
|
Treat some parts of the design as ceilings rather than immediate scope:
|
|
|
|
- event taxonomy should start with three execution modes only:
|
|
- parallel observers
|
|
- sequential merge or decision handlers
|
|
- sync transcript hot paths
|
|
- permission modes should implement `advisory` and `host-enforced` first
|
|
- `sandbox-enforced` should remain a future contract until real isolation exists
|
|
- catalog publication should start small:
|
|
- kernel internal catalog
|
|
- kernel agent catalog
|
|
- host operator and static registries
|
|
- adapter metadata should stay minimal and parity-driven
|
|
- setup flow typing should start with a small result set:
|
|
- config patch
|
|
- credential result
|
|
- status note
|
|
- follow-up action
|
|
- canonical action governance should start as one source file plus tests, not a larger process framework
|
|
- arbitration should start with:
|
|
- exclusive slot
|
|
- ranked provider
|
|
- parallel provider
|
|
|
|
The first implementation goal is parity for pilot migrations, not maximum generality.
|
|
|
|
If a design choice is not required to migrate one channel extension and one non-channel extension safely, defer it.
|
|
|
|
## Current Runtime Surfaces That Must Be Accounted For
|
|
|
|
The current plugin system already owns more than runtime activation.
|
|
|
|
Before implementation starts, write and maintain a cutover inventory for these surfaces:
|
|
|
|
- manifest loading and static metadata
|
|
- package-level install and onboarding metadata
|
|
- discovery, provenance, and origin precedence
|
|
- config schema and UI hint loading
|
|
- typed hooks and legacy hook bridges
|
|
- channels and channel lookup
|
|
- providers and provider auth/setup flows
|
|
- tools and agent-visible tool catalogs
|
|
- HTTP routes and gateway methods
|
|
- CLI registrars and plugin commands
|
|
- services and context-engine registrations
|
|
- slot selection and other existing arbitration paths
|
|
- status, reload, install, update, and diagnostics surfaces
|
|
|
|
Do not treat Phase 5 and Phase 6 as new systems built in isolation.
|
|
|
|
They must absorb and replace the existing partial catalog and arbitration behaviors rather than creating a second source of truth.
|
|
|
|
## Phase Guide
|
|
|
|
### Phase 0: Lock the boundary
|
|
|
|
Goal:
|
|
|
|
- define the kernel versus extension-host boundary in code and imports
|
|
- inventory every current plugin-owned surface that crosses that boundary
|
|
|
|
Deliverables:
|
|
|
|
- boundary cutover inventory
|
|
- anti-corruption interfaces for host-owned registration surfaces
|
|
- initial feature flags for host-path versus legacy-path execution
|
|
- directory and import boundaries for kernel and extension-host code
|
|
|
|
Primary docs:
|
|
|
|
- `openclaw-kernel-extension-host-transition-plan.md`
|
|
- `openclaw-extension-contribution-schema-spec.md`
|
|
|
|
Exit criteria:
|
|
|
|
- kernel code does not take new dependencies on legacy plugin shapes
|
|
- extension-host directory structure exists
|
|
- compatibility-only surfaces are identified
|
|
- each current plugin-owned surface is tagged as kernel-owned, host-owned, or compatibility-only
|
|
- no new direct writes to global registries are introduced without going through the new boundary
|
|
|
|
Current implementation status:
|
|
|
|
- partially implemented
|
|
- the code boundary exists in `src/extension-host/*`
|
|
- central active-registry ownership now routes through the host boundary
|
|
- several central runtime readers now consume the host-owned boundary instead of reading directly from `src/plugins/runtime.ts`
|
|
- the initial cutover inventory now exists in `src/extension-host/cutover-inventory.md` and is being updated as surfaces move, but the phase is still incomplete because loader orchestration, lifecycle ownership, and later compatibility phases have not moved yet
|
|
|
|
### Phase 1: Define the schema
|
|
|
|
Goal:
|
|
|
|
- implement the source-of-truth manifest and contribution types
|
|
- preserve existing extension loading while the schema and SDK boundary changes
|
|
|
|
Primary doc:
|
|
|
|
- `openclaw-extension-contribution-schema-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- manifest parser
|
|
- package metadata parser
|
|
- contribution validators
|
|
- `ResolvedExtension`
|
|
- `ResolvedContribution`
|
|
- typed `ContributionPolicy`
|
|
- static metadata parser
|
|
- new versioned SDK contract surface
|
|
- minimal SDK compatibility loading surface
|
|
- normalized install and onboarding metadata model
|
|
|
|
Exit criteria:
|
|
|
|
- extensions can be normalized into static and runtime sections without activating heavy runtime code
|
|
- existing extension SDK imports still resolve through the compatibility loading path
|
|
|
|
Current implementation status:
|
|
|
|
- partially implemented
|
|
- `ResolvedExtension`, `ResolvedContribution`, and `ContributionPolicy` landed as initial code types
|
|
- legacy manifest and package metadata now converge into a normalized `resolvedExtension` record carried by the manifest registry
|
|
- discovery, install, and catalog metadata parsing now go through host-owned schema helpers
|
|
- partial explicit compatibility now exists through host-owned loader-compat and loader-runtime helpers, but full manifest or contribution validators and a versioned SDK compatibility layer are not implemented yet
|
|
|
|
### Phase 2: Build the extension host
|
|
|
|
Goal:
|
|
|
|
- implement discovery, validation, policy, registries, and lifecycle ownership
|
|
|
|
Primary doc:
|
|
|
|
- `openclaw-extension-host-lifecycle-and-security-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- discovery pipeline
|
|
- activation state machine
|
|
- policy evaluator
|
|
- host-owned registries
|
|
- host-owned adapters for hooks, channels, providers, tools, HTTP routes, gateway methods, CLI, services, commands, and context engines
|
|
- per-extension state ownership
|
|
- provenance and origin handling
|
|
- config redaction-aware schema loading
|
|
- reload and route ownership handling
|
|
|
|
Exit criteria:
|
|
|
|
- the host can load bundled and external extensions into normalized registries
|
|
- the host can populate normalized registries without direct kernel writes except through explicit compatibility adapters
|
|
|
|
Current implementation status:
|
|
|
|
- partially implemented in a compatibility-preserving form
|
|
- the host owns the active registry state
|
|
- the host exposes a resolved-extension registry view for static consumers
|
|
- plugin skills, plugin auto-enable, and config validation indexing now consume host-owned resolved-extension data
|
|
- activation, loader cache control, loader policy, loader candidate planning, loader import flow, loader runtime decisions, loader post-import register flow, loader candidate orchestration, loader record-state helpers, and loader finalization now route through `src/extension-host/*`
|
|
- lifecycle state ownership, activation states, policy evaluation, and broad host-owned registries are still not implemented
|
|
|
|
### Phase 3: Build compatibility bridges
|
|
|
|
Goal:
|
|
|
|
- keep current extensions working through the host without leaking legacy contracts into the kernel
|
|
|
|
Primary docs:
|
|
|
|
- `openclaw-kernel-extension-host-transition-plan.md`
|
|
- `openclaw-extension-contribution-schema-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- `ChannelPlugin` compatibility translators
|
|
- plugin SDK compatibility loading
|
|
- runtime-channel namespace translation into the new SDK modules
|
|
- legacy setup and CLI translation
|
|
- legacy config schema and UI hint translation
|
|
- pilot migration matrix with explicit parity labels
|
|
|
|
Exit criteria:
|
|
|
|
- `thread-ownership` runs through the host path as the first non-channel pilot
|
|
- `telegram` runs through the host path as the first channel pilot
|
|
- both pilots have explicit parity results for discovery, config, activation, diagnostics, and runtime behavior
|
|
|
|
### Phase 4: Implement the canonical event pipeline
|
|
|
|
Goal:
|
|
|
|
- move runtime hook behavior onto explicit canonical events
|
|
|
|
Primary doc:
|
|
|
|
- `openclaw-kernel-event-pipeline-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- event type definitions
|
|
- stage runner
|
|
- sync transcript-write stages
|
|
- bridges from legacy hook buses
|
|
- mapping table from existing typed and legacy hooks to canonical stages
|
|
|
|
Exit criteria:
|
|
|
|
- migrated extensions can use canonical events without relying directly on old plugin hook execution
|
|
- pilot hook behaviors have parity coverage against the pre-host path
|
|
|
|
### Phase 5: Implement catalogs
|
|
|
|
Goal:
|
|
|
|
- compile runtime-derived agent and internal catalogs, plus host-owned operator catalogs
|
|
- replace existing plugin-identity-driven catalog surfaces with canonical family-based catalogs
|
|
|
|
Primary doc:
|
|
|
|
- `openclaw-capability-catalog-and-arbitration-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- kernel internal catalog
|
|
- kernel agent catalog
|
|
- host operator catalog
|
|
- static setup and install catalogs
|
|
- canonical action registry
|
|
- migration plan for existing tool, provider, and setup catalog surfaces
|
|
|
|
Exit criteria:
|
|
|
|
- agent-visible tools are compiled from canonical action families instead of plugin identity
|
|
- setup and install catalogs no longer depend on duplicated legacy metadata paths
|
|
|
|
### Phase 6: Implement arbitration
|
|
|
|
Goal:
|
|
|
|
- resolve overlap, ranking, selection, and slot conflicts deterministically
|
|
- absorb the existing slot and provider selection behavior into canonical arbitration
|
|
|
|
Primary doc:
|
|
|
|
- `openclaw-capability-catalog-and-arbitration-spec.md`
|
|
|
|
Deliverables:
|
|
|
|
- conflict detection
|
|
- provider selection
|
|
- slot arbitration
|
|
- planner-visible name collision handling
|
|
- migration plan for existing slot and name-collision behaviors
|
|
|
|
Exit criteria:
|
|
|
|
- at least one multi-provider family works through canonical arbitration
|
|
- legacy slot and provider-selection paths no longer act as separate arbitration systems
|
|
|
|
### Phase 7: Migrate and remove legacy paths
|
|
|
|
Goal:
|
|
|
|
- finish migration and shrink compatibility-only surfaces
|
|
|
|
Primary docs:
|
|
|
|
- `openclaw-kernel-extension-host-transition-plan.md`
|
|
- all other docs as parity references
|
|
|
|
Deliverables:
|
|
|
|
- channel migrations
|
|
- non-channel extension migrations
|
|
- parity tests
|
|
- deprecation markers
|
|
- removal plan for obsolete compatibility shims
|
|
|
|
Exit criteria:
|
|
|
|
- legacy plugin runtime is no longer the default execution path
|
|
|
|
## Pilot Matrix
|
|
|
|
Initial pilot set:
|
|
|
|
- non-channel pilot: `thread-ownership`
|
|
- channel pilot: `telegram`
|
|
|
|
Why these pilots:
|
|
|
|
- `thread-ownership` exercises typed hook loading without introducing CLI, HTTP route, or service migration at the same time
|
|
- `telegram` exercises the `ChannelPlugin` compatibility path with a minimal top-level plugin registration surface
|
|
|
|
Second-wave compatibility candidates after the pilots are stable:
|
|
|
|
- `line` for channel plus command registration
|
|
- `device-pair` for command, service, and setup flow coverage
|
|
|
|
Each pilot must record parity for:
|
|
|
|
- discovery and precedence
|
|
- manifest and static metadata loading
|
|
- config schema and UI hints
|
|
- enabled and disabled state handling
|
|
- activation and reload behavior
|
|
- diagnostics and status output
|
|
- runtime behavior on the migrated path
|
|
- compatibility-only gaps that still remain
|
|
|
|
## Recommended First Implementation Slice
|
|
|
|
If you want the lowest-risk start, do this first:
|
|
|
|
1. write the boundary cutover inventory
|
|
2. add source-of-truth types
|
|
3. add the static metadata and package metadata parsers
|
|
4. add `ResolvedExtension`
|
|
5. add minimal SDK compatibility loading
|
|
6. add host discovery and validation
|
|
7. bring `thread-ownership` through the host path first
|
|
8. bring `telegram` through the host path second
|
|
|
|
Status of this slice:
|
|
|
|
- steps 2 through 6 are underway
|
|
- step 1 is still missing as a formal artifact
|
|
- steps 7 and 8 have not started
|
|
|
|
Concrete landings from this slice:
|
|
|
|
- the host boundary exists
|
|
- source-of-truth schema types exist
|
|
- package metadata parsing now routes through the host schema layer
|
|
- `ResolvedExtension` exists in code and is attached to manifest-registry records
|
|
- host-owned active-registry and resolved-registry views exist
|
|
- early static consumers have moved onto the new host-owned data path
|
|
|
|
Do not start with catalogs or arbitration first.
|
|
|
|
Also avoid these first-cut traps:
|
|
|
|
- do not build a broad event scheduling framework before the canonical stages exist
|
|
- do not turn permission descriptors into fake sandbox guarantees
|
|
- do not build a large operator catalog publication layer before the host registries are real
|
|
- do not over-type setup flows before the pilot migrations prove the minimum result model is insufficient
|
|
|
|
## Tracking Rules
|
|
|
|
When implementation begins:
|
|
|
|
- update this guide first with phase status
|
|
- update the matching spec TODOs when a domain changes
|
|
- record where the implementation intentionally diverged from the spec
|
|
- record which behaviors are full parity, partial parity, or compatibility-only
|
|
- update the pilot parity matrix whenever a migrated surface changes
|
|
|
|
## Suggested Status Format
|
|
|
|
Use this format in each doc when work starts:
|
|
|
|
- `not started`
|
|
- `in progress`
|
|
- `implemented`
|
|
- `verified`
|
|
- `deferred`
|
|
|
|
For example:
|
|
|
|
- `ResolvedExtension` registry: `implemented`
|
|
- setup fallback removal: `deferred`
|
|
- sync transcript-write parity tests: `in progress`
|