22 KiB
Temporary internal migration note: remove this document once the extension-host migration is complete.
OpenClaw Extension Host Implementation Guide
Date: 2026-03-15
Purpose
This is the main execution guide for implementing the extension-host and kernel transition.
Use it as the top-level implementation document.
How We Fix It
Fix this as a staged architectural migration, not a broad refactor.
- Lock the boundary first by writing the cutover inventory and adding anti-corruption interfaces so no new plugin-specific behavior leaks into the kernel.
- Introduce source-of-truth extension schema types and the
ResolvedExtensionmodel while preserving currentopenclaw/plugin-sdk/*loading through minimal compatibility support. - Move discovery, policy, provenance, static metadata, and registration ownership into the extension host, including hooks, channels, providers, tools, routes, CLI, setup, services, and slot-backed providers.
- Prove the path with pilot migrations:
thread-ownershipfirst for non-channel hook behavior, thentelegramfor channel compatibility. - After pilot parity is established, move runtime behavior onto canonical event stages and replace the fragmented tool, provider, and slot-selection paths with one catalog and arbitration model.
- Remove the legacy plugin runtime as the default path only after the host path has parity and the duplicate legacy systems are gone or explicitly downgraded to compatibility-only shims.
The other docs remain the source of truth for their domains:
openclaw-extension-contribution-schema-spec.mdopenclaw-extension-host-lifecycle-and-security-spec.mdopenclaw-kernel-event-pipeline-spec.mdopenclaw-capability-catalog-and-arbitration-spec.mdopenclaw-kernel-extension-host-transition-plan.md
TODOs
- Confirm the implementation order and owners for each phase.
- Create the initial code skeleton for kernel and extension-host boundaries.
- Write the initial boundary cutover inventory for every current plugin-owned surface.
- Keep the boundary cutover inventory updated as surfaces move.
- Track PRs, migrations, and follow-up gaps by phase.
- Keep the linked spec TODO sections in sync with implementation progress.
- Define the detailed pilot migration matrix and parity checks before Phase 3 starts.
- Mark this guide complete only when the legacy plugin path is no longer the primary runtime path.
Implementation Status
Current status against this guide:
- Phase 0 has started but is not complete.
- Phase 1 has started but is not complete.
- Phase 2 has started in a narrow, compatibility-preserving form but is not complete.
- Phases 3 through 7 have not started in a meaningful way yet.
What has been implemented so far:
- a new
src/extension-host/*boundary now exists in code - active runtime registry ownership moved into
src/extension-host/active-registry.ts src/plugins/runtime.tsnow acts as a compatibility facade over the host-owned active registry- registry activation now routes through
src/extension-host/activation.ts - initial source-of-truth types landed in
src/extension-host/schema.ts, includingResolvedExtension,ResolvedContribution, andContributionPolicy - static manifest and package metadata are now normalized through host-owned helpers rather than being interpreted only inside plugin-era modules
src/plugins/manifest-registry.tsnow carries a normalizedresolvedExtensionalongside the legacy flat manifest recordsrc/extension-host/resolved-registry.tsnow exposes a host-owned resolved-extension registry view- an initial Phase 0 inventory now exists in
src/extension-host/cutover-inventory.md - plugin SDK alias resolution now routes through
src/extension-host/loader-compat.ts - loader cache key construction and registry cache control now route through
src/extension-host/loader-cache.ts - loader provenance, duplicate-order, and warning policy now route through
src/extension-host/loader-policy.ts - loader initial candidate planning and record creation now route through
src/extension-host/loader-records.ts - loader entry-path opening and module import now route through
src/extension-host/loader-import.ts - loader module-export resolution, config validation, and memory-slot load decisions now route through
src/extension-host/loader-runtime.ts - loader post-import planning and
register(...)execution now route throughsrc/extension-host/loader-register.ts - loader per-candidate orchestration now routes through
src/extension-host/loader-flow.ts - loader record-state transitions now route through
src/extension-host/loader-state.ts, including explicit compatibilitylifecycleStatemapping - loader final cache, warning, and activation finalization now routes through
src/extension-host/loader-finalize.ts - runtime registration normalization has started in
src/extension-host/runtime-registrations.tsfor channel, provider, HTTP-route, gateway-method, tool, CLI, service, command, context-engine, and hook registrations - several static and lookup consumers now read through the host boundary or resolved-extension model:
- channel registry and dock lookups
- message-channel normalization
- plugin HTTP route registry default lookup
- discovery and install package metadata parsing
- channel catalog package metadata parsing
- plugin skill discovery
- plugin auto-enable
- config doc baseline generation
- config validation indexing
How it has been done:
- by extracting narrow host-owned modules first and making existing plugin modules delegate to them
- by preserving current behavior and import surfaces wherever possible instead of attempting a broad rewrite
- by introducing normalized static records before touching heavy runtime activation paths
- by converting one static consumer at a time so each call site can move without forcing a loader rewrite
- by extracting low-risk runtime registration helpers next and letting
src/plugins/registry.tsdelegate to them as a compatibility facade - by keeping duplicate enforcement in legacy subsystems only where that logic has not moved yet, such as plugin commands
- by starting loader and lifecycle migration with compatibility helpers for activation and SDK alias resolution before changing discovery or policy behavior
- by moving cache-key construction, cache reads, cache writes, and cache clearing behind host-owned helpers before changing activation-state ownership
- by moving loader-owned policy helpers next, while keeping module loading and enablement flow behavior unchanged
- by moving initial candidate planning and record construction behind host-owned helpers before changing import and registration flow
- by moving entry-path opening and module import behind host-owned helpers before changing cache wiring or lifecycle orchestration
- by moving loader runtime decisions behind host-owned helpers while preserving lazy loading, config validation behavior, and memory-slot policy behavior
- by moving post-import planning and
register(...)execution behind host-owned helpers before changing entry-path and import flow - by composing those seams into one host-owned per-candidate orchestrator before changing cache and lifecycle finalization behavior
- by moving loader record-state transitions into host-owned helpers before introducing a full lifecycle state machine
- by moving cache writes, provenance warnings, final memory-slot warnings, and activation into a host-owned loader finalizer before introducing an explicit lifecycle state machine
- by adding explicit compatibility
lifecycleStatemapping on loader-owned plugin records before introducing a full lifecycle state machine - by moving central readers first, so later lifecycle and compatibility work can land on one boundary instead of many ad hoc call sites
- by adding focused tests for each extracted seam before widening the boundary further
Committed implementation slices so far:
6abf6750eePlugins: add extension host registry boundary1aab89e820Plugins: extract loader host seams7bc3135082Plugins: extract loader candidate planning3a122c95faPlugins: extract loader register flowfc81454038Plugins: extract loader import flowe1b207f4cfPlugins: extract loader candidate orchestration0c44d8049bPlugins: extract loader finalization33ef55a9eePlugins: add loader lifecycle state mapping6590e19095Plugins: extract loader cache control89414ed857Docs: track extension host migration internallyd8af1eceafDocs: refresh extension host migration status
What is still missing for these phases:
- keeping the cutover inventory current as more surfaces move
- the full lifecycle state machine, remaining explicit activation-state ownership, policy gate, and broad host-owned registries described for Phase 2
- minimal SDK compatibility work beyond preserving current behavior indirectly through existing loading
- any pilot migration, event pipeline, canonical catalog, or arbitration implementation
Implementation Order
Implement phases in this order:
- Phase 0: boundary inventory and anti-corruption layer
- Phase 1: contribution schema, package metadata, and minimal SDK compatibility
- Phase 2: extension host lifecycle and registries
- Phase 3: broader legacy compatibility bridges
- Phase 4: canonical event pipeline
- Phase 5: capability catalog migration
- Phase 6: arbitration migration
- Phase 7: broader migration and legacy removal
This order matters because each layer depends on the previous one:
- catalogs depend on normalized contributions
- normalized contributions depend on host discovery and validation
- existing extensions must keep loading while the schema and SDK boundary changes
- migrated hooks depend on the canonical event pipeline
- install, onboarding, and status flows depend on static metadata before runtime activation
- catalogs and arbitration already exist in partial forms, so their phases are migrations, not greenfield work
- safe removal of legacy paths depends on compatibility coverage and parity checks
Implementation Guardrails
Do not implement every abstraction in the docs in the first cut.
Treat some parts of the design as ceilings rather than immediate scope:
- event taxonomy should start with three execution modes only:
- parallel observers
- sequential merge or decision handlers
- sync transcript hot paths
- permission modes should implement
advisoryandhost-enforcedfirst sandbox-enforcedshould remain a future contract until real isolation exists- catalog publication should start small:
- kernel internal catalog
- kernel agent catalog
- host operator and static registries
- adapter metadata should stay minimal and parity-driven
- setup flow typing should start with a small result set:
- config patch
- credential result
- status note
- follow-up action
- canonical action governance should start as one source file plus tests, not a larger process framework
- arbitration should start with:
- exclusive slot
- ranked provider
- parallel provider
The first implementation goal is parity for pilot migrations, not maximum generality.
If a design choice is not required to migrate one channel extension and one non-channel extension safely, defer it.
Current Runtime Surfaces That Must Be Accounted For
The current plugin system already owns more than runtime activation.
Before implementation starts, write and maintain a cutover inventory for these surfaces:
- manifest loading and static metadata
- package-level install and onboarding metadata
- discovery, provenance, and origin precedence
- config schema and UI hint loading
- typed hooks and legacy hook bridges
- channels and channel lookup
- providers and provider auth/setup flows
- tools and agent-visible tool catalogs
- HTTP routes and gateway methods
- CLI registrars and plugin commands
- services and context-engine registrations
- slot selection and other existing arbitration paths
- status, reload, install, update, and diagnostics surfaces
Do not treat Phase 5 and Phase 6 as new systems built in isolation.
They must absorb and replace the existing partial catalog and arbitration behaviors rather than creating a second source of truth.
Phase Guide
Phase 0: Lock the boundary
Goal:
- define the kernel versus extension-host boundary in code and imports
- inventory every current plugin-owned surface that crosses that boundary
Deliverables:
- boundary cutover inventory
- anti-corruption interfaces for host-owned registration surfaces
- initial feature flags for host-path versus legacy-path execution
- directory and import boundaries for kernel and extension-host code
Primary docs:
openclaw-kernel-extension-host-transition-plan.mdopenclaw-extension-contribution-schema-spec.md
Exit criteria:
- kernel code does not take new dependencies on legacy plugin shapes
- extension-host directory structure exists
- compatibility-only surfaces are identified
- each current plugin-owned surface is tagged as kernel-owned, host-owned, or compatibility-only
- no new direct writes to global registries are introduced without going through the new boundary
Current implementation status:
- partially implemented
- the code boundary exists in
src/extension-host/* - central active-registry ownership now routes through the host boundary
- several central runtime readers now consume the host-owned boundary instead of reading directly from
src/plugins/runtime.ts - the initial cutover inventory now exists in
src/extension-host/cutover-inventory.mdand is being updated as surfaces move, but the phase is still incomplete because loader orchestration, lifecycle ownership, and later compatibility phases have not moved yet
Phase 1: Define the schema
Goal:
- implement the source-of-truth manifest and contribution types
- preserve existing extension loading while the schema and SDK boundary changes
Primary doc:
openclaw-extension-contribution-schema-spec.md
Deliverables:
- manifest parser
- package metadata parser
- contribution validators
ResolvedExtensionResolvedContribution- typed
ContributionPolicy - static metadata parser
- new versioned SDK contract surface
- minimal SDK compatibility loading surface
- normalized install and onboarding metadata model
Exit criteria:
- extensions can be normalized into static and runtime sections without activating heavy runtime code
- existing extension SDK imports still resolve through the compatibility loading path
Current implementation status:
- partially implemented
ResolvedExtension,ResolvedContribution, andContributionPolicylanded as initial code types- legacy manifest and package metadata now converge into a normalized
resolvedExtensionrecord carried by the manifest registry - discovery, install, and catalog metadata parsing now go through host-owned schema helpers
- partial explicit compatibility now exists through host-owned loader-compat and loader-runtime helpers, but full manifest or contribution validators and a versioned SDK compatibility layer are not implemented yet
Phase 2: Build the extension host
Goal:
- implement discovery, validation, policy, registries, and lifecycle ownership
Primary doc:
openclaw-extension-host-lifecycle-and-security-spec.md
Deliverables:
- discovery pipeline
- activation state machine
- policy evaluator
- host-owned registries
- host-owned adapters for hooks, channels, providers, tools, HTTP routes, gateway methods, CLI, services, commands, and context engines
- per-extension state ownership
- provenance and origin handling
- config redaction-aware schema loading
- reload and route ownership handling
Exit criteria:
- the host can load bundled and external extensions into normalized registries
- the host can populate normalized registries without direct kernel writes except through explicit compatibility adapters
Current implementation status:
- partially implemented in a compatibility-preserving form
- the host owns the active registry state
- the host exposes a resolved-extension registry view for static consumers
- plugin skills, plugin auto-enable, and config validation indexing now consume host-owned resolved-extension data
- activation, loader cache control, loader policy, loader candidate planning, loader import flow, loader runtime decisions, loader post-import register flow, loader candidate orchestration, loader record-state helpers, and loader finalization now route through
src/extension-host/* - lifecycle state ownership, activation states, policy evaluation, and broad host-owned registries are still not implemented
Phase 3: Build compatibility bridges
Goal:
- keep current extensions working through the host without leaking legacy contracts into the kernel
Primary docs:
openclaw-kernel-extension-host-transition-plan.mdopenclaw-extension-contribution-schema-spec.md
Deliverables:
ChannelPlugincompatibility translators- plugin SDK compatibility loading
- runtime-channel namespace translation into the new SDK modules
- legacy setup and CLI translation
- legacy config schema and UI hint translation
- pilot migration matrix with explicit parity labels
Exit criteria:
thread-ownershipruns through the host path as the first non-channel pilottelegramruns through the host path as the first channel pilot- both pilots have explicit parity results for discovery, config, activation, diagnostics, and runtime behavior
Phase 4: Implement the canonical event pipeline
Goal:
- move runtime hook behavior onto explicit canonical events
Primary doc:
openclaw-kernel-event-pipeline-spec.md
Deliverables:
- event type definitions
- stage runner
- sync transcript-write stages
- bridges from legacy hook buses
- mapping table from existing typed and legacy hooks to canonical stages
Exit criteria:
- migrated extensions can use canonical events without relying directly on old plugin hook execution
- pilot hook behaviors have parity coverage against the pre-host path
Phase 5: Implement catalogs
Goal:
- compile runtime-derived agent and internal catalogs, plus host-owned operator catalogs
- replace existing plugin-identity-driven catalog surfaces with canonical family-based catalogs
Primary doc:
openclaw-capability-catalog-and-arbitration-spec.md
Deliverables:
- kernel internal catalog
- kernel agent catalog
- host operator catalog
- static setup and install catalogs
- canonical action registry
- migration plan for existing tool, provider, and setup catalog surfaces
Exit criteria:
- agent-visible tools are compiled from canonical action families instead of plugin identity
- setup and install catalogs no longer depend on duplicated legacy metadata paths
Phase 6: Implement arbitration
Goal:
- resolve overlap, ranking, selection, and slot conflicts deterministically
- absorb the existing slot and provider selection behavior into canonical arbitration
Primary doc:
openclaw-capability-catalog-and-arbitration-spec.md
Deliverables:
- conflict detection
- provider selection
- slot arbitration
- planner-visible name collision handling
- migration plan for existing slot and name-collision behaviors
Exit criteria:
- at least one multi-provider family works through canonical arbitration
- legacy slot and provider-selection paths no longer act as separate arbitration systems
Phase 7: Migrate and remove legacy paths
Goal:
- finish migration and shrink compatibility-only surfaces
Primary docs:
openclaw-kernel-extension-host-transition-plan.md- all other docs as parity references
Deliverables:
- channel migrations
- non-channel extension migrations
- parity tests
- deprecation markers
- removal plan for obsolete compatibility shims
Exit criteria:
- legacy plugin runtime is no longer the default execution path
Pilot Matrix
Initial pilot set:
- non-channel pilot:
thread-ownership - channel pilot:
telegram
Why these pilots:
thread-ownershipexercises typed hook loading without introducing CLI, HTTP route, or service migration at the same timetelegramexercises theChannelPlugincompatibility path with a minimal top-level plugin registration surface
Second-wave compatibility candidates after the pilots are stable:
linefor channel plus command registrationdevice-pairfor command, service, and setup flow coverage
Each pilot must record parity for:
- discovery and precedence
- manifest and static metadata loading
- config schema and UI hints
- enabled and disabled state handling
- activation and reload behavior
- diagnostics and status output
- runtime behavior on the migrated path
- compatibility-only gaps that still remain
Recommended First Implementation Slice
If you want the lowest-risk start, do this first:
- write the boundary cutover inventory
- add source-of-truth types
- add the static metadata and package metadata parsers
- add
ResolvedExtension - add minimal SDK compatibility loading
- add host discovery and validation
- bring
thread-ownershipthrough the host path first - bring
telegramthrough the host path second
Status of this slice:
- steps 2 through 6 are underway
- step 1 is still missing as a formal artifact
- steps 7 and 8 have not started
Concrete landings from this slice:
- the host boundary exists
- source-of-truth schema types exist
- package metadata parsing now routes through the host schema layer
ResolvedExtensionexists in code and is attached to manifest-registry records- host-owned active-registry and resolved-registry views exist
- early static consumers have moved onto the new host-owned data path
Do not start with catalogs or arbitration first.
Also avoid these first-cut traps:
- do not build a broad event scheduling framework before the canonical stages exist
- do not turn permission descriptors into fake sandbox guarantees
- do not build a large operator catalog publication layer before the host registries are real
- do not over-type setup flows before the pilot migrations prove the minimum result model is insufficient
Tracking Rules
When implementation begins:
- update this guide first with phase status
- update the matching spec TODOs when a domain changes
- record where the implementation intentionally diverged from the spec
- record which behaviors are full parity, partial parity, or compatibility-only
- update the pilot parity matrix whenever a migrated surface changes
Suggested Status Format
Use this format in each doc when work starts:
not startedin progressimplementedverifieddeferred
For example:
ResolvedExtensionregistry:implemented- setup fallback removal:
deferred - sync transcript-write parity tests:
in progress