mirror of https://github.com/openclaw/openclaw.git
The 10-second connection timeout in OpenAIRealtimeSTTSession.doConnect() was never cleared on success or teardown, leaking a timer on every connection and accumulating stale timers across reconnect cycles. Store the timeout handle and clear it in both the open handler and close(), matching the existing clearTimeout pattern in waitForTranscript(). Co-authored-by: Sharoon Sharif <ssharif@Hosanna.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| src | ||
| CHANGELOG.md | ||
| README.md | ||
| api.ts | ||
| index.test.ts | ||
| index.ts | ||
| openclaw.plugin.json | ||
| package.json | ||
| runtime-api.ts | ||
| runtime-entry.ts | ||
README.md
@openclaw/voice-call
Official Voice Call plugin for OpenClaw.
Providers:
- Twilio (Programmable Voice + Media Streams)
- Telnyx (Call Control v2)
- Plivo (Voice API + XML transfer + GetInput speech)
- Mock (dev/no network)
Docs: https://docs.openclaw.ai/plugins/voice-call
Plugin system: https://docs.openclaw.ai/plugin
Install (local dev)
Option A: install via OpenClaw (recommended)
openclaw plugins install @openclaw/voice-call
Restart the Gateway afterwards.
Option B: copy into your global extensions folder (dev)
PLUGIN_HOME=~/.openclaw/extensions
mkdir -p "$PLUGIN_HOME"
cp -R <local-plugin-checkout> "$PLUGIN_HOME/voice-call"
cd "$PLUGIN_HOME/voice-call" && pnpm install
Config
Put under plugins.entries.voice-call.config:
{
provider: "twilio", // or "telnyx" | "plivo" | "mock"
fromNumber: "+15550001234",
toNumber: "+15550005678",
twilio: {
accountSid: "ACxxxxxxxx",
authToken: "your_token",
},
telnyx: {
apiKey: "KEYxxxx",
connectionId: "CONNxxxx",
// Telnyx webhook public key from the Telnyx Mission Control Portal
// (Base64 string; can also be set via TELNYX_PUBLIC_KEY).
publicKey: "...",
},
plivo: {
authId: "MAxxxxxxxxxxxxxxxxxxxx",
authToken: "your_token",
},
// Webhook server
serve: {
port: 3334,
path: "/voice/webhook",
},
// Public exposure (pick one):
// publicUrl: "https://example.ngrok.app/voice/webhook",
// tunnel: { provider: "ngrok" },
// tailscale: { mode: "funnel", path: "/voice/webhook" }
outbound: {
defaultMode: "notify", // or "conversation"
},
streaming: {
enabled: true,
streamPath: "/voice/stream",
preStartTimeoutMs: 5000,
maxPendingConnections: 32,
maxPendingConnectionsPerIp: 4,
maxConnections: 128,
},
}
Notes:
- Twilio/Telnyx/Plivo require a publicly reachable webhook URL.
mockis a local dev provider (no network calls).- Telnyx requires
telnyx.publicKey(orTELNYX_PUBLIC_KEY) unlessskipSignatureVerificationis true. - advanced webhook, streaming, and tunnel notes:
https://docs.openclaw.ai/plugins/voice-call
Stale call reaper
See the plugin docs for recommended ranges and production examples:
https://docs.openclaw.ai/plugins/voice-call#stale-call-reaper
TTS for calls
Voice Call uses the core messages.tts configuration for
streaming speech on calls. Override examples and provider caveats live here:
https://docs.openclaw.ai/plugins/voice-call#tts-for-calls
CLI
openclaw voicecall call --to "+15555550123" --message "Hello from OpenClaw"
openclaw voicecall continue --call-id <id> --message "Any questions?"
openclaw voicecall speak --call-id <id> --message "One moment"
openclaw voicecall end --call-id <id>
openclaw voicecall status --call-id <id>
openclaw voicecall tail
openclaw voicecall expose --mode funnel
Tool
Tool name: voice_call
Actions:
initiate_call(message, to?, mode?)continue_call(callId, message)speak_to_user(callId, message)end_call(callId)get_status(callId)
Gateway RPC
voicecall.initiate(to?, message, mode?)voicecall.continue(callId, message)voicecall.speak(callId, message)voicecall.end(callId)voicecall.status(callId)
Notes
- Uses webhook signature verification for Twilio/Telnyx/Plivo.
- Adds replay protection for Twilio and Plivo webhooks (valid duplicate callbacks are ignored safely).
- Twilio speech turns include a per-turn token so stale/replayed callbacks cannot complete a newer turn.
responseModel/responseSystemPromptcontrol AI auto-responses.- Voice-call auto-responses enforce a spoken JSON contract (
{"spoken":"..."}) and filter reasoning/meta output before playback. - While a Twilio stream is active, playback does not fall back to TwiML
<Say>; stream-TTS failures fail the playback request. - Outbound conversation calls suppress barge-in only while the initial greeting is actively speaking, then re-enable normal interruption.
- Twilio stream disconnect auto-end uses a short grace window so quick reconnects do not end the call.
- Media streaming requires
wsand OpenAI Realtime API key.