twilio-voice-twiml

Name: twilio-voice-twiml
Author: openai/plugins
$npx mdskill add openai/plugins/twilio-voice-twiml
Define voice call behavior using TwiML for inbound and outbound calls
Build interactive voice response (IVR) systems for customer service
Uses Twilio's TwiML verbs like Say, Play, Gather, and Dial
Generates TwiML dynamically with Python or Node.js SDKs
Executes call logic via Twilio webhooks and HTTP responses
SKILL.md
.github/skills/twilio-voice-twimlView on GitHub ↗
---
name: twilio-voice-twiml
description: >
  Build voice call logic using TwiML (Twilio Markup Language). Covers the
  core verbs (Say, Play, Gather, Dial, Record, Conference), generating TwiML
  with Python and Node.js SDKs, and a complete inbound call IVR example. Use
  this skill to define call behavior for inbound or outbound calls.
---

## Overview

TwiML is XML that Twilio executes during a call. Your server returns a TwiML document in response to a Twilio webhook POST, and Twilio executes it.

```
Caller → Twilio → POST to your webhook → Your server returns TwiML → Twilio executes it
```

---

## Prerequisites

- Twilio account with a voice-capable phone number
  — New to Twilio? See `twilio-account-setup`
- Webhook endpoint returning TwiML with `Content-Type: text/xml`
- SDK (for programmatic generation): `pip install twilio` / `npm install twilio`

---

## Quickstart

A minimal inbound call handler that greets the caller and presents a menu:

**Python (Flask)**
```python
from flask import Flask, request
from twilio.twiml.voice_response import VoiceResponse

app = Flask(__name__)

@app.route("/voice", methods=["POST"])
def handle_call():
    response = VoiceResponse()
    gather = response.gather(num_digits=1, action="/menu-choice")
    gather.say("Welcome to Acme. Press 1 for sales, 2 for support.")
    response.redirect("/voice")  # Loop if no input
    return str(response)

@app.route("/menu-choice", methods=["POST"])
def menu_choice():
    digit = request.form.get("Digits")
    response = VoiceResponse()
    if digit == "1":
        response.dial("+15551234567")
    elif digit == "2":
        response.say("Connecting to support.")
        response.dial("+15559876543")
    else:
        response.say("Invalid option.")
        response.redirect("/voice")
    return str(response)
```

**Node.js (Express)**
```node
const { VoiceResponse } = require("twilio").twiml;

app.post("/voice", (req, res) => {
    const response = new VoiceResponse();
    const gather = response.gather({ numDigits: 1, action: "/menu-choice" });
    gather.say("Welcome. Press 1 for sales, 2 for support.");
    response.redirect("/voice");
    res.type("text/xml").send(response.toString());
});

app.post("/menu-choice", (req, res) => {
    const digit = req.body.Digits;
    const response = new VoiceResponse();
    if (digit === "1") response.dial("+15551234567");
    else response.say("Invalid option.").redirect("/voice");
    res.type("text/xml").send(response.toString());
});
```

---

## Core Verbs

### Say — Text-to-speech

**Python**
```python
from twilio.twiml.voice_response import VoiceResponse

response = VoiceResponse()
response.say("Your appointment is confirmed.", voice="alice", language="en-US")
```

**Node.js**
```node
const { VoiceResponse } = require("twilio").twiml;
const response = new VoiceResponse();
response.say({ voice: "alice", language: "en-US" }, "Your appointment is confirmed.");
```

Voices: `alice` (default), `man`, `woman`, or Polly/Google TTS (e.g. `Polly.Joanna`).

### Gather — Collect keypad input or speech

**Python**
```python
response = VoiceResponse()
gather = response.gather(num_digits=1, action="/handle-input", method="POST")
gather.say("Press 1 for sales, press 2 for support.")
response.say("We did not receive your input.")  # Fallback if no input
```

**Node.js**
```node
const gather = response.gather({ numDigits: 1, action: "/handle-input", method: "POST" });
gather.say("Press 1 for sales, press 2 for support.");
response.say("We did not receive your input.");
```

Twilio POSTs collected digits to `action` as `Digits` parameter.

### Play — Play an audio file

**Python**
```python
response = VoiceResponse()
response.play("https://example.com/audio/greeting.mp3")
```

**Node.js**
```node
const response = new VoiceResponse();
response.play("https://example.com/audio/greeting.mp3");
```

Supported formats: MP3, WAV. URL must be publicly accessible.

### Dial — Connect to another number

**Python**
```python
from twilio.twiml.voice_response import Dial

response = VoiceResponse()
dial = Dial(action="/dial-complete")
dial.number("+15558675310")
response.append(dial)
```

**Node.js**
```node
const dial = response.dial({ action: "/dial-complete" });
dial.number("+15558675310");
```

### Record — Capture caller audio

**Python**
```python
response = VoiceResponse()
response.say("Leave a message after the beep.")
response.record(
    action="/recording-complete",
    max_length=60,
    transcribe=True,
    transcribe_callback="/transcription-ready"
)
```

**Node.js**
```node
const response = new VoiceResponse();
response.say("Leave a message after the beep.");
response.record({
    action: "/recording-complete",
    maxLength: 60,
    transcribe: true,
    transcribeCallback: "/transcription-ready",
});
```

### Voicemail — Record a message when no one answers

Use `<Dial>` with `action` URL + `<Record>` in the action handler. When the dial times out or the callee is busy, the action URL serves TwiML with `<Record>`.

**Python**
```python
# Primary TwiML — try to connect the call
response = VoiceResponse()
dial = Dial(action="/voicemail", timeout=20)  # 20 seconds before voicemail
dial.number("+15558675310")
response.append(dial)

# /voicemail handler — plays if no answer
def voicemail_handler(request):
    response = VoiceResponse()
    response.say("We missed your call. Please leave a message after the beep.")
    response.record(
        action="/recording-complete",
        max_length=120,
        transcribe=True,
        transcribe_callback="/transcription-ready",
        play_beep=True
    )
    response.say("We didn't receive a recording. Goodbye.")
    return str(response)
```

**Node.js**
```node
// Primary TwiML — try to connect the call
const response = new VoiceResponse();
const dial = response.dial({ action: "/voicemail", timeout: 20 });
dial.number("+15558675310");

// /voicemail handler — plays if no answer
app.post("/voicemail", (req, res) => {
    const response = new VoiceResponse();
    response.say("We missed your call. Please leave a message after the beep.");
    response.record({
        action: "/recording-complete",
        maxLength: 120,
        transcribe: true,
        transcribeCallback: "/transcription-ready",
        playBeep: true,
    });
    response.say("We didn't receive a recording. Goodbye.");
    res.type("text/xml").send(response.toString());
});
```

**Important:** `<Record>` captures the caller only (voicemail-style). It is NOT for recording two-party calls — see `twilio-call-recordings` for that.

### Conference — Multi-party calls

**Python**
```python
response = VoiceResponse()
dial = response.dial()
dial.conference(
    "Daily Standup",
    start_conference_on_enter=True,
    end_conference_on_exit=True
)
```

**Node.js**
```node
const response = new VoiceResponse();
const dial = response.dial();
dial.conference("Daily Standup", {
    startConferenceOnEnter: true,
    endConferenceOnExit: true,
});
```

### Pay — PCI-compliant payment collection

> **Critical warnings:**
> - Pay Connectors are **Console-only** — there is no REST API to create or manage connectors. Set up in Console > Voice > Pay Connectors before coding.
> - **PCI Mode is IRREVERSIBLE** once enabled on an account. Use a dedicated sub-account for payment calls.

**Python**
```python
response = VoiceResponse()
response.say("We'll now collect your payment.")
pay = Pay(
    payment_connector="stripe_connector",  # Name from Console setup
    charge_amount="49.99",
    currency="usd",
    action="/payment-complete",
    status_callback="/payment-status"
)
response.append(pay)
```

**Node.js**
```node
const response = new VoiceResponse();
response.say("We'll now collect your payment.");
response.pay({
    paymentConnector: "stripe_connector",
    chargeAmount: "49.99",
    currency: "usd",
    action: "/payment-complete",
    statusCallback: "/payment-status",
});
```

Supported processors: Stripe, Braintree, CardConnect. Card data routes directly to the processor — never touches your server.

---

## Production Deployment

### Webhook Hosting

For production, do NOT use ngrok. Deploy your TwiML server with HTTPS:

- **Requirement**: Public HTTPS URL, responds within 15 seconds, returns `Content-Type: text/xml`
- **Options**: Cloud Run, AWS Lambda + API Gateway, Railway, Render — any service with TLS and auto-scaling
- **Fallback URL**: Configure in Console (Phone Numbers > Active Numbers > select number) for when your primary server is unreachable

### State Between TwiML Requests

Each webhook request is stateless. To maintain conversation state across interactions:

- **URL query params**: Pass state in `action` URLs — `/next-step?language=es&dept=sales`
- **Session store**: Use Redis or a database keyed by `CallSid`
- **Do NOT use in-memory state** — your server may scale to multiple instances

### Monitoring

- **Status callbacks**: Track call lifecycle events (`statusCallback` on the call or number config)
- **Voice Insights**: Automatic quality metrics per call (Console > Monitor > Insights)
- **Debugger**: Console > Monitor > Errors for TwiML parsing failures and webhook timeouts
- **Fallback URLs**: Always configure a fallback TwiML URL — serves a graceful message if your primary endpoint fails

---

## Webhook Request Parameters

| Parameter | Description |
|-----------|-------------|
| `CallSid` | Unique call identifier |
| `From` | Caller's number |
| `To` | Called number |
| `CallStatus` | Current status |
| `Direction` | `inbound` or `outbound-api` |

---

## CANNOT

- **Cannot return TwiML without correct content type** — Must use `Content-Type: text/xml`
- **Cannot exceed 15-second webhook response time** — Twilio times out and falls back
- **Cannot exceed 4,096 characters in `<Say>` verb** — Split longer text across multiple `<Say>` elements
- **Cannot create Pay Connectors via API** — Pay Connectors are Console-only (Console > Voice > Pay Connectors). No REST API exists for connector management.
- **Cannot reverse PCI Mode** — Once enabled on an account, PCI Mode is permanent and account-wide. Use a dedicated sub-account for payment calls.
- **Cannot use `<Record>` for two-party call recording** — `<Record>` captures the caller only (voicemail-style). For dual-channel recording of both parties, use `record=True` on `calls.create()` or the Recordings API.

---

## Next Steps

- **Place outbound calls (AMD, conferencing):** `twilio-voice-outbound-calls`
- **AI voice agents with real-time speech/LLM:** `twilio-voice-conversation-relay`