Skip to main content
During a Media Stream, Plivo sends messages to your WebSocket server that provide information about the Stream. For bidirectional Media Streams, your server can also send messages back to Plivo. This document covers each type of message that your WebSocket server can send and receive when using Plivo Audio Streams.

Overview

Mermaid Chart

Table of Contents


Messages from Plivo

Plivo sends the following message types to your WebSocket server during a Stream:
Message TypeDescription
startStream metadata, sent once when connection opens
mediaAudio data chunks from the caller
dtmfTouch-tone keypress from the caller
playedStreamConfirmation that a checkpoint was reached
clearedAudioConfirmation that the audio queue was cleared

start

Plivo sends the start message immediately when the WebSocket connection is established. It contains metadata about the Stream and is only sent once at the beginning.
PropertyTypeDescription
eventstringAlways "start"
sequenceNumbernumberMessage sequence number (starts at 1, increments for each subsequent message)
start.callIdstring (UUID)Unique identifier for the call
start.streamIdstring (UUID)Unique identifier for the stream
start.accountIdstringYour Plivo account ID
start.tracksstring[]Audio tracks being streamed (e.g., ["inbound"], ["inbound", "outbound"])
start.mediaFormat.encodingstringAudio codec (e.g., "audio/x-mulaw")
start.mediaFormat.sampleRatenumberSample rate in Hz (e.g., 8000)
extra_headersstringCustom headers from the Stream XML extraHeaders attribute
Example:
{
  "event": "start",
  "sequenceNumber": 1,
  "start": {
    "callId": "12345678-1234-1234-1234-123456789abc",
    "streamId": "87654321-4321-4321-4321-cba987654321",
    "accountId": "MAXXXXXXXXXXXXXXXXXX",
    "tracks": ["inbound"],
    "mediaFormat": {
      "encoding": "audio/x-mulaw",
      "sampleRate": 8000
    }
  },
  "extra_headers": "userId=12345;sessionId=abc-xyz"
}

media

Plivo sends media messages continuously throughout the call. Each message contains a chunk of audio data from the caller.
PropertyTypeDescription
eventstringAlways "media"
sequenceNumbernumberMessage sequence number (increments for each message)
streamIdstring (UUID)The unique identifier of the Stream
media.trackstringThe audio track ("inbound" = audio from the caller)
media.timestampstringUnix timestamp in milliseconds
media.chunknumberChunk sequence number for this track
media.payloadstringBase64-encoded audio data
extra_headersstringCustom headers from the Stream XML
Example:
{
  "event": "media",
  "sequenceNumber": 42,
  "streamId": "87654321-4321-4321-4321-cba987654321",
  "media": {
    "track": "inbound",
    "timestamp": "1705312200000",
    "chunk": 41,
    "payload": "//uQxAAAAAANIAAAAAExBTUUzLjEwMFVV..."
  },
  "extra_headers": "userId=12345;sessionId=abc-xyz"
}
Audio chunk details:
  • Each chunk contains approximately 20ms of audio
  • At 8kHz with μ-law encoding: ~160 bytes per chunk
  • Decode using: Buffer.from(payload, 'base64')

dtmf

Plivo sends a dtmf message when the caller presses a touch-tone key on their phone, typically in response to a prompt.
PropertyTypeDescription
eventstringAlways "dtmf"
sequenceNumbernumberMessage sequence number
streamIdstring (UUID)The unique identifier of the Stream
dtmf.trackstringThe audio track (always "inbound")
dtmf.digitstringThe key pressed (0-9, *, #, or A-D)
dtmf.timestampstringUnix timestamp in milliseconds
extra_headersstringCustom headers from the Stream XML
Example:
{
  "event": "dtmf",
  "sequenceNumber": 50,
  "streamId": "87654321-4321-4321-4321-cba987654321",
  "dtmf": {
    "track": "inbound",
    "digit": "5",
    "timestamp": "1705312250000"
  },
  "extra_headers": "userId=12345;sessionId=abc-xyz"
}

playedStream

Plivo sends the playedStream message when audio playback reaches a checkpoint that you previously set. Use this to track when specific audio segments finish playing.
PropertyTypeDescription
eventstringAlways "playedStream"
sequenceNumbernumberMessage sequence number
streamIdstring (UUID)The unique identifier of the Stream
namestringThe checkpoint name you specified
Example:
{
  "event": "playedStream",
  "sequenceNumber": 75,
  "streamId": "87654321-4321-4321-4321-cba987654321",
  "name": "greeting-complete"
}

clearedAudio

Plivo sends the clearedAudio message to confirm that the audio playback queue has been emptied after you sent a clearAudio command.
PropertyTypeDescription
eventstringAlways "clearedAudio"
sequenceNumbernumberMessage sequence number
streamIdstring (UUID)The unique identifier of the Stream
Example:
{
  "event": "clearedAudio",
  "sequenceNumber": 80,
  "streamId": "87654321-4321-4321-4321-cba987654321"
}

Messages to Plivo

If you initiated a bidirectional Stream, your WebSocket server can send messages back to Plivo to play audio and control the Stream.
Message TypeDescription
playAudioSend audio to be played to the caller
checkpointMark a point in the audio queue for tracking
clearAudioInterrupt and clear all queued audio

playAudio

Send a playAudio message to play audio to the caller. Audio messages are buffered and played in the order received.
PropertyTypeDescription
eventstringAlways "playAudio"
media.contentTypestringAudio MIME type (must match stream’s contentType)
media.sampleRatenumberSample rate in Hz (must match stream’s sample rate)
media.payloadstringBase64-encoded audio data
Example:
{
  "event": "playAudio",
  "media": {
    "contentType": "audio/x-mulaw",
    "sampleRate": 8000,
    "payload": "//uQxAAAAAANIAAAAAExBTUUzLjEwMFVV..."
  }
}
⚠️ Important: The contentType and sampleRate must match the values specified in your Stream XML. Mismatched formats will cause audio playback issues.
⚠️ Warning: The media.payload should contain raw audio data only—do not include audio file headers (e.g., WAV headers). Including headers will cause playback distortion.

checkpoint

Send a checkpoint message after sending audio to receive a playedStream notification when playback reaches that point. This helps you track which audio has been played.
PropertyTypeDescription
eventstringAlways "checkpoint"
streamIdstring (UUID)The unique identifier of the Stream
namestringA custom name to identify this checkpoint
Example:
{
  "event": "checkpoint",
  "streamId": "87654321-4321-4321-4321-cba987654321",
  "name": "greeting-complete"
}
Use cases:
  • Know when a specific response finishes playing
  • Coordinate follow-up actions after playback
  • Measure end-to-end latency from audio send to playback completion

clearAudio

Send a clearAudio message to immediately stop playback and empty the audio buffer. Use this to implement user interruption—when the caller starts speaking while audio is playing.
PropertyTypeDescription
eventstringAlways "clearAudio"
streamIdstring (UUID)The unique identifier of the Stream
Example:
{
  "event": "clearAudio",
  "streamId": "87654321-4321-4321-4321-cba987654321"
}
After sending this message, Plivo will respond with a clearedAudio confirmation.

Protocol Schema Reference

JSON Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "StartEvent": {
      "type": "object",
      "required": ["event", "sequenceNumber", "start", "extra_headers"],
      "properties": {
        "event": { "const": "start" },
        "sequenceNumber": { "type": "integer", "minimum": 1 },
        "start": {
          "type": "object",
          "required": ["callId", "streamId", "accountId", "tracks", "mediaFormat"],
          "properties": {
            "callId": { "type": "string", "format": "uuid" },
            "streamId": { "type": "string", "format": "uuid" },
            "accountId": { "type": "string" },
            "tracks": { "type": "array", "items": { "type": "string" } },
            "mediaFormat": {
              "type": "object",
              "required": ["encoding", "sampleRate"],
              "properties": {
                "encoding": { "type": "string" },
                "sampleRate": { "type": "integer" }
              }
            }
          }
        },
        "extra_headers": { "type": "string" }
      }
    },
    "MediaEvent": {
      "type": "object",
      "required": ["event", "sequenceNumber", "streamId", "media", "extra_headers"],
      "properties": {
        "event": { "const": "media" },
        "sequenceNumber": { "type": "integer" },
        "streamId": { "type": "string", "format": "uuid" },
        "media": {
          "type": "object",
          "required": ["track", "timestamp", "chunk", "payload"],
          "properties": {
            "track": { "type": "string" },
            "timestamp": { "type": "string" },
            "chunk": { "type": "integer" },
            "payload": { "type": "string", "contentEncoding": "base64" }
          }
        },
        "extra_headers": { "type": "string" }
      }
    },
    "DTMFEvent": {
      "type": "object",
      "required": ["event", "sequenceNumber", "streamId", "dtmf", "extra_headers"],
      "properties": {
        "event": { "const": "dtmf" },
        "sequenceNumber": { "type": "integer" },
        "streamId": { "type": "string", "format": "uuid" },
        "dtmf": {
          "type": "object",
          "required": ["track", "digit", "timestamp"],
          "properties": {
            "track": { "type": "string" },
            "digit": { "type": "string", "pattern": "^[0-9*#A-D]$" },
            "timestamp": { "type": "string" }
          }
        },
        "extra_headers": { "type": "string" }
      }
    },
    "PlayedStreamEvent": {
      "type": "object",
      "required": ["event", "sequenceNumber", "streamId", "name"],
      "properties": {
        "event": { "const": "playedStream" },
        "sequenceNumber": { "type": "integer" },
        "streamId": { "type": "string", "format": "uuid" },
        "name": { "type": "string" }
      }
    },
    "ClearedAudioEvent": {
      "type": "object",
      "required": ["event", "sequenceNumber", "streamId"],
      "properties": {
        "event": { "const": "clearedAudio" },
        "sequenceNumber": { "type": "integer" },
        "streamId": { "type": "string", "format": "uuid" }
      }
    },
    "PlayAudioEvent": {
      "type": "object",
      "required": ["event", "media"],
      "properties": {
        "event": { "const": "playAudio" },
        "media": {
          "type": "object",
          "required": ["contentType", "sampleRate", "payload"],
          "properties": {
            "contentType": { "type": "string" },
            "sampleRate": { "type": "integer" },
            "payload": { "type": "string", "contentEncoding": "base64" }
          }
        }
      }
    },
    "CheckpointEvent": {
      "type": "object",
      "required": ["event", "streamId", "name"],
      "properties": {
      "event": { "const": "checkpoint" },
      "streamId": { "type": "string", "format": "uuid" },
      "name": { "type": "string" }
      }
    },
    "ClearAudioEvent": {
      "type": "object",
      "required": ["event", "streamId"],
      "properties": {
        "event": { "const": "clearAudio" },
        "streamId": { "type": "string", "format": "uuid" }
      }
    }
  }
}