Introducing Real-Time Audio Streaming

Jul 12, 2023
Introducing Real-Time Audio Streaming

Building truly intelligent voice applications often means grappling with technical limitations such as: 

  • Lack of real-time audio access
  • Difficulty integrating with AI models
  • Limited flexibility for scaling advanced use cases

Plivo’s audio streaming makes this easier. It lets you send real-time call audio to your applications, enabling AI-powered voice agents to respond dynamically. It also comes with features to analyze customer sentiment during calls, and automate tasks like call routing or transfer calls to human agents whenever necessary.In this blog, we’ll explore how audio streaming solves voice infrastructure challenges and help you to deliver next-level customer experiences. Let’s dive in!

How Does Audio Streaming Work?

With audio streaming, businesses transmit the raw media of a live call in real time to their applications or third-party systems.

This illustration shows how a call center might use audio streaming.

Plivo lets you stream audio by using XML instructions or APIs. First, you get a number with Plivo (by signing up on the console), then connect it to an XML application.

Here’s how it works:

XML instructions

With the XML element, you can stream raw media from a live phone call over a WebSocket connection. To begin streaming, return XML like this during the call.

<Response>
<Stream streamTimeout="3600" keepCallAlive="true" bidirectional="true" contentType="audio/x-mulaw;rate=8000" statusCallbackUrl="https://yourdomain.com/callbacks">wss://<yourstream>.ngrok.io/audiostream</Stream>
</Response>

See our XML reference documentation for complete details.

API integration

Alternatively, you can use Plivo APIs to initiate and manage audio streams.

curl -i –user AUTH_ID:AUTH_TOKEN \\
      -H "Content-Type: application/json" \\
      -d '{"service_url": "wss://.ngrok.io/audiostream"}' \\
      https://api.plivo.com/v1/Account/{auth_id}/Call/{call_uuid}/Stream/

See our audio streaming API reference documentation for complete details.

Audio Streaming Use Cases 

Businesses can use audio streaming in multiple ways, especially in tandem with machine learning-based engines and AI voice agents, to enhance value and improve customer experience.

Call centers can leverage Plivo’s audio streaming to extract insights from raw audio data, perform sentiment analysis, implement speech recognition, and analyze voice-related data, enhancing operational efficiency.

You can integrate Plivo’s audio streaming with AI/ML-based tools to provide real-time transcription services.

Similarly, you can integrate audio streams with third-party tools to facilitate live translation of audio content, enabling effective communication across different languages during conferences and meetings.

Here are some key use cases of Plivo audio streaming: 

Smart IVR with dynamic call routing

Core Use Case: Transforming traditional IVR systems into smart, AI-driven IVRs that enable natural language understanding and intelligent call routing.

Example: A telecommunications company implements an AI-Powered IVR where customers can state their concerns naturally (e.g., "I want to upgrade my plan" or "I have a billing issue") instead of navigating menus. 

The system interprets their input, evaluates sentiment, and routes the call to the appropriate team or provides immediate automated assistance, such as sending payment links or resolving common queries. If sentiment analysis detects frustration, the system escalates the call to a specialized agent for de-escalation.

24/7 customer support

Core Use Case: Maintaining consistent customer service quality beyond business hours using AI voice agents.

Example: An e-commerce platform deploys an AI Voice Bot to assist customers during off-hours. If a customer calls at midnight to track a delayed order, the bot retrieves real-time shipping updates from the database, informs the customer, and resolves their issue without human intervention.

Real-time speech recognition, transcription, and analysis

Core Use Case: Transcribing calls in real time by converting spoken language into text for immediate analysis, record-keeping, and automated processing.

Personalized training and real-time coaching for agents

This helps you leverage live call analysis to provide actionable feedback to agents for skill enhancement.

Example: During a customer call, the system evaluates the agent's tone and adherence to protocols, offering real-time coaching tips, such as suggesting a more empathetic response or adjusting the script to align with customer sentiment.

Outbound call automation with AI voice agents

Core use case: Automating outbound communication to improve engagement, gather insights, and provide proactive assistance to customers.

Promotional messages

Example: A retail company uses AI voice agents to inform customers about limited-time discounts and exclusive deals, personalizing the message based on purchase history.

Customer feedback surveys

Example: After a service interaction, the system calls customers to collect feedback through a quick, automated survey, ensuring a seamless user experience.

Proactive customer support

Example: An insurance company proactively calls customers to remind them of upcoming policy renewal deadlines, reducing the likelihood of service lapses.

Post-delivery feedback and assistance

Example: After delivering a product, an AI agent contacts customers to confirm satisfaction and offer assistance if issues arise, such as initiating returns or troubleshooting.

Automation of Voice Channel Operations

Core use case: Streamlining routine voice-based interactions to improve efficiency, reduce costs, and enhance customer satisfaction.

Appointment booking

Example: A healthcare provider uses voice agents to schedule, confirm, or reschedule patient appointments via calls, reducing the need for manual coordination.

Service-related queries

Example: A telecom provider deploys voice agents to answer common service queries, such as "How can I change my plan?" or "What’s my current data usage?"

Delivery status updates

Example: A logistics company enables customers to call and inquire about shipment statuses, with the bot providing real-time updates from the system.

Accepting orders on calls

Example: A restaurant uses voice agents to take orders over the phone, confirm details, and provide estimated delivery times without requiring human intervention.

Get Started with Audio Streaming with Plivo 

By integrating audio streaming with AI voice agents, businesses can provide advanced customer support, gain valuable insights into conversations, and improve customer interactions. 

Couple it with speech recognition, AI call analysis and dual channel recording and have a kickass customer support toolkit.

But, to make the most of all these features, you need a CPaaS solution with superior voice infrastructure. Plivo offers superior voice network to launch context-aware voice agents and voice assistants trained on your proprietary knowledge base that transform the customer experience

Here’s why Plivo should be your first choice:

  • Superior call quality: Plivo ensures superior call quality through its partnerships with Tier 1 carriers and optimized routing
  • Low latency: Global Infrastructure enables low latency communications. Our PoPs are located in seven locations (California, Virginia, Frankfurt, Mumbai, Singapore, Sydney, São Paulo) across five continents
  • High availability and uptime: With a redundant infrastructure across multiple geographies and at least three local carrier connections across countries, Zentrunk promises 99.99% uptime
  • Full redundancy: Redundant links reroute traffic over backup networks in less than two seconds in case of backbone failover 

We can integrate with all top players in the market from STT providers, LLM models and TTS providers. 

The best part? You get started by paying 40% less than all competitors. 

All you have to do is:

  1. Sign up with Plivo.
  2. Procure a number via the API or console.
  3. Associate the number with an application responsible for initiating a call to the agent and establishing audio streaming over the WebSocket to your application. A sample reference is shown below.

At Plivo, Audio streaming is priced at $0.003 per minute per stream, over and above the expected charges for voice minutes associated with a call. 

Note: Pricing is subject to change — check our pricing page for the most up-to-date information.

If you don’t already have a Plivo account, sign up today for free.

You can also connect with team Plivo to find out how we can help with your use case and get volume pricing for the same.

Supercharge your customers' interactions through audio streaming with Plivo powered AI voice agents.

Get Volume Pricing

Thousands of businesses in more than 220 countries trust Plivo’s cloud communications platform

The best communications platform forthe world’s leading entertainment service

Frequently asked questions

No items found.
footer bg

Subscribe to Our Newsletter

Get monthly product and feature updates, the latest industry news, and more!

Thank you icon
Thank you!
Thank you for subscribing
Oops! Something went wrong while submitting the form.