Skip to main content
Pipecat’s RTVI (Real-Time Voice Interaction) protocol provides a standardized communication layer between clients and servers for building real-time voice and multimodal applications. It handles the synchronization of user and bot interactions, transcriptions, LLM processing, and text-to-speech delivery. This page provides an overview of RTVI from the server’s perspective and how to use it in your bot applications.

RTVI Protocol

A complete specification of the RTVI protocol for client-server communication.

Architecture

RTVI operates with two primary components:
  1. RTVIProcessor - A frame processor residing in the pipeline that serves as the entry point for sending and receiving messages to/from the client.
  2. RTVIObserver - An observer that monitors pipeline events and translates them into client-compatible messages, handling:
    • Speaking state changes
    • Transcription updates
    • LLM responses
    • TTS events
    • Performance metrics
RTVI is enabled by default. When you create a PipelineTask, it automatically adds RTVIProcessor to the start of your pipeline and registers an RTVIObserver. The default on_client_ready handler calls set_bot_ready() automatically.

Basic Example

With automatic RTVI setup, your pipeline code can focus on core functionality:
pipeline = Pipeline(
    [
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ]
)

# Add the RTVIObserver to your pipeline task
task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
)

# Access the RTVI processor via task.rtvi
@task.rtvi.event_handler("on_client_ready")
async def on_client_ready(rtvi):
    # set_bot_ready() is called automatically, add custom logic here
    await task.queue_frames([LLMRunFrame()])

# Handle participant disconnection
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
    await task.cancel()

# Run the pipeline
runner = PipelineRunner()
await runner.run(task)

Customizing RTVI

You can customize RTVI behavior through PipelineTask parameters:
from pipecat.processors.frameworks.rtvi import RTVIProcessor, RTVIObserverParams

task = PipelineTask(
    pipeline,
    rtvi_processor=RTVIProcessor(),  # Provide your own processor
    rtvi_observer_params=RTVIObserverParams(...),  # Customize observer
)
To disable RTVI entirely:
task = PipelineTask(pipeline, enable_rtvi=False)

Protocol Flow

  1. Client connects and sends a client-ready message
  2. Server responds with bot-ready and initial configuration
  3. Client and server exchange real-time events:
    • Speaking state changes (user/bot-started/stopped-speaking)
    • Transcriptions (user-transcription/bot-output)
    • LLM processing (bot-llm-started/stopped, bot-llm-text, llm-function-call)
    • TTS events (bot-tts-started/stopped, bot-tts-text, bot-tts-audio)

Key Components

RTVIProcessor

Configure and manage RTVI services, actions, and client communication

RTVIObserver

Translate internal pipeline events to standardized client messages

Client Integration

RTVI is implemented in Pipecat client SDKs, providing a high-level API to interact with the protocol. Visit the Pipecat Client SDKs documentation:

Client SDKs

Learn how to implement RTVI on the client-side with our JavaScript, React, and mobile SDKs