Skip to main content

Overview

Inworld provides high-quality, low-latency speech synthesis via two implementation types: InworldTTSService for real-time, minimal-latency use-cases through websockets and InworldHttpTTSService for streaming and non-streaming use-cases over HTTP. Featuring support for 12+ languages, timestamps, custom pronunciation and instant voice cloning.

Inworld TTS API Reference

Pipecat’s API methods for Inworld TTS integration

Example Implementation (Websockets)

Complete example with Inworld TTS

Inworld Documentation

Official Inworld TTS API documentation

Inworld Portal

Create and manage voice models

Installation

To use Inworld services, no additional dependencies are required beyond the base installation:
pip install "pipecat-ai"

Prerequisites

Inworld Account Setup

Before using Inworld TTS services, you need:
  1. Inworld Account: Sign up at Inworld Studio
  2. API Key: Generate an API key from your account dashboard
  3. Voice Selection: Choose from available voice models

Required Environment Variables

  • INWORLD_API_KEY: Your Inworld API key for authentication

Configuration

InworldTTSService

WebSocket-based service for lowest latency streaming.
api_key
str
required
Inworld API key.
voice_id
str
default:"Ashley"
deprecated
ID of the voice to use for synthesis. Deprecated in v0.0.105. Use settings=InworldTTSService.Settings(voice=...) instead.
model
str
default:"inworld-tts-1.5-max"
deprecated
ID of the model to use for synthesis. Deprecated in v0.0.105. Use settings=InworldTTSService.Settings(model=...) instead.
url
str
URL of the Inworld WebSocket API.
sample_rate
int
default:"None"
Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
encoding
str
default:"LINEAR16"
Audio encoding format.
text_aggregation_mode
TextAggregationMode
default:"TextAggregationMode.SENTENCE"
Controls how incoming text is aggregated before synthesis. SENTENCE (default) buffers text until sentence boundaries, producing more natural speech. TOKEN streams tokens directly for lower latency. Import from pipecat.services.tts_service.
aggregate_sentences
bool
default:"None"
deprecated
Deprecated in v0.0.104. Use text_aggregation_mode instead.
append_trailing_space
bool
default:"True"
Whether to append a trailing space to text before sending to TTS.
params
InputParams
default:"None"
deprecated
Deprecated in v0.0.105. Use settings=InworldTTSService.Settings(...) instead.
settings
InworldTTSService.Settings
default:"None"
Runtime-configurable settings. See InworldTTSService Settings below.

InworldTTSService Settings

Runtime-configurable settings passed via the settings constructor argument using InworldTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneModel identifier. (Inherited.)
voicestrNoneVoice identifier. (Inherited.)
languageLanguage | strNoneLanguage for synthesis. (Inherited.)
speaking_ratefloatNOT_GIVENSpeaking rate for speech synthesis.
temperaturefloatNOT_GIVENTemperature for speech synthesis.

InworldHttpTTSService

HTTP-based service supporting both streaming and non-streaming modes.
api_key
str
required
Inworld API key.
aiohttp_session
aiohttp.ClientSession
required
aiohttp ClientSession for HTTP requests.
voice_id
str
default:"Ashley"
deprecated
ID of the voice to use for synthesis. Deprecated in v0.0.105. Use settings=InworldHttpTTSService.Settings(voice=...) instead.
model
str
default:"inworld-tts-1.5-max"
deprecated
ID of the model to use for synthesis. Deprecated in v0.0.105. Use settings=InworldHttpTTSService.Settings(model=...) instead.
streaming
bool
default:"True"
Whether to use streaming mode.
sample_rate
int
default:"None"
Audio sample rate in Hz.
encoding
str
default:"LINEAR16"
Audio encoding format.
params
InputParams
default:"None"
deprecated
Deprecated in v0.0.105. Use settings=InworldHttpTTSService.Settings(...) instead.
settings
InworldHttpTTSService.Settings
default:"None"
Runtime-configurable settings. See InworldTTSService Settings below.

Usage

Basic Setup (WebSocket)

from pipecat.services.inworld import InworldTTSService

tts = InworldTTSService(
    api_key=os.getenv("INWORLD_API_KEY"),
    settings=InworldTTSService.Settings(
        voice="Ashley",
    ),
)

With Custom Settings

tts = InworldTTSService(
    api_key=os.getenv("INWORLD_API_KEY"),
    settings=InworldTTSService.Settings(
        voice="Ashley",
        model="inworld-tts-1.5-max",
        temperature=0.8,
        speaking_rate=1.1,
    ),
)

HTTP Service

import aiohttp
from pipecat.services.inworld import InworldHttpTTSService

async with aiohttp.ClientSession() as session:
    tts = InworldHttpTTSService(
        api_key=os.getenv("INWORLD_API_KEY"),
        aiohttp_session=session,
        voice_id="Ashley",
        streaming=True,
    )
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • WebSocket vs HTTP: The WebSocket service (InworldTTSService) provides the lowest latency with bidirectional streaming and supports multiple independent audio contexts per connection (max 5). The HTTP service supports both streaming and non-streaming modes via the streaming parameter.
  • Word timestamps: Both services provide word-level timestamps for synchronized text display. Timestamps are tracked cumulatively across utterances within a turn.
  • Auto mode: When auto_mode=True (default), the server controls flushing of buffered text for optimal latency and quality. This is recommended when text is sent in full sentences or phrases (i.e., when using text_aggregation_mode=TextAggregationMode.SENTENCE).
  • Keepalive: The WebSocket service sends periodic keepalive messages every 60 seconds to maintain the connection.

Event Handlers

Inworld TTS supports the standard service connection events:
EventDescription
on_connectedConnected to Inworld WebSocket
on_disconnectedDisconnected from Inworld WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Inworld")