Skip to main content

Overview

AWSTranscribeSTTService provides real-time speech recognition using Amazon Transcribe’s WebSocket streaming API with support for interim results, multiple languages, and configurable audio processing parameters for enterprise-grade transcription.

AWS Transcribe STT API Reference

Pipecat’s API methods for AWS Transcribe integration

Example Implementation

Complete example with AWS services integration

AWS Transcribe Documentation

Official AWS Transcribe documentation and features

AWS Console

Access AWS Transcribe services and IAM setup

Installation

To use AWS Transcribe services, install the required dependency:
pip install "pipecat-ai[aws]"

Prerequisites

AWS Account Setup

Before using AWS Transcribe STT services, you need:
  1. AWS Account: Sign up at AWS Console
  2. IAM User: Create an IAM user with Amazon Transcribe permissions
  3. Credentials: Set up AWS access keys and region configuration

Required Environment Variables

  • AWS_ACCESS_KEY_ID: Your AWS access key ID
  • AWS_SECRET_ACCESS_KEY: Your AWS secret access key
  • AWS_SESSION_TOKEN: Session token (if using temporary credentials)
  • AWS_REGION: AWS region (defaults to “us-east-1”)

Configuration

api_key
str
default:"None"
AWS secret access key. If None, uses AWS_SECRET_ACCESS_KEY environment variable.
aws_access_key_id
str
default:"None"
AWS access key ID. If None, uses AWS_ACCESS_KEY_ID environment variable.
aws_session_token
str
default:"None"
AWS session token for temporary credentials. If None, uses AWS_SESSION_TOKEN environment variable.
region
str
default:"None"
AWS region for the service. If None, uses AWS_REGION environment variable (defaults to "us-east-1").
sample_rate
int
default:"None"
Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate. AWS Transcribe only supports 8000 or 16000 Hz; other values are clamped to 16000 Hz at connect time.
language
Language
default:"Language.EN"
deprecated
Language for transcription. Supports a wide range of languages including English, Spanish, French, German, and many more. See AWS Transcribe supported languages. Deprecated in v0.0.105. Use settings=AWSTranscribeSTTService.Settings(...) instead.
settings
AWSTranscribeSTTService.Settings
default:"None"
Runtime-configurable settings for the STT service. See Settings below.
ttfs_p99_latency
float
default:"AWS_TRANSCRIBE_TTFS_P99"
P99 latency from speech end to final transcript in seconds. Override for your deployment.

Settings

Runtime-configurable settings passed via the settings constructor argument using AWSTranscribeSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneSTT model identifier. (Inherited from base STT settings.)
languageLanguage | strLanguage.ENLanguage for transcription. (Inherited from base STT settings.)

Usage

Basic Setup

from pipecat.services.aws.stt import AWSTranscribeSTTService

stt = AWSTranscribeSTTService(
    api_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region=os.getenv("AWS_REGION", "us-east-1"),
)

With Custom Language and Sample Rate

from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.transcriptions.language import Language

stt = AWSTranscribeSTTService(
    api_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="eu-west-1",
    sample_rate=8000,
    settings=AWSTranscribeSTTService.Settings(
        language=Language.ES,
    ),
)

Notes

  • Supported sample rates: AWS Transcribe only supports 8000 Hz and 16000 Hz. If a different rate is provided, the service automatically falls back to 16000 Hz with a warning.
  • Pre-signed URL authentication: The service uses pre-signed URLs for WebSocket authentication rather than passing credentials directly, following AWS best practices.
  • Partial results stabilization: Enabled by default with "high" stability, which reduces changes to interim transcripts at the cost of slightly higher latency.
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Event Handlers

AWS Transcribe STT supports the standard service connection events:
EventDescription
on_connectedConnected to AWS Transcribe WebSocket
on_disconnectedDisconnected from AWS Transcribe WebSocket
@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to AWS Transcribe")