Skip to main content

Overview

TogetherTTSService provides real-time text-to-speech using Together AI’s WebSocket API. It supports streaming synthesis with configurable voice and model options, interruption handling, and automatic reconnection.

Together AI TTS API Reference

Pipecat’s API methods for Together AI TTS

Example Implementation

Complete voice bot example

Together AI Documentation

Official Together AI TTS WebSocket API documentation

Together AI Platform

Access models and manage API keys

Installation

To use Together AI TTS services, install the required dependencies:
uv add "pipecat-ai[together]"

Prerequisites

Together AI Account Setup

Before using Together AI TTS services, you need:
  1. Together AI Account: Sign up at Together AI
  2. API Key: Generate an API key from your account dashboard
  3. Model Selection: Choose from available TTS models and voices

Required Environment Variables

  • TOGETHER_API_KEY: Your Together AI API key for authentication

Configuration

api_key
str
required
Together AI API key for authentication.
url
str
default:"wss://api.together.ai/v1/audio/speech/websocket"
WebSocket URL for Together AI TTS API.
sample_rate
int
default:"24000"
Output sample rate for emitted PCM frames. Together AI streams at 24 kHz and does not support other rates.
settings
TogetherTTSService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using TogetherTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstr"hexgrad/Kokoro-82M"Model identifier. (Inherited.)
voicestr"af_heart"Voice identifier. (Inherited.)
languageLanguage | strLanguage.ENLanguage for synthesis. (Inherited.)
max_partial_lengthint | NoneNoneMaximum partial text length for streaming. None for no cap.

Usage

Basic Setup

import os
from pipecat.services.together import TogetherTTSService

tts = TogetherTTSService(
    api_key=os.getenv("TOGETHER_API_KEY"),
)

With Custom Settings

from pipecat.services.together import TogetherTTSService
from pipecat.transcriptions.language import Language

tts = TogetherTTSService(
    api_key=os.getenv("TOGETHER_API_KEY"),
    settings=TogetherTTSService.Settings(
        model="hexgrad/Kokoro-82M",
        voice="af_heart",
        language=Language.EN,
    ),
)

In a Voice Pipeline

from pipecat.pipeline.pipeline import Pipeline
from pipecat.services.together import TogetherTTSService

tts = TogetherTTSService(
    api_key=os.getenv("TOGETHER_API_KEY"),
    settings=TogetherTTSService.Settings(
        voice="af_heart",
        model="hexgrad/Kokoro-82M",
    ),
)

pipeline = Pipeline([
    # ... upstream processors
    llm,
    tts,
    transport.output(),
])

Notes

  • Together AI TTS streams audio at 24 kHz. The service outputs 24 kHz signed 16-bit mono PCM; the transport layer resamples to the pipeline’s configured rate if needed.
  • The service supports interruption handling and automatically clears the text buffer when interrupted.
  • Audio is streamed incrementally via WebSocket deltas for low-latency synthesis.