Skip to main content

Overview

TogetherSTTService provides real-time speech recognition using Together AI’s WebSocket API with OpenAI-compatible speech-to-text endpoints. It supports streaming transcription with interim results and automatic reconnection.

Together AI STT API Reference

Pipecat’s API methods for Together AI STT

Example Implementation

Complete transcription example

Together AI Documentation

Official Together AI Realtime API documentation

Together AI Platform

Access models and manage API keys

Installation

To use Together AI STT services, install the required dependencies:
uv add "pipecat-ai[together]"

Prerequisites

Together AI Account Setup

Before using Together AI STT services, you need:
  1. Together AI Account: Sign up at Together AI
  2. API Key: Generate an API key from your account dashboard
  3. Model Selection: Choose from available transcription models

Required Environment Variables

  • TOGETHER_API_KEY: Your Together AI API key for authentication

Configuration

api_key
str
required
Together AI API key for authentication.
sample_rate
int
default:"None"
Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
base_url
str
default:"wss://api.together.ai/v1"
WebSocket base URL for Together AI API.
settings
TogetherSTTService.Settings
default:"None"
Runtime-configurable settings. See Settings below.
ttfs_p99_latency
float
default:"1.00"
P99 latency from speech end to final transcript in seconds. Override for your deployment. See https://github.com/pipecat-ai/stt-benchmark.

Settings

Runtime-configurable settings passed via the settings constructor argument using TogetherSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstr"openai/whisper-large-v3"Model identifier. (Inherited.)
languageLanguage | strLanguage.ENLanguage for transcription. (Inherited)

Usage

Basic Setup

import os
from pipecat.services.together import TogetherSTTService

stt = TogetherSTTService(
    api_key=os.getenv("TOGETHER_API_KEY"),
)

With Custom Settings

from pipecat.services.together import TogetherSTTService
from pipecat.transcriptions.language import Language

stt = TogetherSTTService(
    api_key=os.getenv("TOGETHER_API_KEY"),
    settings=TogetherSTTService.Settings(
        model="openai/whisper-large-v3",
        language=Language.EN,
    ),
)

In a Voice Pipeline

from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.processors.audio.vad_processor import VADProcessor
from pipecat.services.together import TogetherSTTService

stt = TogetherSTTService(api_key=os.getenv("TOGETHER_API_KEY"))
vad_processor = VADProcessor(vad_analyzer=SileroVADAnalyzer())

pipeline = Pipeline([
    transport.input(),
    vad_processor,
    stt,
    # ... rest of pipeline
])

Notes

  • Together AI’s STT service uses an OpenAI-compatible WebSocket protocol for real-time transcription.
  • The service automatically handles reconnection on connection errors.
  • Transcription is committed when VADUserStoppedSpeakingFrame is received.