API Documentation

Complete guide to using the Isosonic TTS API

Authentication

All API requests require authentication using a Bearer token. Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY
info
Getting your API Key

Contact your administrator to obtain your API key. Keep it secure and never share it publicly.

REST API Endpoints

The REST API provides synchronous text-to-speech synthesis. Perfect for simple use cases.

POSThttps://api1.isosonic.co/v1/audio/speech

Create speech from text (OpenAI-compatible)

GEThttps://api1.isosonic.co/v1/voices

List all 492 available voices

GEThttps://api1.isosonic.co/v1/models

List available TTS models

Text-to-Speech Synthesis

Python Example

import requests

# Configuration
API_KEY = "your-api-key-here"
API_URL = "https://api1.isosonic.co/v1/audio/speech"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

payload = {
    "model": "tts-1",
    "input": "Hello! This is a test of the text to speech API.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0
}

# Make the request
response = requests.post(API_URL, headers=headers, json=payload)

# Save the audio file
if response.status_code == 200:
    with open("output.mp3", "wb") as f:
        f.write(response.content)
    print("Audio saved to output.mp3")
else:
    print(f"Error: {response.status_code} - {response.text}")

Streaming Example (Python)

import requests

# Configuration
API_KEY = "your-api-key-here"
API_URL = "https://api1.isosonic.co/v1/audio/speech"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

payload = {
    "model": "tts-1",
    "input": "This is a streaming example with real-time audio delivery.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0,
    "stream": True  # Enable streaming
}

# Stream the response
response = requests.post(API_URL, headers=headers, json=payload, stream=True)

with open("output_stream.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        if chunk:
            f.write(chunk)
            print(f"Received {len(chunk)} bytes")

print("Audio saved to output_stream.mp3")

Request Parameters

Parameter
Type
Description
model
string
TTS model: tts-1 or tts-1-hd
input
string
Text to synthesize (max 4096 characters)
voice
string
Voice ID (e.g., alloy, fr05, vctk_p225). See 492 available voices
response_format
string
Audio format: mp3, opus, aac, flac, wav, or pcm
speed
number
Playback speed: 0.25 to 4.0 (default: 1.0)
stream
boolean
Enable streaming response (default: false)

Available Voices

The API supports 492 voices across 8 categories. Here are some examples:

OpenAI Compatible

alloy, echo, fable, nova, onyx, shimmer

French (CML-TTS)

fr01, fr02, fr03, fr04, fr05

EARS Neutral

ears_p225, ears_p226, ears_p227...

VCTK

vctk_p225, vctk_p226, vctk_p227...

tips_and_updates
Get Full Voice List

Fetch the complete list of 492 voices from: https://api1.isosonic.co/v1/voices

Rate Limits & Best Practices

warning

Authentication Required

Always include your API key in the Authorization header for production use.

speed

Use Streaming for Long Texts

Enable stream: true for texts longer than 500 characters to reduce latency.

data_usage

Choose Appropriate Format

Use opus for lowest bandwidth, pcm for lowest latency, or mp3 for compatibility.

replay

Handle Errors Gracefully

Implement retry logic with exponential backoff for failed requests.

Need Help?

description

API Reference

Complete API documentation

View Docs →
record_voice_over

Voice Catalog

Browse all 492 voices

View Voices →
bug_report

Report Issues

Found a bug or have feedback?

Contact Support →