TTS (Text to Speech) (Beta)

Generate speech

const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    model: '<string>',
    input: '<string>',
    voice: '<string>',
    provider: {
      order: ['<string>'],
      only: ['<string>'],
      ignore: ['<string>'],
      include_alpha: true,
      allow_fallbacks: true,
      require_parameters: true,
      required_execution_region: '<string>',
      required_data_region: '<string>',
      require_zero_data_retention: true,
      zdr: true,
      enforce_distillable_text: true,
      quantizations: ['<string>'],
      sort: '<string>',
      max_price: {prompt: 123, completion: 123, image: 123, audio: 123, request: 123},
      preferred_min_throughput: 123,
      preferred_max_latency: 123
    }
  })
};

fetch('https://api.phaseo.app/v1/audio/speech', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

"<string>"

POST

audio

speech

Generate speech

const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    model: '<string>',
    input: '<string>',
    voice: '<string>',
    provider: {
      order: ['<string>'],
      only: ['<string>'],
      ignore: ['<string>'],
      include_alpha: true,
      allow_fallbacks: true,
      require_parameters: true,
      required_execution_region: '<string>',
      required_data_region: '<string>',
      require_zero_data_retention: true,
      zdr: true,
      enforce_distillable_text: true,
      quantizations: ['<string>'],
      sort: '<string>',
      max_price: {prompt: 123, completion: 123, image: 123, audio: 123, request: 123},
      preferred_min_throughput: 123,
      preferred_max_latency: 123
    }
  })
};

fetch('https://api.phaseo.app/v1/audio/speech', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

"<string>"

Voice Mapping

voice is normalized by provider with an internal alias map:

openai: common voices (for example alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, cedar, marin).
google (AI Studio / Gemini TTS): canonical prebuilt voices: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, Umbriel, Algieba, Despina, Erinome, Algenib, Rasalgethi, Laomedeia, Achernar, Alnilam, Schedar, Gacrux, Pulcherrima, Achird, Zubenelgenubi, Vindemiatrix, Sadachbia, Sadaltager, Sulafat.
elevenlabs: common public starter voices are mapped (for example rachel, domi, bella, antoni, elli, josh, arnold, adam, sam).

If a provided voice is not valid for the routed provider/model mapping, the request returns 400 invalid_request_error with param: "voice".

Provider Overrides

You can still pass provider-native voice settings:

ElevenLabs: config.elevenlabs.voice_id / config.elevenlabs.voice / config.elevenlabs.voiceName
Google: config.google.voice_name / config.google.voiceName

Top-level voice should be your default for portability.

Voice Samples

OpenAI: openai.fm
Google TTS voice examples: Gemini text-to-speech docs
ElevenLabs voice library: ElevenLabs Voice Library

Authorizations

Authorization

string

header

required

Bearer token authentication

Body

application/json

model

string

required

input

string

required

voice

string

format

enum<string>

Available options:

mp3,

wav,

ogg,

aac

provider

object

Provider routing preferences for gateway selection.

Show child attributes

Response

200 - audio/mpeg

Audio file

The response is of type file.

Last modified on April 21, 2026

Video delete (Beta)STT (Speech to Text) (Beta)

​Voice Mapping

​Provider Overrides

​Voice Samples

Authorizations

Body

Response

Voice Mapping

Provider Overrides

Voice Samples