Skip to main content
POST
/
audio
/
speech
Generate speech
const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    model: '<string>',
    input: '<string>',
    voice: '<string>',
    provider: {
      order: ['<string>'],
      only: ['<string>'],
      ignore: ['<string>'],
      include_alpha: true,
      allow_fallbacks: true,
      require_parameters: true,
      required_execution_region: '<string>',
      required_data_region: '<string>',
      require_zero_data_retention: true,
      zdr: true,
      enforce_distillable_text: true,
      quantizations: ['<string>'],
      sort: '<string>',
      max_price: {prompt: 123, completion: 123, image: 123, audio: 123, request: 123},
      preferred_min_throughput: 123,
      preferred_max_latency: 123
    }
  })
};

fetch('https://api.phaseo.app/v1/audio/speech', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));
"<string>"

Voice Mapping

voice is normalized by provider with an internal alias map:
  • openai: common voices (for example alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, cedar, marin).
  • google (AI Studio / Gemini TTS): canonical prebuilt voices: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, Umbriel, Algieba, Despina, Erinome, Algenib, Rasalgethi, Laomedeia, Achernar, Alnilam, Schedar, Gacrux, Pulcherrima, Achird, Zubenelgenubi, Vindemiatrix, Sadachbia, Sadaltager, Sulafat.
  • elevenlabs: common public starter voices are mapped (for example rachel, domi, bella, antoni, elli, josh, arnold, adam, sam).
If a provided voice is not valid for the routed provider/model mapping, the request returns 400 invalid_request_error with param: "voice".

Provider Overrides

You can still pass provider-native voice settings:
  • ElevenLabs: config.elevenlabs.voice_id / config.elevenlabs.voice / config.elevenlabs.voiceName
  • Google: config.google.voice_name / config.google.voiceName
Top-level voice should be your default for portability.

Voice Samples

Authorizations

Authorization
string
header
required

Bearer token authentication

Body

application/json
model
string
required
input
string
required
voice
string
format
enum<string>
Available options:
mp3,
wav,
ogg,
aac
provider
object

Provider routing preferences for gateway selection.

Response

200 - audio/mpeg

Audio file

The response is of type file.

Last modified on April 21, 2026