Text to speech

Convert text to speech using a system or cloned voiceId. Copy IDs from Voices list or clone responses.

Sync vs async

Mode	Endpoint	Results
Sync	`POST /openapi/v1/task/tts/create`	`data.results[].uri` (CDN audio) in HTTP response
Async	`POST /openapi/v1/task/tts/create-async`	Webhook when `event` is `openapi-tts` (`data.results`)

Every create and create-async call requires X-Idempotency-Key. Async also needs a configured Webhook.

Workflow (sync)

See Quickstart for a complete curl example. Key request fields:

language.target — locale from Supported languages
data[] — up to 50 lines; each needs text and voiceId (or timbreRefAudio)
outputFormat — wav, mp3, or m4a

Do not send provider. Pass the voiceId from the voice list or clone result; the service picks the matching engine automatically. All lines in one request must use voices from the same tier or clone engine (see Supported clone methods and Error codes).

Workflow (async)

Configure Webhooks
POST to create-async with idempotency key
Handle audio URLs in webhook data.results

Voices

System voices: POST /openapi/v1/assets/voice/basic/list → copy voiceId
Cloned voices: Voice clone → copy returned voiceId

Full request schema: API reference → TTS operations.

Products overview
Usage limits — batch size limits

Get started

Authentication

MCP

Reliable requests

Webhooks

Assets

Products

Limits

Sync vs async

Workflow (sync)

Workflow (async)

Voices

​Sync vs async

​Workflow (sync)

​Workflow (async)

​Voices

​Related

Sync vs async

Workflow (sync)

Workflow (async)

Voices

Related