语音合成 - Mint Starter Kit

curl --request POST \ --url https://api.powertokens.ai/v1/audio/speech \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "qwen3-tts-instruct-flash", "input": "请用略快的语速介绍这款产品。", "voice": "Cherry", "instructions": "语速较快，带有上扬语调。", "optimize_instructions": false, "language_type": "Chinese" } '

授权

Authorization

string

header

必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

请求体

application/json

model

enum<string>

必填

可用选项:

qwen3-tts-instruct-flash

input

string

必填

待合成文本。

voice

string

必填

音色名称。

instructions

string

风格控制指令。

optimize_instructions

boolean

是否优化 instructions。显式 false 也会保留并透传。

language_type

string

语言类型。

stream_format

string

流式输出格式。任何非空值都会触发统一入口的流式请求；当前阿里能力常用值为 pcm。

响应

流式调用成功，返回阿里 SSE 数据。项目内部结算会读取 usage.characters 作为统一输入音频字符数。

The response is of type string.

模型接口

授权

请求体

响应