curl --request POST \
--url https://api.powertokens.ai/v1/chat/completions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "seed-2-0-pro-260328",
"messages": [
{
"role": "system",
"content": "You are a concise and accurate assistant."
},
{
"role": "user",
"content": "Summarize the core RAG flow in three sentences."
}
],
"thinking": {
"type": "enabled"
},
"reasoning_effort": "medium",
"temperature": 0.3,
"max_completion_tokens": 1024,
"stream": false
}
'{
"id": "chatcmpl_bp_123",
"object": "chat.completion",
"created": 1742342400,
"model": "seed-2-0-pro-260328",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "RAG usually has three steps: retrieve, augment, and generate. The system first embeds the question and retrieves relevant knowledge chunks. It then injects those chunks into the prompt so the model can answer with better factual grounding and fewer hallucinations."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 123,
"completion_tokens": 98,
"total_tokens": 221
}
}Invoke BytePlus chat completions.
Supported models include seed-2-0-pro-260328, seed-2-0-lite-260228, seed-2-0-mini-260215, seed-1-8-251228, seed-1-6-250915, seed-1-6-flash-250715, deepseek-v3-2-251201, and gpt-oss-120b-250805.
This document exposes the following fields: model, messages, thinking, stream, stream_options.include_usage, stream_options.chunk_include_usage, max_tokens, max_completion_tokens, stop, reasoning_effort, response_format, frequency_penalty, presence_penalty, temperature, top_p, logprobs, top_logprobs, tools, parallel_tool_calls, and tool_choice. messages[].content supports multimodal text, image_url, and video_url parts, while assistant messages may include reasoning_content and tool_calls.
curl --request POST \
--url https://api.powertokens.ai/v1/chat/completions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "seed-2-0-pro-260328",
"messages": [
{
"role": "system",
"content": "You are a concise and accurate assistant."
},
{
"role": "user",
"content": "Summarize the core RAG flow in three sentences."
}
],
"thinking": {
"type": "enabled"
},
"reasoning_effort": "medium",
"temperature": 0.3,
"max_completion_tokens": 1024,
"stream": false
}
'{
"id": "chatcmpl_bp_123",
"object": "chat.completion",
"created": 1742342400,
"model": "seed-2-0-pro-260328",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "RAG usually has three steps: retrieve, augment, and generate. The system first embeds the question and retrieves relevant knowledge chunks. It then injects those chunks into the prompt so the model can answer with better factual grounding and fewer hallucinations."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 123,
"completion_tokens": 98,
"total_tokens": 221
}
}Send Authorization: Bearer <token> in the request headers.
Request body for BytePlus chat completions.
Model name. Supported BytePlus chat models include seed-2-0-pro-260328, seed-2-0-lite-260228, seed-2-0-mini-260215, seed-1-8-251228, seed-1-6-250915, seed-1-6-flash-250715, deepseek-v3-2-251201, and gpt-oss-120b-250805.
seed-2-0-pro-260328, seed-2-0-lite-260228, seed-2-0-mini-260215, seed-1-8-251228, seed-1-6-250915, seed-1-6-flash-250715, deepseek-v3-2-251201, gpt-oss-120b-250805 "seed-2-0-pro-260328"
Conversation message list. This document exposes system, user, assistant, and tool roles. messages[].content supports plain text or an array of multimodal parts composed of text, image_url, and video_url.
1Show child attributes
Controls whether the model should enter deep reasoning mode.
Show child attributes
Whether to enable streaming output. When true, the response content type is text/event-stream.
Streaming response options. Effective only when stream=true.
Show child attributes
Maximum answer length in tokens, excluding reasoning tokens.
Maximum total output length in tokens, including both answer and reasoning tokens. When set, max_tokens no longer applies.
Stop sequence. This can be a single string or an array of strings.
Caps the amount of reasoning work. minimal is fastest and high is deepest.
minimal, low, medium, high Controls the answer format. Beta.
Show child attributes
Frequency penalty. Higher values reduce repeated wording.
-2 <= x <= 2Presence penalty. Higher values encourage introducing new topics.
-2 <= x <= 2Sampling temperature. Lower is more stable; higher is more diverse.
0 <= x <= 2Nucleus sampling parameter. Usually tuned instead of temperature.
0 <= x <= 1Whether to return log probability data for output tokens.
When logprobs=true, how many candidate token log probabilities to return per position.
0 <= x <= 20Function tool definitions available to the model.
Show child attributes
Whether the model may emit multiple tool calls in parallel.
Tool selection strategy. This can be a string mode or an object that pins a specific function.
none, auto, required Success. Non-streaming mode returns JSON, and streaming mode returns an SSE event stream.