Skip to main content
POST
/
v1
/
chat
/
completions
curl --request POST \
  --url https://api.powertokens.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "seed-2-0-pro-260328",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise and accurate assistant."
    },
    {
      "role": "user",
      "content": "Summarize the core RAG flow in three sentences."
    }
  ],
  "thinking": {
    "type": "enabled"
  },
  "reasoning_effort": "medium",
  "temperature": 0.3,
  "max_completion_tokens": 1024,
  "stream": false
}
'
{
  "id": "chatcmpl_bp_123",
  "object": "chat.completion",
  "created": 1742342400,
  "model": "seed-2-0-pro-260328",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "RAG usually has three steps: retrieve, augment, and generate. The system first embeds the question and retrieves relevant knowledge chunks. It then injects those chunks into the prompt so the model can answer with better factual grounding and fewer hallucinations."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 98,
    "total_tokens": 221
  }
}

Authorizations

Authorization
string
header
required

Send Authorization: Bearer <token> in the request headers.

Body

application/json

Request body for BytePlus chat completions.

model
enum<string>
required

Model name. Supported BytePlus chat models include seed-2-0-pro-260328, seed-2-0-lite-260228, seed-2-0-mini-260215, seed-1-8-251228, seed-1-6-250915, seed-1-6-flash-250715, deepseek-v3-2-251201, and gpt-oss-120b-250805.

Available options:
seed-2-0-pro-260328,
seed-2-0-lite-260228,
seed-2-0-mini-260215,
seed-1-8-251228,
seed-1-6-250915,
seed-1-6-flash-250715,
deepseek-v3-2-251201,
gpt-oss-120b-250805
Example:

"seed-2-0-pro-260328"

messages
object[]
required

Conversation message list. This document exposes system, user, assistant, and tool roles. messages[].content supports plain text or an array of multimodal parts composed of text, image_url, and video_url.

Minimum array length: 1
thinking
object

Controls whether the model should enter deep reasoning mode.

stream
boolean
default:false

Whether to enable streaming output. When true, the response content type is text/event-stream.

stream_options
object

Streaming response options. Effective only when stream=true.

max_tokens
integer

Maximum answer length in tokens, excluding reasoning tokens.

max_completion_tokens
integer

Maximum total output length in tokens, including both answer and reasoning tokens. When set, max_tokens no longer applies.

stop

Stop sequence. This can be a single string or an array of strings.

reasoning_effort
enum<string>

Caps the amount of reasoning work. minimal is fastest and high is deepest.

Available options:
minimal,
low,
medium,
high
response_format
object

Controls the answer format. Beta.

frequency_penalty
number

Frequency penalty. Higher values reduce repeated wording.

Required range: -2 <= x <= 2
presence_penalty
number

Presence penalty. Higher values encourage introducing new topics.

Required range: -2 <= x <= 2
temperature
number

Sampling temperature. Lower is more stable; higher is more diverse.

Required range: 0 <= x <= 2
top_p
number

Nucleus sampling parameter. Usually tuned instead of temperature.

Required range: 0 <= x <= 1
logprobs
boolean

Whether to return log probability data for output tokens.

top_logprobs
integer

When logprobs=true, how many candidate token log probabilities to return per position.

Required range: 0 <= x <= 20
tools
object[]

Function tool definitions available to the model.

parallel_tool_calls
boolean

Whether the model may emit multiple tool calls in parallel.

tool_choice

Tool selection strategy. This can be a string mode or an object that pins a specific function.

Available options:
none,
auto,
required

Response

Success. Non-streaming mode returns JSON, and streaming mode returns an SSE event stream.

Successful BytePlus chat completions response.

id
string

Response ID.

object
string

Object type such as chat.completion.

created
integer

Unix timestamp.

model
string

Actual model used for the request.

choices
object[]
usage
object

Token usage accounting.