Skip to main content
POST
/
v1
/
chat
/
completions
curl --request POST \
  --url https://api.powertokens.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "deepseek-v3-2-251201",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise and accurate assistant."
    },
    {
      "role": "user",
      "content": "Summarize the core RAG workflow in three sentences."
    }
  ],
  "thinking": {
    "type": "enabled"
  },
  "reasoning_effort": "medium",
  "temperature": 0.3,
  "max_completion_tokens": 1024,
  "stream": false
}
'
{
  "id": "chatcmpl_bp_123",
  "object": "chat.completion",
  "created": 1742342400,
  "model": "deepseek-v3-2-251201",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The core RAG workflow typically consists of three steps: retrieval, augmentation, and generation. The system first converts the user query into a vector and retrieves relevant knowledge snippets, then concatenates the retrieved results with the original query into the prompt. Finally, the model generates an answer based on the augmented context, reducing hallucinations and improving factual accuracy."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 98,
    "total_tokens": 221
  }
}

Authorizations

Authorization
string
header
required

Pass Authorization: Bearer <token> in the request header.

Body

application/json

BytePlus chat completions request body.

model
enum<string>
required

Model name. Supported BytePlus chat models include deepseek-v3-2-251201

Available options:
deepseek-v3-2-251201
Example:

"deepseek-v3-2-251201"

messages
object[]
required

Message list. This document covers four roles: system, user, assistant, and tool.messages[].content supports plain text or multimodal parts composed of text, image_url, and video_url.

Minimum array length: 1
thinking
object

Controls whether the model enables deep thinking mode.

stream
boolean
default:true

Whether to enable streaming output. When true, the response content type is text/event-stream.

stream_options
object

Additional options for streaming responses. Only effective when stream=true.

max_tokens
integer

Maximum length of the model response (excluding chain-of-thought), in tokens.

max_completion_tokens
integer

Maximum total output length (including response and chain-of-thought), in tokens. When set, max_tokens is ignored.

stop

Stop sequence. Can be a single string or an array of strings.

reasoning_effort
enum<string>

Limit reasoning effort. minimal is fastest, high is deepest.

Available options:
minimal,
low,
medium,
high
response_format
object

Controls the response format (Beta).

frequency_penalty
number
default:0

Frequency penalty. Higher values suppress repetitive expressions.

Required range: -2 <= x <= 2
presence_penalty
number
default:0

Presence penalty. Higher values encourage the model to introduce new topics.

Required range: -2 <= x <= 2
temperature
number
default:0.7

Sampling temperature. Lower values produce more deterministic output; higher values produce more diverse output.

Required range: 0 <= x <= 2
top_p
number
default:0.95

Nucleus sampling parameter. Typically tuned as an alternative to temperature.

Required range: 0 <= x <= 1
logprobs
boolean

Whether to return log probabilities of output tokens.

top_logprobs
integer

When logprobs=true, specifies how many candidate token log probabilities to return at each position.

Required range: 0 <= x <= 20
tools
object[]

List of function tool definitions available for the model to call.

parallel_tool_calls
boolean

Whether to allow the model to issue multiple tool calls in parallel.

tool_choice

Tool calling strategy. Can be a string mode or an object specifying a particular function.

Available options:
none,
auto,
required

Response

Success. Returns JSON in non-streaming mode; returns an SSE event stream in streaming mode.

BytePlus chat completions success response.

id
string

Response ID.

object
string

Object type, e.g. chat.completion.

created
integer

Unix timestamp.

model
string

Actual model name used.

choices
object[]
usage
object

Token usage statistics.