Authorizations
Use the following format for authentication: Bearer <your api key>
Body
Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.
Tongyi-Zhiwen/QwenLong-L1-32B
, Qwen/Qwen3-30B-A3B
, Qwen/Qwen3-32B
, Qwen/Qwen3-14B
, Qwen/Qwen3-8B
, Qwen/Qwen3-235B-A22B
, THUDM/GLM-Z1-32B-0414
, THUDM/GLM-4-32B-0414
, THUDM/GLM-Z1-Rumination-32B-0414
, THUDM/GLM-4-9B-0414
, THUDM/GLM-4-9B-0414
, Qwen/QwQ-32B
, Pro/deepseek-ai/DeepSeek-R1
, Pro/deepseek-ai/DeepSeek-R1-0120
, Pro/deepseek-ai/DeepSeek-V3
, deepseek-ai/DeepSeek-R1
, deepseek-ai/DeepSeek-V3
, deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
, Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
, Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
, deepseek-ai/DeepSeek-V2.5
, Qwen/Qwen2.5-72B-Instruct-128K
, Qwen/Qwen2.5-72B-Instruct
, Qwen/Qwen2.5-32B-Instruct
, Qwen/Qwen2.5-14B-Instruct
, Qwen/Qwen2.5-7B-Instruct
, Qwen/Qwen2.5-Coder-32B-Instruct
, Qwen/Qwen2.5-Coder-7B-Instruct
, Qwen/Qwen2-7B-Instruct
, Qwen/Qwen2-1.5B-Instruct
, Qwen/QwQ-32B-Preview
, TeleAI/TeleChat2
, THUDM/glm-4-9b-chat
, Vendor-A/Qwen/Qwen2.5-72B-Instruct
, internlm/internlm2_5-7b-chat
, internlm/internlm2_5-20b-chat
, Pro/Qwen/Qwen2.5-7B-Instruct
, Pro/Qwen/Qwen2-7B-Instruct
, Pro/Qwen/Qwen2-1.5B-Instruct
, Pro/THUDM/chatglm3-6b
, Pro/THUDM/glm-4-9b-chat
"Qwen/QwQ-32B"
A list of messages comprising the conversation so far.
1 - 10
elementsIf set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
false
The maximum number of tokens to generate.
1 <= x <= 16384
512
Switches between thinking and non-thinking modes. Default is True. This field only applies to Qwen3.
false
Maximum number of tokens for chain-of-thought output. This field applies to all Reasoning models.
128 <= x <= 32768
4096
Dynamic filtering threshold that adapts based on token probabilities.This field only applies to Qwen3.
0 <= x <= 1
0.05
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
null
Determines the degree of randomness in the response.
0.7
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
0.7
50
0.5
Number of generations to return
1
An object specifying the format that the model must output.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.