Text and image

The latest service changes have not yet been reflected in this content. We will update the content as soon as possible. Please refer to the Korean version for information on the latest updates.

Available in Classic and VPC

This guide describes Chat Completions v3, which can use the HCX-005 vision model that can interpret and understand images, and the lightweight HCX-DASH-002 model.

Request

This section describes the request format. The method and URI are as follows:

Method	URI
POST	/v3/chat-completions/{modelName} Generate sentences using the model. /v3/tasks/{taskId}/chat-completions Generate sentences using tuning trained jobs. No support for image input, inference, function calling, or structured outputs

Request headers

The following describes the request headers.

Headers	Required	Description
`Authorization`	Required	API key for authentication Example: `Bearer nv-************`
`X-NCP-CLOVASTUDIO-REQUEST-ID`	Optional	Request ID for the request
`Content-Type`	Required	Request data format `application/json`
`Accept`	Conditional	Response data format `text/event-stream`

Note

Response results are returned in JSON by default, but if you specify Accept as text/event-stream, then the response results are returned as a stream.

Request path parameters

You can use the following path parameters with your request:

Field	Type	Required	Description
`modelName`	Enum	Required	Model name Example: `HCX-005`

Note

HCX-005 and HCX-DASH-002 are only available in the Chat Completions v3 API.
Image input is only available in HCX-005, the HyperCLOVA X vision model, and tuning training is not supported.

Request body

You can include the following data in the body of your request:

Field	Type	Required	Description
`messages`	Array	Required	Conversation messages
`topP`	Double	Optional	Sample generated token candidates based on cumulative probability. 0.00 ＜ `topP` ≤ 1.00 (default: 0.80)
`topK`	Integer	Optional	Sample K high-probability candidates from the pool of generated token candidates 0 ≤ `topK` ≤ 128 (default: 0)
`maxTokens`	Integer	Optional	Maximum number of generated tokens 1 ≤ `maxTokens` ≤ model maximum value Can't be used concurrently with `maxCompletionTokens`.
`maxCompletionTokens`	Integer	Optional	Maximum number of generated tokens (inference model) 1 ≤ `maxCompletionTokens` ≤ model maximum value Can't be used concurrently with `maxTokens`.
`temperature`	Double	Optional	Degree of diversity for the generated tokens (higher values generate more diverse sentences) 0.00 ≤ `temperature` ≤ 1.00 (default: 0.50)
`repetitionPenalty`	Double	Optional	Degree of penalty for generating the same token (the higher the setting, the less likely it is to generate the same result repeatedly) 0 ＜ `repetitionPenalty` ≤ 2.0 (default: 1.1)
`stop`	Array	Optional	Token generation stop character [] (default) Can't be used for inference.
`seed`	Integer	Optional	Adjust consistency level of output on model iterations. 0: Randomize consistency level (default). 1 ≤ `seed` ≤ 4294967295: `seed` value of result value you want to generate consistently, or a user-specified `seed` value
`includeAiFilters`	Boolean	Optional	Whether to display the AI filter results (degree of the generated results in categories such as profanity, degradation/discrimination/hate, sexual harassment/obscenity, etc.) `true` (default) \| `false` `true`: display `false`: not display

`messages`

The following describes messages.

Field	Type	Required	Description
`role`	Enum	Required	Role of conversation messages `system` \| `user` \| `assistant` \| `system`: directives that define roles `user`: user utterances/questions `assistant`: answers to user utterances/questions
`content`	String \| Array	Required	Conversation message content Enter text (string). Enter by composing text and image URLs (array)

`content`

The following describes content.

Field	Type	Required	Description
`type`	Enum	Required	Format of conversation message content `text` \| `image_url` `text` : text `image_url` : image URL
`text`	String	Conditional	Conversation message content Enter text. Required if `type` is `text`
`imageUrl`	Object	Conditional	Image list If `type` is `image_url`, `imageUrl` or `dataUri` must be entered. One image per turn can be included. Recommended to request with `text` for best results
`imageUrl.url`	String	Conditional	Public URL of a single image, including file extension Supported image specifications Format: BMP, PNG, JPG, JPEG, WEBP Size: between 0 byte and 20 MB Ratio: horizontal and vertical dimensions of 1:5 or 5:1 or less Length: The longer side (horizontal or vertical) must be 2240 px or less. The shorter side must be 4 px or more.
`dataUri`	Object	Conditional	Image list If `type` is `image_url`, `imageUrl` or `dataUri` must be entered. One image per turn can be included. Recommended to request with `text` for best results
`dataUri.data`	String	Conditional	Base64-encoded image string Supported image specifications Format: BMP, PNG, JPG, JPEG, WEBP Size: between 0 byte and 20 MB Ratio: horizontal and vertical dimensions of 1:5 or 5:1 or less Length: The longer side (horizontal or vertical) must be 2240 px or less. The shorter side must be 4 px or more.

Note

When entering some fields, check the following:

role: You can only include one Conversation message that is system per request.
HCX-005
- The sum of the input tokens and the output tokens cannot exceed 128,000 tokens.
- The input tokens can be up to 128,000 tokens.
- The output tokens (maxTokens) to be requested from the model can be up to 4096 tokens.
- messages: One image can be included per turn, and up to five images can be included per request.
  - The total request body size must be 50 MB or less. Therefore, if you want to include multiple images in a request, we recommend using image URLs rather than Base64 format.
HCX-DASH-002
- The sum of the input tokens and the output tokens cannot exceed 32000 tokens.
- The input tokens can be up to 32000 tokens.
- The output tokens (maxTokens) to be requested from the model can be up to 4096 tokens.

Request example

The request example is as follows:

curl --location --request POST 'https://clovastudio.stream.ntruss.com/v3/chat-completions/HCX-005' \
--header 'Authorization: Bearer {CLOVA Studio API Key}' \
--header 'X-NCP-CLOVASTUDIO-REQUEST-ID: {Request ID}' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--data '{
    "messages": [
      {
        "role": "system",
        "content": [
          {
            "type": "text",
            "text": "- This is a friendly AI assistant."
          }
        ]
      },
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "imageUrl": {
              "url": "https://www.******.com/image_a1b1c1.png"
            }
          },
          {
            "type": "text",
            "text": "Please describe this photo."
          }
        ]
      }
    ],
    "topP": 0.8,
    "topK": 0,
    "maxTokens": 100,
    "temperature": 0.5,
    "repetitionPenalty": 1.1,
    "stop": []
  }'

Response

This section describes the response format.

Response headers

The following describes the response headers.

Headers	Required	Description
`Content-Type`	-	Response data format `application/json`

Response body

The response body includes the following data:

Field	Type	Required	Description
`status`	Object	-	Response status
`result`	Object	-	Response result
`result.created`	Integer	-	Response date and time Unix timestamp milliseconds format
`result.usage`	Object	-	Token usage
`result.usage.completionTokens`	Integer	-	Generated token count
`result.usage.promptTokens`	Integer	-	Number of input (prompt) tokens
`result.usage.totalTokens`	Integer	-	Total number of tokens Number of generated tokens + number of input tokens
`result.message`	Object	-	Conversation messages
`result.message.role`	Enum	-	Role of conversation messages `system` \| `user` \| `assistant` `system`: directives that define roles `user`: user utterances/questions `assistant`: answers of the model
`result.message.content`	String	-	Content of conversation messages
`result.finishReason`	String	-	Reason for stopping token generation (generally passed to the last event) `length` \| `stop` `length`: length limit `stop`: character specified in `stop` occurred during answer generation. `tool_calls`: Model successfully completed tool call.
`result.seed`	Integer	-	Input seed value (Return a random value when 0 is entered or not entered)
`result.aiFilter`	Array	-	AI Filter result

`aiFilter`

The following describes aiFilter.

Field	Type	Required	Description
`groupName`	String	-	AI Filter category `curse` \| `unsafeContents` `curse`: degradation, discrimination, hate, and profanity `unsafeContents`: sexual harassment, obscenity
`name`	String	-	AI Filter subcategory `discrimination` \| `insult` \| `sexualHarassment` `discrimination`: degradation, discrimination, hate `insult`: profanity `sexualHarassment`: sexual harassment, obscenity
`score`	String	-	AI Filter score `-1` \| `0` \| `1` \| `2` `-1`: AI Filter error occurred. `0`: Conversation messages are more likely to contain sensitive/hazardous language. `1`: Conversation messages are likely to contain sensitive/hazardous language. `2`: Conversation messages are unlikely to contain sensitive/hazardous language.
`result`	String	-	Whether AI Filter is operating properly `OK` \| `ERROR` `OK`: normal operation `ERROR`: error occurred

Note

AI Filter can analyze up to 500 characters. However, if the text being analyzed contains many unusual formats, emojis, or special characters, it may not be analyzed correctly.

Response example

The response example is as follows:

Succeeded

The following is a sample response upon a successful call.

{
    "status": {
        "code": "20000",
        "message": "OK"
    },
    "result": {
        "created": 1791043155000,
        "usage": {
            "completionTokens": 80,
            "promptTokens": 843,
            "totalTokens": 923
        },        
        "message": {
            "role": "assistant",
            "content": "The photo shows a young child feeding a sheep. The child is wearing a blue outfit and a striped hat. The child appears to be concentrating, while the sheep is lowering its head to eat the food the child is offering. Other sheep can be seen in the background, suggesting that the location is a sheep farm."
        },
        "seed": 1561390649,
        "aiFilter": [
         {
          "groupName": "curse",
          "name": "insult",
         "score": "1"
         },
         {
          "groupName": "curse",
          "name": "discrimination",
          "score": "0"
         },
         {
          "groupName": "unsafeContents",
          "name": "sexualHarassment",
          "score": "2"
         }
        ]
    }
}

Failure

The following is a sample response upon a failed call.

Response stream

You can use token streaming to output the tokens as they are generated, one by one. The following describes the token streaming format.

Response headers

The following describes the response headers.

Headers	Required	Description
`Accept`	-	Response data format `text/event-stream`

Response body

The response body includes the following data:

StreamingChatCompletionsTokenEvent

The following describes StreamingChatCompletionsTokenEvent.

Field	Type	Required	Description
`created`	Integer	-	Response timestamp
`usage`	Object	-	Token usage
`usage.promptTokens`	Integer	-	Number of input (prompt) tokens
`usage.completionTokens`	Integer	-	Generated token count
`message`	Object	-	Conversation messages
`message.role`	Enum	-	Conversation message role `user` \| `assistant` `user`: user's utterance or question `assistant`: model's answer
`message.content`	String	-	Content of conversation messages
`finishReason`	String	-	Reason for stopping token generation (typically passed to the last event) `length` \| `stop` `length`: length limit `stop`: character specified in `stop` occurred during answer generation.

StreamingChatCompletionsResultEvent

The following describes StreamingChatCompletionsResultEvent.

Field	Type	Required	Description
`created`	Integer	-	Response timestamp
`usage`	Object	-	Token usage
`usage.promptTokens`	Integer	-	Number of input (prompt) tokens
`usage.completionTokens`	Integer	-	Generated token count
`usage.totalTokens`	Integer	-	Total number of tokens Number of generated tokens + number of input tokens
`message`	Object	-	Conversation messages
`message.role`	Enum	-	Conversation message role `user` \| `assistant` `user`: user's utterance or question `assistant`: model's answer
`message.content`	String	-	Content of conversation messages
`finishReason`	String	-	Reason for stopping token generation (typically passed to the last event) `length` \| `stop` `length`: length limit `stop`: character specified in `stop` occurred during answer generation.
`aiFilter`	Array	-	AI Filter result

ErrorEvent

The following describes ErrorEvent.

Field	Type	Required	Description
`status`	Object	-	Response status
`status.code`	Object	-	Response status code See CLOVA Studio troubleshooting.
`status.message`	Object	-	Response status message See CLOVA Studio troubleshooting.

SignalEvent

The following describes SignalEvent.

Field	Type	Required	Description
`data`	String	-	Signal data information to pass

Response example

The response example is as follows:

Succeeded

The following is a sample response upon a successful call.

id: aabdfe-dfgwr-edf-hpqwd-f3asd-g
event: token
data: {"message": {"role": "assistant", "content": “He”},"finishReason": null, "created": 1744710905, "seed": 3284419119, "usage": null} 

id: aabdfe-dfgwr-edf-hpqwd-f2asd-g
event: token
data: {"message": {"role": "assistant", "content": “llo”},"finishReason": null, "created": 1744710905, "seed": 3284419119, "usage": null} 

id: aabdfe-dfgwr-edf-hpqwd-f1asd-g
event: result
data: {"message": {"role": "assistant", "content": “Hello”}, "finishReason": "stop", "created": 1744710905, "seed": 3284419119, "usage": {"promptTokens": 20, "completionTokens": 5, "totalTokens": 25}}

Failure

The following is a sample response upon a failed call.