Chat Completions

Prev Next

Available in Classic and VPC

This guide describes Chat Completions, which enables you to use the HCX-003 model for generating conversational sentences and the lightweight HCX-DASH-001 model.

Request

This section describes the request format. The method and URI are as follows:

Method URI
POST
  • /v1/chat-completions/{modelName}
    • Generate sentences using the model.
  • /v1/tasks/{taskId}/chat-completions
  • /v2/tasks/{taskId}/chat-completions
    • Generate sentences using tuning trained jobs.

Request headers

The following describes the request headers.

Headers Required Description
Authorization Required API key for authentication Example: Bearer nv-************
X-NCP-CLOVASTUDIO-REQUEST-ID Optional Request ID for the request
Content-Type Required Request data format
  • application/json
Accept Conditional Response data format
  • text/event-stream
Note
  • Response results are returned in JSON by default, but if you specify Accept as text/event-stream, then the response results are returned as a stream.

Request path parameters

You can use the following path parameters with your request:

Field Type Required Description
modelName String Conditional Model name
  • When using the model to generate sentences
  • Example: HCX-003
taskId String Conditional Training ID
  • When using tuning trained job to generate sentences
  • Can be checked in the response body of the Create training API

Request body

The response body includes the following data:

Field Type Required Description
messages Array Required Conversation messages
temperature Double Optional Degree of diversity for the generated tokens (higher values generate more diverse sentences)
  • 0.00 < temperature ≤ 1.00 (default: 0.50)
    • Display to two decimal places.
topK Integer Optional Sample K high-probability candidates from the pool of generated token candidates.
  • 0 ≤ topK ≤ 128 (default: 0)
  • topP Double Optional Sample generated token candidates based on cumulative probability.
    • 0.00 < topP ≤ 1.00 (default: 0.80)
      • Display to two decimal places.
    repeatPenalty Double Optional Degree of penalty for generating the same token (the higher the setting, the less likely it is to generate the same result repeatedly)
  • 0.0 < repeatPenalty ≤ 10.0 (default: 5.0)
  • stopBefore Array Optional Character to abort token generation
  • [] (default)
  • maxTokens Integer Optional Maximum number of generated tokens
  • 0 < maxTokens ≤ 4096 (default: 100)
  • includeAiFilters Boolean Optional Whether to display the AI Filter results (degree of the generated results in categories such as profanity, degradation/discrimination/hate, sexual harassment/obscenity, etc.)
    • false (default) | true
      • false: not display
      • true: display
    seed Integer Optional Adjust consistency level of output on model iterations.
    • 0: Randomize consistency level (default).
    • 1 ≤ seed ≤ 4294967295: seed value of result value you want to generate consistently, or a user-specified seed value
    Note

    When entering some fields, check the following:

    • HCX-003
      • The sum of the input tokens and the output tokens cannot exceed 8192 tokens.
      • The input tokens can be up to 7600 tokens.
      • The output tokens (maxTokens) to be requested from the model can be up to 4096 tokens.
    • HCX-DASH-001
      • The sum of the input tokens and the output tokens cannot exceed 4096 tokens.
      • The input tokens can be up to 3500 tokens.
      • The output tokens (maxTokens) to be requested from the model can be up to 4096 tokens.

    messages

    The following describes messages.

    Field Type Required Description
    role Enum Required Role of conversation messages
    • system | user | assistant
      • system: directives that define roles
      • user: user utterances/questions
      • assistant: answers to user utterances/questions
    content String Required Content of conversation messages

    Request example

    The request example is as follows:

    curl --location --request POST 'https://clovastudio.stream.ntruss.com/v1/chat-completions/HCX-003' \
    --header 'Authorization: Bearer {API Key}' \
    --header 'X-NCP-CLOVASTUDIO-REQUEST-ID: {Request ID}' \
    --header 'Content-Type: application/json' \
    --header 'Accept: text/event-stream' \
    --data '{
      "topK" : 0,
      "includeAiFilters" : true,
      "maxTokens" : 256,
      "temperature" : 0.5,
      "messages" : [ {
        "role" : "system",
        "content" : "test"
      }, {
        "role" : "user",
        "content" : "Let's test it."
      }, {
        "role" : "assistant",
        "content" : "Understood. What would you like to test?"
      } ],
      "stopBefore" : [ ],
      "repeatPenalty" : 5.0,
      "topP" : 0.8
    }'
    

    Response

    This section describes the response format.

    Response headers

    The following describes the response headers.

    Headers Required Description
    Content-Type - Response data format
    • application/json

    Response body

    The response body includes the following data:

    Field Type Required Description
    status Object - Response status
    result Object - Response result
    result.message Object - Conversation messages
    result.message.role Enum - Role of conversation messages
    • system | user | assistant
      • system: directives that define roles
      • user: user utterances/questions
      • assistant: answers of the model
    result.message.content String - Content of conversation messages
    result.stopReason Enum - Reason for stopping results generation
    • length | end_token | stop_before
      • length: length limit
      • end_token: token count limit
      • stop_before:
      • The model terminated its output on its own
      • Character specified in stopBefore occurred during answer generation
    result.inputLength Integer - Number of input tokens (including special tokens such as END OF TURN based on billing)
    result.outputLength Integer - Number of response tokens
    result.seed int - Input seed value (Return a random value when 0 is entered or not entered)
    result.aiFilter Array - AI Filter result

    aiFilter

    The following describes aiFilter.

    Field Type Required Description
    groupName String - AI Filter category
    • curse | unsafeContents
      • curse: degradation, discrimination, hate, and profanity
      • unsafeContents: sexual harassment, obscenity
    name String - AI Filter subcategory
    • discrimination | insult | sexualHarassment
      • discrimination: degradation, discrimination, hate
      • insult: profanity
      • sexualHarassment: sexual harassment, obscenity
    score String - AI Filter score
    • -1 | 0 | 1 | 2
      • -1: AI Filter error occurred.
      • 0: Conversation messages are more likely to contain sensitive/hazardous language.
      • 1: Conversation messages are likely to contain sensitive/hazardous language.
      • 2: Conversation messages are unlikely to contain sensitive/hazardous language.
    result String - Whether AI Filter is operating properly
    • OK | ERROR
      • OK: normal operation
      • ERROR: error occurred
    Note

    AI Filter can analyze up to 500 characters. However, if the text being analyzed contains many unusual formats, emojis, or special characters, it may not be analyzed correctly.

    Response example

    The response example is as follows:

    Succeeded

    The following is a sample response upon a successful call.

    {
      "status": {
        "code": "20000",
        "message": "OK"
      },
      "result": {
        "message": {
          "role": "assistant",
          "content": "Phrase: Record what happened today, and prepare for tomorrow. A journal will make your life richer.\n"
        },
        "stopReason": "LENGTH",
        "inputLength": 100,
        "outputLength": 10,
        "aiFilter": [
          {
            "groupName": "curse",
            "name": "insult",
            "score": "1"
          },
          {
            "groupName": "curse",
            "name": "discrimination",
            "score": "0"
          },
          {
            "groupName": "unsafeContents",
            "name": "sexualHarassment",
            "score": "2"
          }
        ]
      }
    }
    

    Failure

    The following is a sample response upon a failed call.

    Response stream

    You can use token streaming to output the tokens as they are generated, one by one. The following describes the token streaming format.

    Response headers

    The following describes the response headers.

    Headers Required Description
    Accept - Response data format
    • text/event-stream

    Response body

    The response body includes the following data:

    StreamingChatCompletionsResultEvent

    The following describes StreamingChatCompletionsResultEvent.

    Field Type Required Description
    message Object - Conversation messages
    message.role Enum - Role of conversation messages
    • system | user | assistant
      • system: directives that define roles
      • user: user utterances/questions
      • assistant: answers of the model
    message.content String - Content of conversation messages
    stopReason Enum - Reason for stopping results generation
    • length | end_token | stop_before
      • length: length limit
      • end_token: token count limit
      • stop_before: character specified in stopBefore occurred during answer generation
    inputLength Integer - Number of input tokens (including special tokens such as END OF TURN based on billing)
    outputLength Integer - Number of response tokens (including special tokens such as END OF TURN based on billing)
    aiFilter Array - AI Filter result

    StreamingChatCompletionsTokenEvent

    The following describes StreamingChatCompletionsTokenEvent.

    Field Type Required Description
    id String - Event ID that identifies the request
    message Object - Conversation messages
    message.role Enum - Role of conversation messages
    • system | user | assistant
      • system: directives that define roles
      • user: user utterances/questions
      • assistant: answers of the model
    message.content String - Content of conversation messages
    inputLength Integer - Number of input tokens (including special tokens such as END OF TURN based on billing)
    outputLength Integer - Number of response tokens (including special tokens such as END OF TURN based on billing)
    stopReason Enum - Reason for stopping results generation
    • length | end_token | stop_before
      • length: length limit
      • end_token: token count limit
      • stop_before:
        • The model successfully completed generation.
        • Character specified in stopBefore occurred during answer generation.

    ErrorEvent

    The following describes ErrorEvent.

    Field Type Required Description
    status Object - Response status

    SignalEvent

    The following describes SignalEvent.

    Field Type Required Description
    data String - Signal data information to pass

    Response example

    The response example is as follows:

    Succeeded

    The following is a sample response upon a successful call.

    id: aabdfe-dfgwr-edf-hpqwd-f3asd-g
    event: token
    data: {"message": {"role": "assistant", "content": “H”}}
    
    id: aabdfe-dfgwr-edf-hpqwd-f2asd-g
    event: token
    data: {"message": {"role": "assistant", "content": “i”}}
    
    id: aabdfe-dfgwr-edf-hpqwd-f1asd-g
    event: result
    data: {"message": {"role": "assistant", "content": "Hello"}, "inputLength":20, "outputLength":5, "stopReason":"stop_before" }
    

    Failure

    The following is a sample response upon a failed call.