Chat Completions

Generate AI chat completions using any supported model. This endpoint handles inference requests and returns AI-generated responses.

Endpoint: https://apis.threatwinds.com/api/ai/v1/chat/completions

Method: POST

Parameters

Headers

Header Type Required Description
Authorization string Optional* Bearer token for session authentication
api-key string Optional* API key for key-based authentication
api-secret string Optional* API secret for key-based authentication
Content-Type string Yes Must be application/json

Note: You must use either Authorization header OR API key/secret combination.

Request Body

{
  "model": "claude-sonnet-4",
  "messages": [
    {
      "role": "user",
      "content": "What is XDR in cybersecurity?"
    }
  ],
  "max_completion_tokens": 1024,
  "temperature": 1.0,
  "reasoning_effort": "medium",
  "service_tier": "auto",
  "response_format": {
    "type": "json_object",
    "json_schema": {
      "type": "object",
      "properties": {
        "answer": {
          "type": "string"
        }
      }
    }
  }
}

Request Parameters

Parameter Type Required Description
model string Yes Model ID to use for inference (see Models)
messages array Yes Message history for context (minimum 1)
max_completion_tokens integer No Maximum tokens in response (defaults to 50% of model max, capped at model limit)
temperature float No Sampling temperature (0.0 to 2.0, default: 1.0, forced to 1.0 when reasoning enabled)
reasoning_effort string No Reasoning effort level (default: “auto” which disables reasoning)
service_tier string No Service tier for priority routing (default: “auto”)
response_format object No Response format specification (json_schema must be object, not string)

Message Object

Field Type Required Description
role string Yes Message role: user, assistant, system, tool, developer
content string Yes Message text content
tool_call_id string No Tool call identifier (for tool messages)
reasoning string No Reasoning text (for assistant messages)

Valid Message Roles

Role Description Usage
user User message Input from end user
assistant Assistant response Previous AI responses
system System instructions Behavioral instructions for AI
tool Tool/function response Results from tool calls
developer Developer message Developer-level instructions

Reasoning Effort Values

Value Description Token Budget (Claude)
auto Automatic reasoning level (default, disables reasoning) N/A
low Low reasoning effort 25% of max tokens
medium Moderate reasoning 33% of max tokens
high Extended reasoning 50% of max tokens
minimal Very minimal reasoning 20% of max tokens
disabled Explicitly disabled N/A

Special Notes:

  • For Claude models: Temperature is forced to 1.0 when reasoning is enabled
  • For Groq qwen3-32b: Only accepts “none” or “default” for reasoning_effort
  • Reasoning budget has minimum of 1024 tokens (Claude API requirement)

Service Tier Values

Value Description
auto Automatic tier selection (default)
default Standard processing
flex Flexible scheduling
priority Priority processing
standard Standard tier

Request

To generate a chat completion, use a POST request:

curl -X 'POST' \
  'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "claude-sonnet-4",
  "messages": [
    {
      "role": "user",
      "content": "What is XDR in cybersecurity?"
    }
  ],
  "max_completion_tokens": 1024
}'

Or using API key and secret:

curl -X 'POST' \
  'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key' \
  -H 'api-secret: your-api-secret' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "gpt-oss-20b",
  "messages": [
    {
      "role": "user",
      "content": "Explain threat intelligence"
    }
  ]
}'

Response

A successful response will return the AI-generated completion with usage statistics.

Success Response (200 OK)

{
  "id": "msg_01AbCdEf123",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "claude-sonnet-4",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "XDR (Extended Detection and Response) is a unified security platform that...",
        "reasoning": "The user is asking about a cybersecurity concept. I should provide a clear, comprehensive explanation..."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 150,
    "total_tokens": 162
  }
}

Response Schema

Field Type Description
id string (UUID) Unique interaction identifier
object string Response type, always “chat.completion”
created integer Unix timestamp of creation
model string Model used for inference
choices array Generated response choices (typically 1)
choices[].index integer Choice index in array
choices[].finish_reason string How generation ended
choices[].message object Generated message
choices[].message.role string Always “assistant”
choices[].message.content string Generated response text
choices[].message.reasoning string Reasoning process (optional, when reasoning enabled)
usage object Token usage statistics
usage.prompt_tokens integer Tokens in input messages
usage.completion_tokens integer Tokens in generated response
usage.total_tokens integer Sum of prompt and completion tokens

Finish Reason Values

Reason Description
stop Model completed response naturally
length Response cut off due to max_completion_tokens limit
content_filter Content filtered by provider safety systems
tool_calls Model requested tool/function calls

Business Logic

Request Processing

  1. Validation: Validates message array is not empty and all roles are valid
  2. Model Resolution: Looks up provider client based on model ID
  3. Parameter Normalization: Defaults reasoning_effort and service_tier to “auto” if invalid
  4. Inference: Calls provider client’s Message() method

Error Codes

Status Code Description Possible Cause
200 OK Request successful
400 Bad Request Invalid JSON, zero messages, invalid role, invalid json_schema
401 Unauthorized Missing or invalid authentication
403 Forbidden Insufficient permissions
500 Internal Server Error Provider error, model unavailable

Examples

Example 1: Simple Question

curl -X POST 'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "messages": [
      {"role": "user", "content": "What is 2+2?"}
    ],
    "max_completion_tokens": 100
  }'

Response:

{
  "id": "...",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "2 + 2 = 4"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 5,
    "total_tokens": 13
  }
}

Example 2: Multi-turn Conversation

curl -X POST 'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a cybersecurity expert"},
      {"role": "user", "content": "What is a zero-day vulnerability?"},
      {"role": "assistant", "content": "A zero-day vulnerability is..."},
      {"role": "user", "content": "How can organizations protect against them?"}
    ],
    "max_completion_tokens": 500
  }'

Example 3: JSON Response Format

curl -X POST 'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "messages": [
      {
        "role": "user",
        "content": "Extract the name and severity from: Critical vulnerability in Apache Log4j"
      }
    ],
    "response_format": {
      "type": "json_object",
      "json_schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "severity": {"type": "string"}
        }
      }
    }
  }'

Response:

{
  "choices": [{
    "message": {
      "content": "{\"name\": \"Apache Log4j\", \"severity\": \"Critical\"}"
    }
  }]
}

Example 4: Extended Reasoning

curl -X POST 'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "o1-preview",
    "messages": [
      {
        "role": "user",
        "content": "Design a secure architecture for a microservices-based XDR platform"
      }
    ],
    "reasoning_effort": "high",
    "max_completion_tokens": 4000
  }'

Example 5: Vision (Image Understanding)

curl -X POST 'https://apis.threatwinds.com/api/ai/v1/chat/completions' \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What security issues do you see?"},
          {"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}}
        ]
      }
    ]
  }'

Best Practices

Message Construction

  1. System Messages: Use system messages to set behavior and context
  2. Message History: Include relevant conversation history for context
  3. Clear Prompts: Be specific and clear in user messages
  4. Role Consistency: Maintain proper role alternation

Token Management

  1. Set Max Tokens: Always set max_completion_tokens to control costs
  2. Count First: Use token counting endpoint for large requests
  3. Monitor Usage: Track token usage via response usage field
  4. Truncate History: Remove old messages to stay within limits

Error Handling

  1. Validate Before Send: Check message count and roles before sending
  2. Handle Provider Errors: Be prepared for provider-specific errors
  3. Implement Retries: Add exponential backoff for transient errors
  4. Log Interaction IDs: Save interaction IDs for debugging

Performance

  1. Choose Appropriate Models: Use faster models for simple tasks
  2. Minimize Context: Send only necessary message history
  3. Parallel Requests: Process multiple items concurrently when possible
  4. Cache Results: Cache frequent queries to reduce API calls

Security

  1. Validate Input: Sanitize user input before sending to AI
  2. Filter Output: Validate and sanitize AI responses
  3. Monitor Usage: Track unusual patterns via logs
  4. Rotate Keys: Regularly rotate API credentials

Cost Optimization

  1. Model Selection: Use smaller/faster models when quality permits
  2. Token Limits: Set appropriate max_completion_tokens
  3. Batch Processing: Group similar requests
  4. Response Caching: Cache responses for repeated queries