Model Management

Query available AI models and their capabilities across all supported providers.

List Models

Returns the list of available AI models across all providers (Claude, OpenAI, Gemini, Groq, vLLM).

Endpoint: https://apis.threatwinds.com/api/ai/v1/models

Method: GET

Parameters

Headers

Header Type Required Description
Authorization string Optional* Bearer token for session authentication
api-key string Optional* API key for key-based authentication
api-secret string Optional* API secret for key-based authentication

Note: You must use either Authorization header OR API key/secret combination.

Request

To list all available models, use a GET request:

curl -X 'GET' \
  'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer <token>'

Or using API key and secret:

curl -X 'GET' \
  'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key' \
  -H 'api-secret: your-api-secret'

Response

A successful response will return all available models across all providers.

Success Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "claude-sonnet-4",
      "object": "model",
      "name": "Claude Sonnet 4.6",
      "provider": "claude",
      "owned_by": "Anthropic PBC",
      "created": 2026,
      "capabilities": [
        "chat",
        "tools-use",
        "reasoning",
        "code-generation",
        "image"
      ],
      "limits": {
        "max_input_tokens": 200000,
        "max_completion_tokens": 64000,
        "max_total_tokens": 264000
      }
    },
    {
      "id": "claude-opus-4",
      "object": "model",
      "name": "Claude Opus 4.6",
      "provider": "claude",
      "owned_by": "Anthropic PBC",
      "created": 2026,
      "capabilities": [
        "chat",
        "tools-use",
        "reasoning",
        "code-generation",
        "image"
      ],
      "limits": {
        "max_input_tokens": 200000,
        "max_completion_tokens": 128000,
        "max_total_tokens": 328000
      }
    },
    {
      "id": "gpt-oss-20b",
      "object": "model",
      "name": "GPT OSS 20B",
      "provider": "groq",
      "owned_by": "OpenAI",
      "created": 2025,
      "capabilities": [
        "chat",
        "code-generation",
        "tools-use",
        "reasoning"
      ],
      "limits": {
        "max_input_tokens": 131072,
        "max_completion_tokens": 65536,
        "max_total_tokens": 196608
      }
    }
  ]
}

Response Schema

Field Type Description
object string Response type, always “list”
data array List of model details
data[].id string Model unique identifier
data[].object string Object type, always “model”
data[].name string Human-readable model name
data[].provider string Provider identifier: claude, openai, gemini, groq, vllm, tei
data[].owned_by string Organization that owns the model
data[].created integer Year of model release
data[].capabilities array List of model capabilities
data[].limits object Token limit information
limits.max_input_tokens integer Maximum input tokens
limits.max_completion_tokens integer Maximum completion tokens
limits.max_total_tokens integer Maximum total tokens (input + completion)

Model Capabilities

Capability Description
chat Text-based conversation
text-generation General text generation
code-generation Code generation and completion
tools-use Function/tool calling support
reasoning Extended reasoning capabilities
image Image understanding (vision)
audio Audio processing
video Video processing
vision-embeddings Multimodal embedding (text and images)

Error Codes

Status Code Description Possible Cause
200 OK Request successful
400 Bad Request Invalid request
401 Unauthorized Missing or invalid authentication
403 Forbidden Insufficient permissions

Get Model Details

Returns detailed information about a specific model.

Endpoint: https://apis.threatwinds.com/api/ai/v1/models/{id}

Method: GET

Parameters

Headers

Header Type Required Description
Authorization string Optional* Bearer token for session authentication
api-key string Optional* API key for key-based authentication
api-secret string Optional* API secret for key-based authentication

Note: You must use either Authorization header OR API key/secret combination.

Path Parameters

Parameter Type Required Description Example
id string Yes Model identifier claude-sonnet-4

Request

To get details for a specific model, use a GET request:

curl -X 'GET' \
  'https://apis.threatwinds.com/api/ai/v1/models/claude-sonnet-4' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer <token>'

Or using API key and secret:

curl -X 'GET' \
  'https://apis.threatwinds.com/api/ai/v1/models/silas-1.0' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key' \
  -H 'api-secret: your-api-secret'

Response

Success Response (200 OK)

{
  "id": "claude-sonnet-4",
  "object": "model",
  "name": "Claude Sonnet 4.6",
  "provider": "claude",
  "owned_by": "Anthropic PBC",
  "created": 2026,
  "capabilities": [
    "chat",
    "tools-use",
    "reasoning",
    "code-generation",
    "image"
  ],
  "limits": {
    "max_input_tokens": 200000,
    "max_completion_tokens": 64000,
    "max_total_tokens": 264000
  }
}

Response Schema

Field Type Description
id string Model unique identifier
object string Object type, always “model”
name string Human-readable model name
provider string Provider identifier
owned_by string Organization that owns the model
created integer Year of model release
capabilities array List of model capabilities
limits object Token limit information
limits.max_input_tokens integer Maximum input tokens accepted
limits.max_completion_tokens integer Maximum tokens model can generate
limits.max_total_tokens integer Maximum combined tokens

Error Codes

Status Code Description Possible Cause
200 OK Model found and returned
404 Not Found Model not found
400 Bad Request Invalid request
401 Unauthorized Missing or invalid authentication
403 Forbidden Insufficient permissions

Available Models

Claude Models (Anthropic)

Model ID Name Max Input Max Output Capabilities
claude-sonnet-4 Claude Sonnet 4.6 200,000 64,000 Chat, tools, reasoning, code-generation, image
claude-opus-4 Claude Opus 4.6 200,000 128,000 Chat, tools, reasoning, code-generation, image
claude-haiku-4 Claude Haiku 4.5 200,000 64,000 Chat, tools, reasoning, code-generation, image

Best For:

  • Sonnet 4.6: Best combination of speed and intelligence (200K context)
  • Opus 4.6: Most intelligent model for agents and coding (200K context, 128K output)
  • Haiku 4.5: Fastest model with near-frontier intelligence, cost-effective

OpenAI Models

Model ID Name Max Input Max Output Capabilities
gpt-5 GPT-5.4 1,050,000 128,000 Chat, tools, reasoning, code-generation, image
gpt-5-mini GPT-5 Mini 400,000 128,000 Chat, tools, reasoning, code-generation

Best For:

  • GPT-5.4: General-purpose reasoning with massive 1M+ context window and vision
  • GPT-5 Mini: Cost-efficient reasoning for well-defined tasks (400K context)

Special Notes:

  • Token counting not supported for OpenAI models
  • GPT-5 supports image understanding (vision)

Gemini Models (Google)

Model ID Name Max Input Max Output Capabilities
gemini-3-pro Gemini 3.1 Pro 1,048,576 65,536 Chat, tools, reasoning, image
gemini-3-flash-lite Gemini 3.1 Flash Lite 1,048,576 65,536 Chat, tools, reasoning, image
gemini-3-flash Gemini 3 Flash 1,048,576 65,536 Chat, tools, reasoning, image

Best For:

  • Gemini 3.1 Pro: Complex reasoning and agentic workflows with 1M context
  • Gemini 3.1 Flash Lite: Most cost-effective option with full 1M context
  • Gemini 3 Flash: Fast frontier-class performance at low cost

Special Notes:

  • All Gemini models support dynamic thinking with configurable reasoning levels
  • Token counting supported for all Gemini models

Groq Models

Model ID Name Max Input Max Output Capabilities
gpt-oss-20b GPT OSS 20B 131,072 65,536 Chat, code, tools, reasoning
gpt-oss-120b GPT OSS 120B 131,072 65,536 Chat, code, tools, reasoning
qwen3-32b Qwen 3 32B 131,072 40,960 Chat, code, tools, reasoning*
llama4-maverick LLaMA 4 Maverick 17B 131,072 8,192 Chat, code, tools
llama4-scout LLaMA 4 Scout 17B 131,072 8,192 Chat, code, tools

Best For:

  • Extremely fast inference (optimized hardware)
  • Real-time applications
  • Cost-effective at scale
  • Large context windows (131K tokens)

Special Notes:

  • *qwen3-32b: reasoning_effort is accepted but always treated as “default” internally
  • Token counting not supported for Groq models

vLLM Models (ThreatWinds)

Model ID Name Max Input Max Output Capabilities
silas-1.0 Silas 1.0 121,072 10,000 Chat, tools, code-generation
silas-1.0-pro Silas 1.0 Pro 121,072 10,000 Chat, tools, code-generation

Best For:

  • Cybersecurity-focused tasks
  • Threat intelligence analysis
  • Penetration testing assistance
  • Security operations

Special Notes:

  • Silas is a specialized cybersecurity AI assistant developed by ThreatWinds
  • Silas 1.0 Pro uses a full-precision model for higher quality responses
  • Optimized for offensive security and threat intelligence tasks
  • Token counting supported

Embedding Models (ThreatWinds)

Model ID Name Max Input Capabilities
qwen3-embedding-8b Qwen3 Embedding 8B 32,768 Text embeddings
qwen3-vl-embedding-8b Qwen3 VL Embedding 8B 32,768 Vision embeddings

Best For:

  • qwen3-embedding-8b: Semantic search, text clustering, RAG pipelines, multilingual embedding
  • qwen3-vl-embedding-8b: Cross-modal search (text-to-image, image-to-text), visual threat detection, multimodal RAG

Special Notes:

  • qwen3-embedding-8b: Hosted on ThreatWinds TEI infrastructure, 4,096-dimensional vectors with MRL support, batch embedding supported
  • qwen3-vl-embedding-8b: Hosted on ThreatWinds vLLM infrastructure, accepts text, base64 images, or mixed content blocks. See Embeddings for input format details.

Choosing the Right Model

By Use Case

Use Case Recommended Models
Simple Q&A Claude Haiku, GPT-5 Mini, Gemini Flash Lite, Groq models
Complex Reasoning Claude Opus, GPT-5.4 Pro, Gemini 3.1 Pro
Code Generation Claude Sonnet, GPT-5, Silas Pro
Long Context GPT-5.4 (1M), Gemini (1M), Claude (200K), Groq (131K)
Vision Tasks Claude models, GPT-5, Gemini models
Real-time Chat Groq models, Claude Haiku, Gemini Flash Lite
Tool/Function Calling Claude Sonnet, GPT-5.4, Gemini 3.1 Pro
Cybersecurity Silas Pro, Silas, Claude Sonnet
Semantic Search / RAG Qwen3 Embedding 8B, Qwen3 VL Embedding 8B
Visual Search / Multimodal RAG Qwen3 VL Embedding 8B

By Performance Priority

Priority Models
Speed Groq (LLaMA, GPT OSS), Claude Haiku, Gemini Flash Lite
Quality Claude Opus, GPT-5.4 Pro, Gemini 3.1 Pro
Cost Claude Haiku, GPT-5 Mini, Gemini Flash Lite, Groq
Context Length GPT-5.4 (1M), Gemini (1M), Claude (200K)
Cybersecurity Expertise Silas Pro, Silas
Text Embeddings Qwen3 Embedding 8B
Vision Embeddings Qwen3 VL Embedding 8B

By Provider Strengths

Provider Strengths
Claude (Anthropic) Instruction following, safety, reasoning, vision
OpenAI Massive context (1M), reasoning, vision, code
Gemini (Google) 1M context, cost-effective, configurable thinking
Groq Ultra-fast inference, low latency, large context
vLLM (ThreatWinds) Cybersecurity specialization, pentesting support

Example: Filtering Models

List Only Claude Models

curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'Authorization: Bearer <token>' | \
  jq '.data[] | select(.provider == "claude")'

Find Models with Vision

curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'Authorization: Bearer <token>' | \
  jq '.data[] | select(.capabilities | contains(["image"]))'

Find Models with Large Context

curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'Authorization: Bearer <token>' | \
  jq '.data[] | select(.limits.max_input_tokens > 100000)'

Find Models with Tool Support

curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
  -H 'Authorization: Bearer <token>' | \
  jq '.data[] | select(.capabilities | contains(["tools-use"]))'

Model Selection Best Practices

  1. Check Token Limits: Ensure model supports your required context length
  2. Verify Capabilities: Confirm model has required features (vision, tools, etc.)
  3. Consider Cost: Balance quality needs with token costs
  4. Test Multiple Models: Compare results across models for your use case
  5. Monitor Performance: Track response times and quality metrics
  6. Stay Updated: Model list changes as providers release new versions
  7. Match Expertise: Use specialized models (like Silas) for domain-specific tasks