Model Management
Query available AI models and their capabilities across all supported providers.
List Models
Returns the list of available AI models across all providers (Claude, OpenAI, Gemini, Groq, vLLM).
Endpoint: https://apis.threatwinds.com/api/ai/v1/models
Method: GET
Parameters
Headers
| Header | Type | Required | Description |
|---|---|---|---|
| Authorization | string | Optional* | Bearer token for session authentication |
| api-key | string | Optional* | API key for key-based authentication |
| api-secret | string | Optional* | API secret for key-based authentication |
Note: You must use either Authorization header OR API key/secret combination.
Request
To list all available models, use a GET request:
curl -X 'GET' \
'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'accept: application/json' \
-H 'Authorization: Bearer <token>'
Or using API key and secret:
curl -X 'GET' \
'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'api-secret: your-api-secret'
Response
A successful response will return all available models across all providers.
Success Response (200 OK)
{
"object": "list",
"data": [
{
"id": "claude-sonnet-4",
"object": "model",
"name": "Claude Sonnet 4.6",
"provider": "claude",
"owned_by": "Anthropic PBC",
"created": 2026,
"capabilities": [
"chat",
"tools-use",
"reasoning",
"code-generation",
"image"
],
"limits": {
"max_input_tokens": 200000,
"max_completion_tokens": 64000,
"max_total_tokens": 264000
}
},
{
"id": "claude-opus-4",
"object": "model",
"name": "Claude Opus 4.6",
"provider": "claude",
"owned_by": "Anthropic PBC",
"created": 2026,
"capabilities": [
"chat",
"tools-use",
"reasoning",
"code-generation",
"image"
],
"limits": {
"max_input_tokens": 200000,
"max_completion_tokens": 128000,
"max_total_tokens": 328000
}
},
{
"id": "gpt-oss-20b",
"object": "model",
"name": "GPT OSS 20B",
"provider": "groq",
"owned_by": "OpenAI",
"created": 2025,
"capabilities": [
"chat",
"code-generation",
"tools-use",
"reasoning"
],
"limits": {
"max_input_tokens": 131072,
"max_completion_tokens": 65536,
"max_total_tokens": 196608
}
}
]
}
Response Schema
| Field | Type | Description |
|---|---|---|
| object | string | Response type, always “list” |
| data | array | List of model details |
| data[].id | string | Model unique identifier |
| data[].object | string | Object type, always “model” |
| data[].name | string | Human-readable model name |
| data[].provider | string | Provider identifier: claude, openai, gemini, groq, vllm, tei |
| data[].owned_by | string | Organization that owns the model |
| data[].created | integer | Year of model release |
| data[].capabilities | array | List of model capabilities |
| data[].limits | object | Token limit information |
| limits.max_input_tokens | integer | Maximum input tokens |
| limits.max_completion_tokens | integer | Maximum completion tokens |
| limits.max_total_tokens | integer | Maximum total tokens (input + completion) |
Model Capabilities
| Capability | Description |
|---|---|
| chat | Text-based conversation |
| text-generation | General text generation |
| code-generation | Code generation and completion |
| tools-use | Function/tool calling support |
| reasoning | Extended reasoning capabilities |
| image | Image understanding (vision) |
| audio | Audio processing |
| video | Video processing |
| vision-embeddings | Multimodal embedding (text and images) |
Error Codes
| Status Code | Description | Possible Cause |
|---|---|---|
| 200 | OK | Request successful |
| 400 | Bad Request | Invalid request |
| 401 | Unauthorized | Missing or invalid authentication |
| 403 | Forbidden | Insufficient permissions |
Get Model Details
Returns detailed information about a specific model.
Endpoint: https://apis.threatwinds.com/api/ai/v1/models/{id}
Method: GET
Parameters
Headers
| Header | Type | Required | Description |
|---|---|---|---|
| Authorization | string | Optional* | Bearer token for session authentication |
| api-key | string | Optional* | API key for key-based authentication |
| api-secret | string | Optional* | API secret for key-based authentication |
Note: You must use either Authorization header OR API key/secret combination.
Path Parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| id | string | Yes | Model identifier | claude-sonnet-4 |
Request
To get details for a specific model, use a GET request:
curl -X 'GET' \
'https://apis.threatwinds.com/api/ai/v1/models/claude-sonnet-4' \
-H 'accept: application/json' \
-H 'Authorization: Bearer <token>'
Or using API key and secret:
curl -X 'GET' \
'https://apis.threatwinds.com/api/ai/v1/models/silas-1.0' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'api-secret: your-api-secret'
Response
Success Response (200 OK)
{
"id": "claude-sonnet-4",
"object": "model",
"name": "Claude Sonnet 4.6",
"provider": "claude",
"owned_by": "Anthropic PBC",
"created": 2026,
"capabilities": [
"chat",
"tools-use",
"reasoning",
"code-generation",
"image"
],
"limits": {
"max_input_tokens": 200000,
"max_completion_tokens": 64000,
"max_total_tokens": 264000
}
}
Response Schema
| Field | Type | Description |
|---|---|---|
| id | string | Model unique identifier |
| object | string | Object type, always “model” |
| name | string | Human-readable model name |
| provider | string | Provider identifier |
| owned_by | string | Organization that owns the model |
| created | integer | Year of model release |
| capabilities | array | List of model capabilities |
| limits | object | Token limit information |
| limits.max_input_tokens | integer | Maximum input tokens accepted |
| limits.max_completion_tokens | integer | Maximum tokens model can generate |
| limits.max_total_tokens | integer | Maximum combined tokens |
Error Codes
| Status Code | Description | Possible Cause |
|---|---|---|
| 200 | OK | Model found and returned |
| 404 | Not Found | Model not found |
| 400 | Bad Request | Invalid request |
| 401 | Unauthorized | Missing or invalid authentication |
| 403 | Forbidden | Insufficient permissions |
Available Models
Claude Models (Anthropic)
| Model ID | Name | Max Input | Max Output | Capabilities |
|---|---|---|---|---|
| claude-sonnet-4 | Claude Sonnet 4.6 | 200,000 | 64,000 | Chat, tools, reasoning, code-generation, image |
| claude-opus-4 | Claude Opus 4.6 | 200,000 | 128,000 | Chat, tools, reasoning, code-generation, image |
| claude-haiku-4 | Claude Haiku 4.5 | 200,000 | 64,000 | Chat, tools, reasoning, code-generation, image |
Best For:
- Sonnet 4.6: Best combination of speed and intelligence (200K context)
- Opus 4.6: Most intelligent model for agents and coding (200K context, 128K output)
- Haiku 4.5: Fastest model with near-frontier intelligence, cost-effective
OpenAI Models
| Model ID | Name | Max Input | Max Output | Capabilities |
|---|---|---|---|---|
| gpt-5 | GPT-5.4 | 1,050,000 | 128,000 | Chat, tools, reasoning, code-generation, image |
| gpt-5-mini | GPT-5 Mini | 400,000 | 128,000 | Chat, tools, reasoning, code-generation |
Best For:
- GPT-5.4: General-purpose reasoning with massive 1M+ context window and vision
- GPT-5 Mini: Cost-efficient reasoning for well-defined tasks (400K context)
Special Notes:
- Token counting not supported for OpenAI models
- GPT-5 supports image understanding (vision)
Gemini Models (Google)
| Model ID | Name | Max Input | Max Output | Capabilities |
|---|---|---|---|---|
| gemini-3-pro | Gemini 3.1 Pro | 1,048,576 | 65,536 | Chat, tools, reasoning, image |
| gemini-3-flash-lite | Gemini 3.1 Flash Lite | 1,048,576 | 65,536 | Chat, tools, reasoning, image |
| gemini-3-flash | Gemini 3 Flash | 1,048,576 | 65,536 | Chat, tools, reasoning, image |
Best For:
- Gemini 3.1 Pro: Complex reasoning and agentic workflows with 1M context
- Gemini 3.1 Flash Lite: Most cost-effective option with full 1M context
- Gemini 3 Flash: Fast frontier-class performance at low cost
Special Notes:
- All Gemini models support dynamic thinking with configurable reasoning levels
- Token counting supported for all Gemini models
Groq Models
| Model ID | Name | Max Input | Max Output | Capabilities |
|---|---|---|---|---|
| gpt-oss-20b | GPT OSS 20B | 131,072 | 65,536 | Chat, code, tools, reasoning |
| gpt-oss-120b | GPT OSS 120B | 131,072 | 65,536 | Chat, code, tools, reasoning |
| qwen3-32b | Qwen 3 32B | 131,072 | 40,960 | Chat, code, tools, reasoning* |
| llama4-maverick | LLaMA 4 Maverick 17B | 131,072 | 8,192 | Chat, code, tools |
| llama4-scout | LLaMA 4 Scout 17B | 131,072 | 8,192 | Chat, code, tools |
Best For:
- Extremely fast inference (optimized hardware)
- Real-time applications
- Cost-effective at scale
- Large context windows (131K tokens)
Special Notes:
- *qwen3-32b: reasoning_effort is accepted but always treated as “default” internally
- Token counting not supported for Groq models
vLLM Models (ThreatWinds)
| Model ID | Name | Max Input | Max Output | Capabilities |
|---|---|---|---|---|
| silas-1.0 | Silas 1.0 | 121,072 | 10,000 | Chat, tools, code-generation |
| silas-1.0-pro | Silas 1.0 Pro | 121,072 | 10,000 | Chat, tools, code-generation |
Best For:
- Cybersecurity-focused tasks
- Threat intelligence analysis
- Penetration testing assistance
- Security operations
Special Notes:
- Silas is a specialized cybersecurity AI assistant developed by ThreatWinds
- Silas 1.0 Pro uses a full-precision model for higher quality responses
- Optimized for offensive security and threat intelligence tasks
- Token counting supported
Embedding Models (ThreatWinds)
| Model ID | Name | Max Input | Capabilities |
|---|---|---|---|
| qwen3-embedding-8b | Qwen3 Embedding 8B | 32,768 | Text embeddings |
| qwen3-vl-embedding-8b | Qwen3 VL Embedding 8B | 32,768 | Vision embeddings |
Best For:
- qwen3-embedding-8b: Semantic search, text clustering, RAG pipelines, multilingual embedding
- qwen3-vl-embedding-8b: Cross-modal search (text-to-image, image-to-text), visual threat detection, multimodal RAG
Special Notes:
- qwen3-embedding-8b: Hosted on ThreatWinds TEI infrastructure, 4,096-dimensional vectors with MRL support, batch embedding supported
- qwen3-vl-embedding-8b: Hosted on ThreatWinds vLLM infrastructure, accepts text, base64 images, or mixed content blocks. See Embeddings for input format details.
Choosing the Right Model
By Use Case
| Use Case | Recommended Models |
|---|---|
| Simple Q&A | Claude Haiku, GPT-5 Mini, Gemini Flash Lite, Groq models |
| Complex Reasoning | Claude Opus, GPT-5.4 Pro, Gemini 3.1 Pro |
| Code Generation | Claude Sonnet, GPT-5, Silas Pro |
| Long Context | GPT-5.4 (1M), Gemini (1M), Claude (200K), Groq (131K) |
| Vision Tasks | Claude models, GPT-5, Gemini models |
| Real-time Chat | Groq models, Claude Haiku, Gemini Flash Lite |
| Tool/Function Calling | Claude Sonnet, GPT-5.4, Gemini 3.1 Pro |
| Cybersecurity | Silas Pro, Silas, Claude Sonnet |
| Semantic Search / RAG | Qwen3 Embedding 8B, Qwen3 VL Embedding 8B |
| Visual Search / Multimodal RAG | Qwen3 VL Embedding 8B |
By Performance Priority
| Priority | Models |
|---|---|
| Speed | Groq (LLaMA, GPT OSS), Claude Haiku, Gemini Flash Lite |
| Quality | Claude Opus, GPT-5.4 Pro, Gemini 3.1 Pro |
| Cost | Claude Haiku, GPT-5 Mini, Gemini Flash Lite, Groq |
| Context Length | GPT-5.4 (1M), Gemini (1M), Claude (200K) |
| Cybersecurity Expertise | Silas Pro, Silas |
| Text Embeddings | Qwen3 Embedding 8B |
| Vision Embeddings | Qwen3 VL Embedding 8B |
By Provider Strengths
| Provider | Strengths |
|---|---|
| Claude (Anthropic) | Instruction following, safety, reasoning, vision |
| OpenAI | Massive context (1M), reasoning, vision, code |
| Gemini (Google) | 1M context, cost-effective, configurable thinking |
| Groq | Ultra-fast inference, low latency, large context |
| vLLM (ThreatWinds) | Cybersecurity specialization, pentesting support |
Example: Filtering Models
List Only Claude Models
curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'Authorization: Bearer <token>' | \
jq '.data[] | select(.provider == "claude")'
Find Models with Vision
curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'Authorization: Bearer <token>' | \
jq '.data[] | select(.capabilities | contains(["image"]))'
Find Models with Large Context
curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'Authorization: Bearer <token>' | \
jq '.data[] | select(.limits.max_input_tokens > 100000)'
Find Models with Tool Support
curl -X 'GET' 'https://apis.threatwinds.com/api/ai/v1/models' \
-H 'Authorization: Bearer <token>' | \
jq '.data[] | select(.capabilities | contains(["tools-use"]))'
Model Selection Best Practices
- Check Token Limits: Ensure model supports your required context length
- Verify Capabilities: Confirm model has required features (vision, tools, etc.)
- Consider Cost: Balance quality needs with token costs
- Test Multiple Models: Compare results across models for your use case
- Monitor Performance: Track response times and quality metrics
- Stay Updated: Model list changes as providers release new versions
- Match Expertise: Use specialized models (like Silas) for domain-specific tasks