Select Model

curl --request POST \
  --url https://api.llmadaptive.uk/v1/select-model \
  --header 'Content-Type: application/json' \
  --data '{
  "models": [
    {}
  ],
  "prompt": "<string>",
  "cost_bias": 123,
  "model_router_cache": {
    "enabled": true,
    "semantic_threshold": 123
  }
}'

{
  "selected_model": {
    "id": 123,
    "author": "<string>",
    "model_name": "<string>",
    "display_name": "<string>",
    "description": "<string>",
    "context_length": 123,
    "pricing": {},
    "providers": [
      {}
    ]
  },
  "alternatives": [
    {}
  ],
  "cache_tier": "<string>"
}

POST

select-model

Select Model

curl --request POST \
  --url https://api.llmadaptive.uk/v1/select-model \
  --header 'Content-Type: application/json' \
  --data '{
  "models": [
    {}
  ],
  "prompt": "<string>",
  "cost_bias": 123,
  "model_router_cache": {
    "enabled": true,
    "semantic_threshold": 123
  }
}'

{
  "selected_model": {
    "id": 123,
    "author": "<string>",
    "model_name": "<string>",
    "display_name": "<string>",
    "description": "<string>",
    "context_length": 123,
    "pricing": {},
    "providers": [
      {}
    ]
  },
  "alternatives": [
    {}
  ],
  "cache_tier": "<string>"
}

Get Adaptive’s intelligent model selection without using our inference. Provider-agnostic design - works with any models, any providers, any infrastructure.

Why Use This?

Use Adaptive’s intelligence, run inference wherever you want:

“I have my own OpenAI/Anthropic accounts” - Get optimal model selection, pay your providers directly
“I run models on-premise” - Get routing decisions for your local infrastructure
“I have enterprise contracts” - Use your existing provider relationships with intelligent routing
“I need data privacy” - Keep inference local while getting smart model selection

Request

Provider-agnostic format - send your available models and prompt, get intelligent selection back.

models

array

required

Array of available model specifications in provider:model_name format. Adaptive automatically queries the Model Registry to fill in pricing, capabilities, and other details for known models.

Show Model Specification Format

["openai/gpt-5-mini", "anthropic/claude-sonnet-4-5", "gemini/gemini-2.5-flash-lite"]

prompt

string

required

The prompt text to analyze for optimal model selection

cost_bias

number

Cost optimization preference (0.0 = cheapest, 1.0 = best performance) Default: Uses server configuration. Override to prioritize cost savings or performance for this specific selection.

model_router_cache

object

Semantic cache configuration for this request

Show Semantic Cache Configuration

enabled

boolean

Override whether to use semantic caching for this specific request (overrides server configuration)

semantic_threshold

number

Override similarity threshold for cache hits (0.0-1.0, higher = more strict matching)

Response

selected_model

object

Selected model details Complete model information for the chosen model

Show RegistryModel Object

integer

Unique model identifier

author

string

Model provider/author (e.g., “openai”, “anthropic”, “gemini”)

model_name

string

Model identifier (e.g., “gpt-5-mini”, “claude-sonnet-4-5”)

display_name

string

Human-readable model name

description

string

Model description and capabilities

context_length

integer

Maximum context window in tokens

pricing

object

Cost information for the model

providers

array

Available endpoints for this model

alternatives

array

Alternative models (optional) Fallback model options if the primary selection is unavailable. Each alternative is a complete RegistryModel object.

cache_tier

string

Cache hit information Indicates if the selection came from cache (“semantic_exact”, “semantic_similar”, or empty if not cached)

Authentication

Same as chat completions:

-H "Authorization: Bearer apk_123456"

No Inference = Fast & Cheap

This endpoint:

✅ Fast - No LLM inference, just routing logic
✅ Cheap - Doesn’t count against token usage
✅ Accurate - Uses exact same selection logic as real completions

Perfect for testing, debugging, and cost planning without burning through your budget.

Models API Gemini Generate Content

⌘I

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

Select Model

Why Use This?

Request

Response

Authentication

No Inference = Fast & Cheap

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

​Why Use This?

​Request

​Response

​Authentication

​No Inference = Fast & Cheap

Why Use This?

Request

Response

Authentication

No Inference = Fast & Cheap