Skip to main content
Adaptive Proxy exposes granular provider controls so you can steer routing per request without sacrificing the intelligent model router. The provider object is accepted on Chat Completions, Anthropic Messages, and Gemini Generate requests and is enforced consistently across non-streaming and streaming executions (including fallback paths).

Quick Start

Use the provider object to control routing behavior per request:
const completion = await openai.chat.completions.create({
  model: "meta-llama/llama-3.1-70b-instruct",
  messages: [{ role: "user", content: "Summarize this thread." }],
  provider: {
    order: ["anthropic", "openrouter/groq"],
    only: ["anthropic", "openrouter/groq"],
    ignore: ["deepinfra"],
    sort: "price",
    quantizations: ["fp8"],
    require_parameters: true,
    data_collection: "deny",
    zdr: true,
    enforce_distillable_text: false,
    allow_fallbacks: false,
    max_price: {
      prompt: 1.2,
      completion: 2.0,
      request: 0.10
    }
  }
});

console.log(`Used provider: ${completion.provider}`);
provider.allow_fallbacks simply toggles the new fallback.enabled flag. Use the fallback object when you need finer control over mode, retries, or circuit breakers.

Real Examples

Cost Control

“Set maximum prices for all requests”Use max_price to enforce budget limits and prevent unexpected costs.
provider: {
  max_price: { prompt: 1.0, completion: 1.5 }
}

Compliance

“Only use zero data retention providers”Enforce data privacy requirements with zdr and data_collection filters.
provider: {
  zdr: true,
  data_collection: "deny"
}

Performance

“Prioritize throughput for batch jobs”Use sort: "throughput" or :nitro shortcuts for high-volume processing.
provider: {
  sort: "throughput",
  quantizations: ["fp8"]
}

Reliability

“Restrict to trusted providers”Use only to whitelist specific providers for critical applications.
provider: {
  only: ["anthropic", "openai"],
  allow_fallbacks: false
}

Configuration Options

Provider Ordering

Control which providers are tried first:
// Try Anthropic first, then OpenRouter
const completion = await openai.chat.completions.create({
  model: "claude-3-5-sonnet-20241022",
  messages: [{ role: "user", content: "Hello!" }],
  provider: {
    order: ["anthropic", "openrouter"]
  }
});

Cost Optimization

Set maximum prices to control spending:
const completion = await openai.chat.completions.create({
  model: "",
  messages: [{ role: "user", content: "Write a summary" }],
  provider: {
    sort: "price",
    max_price: {
      prompt: 0.5,      // $0.50 per million prompt tokens
      completion: 1.0,  // $1.00 per million completion tokens
      request: 0.05     // $0.05 per request
    }
  }
});

Compliance & Security

Enforce data privacy and security requirements:
const completion = await openai.chat.completions.create({
  model: "",
  messages: [{ role: "user", content: "Analyze sensitive data" }],
  provider: {
    data_collection: "deny",  // Only non-retentive providers
    zdr: true,                // Zero Data Retention only
    only: ["anthropic", "openai"]  // Trusted providers only
  }
});

Provider Parameters

order
array
Explicit list of provider tags to try first. When omitted, Adaptive’s heuristics determine the initial ordering.
only
array
Whitelist of providers/endpoint tags. Requests are rejected if no allowed provider remains.
ignore
array
Blacklist of providers/endpoint tags to skip even when the router selects them.
sort
string
Secondary ordering when order is absent. Options: price, throughput, and latency map to cost, capacity, and responsiveness heuristics.
quantizations
array
Require specific quantization levels (e.g., ["int8","fp8"]). Endpoint metadata is used; models lacking the requested format are filtered out.
require_parameters
boolean
When true, the model must advertise support for every parameter implied by the request (tools, response_format, etc.).
data_collection
string
allow (default) or deny. When deny, only providers marked as non-retentive in the registry remain. (Falls back to current metadata; future registry updates will make this stricter.)
zdr
boolean
Restrict routing to Zero Data Retention endpoints.
enforce_distillable_text
boolean
Filter to models whose publishers have opted into distillable outputs.
allow_fallbacks
boolean
Convenience flag that maps to fallback.enabled. Set to false to disable provider retries entirely.
max_price
object
Ceilings for prompt/completion/request/image pricing. Providers lacking explicit pricing are treated as exceeding the cap.
max_price.prompt
number
Maximum price for prompt tokens (USD per million tokens).
max_price.completion
number
Maximum price for completion tokens (USD per million tokens).
max_price.request
number
Maximum price per request (USD).

Intelligent Routing + Provider Constraints

  1. Logical model selection still happens through the Adaptive router (unless you hard-code model).
  2. Provider constraints (order/only/quantization/price/etc.) are applied when building the physical execution plan.
  3. Fallback now respects fallback.enabled. When disabled, the first provider failure surfaces directly.
Because constraints are enforced during provider selection, both primary execution and fallback candidates adhere to the same rules. For example, if you pin quantizations: ["fp8"], every provider in the execution plan satisfies that requirement.

Nitro / Floor Shortcuts

  • Append :nitro to any model slug to imply provider.sort = "throughput".
  • Append :floor to imply provider.sort = "price".
These hints are recognized even when you specify model directly (e.g., meta-llama/llama-3.1-70b-instruct:nitro).

Endpoint Coverage

EndpointSupport
/v1/chat/completionsFull provider object + fallback.enabled
/v1/messagesSame provider fields + fallback toggle
/v1/models/:generateSame provider fields + fallback toggle
The Gemini streaming API now builds the same provider execution plan as the non-streaming route, so ordering and filtering are consistent everywhere.

Migration Tips

  • Existing code: No changes required unless you want to leverage the new controls. Previous behavior (no provider object) is unchanged.
  • Fallback: If you relied on “unset mode = disabled,” switch to fallback.enabled=false (or provider.allow_fallbacks=false).
  • Registry metadata: Some filters (data collection, ZDR, distillable text) depend on registry tags. They currently act as “best effort” switches and will grow stricter as the registry schema expands.
Use these controls to emulate the routing policies your apps expect, enforce compliance requirements, and keep Adaptive’s intelligent planner as the safety net. Whatever combination you choose, the planner guarantees all executed providers match the constraints you set.