Skip to main content

API Reference

Packages​

vllm.ai/v1alpha1​

Package v1alpha1 contains API Schema definitions for the v1alpha1 API group

Resource Types​

Decision​

Decision defines a routing decision based on rule combinations

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this decisionMaxLength: 100
MinLength: 1
Required: {}
priority integerPriority defines the priority of this decision (higher values = higher priority)
Used when strategy is "priority"
0Maximum: 1000
Minimum: 0
description stringDescription provides a human-readable description of this decisionMaxLength: 500
signals SignalCombinationSignals defines the signal combination logicRequired: {}
modelRefs ModelRef arrayModelRefs defines the model references for this decision (currently only one model is supported)MaxItems: 1
MinItems: 1
Required: {}
plugins DecisionPlugin arrayPlugins defines the plugins to apply for this decisionMaxItems: 10

DecisionPlugin​

DecisionPlugin defines a plugin configuration for a decision

Appears in:

FieldDescriptionDefaultValidation
type stringType is the plugin type (semantic-cache, jailbreak, pii, system_prompt, header_mutation)Enum: [semantic-cache jailbreak pii system_prompt header_mutation]
Required: {}
configuration RawExtensionConfiguration is the plugin-specific configuration as a raw JSON objectSchemaless: {}

DomainSignal​

DomainSignal defines a domain category for classification

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this domainMaxLength: 100
MinLength: 1
Required: {}
description stringDescription provides a human-readable description of this domainMaxLength: 500

EmbeddingSignal​

EmbeddingSignal defines an embedding-based signal extraction rule

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this signalMaxLength: 100
MinLength: 1
Required: {}
threshold floatThreshold is the similarity threshold for matching (0.0-1.0)Maximum: 1
Minimum: 0
Required: {}
candidates string arrayCandidates is the list of candidate phrases for semantic matchingMaxItems: 100
MinItems: 1
Required: {}
aggregationMethod stringAggregationMethod defines how to aggregate multiple candidate similaritiesmaxEnum: [mean max any]

IntelligentPool​

IntelligentPool defines a pool of models with their configurations

Appears in:

FieldDescriptionDefaultValidation
apiVersion stringvllm.ai/v1alpha1
kind stringIntelligentPool
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec IntelligentPoolSpec
status IntelligentPoolStatus

IntelligentPoolList​

IntelligentPoolList contains a list of IntelligentPool

FieldDescriptionDefaultValidation
apiVersion stringvllm.ai/v1alpha1
kind stringIntelligentPoolList
metadata ListMetaRefer to Kubernetes API documentation for fields of metadata.
items IntelligentPool array

IntelligentPoolSpec​

IntelligentPoolSpec defines the desired state of IntelligentPool

Appears in:

FieldDescriptionDefaultValidation
defaultModel stringDefaultModel specifies the default model to use when no specific model is selectedMaxLength: 100
MinLength: 1
Required: {}
models ModelConfig arrayModels defines the list of available models in this poolMaxItems: 100
MinItems: 1
Required: {}

IntelligentPoolStatus​

IntelligentPoolStatus defines the observed state of IntelligentPool

Appears in:

FieldDescriptionDefaultValidation
conditions Condition arrayConditions represent the latest available observations of the IntelligentPool's state
observedGeneration integerObservedGeneration reflects the generation of the most recently observed IntelligentPool
modelCount integerModelCount indicates the number of models in the pool

IntelligentRoute​

IntelligentRoute defines intelligent routing rules and decisions

Appears in:

FieldDescriptionDefaultValidation
apiVersion stringvllm.ai/v1alpha1
kind stringIntelligentRoute
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec IntelligentRouteSpec
status IntelligentRouteStatus

IntelligentRouteList​

IntelligentRouteList contains a list of IntelligentRoute

FieldDescriptionDefaultValidation
apiVersion stringvllm.ai/v1alpha1
kind stringIntelligentRouteList
metadata ListMetaRefer to Kubernetes API documentation for fields of metadata.
items IntelligentRoute array

IntelligentRouteSpec​

IntelligentRouteSpec defines the desired state of IntelligentRoute

Appears in:

FieldDescriptionDefaultValidation
signals SignalsSignals defines signal extraction rules for routing decisions
decisions Decision arrayDecisions defines the routing decisions based on signal combinationsMaxItems: 100
MinItems: 1
Required: {}

IntelligentRouteStatus​

IntelligentRouteStatus defines the observed state of IntelligentRoute

Appears in:

FieldDescriptionDefaultValidation
conditions Condition arrayConditions represent the latest available observations of the IntelligentRoute's state
observedGeneration integerObservedGeneration reflects the generation of the most recently observed IntelligentRoute
statistics RouteStatisticsStatistics provides statistics about configured decisions and signals

KeywordSignal​

KeywordSignal defines a keyword-based signal extraction rule

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this rule (also used as category name)MaxLength: 100
MinLength: 1
Required: {}
operator stringOperator defines the logical operator for keywords (AND/OR)Enum: [AND OR]
Required: {}
keywords string arrayKeywords is the list of keywords to matchMaxItems: 100
MinItems: 1
Required: {}
caseSensitive booleanCaseSensitive specifies whether keyword matching is case-sensitivefalse

LoRAConfig​

LoRAConfig defines a LoRA adapter configuration

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this LoRA adapterMaxLength: 100
MinLength: 1
Required: {}
description stringDescription provides a human-readable description of this LoRA adapterMaxLength: 500

ModelConfig​

ModelConfig defines the configuration for a single model

Appears in:

FieldDescriptionDefaultValidation
name stringName is the unique identifier for this modelMaxLength: 100
MinLength: 1
Required: {}
reasoningFamily stringReasoningFamily specifies the reasoning syntax family (e.g., "qwen3", "deepseek")
Must be defined in the global static configuration's ReasoningFamilies
MaxLength: 50
pricing ModelPricingPricing defines the cost structure for this model
loras LoRAConfig arrayLoRAs defines the list of LoRA adapters available for this modelMaxItems: 50

ModelPricing​

ModelPricing defines the pricing structure for a model

Appears in:

FieldDescriptionDefaultValidation
inputTokenPrice floatInputTokenPrice is the cost per input tokenMinimum: 0
outputTokenPrice floatOutputTokenPrice is the cost per output tokenMinimum: 0

ModelRef​

ModelRef defines a model reference without score

Appears in:

FieldDescriptionDefaultValidation
model stringModel is the name of the model (must exist in IntelligentPool)MaxLength: 100
MinLength: 1
Required: {}
loraName stringLoRAName is the name of the LoRA adapter to use (must exist in the model's LoRAs)MaxLength: 100
useReasoning booleanUseReasoning specifies whether to enable reasoning mode for this modelfalse
reasoningDescription stringReasoningDescription provides context for when to use reasoningMaxLength: 500
reasoningEffort stringReasoningEffort defines the reasoning effort level (low/medium/high)Enum: [low medium high]

RouteStatistics​

RouteStatistics provides statistics about the IntelligentRoute configuration

Appears in:

FieldDescriptionDefaultValidation
decisions integerDecisions indicates the number of decisions
keywords integerKeywords indicates the number of keyword signals
embeddings integerEmbeddings indicates the number of embedding signals
domains integerDomains indicates the number of domain signals

SignalCombination​

SignalCombination defines how to combine multiple signals

Appears in:

FieldDescriptionDefaultValidation
operator stringOperator defines the logical operator for combining conditions (AND/OR)Enum: [AND OR]
Required: {}
conditions SignalCondition arrayConditions defines the list of signal conditionsMaxItems: 50
MinItems: 1
Required: {}

SignalCondition​

SignalCondition defines a single signal condition

Appears in:

FieldDescriptionDefaultValidation
type stringType defines the type of signal (keyword/embedding/domain)Enum: [keyword embedding domain]
Required: {}
name stringName is the name of the signal to referenceMaxLength: 100
MinLength: 1
Required: {}

Signals​

Signals defines signal extraction rules

Appears in:

FieldDescriptionDefaultValidation
keywords KeywordSignal arrayKeywords defines keyword-based signal extraction rulesMaxItems: 100
embeddings EmbeddingSignal arrayEmbeddings defines embedding-based signal extraction rulesMaxItems: 100
domains DomainSignal arrayDomains defines MMLU domain categories for classificationMaxItems: 14