Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes
test: ## Run all tests.
go test ./...

.PHONY: generate
generate: deepcopy crds ## Regenerate CRDs and DeepCopy after API type changes.

.PHONY: crds
crds: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
$(CONTROLLER_GEN) rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=helm/library/cortex/files/crds
Expand Down
11 changes: 11 additions & 0 deletions api/external/nova/messages.go
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,17 @@ func (s NovaSpec) GetSchedulerHintStr(key string) (string, error) {
return "", errors.New("unknown scheduler hint type")
}

// IsEvacuation checks if this Nova spec represents an evacuation request.
// Nova sets the scheduler hint "_nova_check_type" to "evacuate" for evacuation requests.
// ref https://github.com/sapcc/nova/pull/594/changes
func (s NovaSpec) IsEvacuation() bool {
checkType, err := s.GetSchedulerHintStr("_nova_check_type")
if err != nil {
return false
}
return checkType == "evacuate"
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you rebase on main, you could use this function and then just check for intent == api.EvacuateIntent:

func (req ExternalSchedulerRequest) GetIntent() (RequestIntent, error) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evacuate case should also be unit tested already, so no additional test needed

type NovaInstanceGroup struct {
UserID string `json:"user_id"`
ProjectID string `json:"project_id"`
Expand Down
67 changes: 67 additions & 0 deletions api/external/nova/messages_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,73 @@ import (
"testing"
)

func TestNovaSpec_IsEvacuation(t *testing.T) {
tests := []struct {
name string
schedulerHints map[string]any
expected bool
}{
{
name: "evacuation request with list hint",
schedulerHints: map[string]any{
"_nova_check_type": []any{"evacuate"},
},
expected: true,
},
{
name: "evacuation request with string hint",
schedulerHints: map[string]any{
"_nova_check_type": "evacuate",
},
expected: true,
},
{
name: "non-evacuation request with different check type",
schedulerHints: map[string]any{
"_nova_check_type": []any{"rebuild"},
},
expected: false,
},
{
name: "non-evacuation request without check type hint",
schedulerHints: map[string]any{
"domain_name": []any{"monsoon3"},
},
expected: false,
},
{
name: "nil scheduler hints",
schedulerHints: nil,
expected: false,
},
{
name: "empty scheduler hints",
schedulerHints: map[string]any{},
expected: false,
},
{
name: "evacuation with additional hints",
schedulerHints: map[string]any{
"domain_name": []any{"monsoon3"},
"_nova_check_type": []any{"evacuate"},
},
expected: true,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
spec := NovaSpec{
SchedulerHints: tt.schedulerHints,
}
result := spec.IsEvacuation()
if result != tt.expected {
t.Errorf("IsEvacuation() = %v, expected %v", result, tt.expected)
}
})
}
}

func TestGetIntent(t *testing.T) {
tests := []struct {
name string
Expand Down
126 changes: 103 additions & 23 deletions api/v1alpha1/reservation_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,21 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// Additional specifications needed to place the reservation.
type ReservationSchedulerSpec struct {
// If the type of scheduler is cortex-nova, this field will contain additional
// information used by cortex-nova to place the instance.
CortexNova *ReservationSchedulerSpecCortexNova `json:"cortexNova,omitempty"`
}
// ReservationType defines the type of reservation.
type ReservationType string

// Additional specifications needed by cortex-nova to place the instance.
type ReservationSchedulerSpecCortexNova struct {
// The project ID to reserve for.
ProjectID string `json:"projectID,omitempty"`
// The domain ID to reserve for.
DomainID string `json:"domainID,omitempty"`
// The flavor name of the instance to reserve.
FlavorName string `json:"flavorName,omitempty"`
// Extra specifications relevant for initial placement of the instance.
FlavorExtraSpecs map[string]string `json:"flavorExtraSpecs,omitempty"`
const (
// ReservationTypeCommittedResource is a reservation for committed/reserved capacity.
ReservationTypeCommittedResource ReservationType = "CommittedResourceReservation"
// ReservationTypeFailover is a reservation for failover capacity.
ReservationTypeFailover ReservationType = "FailoverReservation"
)

// ReservationSelector defines the selector criteria for a reservation.
type ReservationSelector struct {
// IsNUMAAlignedHost specifies whether the host should be NUMA-aligned.
// +kubebuilder:validation:Optional
IsNUMAAlignedHost bool `json:"isNUMAAlignedHost,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something specific to the reservation or something we could add to the hv crd?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I don't get what this field should do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion outcome: remove for now

}

// ReservationSpec defines the desired state of Reservation.
Expand All @@ -33,10 +31,70 @@ type ReservationSpec struct {
// This can be used to clean up reservations synced from external systems
// without touching reservations created manually or by other systems.
Creator string `json:"creator,omitempty"`
// Specification of the scheduler that will handle the reservation.
Scheduler ReservationSchedulerSpec `json:"scheduler,omitempty"`
// Resources requested to reserve for this instance.
Requests map[string]resource.Quantity `json:"requests,omitempty"`

// Resources to reserve for this instance.
Resources map[string]resource.Quantity `json:"resources,omitempty"`

// Selector specifies criteria for selecting appropriate hosts.
// +kubebuilder:validation:Optional
Selector ReservationSelector `json:"selector,omitempty"`

// SchedulingDomain specifies the scheduling domain for this reservation (e.g., "nova", "ironcore").
// +kubebuilder:validation:Optional
SchedulingDomain string `json:"schedulingDomain,omitempty"`

// Type of reservation. Defaults to CommittedResourceReservation if not specified.
// +kubebuilder:validation:Enum=CommittedResourceReservation;FailoverReservation
// +kubebuilder:default=CommittedResourceReservation
Type ReservationType `json:"type,omitempty"`

// --- Time-related fields ---

// StartTime is the time when the reservation becomes active.
// +kubebuilder:validation:Optional
StartTime *metav1.Time `json:"startTime,omitempty"`

// EndTime is the time when the reservation expires.
// +kubebuilder:validation:Optional
EndTime *metav1.Time `json:"endTime,omitempty"`
Comment on lines +53 to +59
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this is relevant for all kinds of reservation.
I.e. is a start + end time relevant for a failover reservation?
If not, we should think about substructs like

kind: FailoverReservation
failover:
  ...
# or
kind: CommitmentReservation
commitment:
  startTime:
  endTime:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets discuss live, I think there was an argument for flat instead of nested structs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion outcome: Keep for now but we are open to change it


// NOTE: ActiveTime was removed. Use StartTime and EndTime to define the active period.
// If you need a duration-based activation, calculate EndTime = StartTime + Duration externally.

// --- Scheduling-related fields (e.g., for Nova) ---

// ProjectID is the UUID of the project this reservation belongs to.
// +kubebuilder:validation:Optional
ProjectID string `json:"projectID,omitempty"`

// DomainID is the domain ID to reserve for.
// +kubebuilder:validation:Optional
DomainID string `json:"domainID,omitempty"`

// ResourceName is the name of the resource to reserve (e.g., FlavorName for Nova).
// +kubebuilder:validation:Optional
ResourceName string `json:"resourceName,omitempty"`

// ResourceExtraSpecs contains extra specifications relevant for initial placement
// of the instance (e.g., FlavorExtraSpecs for Nova).
// +kubebuilder:validation:Optional
ResourceExtraSpecs map[string]string `json:"resourceExtraSpecs,omitempty"`
Comment on lines +64 to +81
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can also put this under its own substruct, e.g.

constraints:
  projectID: ...
  ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we make this part of the expectedGuests below that I proposed, meaning, we schedule this reservation so it conforms to the expected guests' scheduling constraints. Which would be semantically correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion outcome: we just remove it for know until we need something like ResourceExtraSpecs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI: for me prepare comparison for next tech sync (flat vs nested for our 4 usecases)


// --- Placement fields (desired state) ---

// Host is the desired compute host where the reservation should be placed.
// This is a generic name that represents different concepts depending on the scheduling domain:
// - For Nova: the hypervisor hostname
// - For Pods: the node name
// The scheduler will attempt to place the reservation on this host.
// +kubebuilder:validation:Optional
Host string `json:"host,omitempty"`

// ConnectTo the instances that lossy use this reservation
// The key is the instance, e.g., VM UUID, the value is usecase specific and can be empty
// E.g., for failover reservations, this maps VM/instance UUIDs to the host the VM is currently running on.
// +kubebuilder:validation:Optional
ConnectTo map[string]string `json:"connectTo,omitempty"`
Comment on lines +93 to +97
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do something like this to type it more strongly

expectedGuests:
  - vmID: my-funny-vm
    expectBy: utc-timestamp
    kind: InitialPlacementVM
  - vmID: my-other-vm
    kind: FailoverVM
  - project: my-unfunny-project
    kind: CommitmentVM

}

// The phase in which the reservation is.
Expand All @@ -58,17 +116,39 @@ const (
type ReservationStatus struct {
// The current phase of the reservation.
Phase ReservationStatusPhase `json:"phase,omitempty"`

// The current status conditions of the reservation.
// Conditions include:
// - type: Ready
// status: True|False|Unknown
// reason: ReservationReady
// message: Reservation is successfully scheduled
// lastTransitionTime: <timestamp>
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type"`
// The name of the compute host that was allocated.
Host string `json:"host"`

// ObservedHost is the actual host where the reservation is on
// This is set by the scheduler after successful placement and reflects the current state.
// It should match Spec.Host when the reservation is successfully placed.
// This is a generic name that represents different concepts depending on the scheduling domain:
// - For Nova: the hypervisor hostname
// - For Pods: the node name
// +kubebuilder:validation:Optional
ObservedHost string `json:"observedHost,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a hot take, but does the reservation need to keep track where the vm is located? Doesn't it only need to keep track if it is allocated by it and when it was planned to be allocated by it, so we can detect if we need to move the reservation to the right spot (where the vm actually spawned) and how much space of the reservation is left?


// ObservedConnectTo maps VM/instance UUIDs to the host they are currently allocated on.
// This tracks which VMs are actually using this failover reservation and their current placement.
// Key: VM/instance UUID, Value: Host name where the VM is currently running.
// Only used when Type is FailoverReservation.
// This should reflect the actual state and may differ from Spec.ConnectTo during transitions.
// +kubebuilder:validation:Optional
ObservedConnectTo map[string]string `json:"observedConnectTo,omitempty"`
Copy link
Member

@PhilippMatthes PhilippMatthes Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better?

allocation:
  - vmID: my-funny-vm
    # To subtract it from the reserved resources
    resources:
      cpu: 2
      memory: 4
    # Keep track of our vm, can be Pending if we're
    # locking this space preventively
    status: Active
    # The last time we observed a change in the state.
    # This will allow us to remove the allocation if it's
    # unready for too long?
    lastTransitionTime: utc timestamp
    # Some enum string that helps us handle this
    # better (in case status isn't enough)
    reason: ...
    # Human-readable explanation why this allocation
    # currently resides in this status
    message: ...

}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:scope=Cluster
// +kubebuilder:printcolumn:name="Host",type="string",JSONPath=".status.host"
// +kubebuilder:printcolumn:name="Host",type="string",JSONPath=".status.observedHost"
// +kubebuilder:printcolumn:name="Phase",type="string",JSONPath=".status.phase"

// Reservation is the Schema for the reservations API
Expand Down
66 changes: 34 additions & 32 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading