-
Notifications
You must be signed in to change notification settings - Fork 36
[WIP] Implement exact reserved cpuset for some cases like using Intel priority core turbo #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,6 +25,7 @@ type ( | |
| LoadClass = cfgapi.LoadClass | ||
| SchedulingClass = cfgapi.SchedulingClass | ||
| CPUTopologyLevel = cfgapi.CPUTopologyLevel | ||
| ReservedCPUMode = cfgapi.ReservedCPUMode | ||
| ) | ||
|
|
||
| var ( | ||
|
|
@@ -35,6 +36,9 @@ var ( | |
| ) | ||
|
|
||
| const ( | ||
| ReservedCPUModePreferred = cfgapi.ReservedCPUModePreferred | ||
| ReservedCPUModeHardExact = cfgapi.ReservedCPUModeHardExact | ||
|
Comment on lines
+39
to
+40
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @louie-tsai Can't we achieve this without introducing explicit dedicated modes for the existing and the new behavior ? Maybe with something like "if we have the reserved CPUs defined as an explicit CPU set, and we have an explicit definition of the reserved balloon in the configuration, and that explicit definition has MaxBalloons == 1 && MinCpus == ReservedCpus.Size() && MaxCpus == MinCpus, then we treat that as an immutable/unresizable reserved balloon" ?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, something that would make this behavior less of an exception, and instead more of just applying existing/expected behavior to the reserved balloon. |
||
|
|
||
| CPUTopologyLevelUndefined = cfgapi.CPUTopologyLevelUndefined | ||
| CPUTopologyLevelSystem = cfgapi.CPUTopologyLevelSystem | ||
| CPUTopologyLevelPackage = cfgapi.CPUTopologyLevelPackage | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,6 +22,7 @@ spec: | |
| {{ $name }}: "{{ $value }}" | ||
| {{- end }} | ||
| spec: | ||
| hostPID: true | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on the all other changes, this seems to me as unjustified security escalation, basically same as @klihub had below. |
||
| {{- with .Values.tolerations }} | ||
| tolerations: | ||
| {{- toYaml . | nindent 8 }} | ||
|
|
@@ -99,7 +100,9 @@ spec: | |
| image: {{ .Values.image.name }}:{{ .Values.image.tag | default .Chart.AppVersion }} | ||
| imagePullPolicy: {{ .Values.image.pullPolicy }} | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| privileged: true | ||
| allowPrivilegeEscalation: true | ||
| runAsUser: 0 | ||
|
Comment on lines
+103
to
+105
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @louie-tsai I don't see anything in this PR which would require such a change. If we need to alter the capability set of the policy (for reasons beyond this PR), we should
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes. agree. that is for Openshift, so let me revert this change. |
||
| capabilities: | ||
| drop: | ||
| - ALL | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the reserved balloon's CPU count matches
exactCountbut the cpuset has drifted, it fallsthrough and hits the generic
if oldCpuCount == newCpuCount { return nil }before drift detection is ever reached. As such, currently the drift error in is dead code, it can never be triggered. We could merge both blocks into one that always returns, so the generic early-return can't short-circuit it. Example: