Skip to content

OCPEDGE-2746: Add MutableTopology feature gated infra spec.controlPlaneTopology#2891

Open
jeff-roche wants to merge 5 commits into
openshift:masterfrom
jeff-roche:controlPlaneTopologySpec
Open

OCPEDGE-2746: Add MutableTopology feature gated infra spec.controlPlaneTopology#2891
jeff-roche wants to merge 5 commits into
openshift:masterfrom
jeff-roche:controlPlaneTopologySpec

Conversation

@jeff-roche

@jeff-roche jeff-roche commented Jun 15, 2026

Copy link
Copy Markdown

Utilizes the MutableTopology feature gate which enables spec.controlPlaneTopology on the Infrastructure resource, allowing cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift, SelfManagedHA) so the field is only present in the appropriate profile/feature-set combinations.

Includes integration tests verifying:

  • Accepted values when the gate is enabled (MutableTopology.yaml)

Implements API changes needed for enhancements#2008

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 15, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 15, 2026

Copy link
Copy Markdown

@jeff-roche: This pull request references OCPEDGE-2746 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Utilizes the MutableTopology feature gate which enables spec.controlPlaneTopology on the Infrastructure resource, allowing cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift, SelfManagedHA) so the field is only present in the appropriate profile/feature-set combinations.

Includes integration tests verifying:

  • Accepted values when the gate is enabled (MutableTopology.yaml)
  • Field pruning when the gate is disabled (AAA_ungated.yaml)

Implements API changes needed for enhancements#2008

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Hello @jeff-roche! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR adds an optional ControlPlaneTopology field to InfrastructureSpec, gated by the MutableTopology OpenShift feature gate with kubebuilder enum validation restricting values to HighlyAvailable and SingleReplica. A new Infrastructure CRD test suite validates field creation with omitted/allowed values, update transitions between allowed values, and negative cases for unsupported values. A ControllerConfig test fixture confirms downstream consumption with spec set to SingleReplica and status defaulting to HighlyAvailable. CRD payload manifests have annotations removed for self-managed-high-availability and ibm-cloud-managed inclusion, and feature-set values are updated from CustomNoUpgrade to TechPreviewNoUpgrade in applicable variants. The test configuration path is updated to reference the new SelfManagedHA variant. Existing ungated test expectations are reformatted to use single-quoted YAML scalar strings for consistency.

Suggested reviewers

  • jkyros
🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Test Structure And Quality ❓ Inconclusive No result was produced after verification. Marking as INCONCLUSIVE. Re-run the check or adjust instructions to produce a final result.
✅ Passed checks (14 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a feature-gated MutableTopology capability with spec.controlPlaneTopology field to the Infrastructure resource.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test names in the PR are static and deterministic. Test names in YAML files contain no UUIDs, timestamps, generated identifiers, pod/node names, or dynamic formatting.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Changes are limited to YAML-based CRD validation test configs, Go type definitions, and CRD manifests. The check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests (It(), Describe(), Context(), When()) are added in this PR. Changes are limited to API validation test configurations (YAML) and Go unit test updates, which are outside the...
Topology-Aware Scheduling Compatibility ✅ Passed PR adds infrastructure support (ControlPlaneTopology field) to enable topology-aware scheduling, not restrict it. Contains no deployment manifests, affinity rules, nodeSelectors, or other topology-...
Ote Binary Stdout Contract ✅ Passed PR adds API field and test fixtures only; no process-level code or stdout writes introduced. Existing test logging properly uses GinkgoWriter.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds no Ginkgo e2e tests. Changes include unit tests using Go testing.T and YAML CRD validation fixtures, which are not subject to IPv6/disconnected network compatibility checks.
No-Weak-Crypto ✅ Passed PR contains no weak cryptography. Changes are limited to topology configuration field additions, test fixtures, and CRD manifests with no cryptographic algorithms or implementations.
Container-Privileges ✅ Passed PR changes are limited to CRD schemas, test configs, and type definitions with no container specs or privilege escalations flagged by check.
No-Sensitive-Data-In-Logs ✅ Passed No logging statements or sensitive data exposure found in PR changes. All modified files contain only type definitions, test configurations, and CRD manifests using clearly-marked test fixtures (fa...
Description check ✅ Passed The PR description clearly relates to the changeset, describing the MutableTopology feature gate implementation, spec.controlPlaneTopology field, CRD manifest restructuring, and integration tests included in the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 15, 2026
@openshift-ci openshift-ci Bot requested review from JoelSpeed and jkyros June 15, 2026 18:03
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jeff-roche

Copy link
Copy Markdown
Author

The linter failure doesn't make much sense to me, enum should be a better validator than string length.

Comment thread config/v1/types_infrastructure.go Outdated
Comment thread config/v1/tests/infrastructures.config.openshift.io/AAA_ungated.yaml Outdated
Comment thread config/v1/types_infrastructure.go Outdated
Comment thread config/v1/types_infrastructure.go Outdated
Comment on lines +59 to +62
// controlPlaneTopology expresses the desired topology configuration for control nodes.
// The 'HighlyAvailable' mode represents a "normal", 3 control node cluster.
// The 'SingleReplica' mode represents configuration where there is a single control node.
// If left blank, no change is required and no transitions will be triggered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to explain here what happens when you set the different values. At the moment you're kind of defining the meaning but not saying what the observable changes are.

We also need to think about valid transitions. For example, once the spec is HighlyAvailable, and the status is also HighlyAvailable, moving spec back to SingleReplica wouldn't be supported and we should prevent that with a validation and explain it here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you suggest is the best place to prevent that? Current plan for the new CCO controller was only react to SNO->HA for now, do you think we need some sort of CEL validation to prevent changing from HA->SNO if status is HA?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a CEL rule to the infrastructure object to to prevent the spec change if status is HA yes, that shouldn't be super complex I don't think

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
config/v1/types_infrastructure.go (1)

59-66: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Enhance field documentation to explain observable behavior and add transition validation.

The field documentation needs improvement in three areas:

  1. Observable behavior: The comment defines what each mode represents but doesn't explain what observable changes occur when setting these values. For example, the status field (lines 108-115) explains that operators "should not configure the operand for highly-available operation" in SingleReplica mode. The spec field should similarly explain what happens when you request each topology.

  2. Valid transitions: A past review comment indicates that certain transitions (e.g., HighlyAvailable → SingleReplica after status is set) may not be supported. If transition constraints exist, they must be:

    • Documented in the field comment
    • Enforced with +kubebuilder:validation:XValidation CEL rules
  3. Relationship to status: The comment should clarify how spec.controlPlaneTopology relates to status.controlPlaneTopology, especially regarding the behavior when the spec field is omitted.

As per coding guidelines, field relationships or constraints must be enforced with XValidation rules using CEL expressions, and all validation markers must be fully documented in field comments.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/v1/types_infrastructure.go` around lines 59 - 66, The
controlPlaneTopology field comment needs enhancement in three areas: add
documentation explaining what observable changes occur when each mode is set
(e.g., what happens with HighlyAvailable vs SingleReplica), document any valid
topology transitions and add kubebuilder:validation:XValidation CEL rules to
enforce transition constraints (such as preventing HighlyAvailable to
SingleReplica transitions after status is set), and clarify in the comment how
the spec.controlPlaneTopology field relates to the status.controlPlaneTopology
field including the behavior when the spec field is omitted. Update the comment
block above the controlPlaneTopology field definition and add appropriate
XValidation markers with CEL expressions to enforce any documented transition
rules.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@config/v1/types_infrastructure.go`:
- Around line 59-66: The controlPlaneTopology field comment needs enhancement in
three areas: add documentation explaining what observable changes occur when
each mode is set (e.g., what happens with HighlyAvailable vs SingleReplica),
document any valid topology transitions and add
kubebuilder:validation:XValidation CEL rules to enforce transition constraints
(such as preventing HighlyAvailable to SingleReplica transitions after status is
set), and clarify in the comment how the spec.controlPlaneTopology field relates
to the status.controlPlaneTopology field including the behavior when the spec
field is omitted. Update the comment block above the controlPlaneTopology field
definition and add appropriate XValidation markers with CEL expressions to
enforce any documented transition rules.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f6a425df-2096-4cea-b08f-b935c6be042c

📥 Commits

Reviewing files that changed from the base of the PR and between 4fa8a9b and 9d6450a.

📒 Files selected for processing (1)
  • config/v1/types_infrastructure.go

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2026
Introduce the MutableTopology feature gate which enables
spec.controlPlaneTopology on the Infrastructure resource, allowing
cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift,
SelfManagedHA) so the field is only present in the appropriate
profile/feature-set combinations.

Includes integration tests verifying:
- Accepted values when the gate is enabled (MutableTopology.yaml)
- Field pruning when the gate is disabled (AAA_ungated.yaml)

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Jeff Roche <jeroche@redhat.com>
@jeff-roche jeff-roche force-pushed the controlPlaneTopologySpec branch from 6bc4bb9 to eaf0036 Compare June 17, 2026 15:46
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2026
Signed-off-by: Jeff Roche <jeroche@redhat.com>
Signed-off-by: Jeff Roche <jeroche@redhat.com>
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@jeff-roche: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

…opology

Signed-off-by: Jeff Roche <jeroche@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants