Skip to content

chore: Skip Code CI for non code changes#22894

Open
comphead wants to merge 3 commits into
apache:mainfrom
comphead:chore
Open

chore: Skip Code CI for non code changes#22894
comphead wants to merge 3 commits into
apache:mainfrom
comphead:chore

Conversation

@comphead

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Problem. Doc-only PRs (a typo fix in README.md, a release-note edit) trigger the full ~25-job Rust CI matrix in rust.yml and dependencies.yml. Wasted runner time and queue capacity for
changes that can't break code.

Solution. Adopt Apache Spark's gating pattern (spark/.github/workflows/build_and_test.yml): a single change-detection job, then per-job if:.

  • rust.yml and dependencies.yml each gain a detect-changes job that diffs the PR against an IGNORE_CODE_CI_FOR_PATHS glob list and emits has_code.
  • Every existing job in those workflows now declares needs: detect-changes and if: needs.detect-changes.outputs.has_code == 'true'.
  • Doc-only PR → those jobs report skipped (which GitHub branch protection treats as satisfying required status checks).
  • Code PR → identical to before, plus a sub-second detector.

Why per-job if: and not workflow-level paths-ignore. The repo's required_status_checks list in .asf.yaml is incompatible with paths-ignoreci/scripts/check_asf_yaml_status_checks.py
explicitly rejects workflows that have it, because a workflow-level skip never reports a status, leaving required checks in "Expected" and blocking merge. A job-level if: false does report
(conclusion: skipped), which is what unblocks doc-only PRs without weakening protection on code PRs.

Other workflows. codeql.yml, breaking_changes_detector.yml, large_files.yml keep workflow-level paths-ignore since none are required checks. extended.yml, audit.yml, docs_pr.yaml were
already path-filtered to the right files.

.asf.yaml. One added context: "Detect changes". The detector script is fail-open (any error → has_code=true → run full CI), so a transient git failure can't silently merge a broken PR; this
required check makes a broken detector itself blocking

@github-actions github-actions Bot added the development-process Related to development process of DataFusion label Jun 10, 2026
@comphead

Copy link
Copy Markdown
Contributor Author

@blaginin @alamb @Jefffrey this is first attempt to skip code CI for non-code changes, let me know what do you think.

I picked this way, to preserve all currently required checks in .asf.yaml. I assume it was done for merge queue support?

@blaginin

Copy link
Copy Markdown
Member

I assume it was done for merge queue support?

Yes! you cannot path-ignore for MQ because all required steps have to be green

https://github.com/orgs/community/discussions/45899

@blaginin

Copy link
Copy Markdown
Member

A cool thing to test would be: how many non-code changes, e.g. configs, caused test failures in the last N months? I wanted to do that myself, but my internet is horrible for the next couple of days.

Also, the approach looks correct to me, but with those mandatory steps, 50% of the time I end up predicting it wrong and accidentally blocking merging for everyone.

If you like, feel free to push it to https://github.com/apache/datafusion-sandbox/ to make sure everything is passing. Also happy to do that testing myself in a few days

@comphead

Copy link
Copy Markdown
Contributor Author

If you like, feel free to push it to https://github.com/apache/datafusion-sandbox/ to make sure everything is passing. Also happy to do that testing myself in a few days

Thanks for the hint, its actually nice to have ASF sandbox, I had to experiment with fork-subfork when was engaging in assessing user fork CI

name: cargo test doc (amd64)
needs: linux-build-lib
needs: [linux-build-lib, detect-changes]
if: needs.detect-changes.outputs.has_code == 'true'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we run doc tests against the code in the md files too?

e.g.

#[cfg(doctest)]
doc_comment::doctest!(
"../../../docs/source/user-guide/arrow-introduction.md",
user_guide_arrow_introduction
);
#[cfg(doctest)]

name: check configs.md and ***_functions.md is up-to-date
needs: linux-build-lib
needs: [linux-build-lib, detect-changes]
if: needs.detect-changes.outputs.has_code == 'true'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this mean if configs.md is manually changed without a code update we could miss this? (or was that an existing issue?)


on:
pull_request:
paths-ignore:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we keep this check even for doc files?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

development-process Related to development process of DataFusion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: do not run code CI for documentation only PRs

3 participants