ci(e2e): balance parallel nodes with measured per-file timings instead of filesize split#10423
ci(e2e): balance parallel nodes with measured per-file timings instead of filesize split#10423davidfirst wants to merge 9 commits into
Conversation
…d of filesize split
Code Review by Qodo
1. Unreachable failure assertion
|
PR Summary by QodoBalance CircleCI e2e parallel nodes using measured per-file timings WalkthroughsDescription• Replace filesize-based e2e splitting with deterministic, timing-weighted bin packing per node. • Add a per-file timing manifest derived from historical CI node wall-clock observations. • Provide a generator script to periodically refresh timings from CircleCI’s public API. Diagramgraph TD
config[".circleci/config.yml"] --> ci["CircleCI e2e_test"] --> split["split-e2e-tests.js"] --> mocha["mocha-circleci"]
split --> timings["e2e-test-timings.json"]
gen["generate-e2e-timings.js"] --> api{{"CircleCI API"}} --> timings
subgraph Legend
direction LR
_cfg["Config/File"] ~~~ _job["CI Job/Step"] ~~~ _ext{{"External API"}}
end
High-Level AssessmentThe following are alternative approaches to this PR: 1. Custom Mocha/JUnit timing reporter (include hook time)
2. Runtime splitting via CircleCI API (latest job timings)
3. Store per-file timings in repo artifacts (e.g., S3/Cache) and update automatically
Recommendation: The current approach (checked-in manifest + deterministic local bin-packing) is a good fit for reliability and reproducibility: no runtime API calls, stable assignments across nodes, and a clear fallback (median/default weights). The main tradeoff is staleness/maintenance of the manifest; if drift becomes frequent, consider a future follow-up to generate hook-inclusive timings via a Mocha reporter so the data refreshes automatically while still leveraging CircleCI’s built-in splitter. File ChangesOther (4)
|
|
Code review by qodo was updated up to the latest commit 6317dc9 |
|
Code review by qodo was updated up to the latest commit 8c96876 |
|
Code review by qodo was updated up to the latest commit 857e455 |
1 similar comment
|
Code review by qodo was updated up to the latest commit 857e455 |
|
Code review by qodo was updated up to the latest commit 994a025 |
| async function main() { | ||
| console.error('finding recent successful e2e_test jobs...'); | ||
| const jobNumbers = await findRecentE2eJobs(); | ||
| console.error(`collecting node data from ${jobNumbers.length} jobs...`); | ||
| const observations = []; | ||
| for (const jobNumber of jobNumbers) { | ||
| const nodes = await fetchNodeObservations(jobNumber); | ||
| observations.push(...nodes); | ||
| console.error(` job ${jobNumber}: ${nodes.length} nodes`); | ||
| } | ||
| if (observations.length < 50) { | ||
| throw new Error(`only ${observations.length} node observations collected - not enough to solve reliably`); | ||
| } | ||
| const files = [...new Set(observations.flatMap((o) => o.files))].sort(); | ||
| console.error(`solving for ${files.length} files from ${observations.length} node observations...`); | ||
| const { estimate, fixed, meanError } = solve(observations, files); | ||
| console.error( | ||
| `fixed per-node overhead: ${Math.round(fixed)}s, mean node prediction error: ${(meanError * 100).toFixed(1)}%` | ||
| ); | ||
| if (meanError > 0.15) { | ||
| console.error('warning: prediction error is high; estimates may be stale or assignments lacked diversity'); | ||
| } | ||
|
|
||
| // floor at 15s: the solver can collapse under-identified files to 0, which makes the | ||
| // bin-packer treat them as free and pile dozens of them onto one node | ||
| const manifest = Object.fromEntries(files.map((f, i) => [f, Math.max(15, Math.round(estimate[i]))])); | ||
| fs.writeFileSync(OUT_FILE, `${JSON.stringify(manifest, null, 2)}\n`); | ||
| console.error(`wrote ${OUT_FILE}`); | ||
| } | ||
|
|
||
| main().catch((err) => { | ||
| console.error(err); | ||
| process.exit(1); |
There was a problem hiding this comment.
1. Raw output in generate-e2e-timings.js 📘 Rule violation ⚙ Maintainability
The new scripts scripts/generate-e2e-timings.js and scripts/split-e2e-tests.js emit progress, warning/note, and stats output using raw console.error and process.stderr.write/process.stdout.write rather than the shared CLI output formatter. This violates the repository’s CLI Output Style Guide and can lead to inconsistent CLI styling across the repo.
Agent Prompt
## Issue description
`scripts/generate-e2e-timings.js` and `scripts/split-e2e-tests.js` emit CLI output (progress, warning/note, and stats lines) using raw `console.error` and `process.stderr.write` / `process.stdout.write`. The repo requires CLI output to follow the CLI Output Style Guide and use the shared output formatting toolkit so messaging is consistent.
## Issue Context
The CLI Output Style Guide (`scopes/harmony/cli/cli-output-style-guide.md`) mandates using the shared toolkit from `@teambit/cli` (`scopes/harmony/cli/output-formatter.ts`) for CLI output. This PR introduces new CLI output in both scripts without using that toolkit.
## Fix Focus Areas
- scripts/generate-e2e-timings.js[126-158]
- scripts/split-e2e-tests.js[41-96]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
… e2e_test parallelism 40
|
Code review by qodo was updated up to the latest commit ff971f3 |
| it('should fail when merging with --build due to test failures', () => { | ||
| const output = helper.command.mergeLane('dev', '--build --no-squash'); | ||
| expect(output).to.have.string('Total Snapped: 0'); | ||
| }); |
There was a problem hiding this comment.
1. Unreachable failure assertion 🐞 Bug ≡ Correctness
The test expects helper.command.mergeLane() to return output when the merge fails, but mergeLane() runs via execSync and throws on non-zero exit so the assertion on the returned output is never reached. This makes the new regression test fail (or fail to validate the intended failure mode).
Agent Prompt
### Issue description
The test `should fail when merging with --build due to test failures` calls `helper.command.mergeLane(...)` and asserts on its return value. However, `mergeLane()` ultimately uses `execSync` via `runCmd()`, so when the merge fails it throws and never returns a string, making the assertion unreachable.
### Issue Context
`CommandHelper.mergeLane()` delegates to `runCmd()`, which uses `childProcess.execSync(...)` and will throw on non-zero exit codes.
### Fix Focus Areas
- e2e/harmony/lanes/merge-lanes-edge-cases-2.e2e.ts[436-439]
### Recommended fix
Update the test to assert on the failure using one of the existing patterns used elsewhere in this repo:
- Use `helper.general.runWithTryCatch('bit lane merge ...')` (or add a `mergeLaneWithTryCatch` helper) and then assert on the returned output string.
- Or wrap the call in `expect(() => helper.command.mergeLane(...)).to.throw()` and validate the error message/output as needed.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
Code review by qodo was updated up to the latest commit d1c55ed |
Replaces
circleci tests split --split-by=filesizewith deterministic bin-packing based on measured per-file wall-clock times, and raises e2e_test parallelism to 40.Result: e2e job 34 min → 16 min (-53%) at ~+5% cost (463 → 487 billed machine-minutes).
Why not CircleCI's
--split-by=timings: it relies on per-testcase junit durations, but ~85% of our e2e time is in before/after hooks, which mocha attributes to no testcase — the splitter sees ~13% of the real cost. Node times ranged 9.7–32.2 min for the same job; the slowest node sets the job duration.How it works:
scripts/e2e-test-timings.json— per-file durations derived from actual CI node run times (each node is one equation: wall = overhead + sum of its files; solved with non-negative least squares).scripts/split-e2e-tests.js— greedy LPT packing; each node independently computes the same assignment and picks its share viaCIRCLE_NODE_INDEX. New files not in the manifest get the median weight.scripts/generate-e2e-timings.js— regenerates the manifest from the CircleCI public API (no token needed); run occasionally to refresh.deps-graph,custom-env,merge-lanes-edge-cases, each 11-17 min alone) were split at describe boundaries — pure moves, no test changes — since the longest single file is a hard floor on any packing.Parallelism 40 is roughly cost-neutral because CircleCI bills per container-minute and total work is constant; only per-container spin-up/cache overhead multiplies.