Conversation
Introduce a structured reporting layer that captures evaluation metadata,
timing, topology, retries, and failures without consuming result payloads,
mirroring the existing evaluator/policy architecture.
- ReportingPolicy shared core with span/run ContextVars for nested,
thread/async-local isolation
- ReportEvent model plus NoOp/InMemory/Logging/Composite/UI reporters and
a bounded UI polling buffer
- Tracing/metrics/alerts policies and OpenTelemetry tracing/metrics
integration (exposed via otel/full/develop/test extras)
- Structural reporting models and a <Vendor><Signal>Reporting{Evaluator,Model}
taxonomy with placeholder vendor classes
- Refactor LoggingEvaluator onto LoggingPolicy to share formatting and
enable LoggingModel while preserving the existing import path and log output
- Retry lifecycle events now carry run_id and child depth via
current_span_depth(); reporter failures are isolated on reporting/retry paths
- DryRunEvaluator with context-local planning guard; synthetic mode is
non-transparent so results are not cached under real-run keys; node_key
strips the dry-run evaluator layer while preserving non-evaluator options
so it matches cache_key() for the logical node
- ReportingStateStore preserves terminal outcomes while allowing retry streams
to progress
- Docs: reporting workflow, reporter options, OpenTelemetry install, reserved
run/graph phases, extra payload keys, and dry-run synthetic-result warnings
- Tests across utils/evaluators/models covering success/error flows, dry-run
override recursion, cache composition, concurrent dry-run reuse, node-key
semantics, retry event nesting, reporter failure isolation, and state folding
Contributor
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #231 +/- ##
==========================================
+ Coverage 94.19% 94.40% +0.21%
==========================================
Files 150 156 +6
Lines 12094 13196 +1102
Branches 665 706 +41
==========================================
+ Hits 11392 12458 +1066
- Misses 570 596 +26
- Partials 132 142 +10 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce a structured reporting layer that captures evaluation metadata, timing, topology, retries, and failures without consuming result payloads, mirroring the existing evaluator/policy architecture.