Skip to content

[venice-common] Add TEST-ONLY store-scoped config to inject A/A DCR divergence for consistency-checker validation#2878

Open
Mohith22 wants to merge 1 commit into
linkedin:mainfrom
Mohith22:mdamarap/inject-aa-bug
Open

[venice-common] Add TEST-ONLY store-scoped config to inject A/A DCR divergence for consistency-checker validation#2878
Mohith22 wants to merge 1 commit into
linkedin:mainfrom
Mohith22:mdamarap/inject-aa-bug

Conversation

@Mohith22

@Mohith22 Mohith22 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Problem Statement

We need confidence that our A/A consistency checker can actually catch real divergence between fabrics. Today we have no controlled way to produce a genuine A/A inconsistency on demand, so the checker's detection capability is effectively untested end-to-end. We are manually injecting this bug to verify the consistency checker is actually capable of flagging it.

Solution

Add a TEST-ONLY, store-scoped server config (server.aa.dcr.bug.injection.store.to.fabric.map) that deliberately diverges the two fabrics for a single store. On the server whose local region matches the configured fabric, the A/A DCR write timestamp is reflected (Long.MAX_VALUE - ts), inverting "newer wins" into "older wins". Since only one fabric inverts, the fabrics deterministically settle on different winners for the same key — a real, reproducible inconsistency the checker should flag. The DCR merge code is untouched (reflection happens at one ingestion chokepoint), it's off by default, logs a loud WARN when active, and must never be enabled in production.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
  • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

@Mohith22 Mohith22 force-pushed the mdamarap/inject-aa-bug branch 3 times, most recently from d97b098 to 73ce32e Compare June 18, 2026 21:04
@Mohith22 Mohith22 force-pushed the mdamarap/inject-aa-bug branch 2 times, most recently from 4e89b28 to 56b63d8 Compare June 18, 2026 23:41
xunyin8
xunyin8 previously approved these changes Jun 18, 2026
@Mohith22 Mohith22 changed the title [venice-common] Add TEST-ONLY config key for A/A DCR bug injection [venice-common] Add TEST-ONLY store-scoped config to inject A/A DCR divergence for consistency-checker validation Jun 19, 2026
@Mohith22 Mohith22 force-pushed the mdamarap/inject-aa-bug branch from 56b63d8 to a4ff8ac Compare June 19, 2026 19:57
…ivergence for consistency-checker validation
@Mohith22 Mohith22 force-pushed the mdamarap/inject-aa-bug branch from a4ff8ac to dd7a68a Compare June 19, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants