Hi Janice Chi
Thanks for posting your query.
Reconciling real-time CDC data with a live DB2 source can’t guarantee accuracy because rows can change after the event is captured but before it lands in Hyperscale. In most CDC implementations, real-time reconciliation is done between CDC payload and the target, not with the live source.
A few points based on what you have shared:
- Using
JRN_TIMESTAMP
for snapshot-based reconciliation is a common practice. You can define soft windows (e.g., “up to T”) and compare only the merged data in Hyperscale vs. staged CDC events in that window. - Since DB2 is live, trying to compare values at runtime with source will give inconsistent results. Instead, validate:
- CDC completeness (i.e., no drops or duplicates)
- Merge correctness (row count or hash comparison)
- If your CDC stream doesn’t include
txn_id
, and commit timestamps are all you have, that’s fine - just make sure they’re used consistently to bracket the window. - There’s no out-of-box commit-aware reconciliation in ADF or Azure SQL today. This logic usually lives in your framework - and from earlier threads, it looks like you already have control tables tracking offsets and SHA checks, so you’re on the right path.
As for real-time vs. catch-up: most customers run strict reconciliation during catch-up (bounded) and limit real-time checks to Hyperscale ingestion consistency only.
Also - would really appreciate it if you could mark any earlier answers that helped as Accepted, so we can keep things clean and useful for others. And if future queries are unrelated, posting them as new threads helps us give more targeted replies.
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.