Hello !
Thank you for posting on Microsoft Learn.
Verify that the source tables in SQL Server have CDC enabled:
-- Enable CDC at DB level
EXEC sys.sp_cdc_enable_db;
-- Enable CDC for specific table
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'MyTable',
@role_name = NULL;
This will create change tracking tables and functions for deltas.
Then create a Linked Service in ADF for your on-prem SQL Server using your SHIR and another one to your Blob Storage (choose either ADLS Gen2 or regular Blob depending on your setup).
In ADF, go to Data Flows and create a new Mapping Data Flow and add a CDC source:
- Choose your SQL Server linked service.
- Select "Change Data Capture" as the source type.
- Select the correct capture instance (created by enabling CDC).
- Choose between: all changes (Insert/Update/Delete) or net changes (deduplicated changes)
Then add a sink pointing to your Blob Storage. where you can write as parquet, JSON, or CSV and partition by date or primary key as needed.
You can optionally use a derived column or filter to transform or enrich the data before writing.
If you don’t want a real-time stream, you can create a pipeline that runs your Data Flow where you use a Tumbling Window Trigger with a recurrence of 15 or 30 minutes and enable dependency tracking to avoid overlap and configure watermarking using a field like __$start_lsn or __$seqval.
To avoid re-reading data you can ise a parameterized watermark column (__$start_lsn or timestamp) and Store the last successful value in a metadata table or file and pass it to the pipeline on next run using parameters.