Building an ETL Pipeline to Load Data Incrementally from Office365 to S3 using ADF and Databricks
by Yi AiNovember 19th, 2021 Too Long; Didn't Read
In this post, we will look at creating an Azure data factory with a pipeline that loads Office 365 event data incrementally based on change data capture (CDC) information in the source of Change Data Feed(CDF) of a Delta lake table to an AWS S3 bucket. What we’ll cover: Create an ADF Pipeline that loads Calendar events from Offfice365 to a Blob container. Run a Databricks Notebook with the activity in the ADF pipeline, transform extracted Calendar event and merge to a Delta Lake table.