top of page

FMCG company elevates data quality through the implementation of advanced data engineering best practices within its established data pipeline

An FMCG multinational company was aiming to get support for its existing ADF ETL process. The support included existing process support and updates/addition in the system. Data quality and data availability are their major focus areas. Data was stored in Azure Data Lake and transformed using medallion architecture in Azure Synapse Analytics (Azure Databricks).

FMCG company elevates data quality through the implementation of advanced data engineering best practices within its established data pipeline

A Real Estate company was looking forward to generating more traction in their sales and marketing initiatives. They had few existing and few new requirements to focus on.

Data Processing

FMCG company elevates data quality through the implementation of advanced data engineering best practices within its established data pipeline

credits-Microsoft

Current System and Challenges


Data from all the countries was getting consolidated in Data Lake using Azure Databricks. The data storage was following medallion architecture. The data came from logs, flat files and business applications. Existing data pipeline had individual transformations for each data source. There were new markets/locations which had started sending data, and those were required to be streamlined into the current system.

Below are the challenges faced:- 

     1. Performing Data Quality check on new data.

     2. Make sure the new data is available the next day for reporting.

     3. Purging/Masking of sensitive data.


Solution Provided and its Impact 


We spent some time on understanding the existing system, once that was done, we implemented scalable and reusable solution for the new source files into the existing data pipeline. We were able to work through all the challenges.

      1. Enhanced Data Quality : Created Azure Databricks notebook to identify

         aberrations in data files and appended logic to handle them while the data is

         flowing in. 

     2. Next-Day Reporting  :Troubleshooting any issues coming up in the daily run

         ofpipeline same day itself.

    3. Data Security Measures :Since there were multiple layers in which data was

        available, we have implemented a process through which we could make sure

        the purge has happened for data. Masking was done for personal sensitive

        data.




Location

Bihar, India

Bangalore, India

Phone

+91-8918176150

Email

Connect

  • LinkedIn
bottom of page