Based on the proof of concept (POC), a data ingestion framework was built on Databricks and AWS, using the medallion data architecture for credit cards and loans. We used data pipelines to connect to source databases to ingest data from multiple sources. This allowed for scalability and flexibility in ingesting data and streamlined the data ingestion process for data consistency across sources.
We enabled change data capture with the help of Debezium to capture and propagate data changes in real-time. The Confluent data ingestion platform was used for ingesting data and saving it in S3 buckets. Data extraction from the S3 location was made possible using Databricks and the delivery of the data to Enterprise Data Warehouse was enabled using Data Live Tables. The solution could also generate reports based on various business conditions from data sources like MariaDB and other partner data from S3 far more seamlessly now the automated reports generated across multiple business areas such as credit cards and embedded finance could be securely transferred using SFTP to the bank’s S3 location.