Koteswara Rao Chirumamilla
Metadata-Driven Unified Data Ingestion in cloud
Abstract:
This presentation introduces a cloud-agnostic, metadata-driven Unified Data Ingestion (UDI) Framework designed to streamline and automate data flow across heterogeneous systems in modern cloud environments. Traditional ETL approaches often suffer from repetitive coding, vendor lock-in, and limited scalability. The proposed framework adopts a configuration-first ingestion model supporting diverse sources and targets, including files, APIs, message queues, and databases. Built on distributed processing engines such as Apache Beam or Spark and orchestrated through workflow automation, the framework enables standardized batch and streaming ingestion. Metadata configurations dynamically generate pipelines, eliminating manual scripting and reducing deployment time. Auditing, logging, dependency management, and secure access mechanisms are integrated to ensure observability, compliance, and data protection. Performance evaluations across representative scenarios show a 35–40% reduction in ingestion time, up to 60% reduction in development effort, and approximately 30% cost savings through automation and reusability.
Profile:
I am a Lead Data Engineer and AI Systems Architect with over fourteen years of experience designing and delivering large-scale, cloud-native data platforms and intelligent analytics systems. I currently work with Albertsons Companies, where I lead the architecture and development of enterprise data engineering solutions supporting complex retail ecosystems.
My expertise includes metadata-driven data ingestion, distributed data processing, real-time and batch analytics, cloud platforms, and automation frameworks. I have played a key role in designing unified data ingestion and governance solutions that significantly reduced development effort, improved scalability, and delivered substantial cost savings through automation and reusability.
My work bridges industry practice and applied research, with a strong focus on building resilient, scalable, and compliance-ready data platforms. In addition to industry leadership, I actively contribute to scholarly research, peer review for international journals, and mentoring initiatives for students and early-career professionals. I am particularly interested in advancing modern data engineering paradigms that simplify complex pipelines while enabling organizations to respond rapidly to evolving business and technology demands.