
Mr. Faiz Gouri
From Detection to Decision: Operationalizing Machine Learning for Network Anomaly Detection at Enterprise Scale
Abstract:
Modern cloud infrastructures demand anomaly detection capabilities that exceed traditional rule-based monitoring. This keynote presents insights from implementing machine learning-based anomaly detection across Microsoft's global network infrastructure. Drawing on my experience developing ML models for telemetry analysis, I'll share the journey from proof-of-concept to production deployment, focusing on practical challenges rarely discussed in academic settings. I'll examine how we bridged theoretical ML concepts with engineering constraints, addressing data quality management, feature engineering for network metrics, and balancing false positives with detection sensitivity. The presentation will highlight architectural considerations for processing high-volume telemetry streams and transforming model outputs into actionable insights for network engineers. Through concrete examples and metrics, attendees will gain practical knowledge for implementing ML systems that adapt to evolving network behaviors while delivering measurable improvements in outage prevention and mean time to detection. The presentation concludes with perspectives on emerging techniques, offering a roadmap for organizations seeking to enhance network monitoring through applied machine learning.