Srikant Sudha Panda

Breaking Dependency Chains: Evaluating Microsoft’S Maia 100 As An Alternative To Nvidia Gpus In Ai Workloads

Abstract:

The rapid growth of AI has made NVIDIA GPUs indispensable for deep learning workloads in particular. Yet as concerns over cost, supply chain integrity, and vendor lock-in mount, alternative accelerators are moving into the spotlight. In this paper, we evaluate Microsoft Maia 100 AI accelerator as a potential alternative to the NVIDIA GPUs, especially the A100 and H100, for large-scale AI training and inference. A set of three representative benchmarks based on Transformer style models (BERT, GPT-3 variants), CNN models (ResNET-50) and recommendation models (DLRM) were chosen. We ran experiments under the same batch size (consumption), precision (FP16, INT8), and distributed training setups. We measured performance metrics such as throughput (samples/sec), latency, power (W), thermal profile and cost per training hour. Maia 100 exhibited its competitiveness in inference workloads by outperforming A100 by 12% in latency-sensitive workloads with 18% less power. For training big language models, Maia 100 achieved similar convergence time but 6% lower throughput than H100. Specifically, Maia 100’s deep integration with Azure’s AI stack was used for enabling improved pipeline optimization and orchestration that in turn helped provide some level of hardware abstraction. These results indicate that Maia 100 is a good candidate for entities working to lower dependence on NVIDIA without compromising on performance. Architectural trade-offs, software compatibility (ONNX, PyTorch, TensorFlow), and deployment concerns are also addressed in this paper. The findings have implications for a hybrid AI infrastructure approach using both Maia & NVIDIA hardware to enable flexibility, cost efficiency, and scalability in enterprise AI deployments.

Profile:

Srikant Panda is a seasoned cloud and hardware technology professional with over 16 years of experience spanning quality assurance, automation, DevOps, and program management. Currently serving as a Senior Technical Program Manager at Microsoft, he plays a key role in driving quality and innovation within Azure’s hardware systems and infrastructure, particularly in AI accelerator and compute silicon validation.

Throughout his career, Srikant has built a strong reputation for bridging the gap between hardware and software teams, translating complex product requirements into scalable and efficient solutions. He has been instrumental in the successful launch of advanced AI accelerators like COBALT 100 and MAIA 100, demonstrating his ability to lead high-impact, cross-functional initiatives from concept to production.

Before joining Microsoft, Srikant spent over a decade at Mindtree, where he progressed through multiple roles, gaining deep expertise in Azure infrastructure, system-level qualification, and global program delivery. His work has contributed to large-scale cloud deployments across data centers worldwide, consistently improving timelines, optimizing processes, and enhancing product quality.

Known for his collaborative leadership style, Srikant excels at managing diverse teams, working across geographies, and aligning stakeholders toward common goals. His technical foundation in C#, SQL, and PowerShell, combined with his understanding of hardware components and cloud ecosystems, allows him to approach challenges with both depth and versatility.

Driven by a passion for innovation and continuous improvement, Srikant brings a balanced mix of technical expertise, strategic thinking, and people leadership to every project he undertakes.