Nishanth Prakash
Abstract:
As enterprises adopt large language models and AI copilots, the real challenge is no longer model accuracy, it is infrastructure. Internal AI platforms must support secure multi-tenancy, low-latency streaming inference, governance controls, region-aware deployment, and operational resilience at scale.
In this talk, Nishanth Prakash shares practical lessons from building and operating large-scale AI/ML model deployment infrastructure within Oracle Cloud Infrastructure. He reframes AI infrastructure not as backend plumbing, but as a product one that requires clear boundaries, versioned contracts, observability, feature gating, and failure-mode thinking.
The session covers real-world topics including streaming WebSocket inference, latency metrics such as Time to First Token (TTFT) and inter-token latency, secure networking models, tenancy-level feature flags, region-specific overrides, and operational lifecycle design. Attendees will learn how to move from experimental AI deployments to production-grade internal platforms that scale reliably across teams and regions.
Profile:
Nishanth Prakash is a Principal Member of Technical Staff and Engineering Lead at Oracle Cloud Infrastructure (OCI), where he builds large-scale AI/ML model deployment systems powering enterprise workloads. His work focuses on secure, high-performance inference infrastructure, private networking for AI deployments, streaming model serving, and governance-aware AI platform design.
With a background spanning software engineering leadership and venture capital, Nishanth specializes in translating research innovations such as quantized model optimization and adapter-based fine-tuning into production-ready systems. He has authored IEEE conference papers on AI systems and regularly contributes to open-source AI infrastructure projects.
Nishanth is a Senior Member of IEEE and a member of the Forbes Technology Council. He frequently speaks and mentors on enterprise AI systems, operationalizing prompt engineering, and secure AI deployment patterns. His work sits at the intersection of AI research, cloud infrastructure, and real-world enterprise scale.