← All briefings Briefing

Modernising AKS ingress before scaling AI API traffic.

akskubernetesgateway apinetworking

Microsoft announced preview support for the Kubernetes Gateway API in the Azure Kubernetes Service application-routing add-on in March 2026. On the surface this is a networking feature. In practice it matters most for teams that are about to route large volumes of AI API traffic through AKS and need ingress that can keep up.

Why Gateway API is more than an upgrade

The Kubernetes Ingress API has served the community well, but it is limited. It handles basic HTTP routing, TLS termination and simple load balancing. Modern applications need richer traffic management: header-based routing, traffic splitting, cross-namespace delegation, retries, timeouts and direct integration with service meshes.

The Gateway API was designed to address these gaps. It separates infrastructure concerns from application concerns, which means platform teams can manage the gateway while application teams define routes for their own services. That separation becomes important when multiple AI services, each owned by a different team, share the same cluster.

The AKS application-routing angle

The AKS application-routing add-on simplifies ingress by managing controllers, DNS integration and certificate handling automatically. Adding Gateway API support to that add-on lowers the operational burden of adopting the newer standard. Teams do not have to install and maintain a separate Gateway controller; they can use the add-on they already have.

For AI workloads, this is timely. Inference services often need sophisticated routing: canary deployments for model versions, A/B tests for prompt strategies, header-based routing for tenant isolation, and weighted traffic shifts during updates. The Gateway API expresses these patterns more cleanly than the older Ingress API.

Why now, before scaling

It is tempting to defer ingress modernisation until traffic grows. That is usually a mistake. Changing routing architecture under load is harder and riskier than doing it while traffic is manageable. AI API traffic also has distinctive patterns: large request and response payloads, unpredictable burstiness, strict latency requirements and high fan-out to downstream services. An ingress layer that works for conventional web traffic may become a bottleneck or a source of timeouts when those patterns arrive.

The preview status is also relevant. Preview features are useful for testing and planning, but they should not carry production workloads until they reach general availability. The right move is to evaluate the Gateway API support in a non-production environment, identify which existing routes would benefit, and prepare a migration plan.

Questions to ask

  • Which Ingress resources would be cleaner or more capable as Gateway API routes?
  • Do any of our AI services already need header-based, weight-based or cross-namespace routing?
  • Who owns the gateway configuration: platform engineering, individual service teams, or both?
  • What is our rollback plan if a Gateway API route behaves differently in production?

The bottom line

Gateway API support in AKS application-routing is a sensible modernisation step, not an urgent disruption. For organisations building AI APIs on Kubernetes, it is worth adopting deliberately before traffic scales. Routing is rarely the most exciting part of an AI platform, but it is one of the first places where growth exposes architectural debt.

Related briefings

Keep reading.

More from the team

Longer thinking →

Briefings are short reads on the news. For Burt's own thinking, see the Journal.