Networking, AI/ML, Security Architecture

AI network deployments: Understanding the performance challenges

Share
Network-as-a-Service

The widespread use of generative artificial intelligence (GenAI) and large-language models (LLMs) means that many enterprises will soon have to undertake a transformative reconfiguration of their network architectures.

Those networks must become less centralized and more reliant upon edge servers while maximizing efficiency and minimizing latency. Architectures using secure access service edge (SASE) models, especially those tailored to the huge demands of GenAI, will be best poised to adapt to this new networking paradigm.

"When users are accessing [graphics processing units as a service] or when you're doing data transactions with LLMs, there's a ton of information that needs to go back and forth," said Renuka Nadkarni, Chief Product Officer at Aryaka, in a recent interview with SC Media. "And you can saturate your wideband just by doing that."

The challenges of large-scale AI networking

The rapid growth of AI is undeniable. In May, graphics processing unit (GPU) leader Nvidia, which has profited greatly from the AI boom, reported that its data-center revenue grew a staggering 427% year-over-year in its first-quarter 2025 report.

"Current networks were never designed for the demands of AI," wrote well-known industry analyst Zeus Kerravala in a recent posting.

Kerravala added that processors, networks and storage will all need to evolve to adapt to the AI world. 

"The processor evolution is evident as the graphics processing unit (GPU) is now at the center of AI strategies," he said. "Consequently, Nvidia has left its once formidable rival, Intel, in the dust."

The network demands of GenAI and LLM modeling are not the same as those of cloud computing and other well-understood use cases. AI-use networks need to be especially decentralized and reliant upon edge servers and regional points of presence (PoPs).

Those edge servers will primarily be used for processing, Usman Javaid and Bruno Zerbib of the Orange Group wrote in a recent piece in TM Forum.

"There is no 'caching' in GenAI; the content is dynamically generated for every request," Javaid and Zerbib wrote. "AI pushes boundaries but can also clog networks. Scaling GenAI deployment requires network and compute reconfigurations for balancing centralized training and rapid edge AI inference."

Massive demands

Unlike the steady flow of network traffic that flows between web and mobile apps and cloud servers, AI has stop-start, intermittent traffic demands that can be massive, two-way -- and costly.

"All of a sudden, you have a lot of traffic going in and out into those instances. The data transfer rates go up," said Klaus Schwegler, Senior Director of Product Marketing with Aryaka. "How do you cope with that, especially if you deal with an AI workload sitting in a public cloud, and you have to pay for egress costs?"

Worse, Schwegler said, the congestion caused by AI traffic hogging the bandwidth will crowd out other applications.

"You'll have those more massive data-transfer-rate requirements," he explained. "Because you want to access an AI application, you might suffer from performance degradation and issues for your other applications that are equally business critical."

Aryaka's Nadkarni said that many enterprises are already moving to retrieval augmented generation (RAG), which she calls "the next phase of GenAI applications."

Amazon defines RAG as "the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response."

"RAG will absolutely raise the bar on the converged networking and security requirements needed for these applications to deliver value to organizations," said Nadkarni.

This combination of high but unpredictable two-way bandwidth demands, low latency requirements, rapid edge processing and perfect reliability means moving away from the centralized cloud model.

"Real-time AI applications mimicking human decision-making processes require fast model inference which is infeasible with cloud-based architectures," wrote Javaid and Zerbib. "Future networks must expand cloud-centric architectures toward the edge, bringing LLM closer to data sources, enabling low-latency inference, improving data transfer by processing data locally, while maintaining user data privacy."

"How the network is built to support AI-enabled applications needs revisiting urgently," they added. "But are we ready?"

Tailoring SASE for AI

Aryaka might have a solution. Its bundled SASE offering, which it calls Unified SASE as a Service, is designed to deliver high-bandwidth traffic to points of presence worldwide using Aryaka's private backbone while guaranteeing low latency and constant throughput.

An optional add-on called AI>Perform optimizes network performance for AI applications.

"Unified SASE as a Service, first and foremost, addresses really major aspects when it comes to deploying for enterprises around networking, security and observability, all integrated into that solution," said Aryaka's Schwegler. "AI>Perform is a focused approach on the networking piece and the networking performance aspect of that service delivery."

There's no magic in this, Schwegler said. Aryaka's Unified SASE as a Service just intelligently uses standard networking tools to maximize efficiency for GenAI.

"How we do it is by using existing integrated technology, by traffic shaping, by ensuring WAN optimization for our workloads, quality-of-service settings, using capabilities like deduplication and compression," he said. "[We use] our failover redundant links in order to ensure that we can optimize the traffic for such AI applications as well as AI workloads that can be distributed around the world."

Schwegler explained that because Aryaka delivers traffic to PoPs via its own backbone, which is calls Private Core Network, it can monitor which applications are using the backbone, and steer traffic and reallocate bandwidth accordingly.

The reconfiguration of networks to accommodate the demands of AI has only begun, and enterprises are only in the first phases of adopting GenAI into their core operations. But Aryaka's Nadkarni says that the firm's Unified SASE as a Service, especially with its optional AI Perform module, lays the groundwork for future growth.

"We are not claiming to solve all the problems that everyone has," she said. "But the fact that we have the foundational technologies and we also have integrations in terms of traffic and bringing other security technologies is what makes [Unified SASE as a Service] very powerful."

Paul Wagenseil

Paul Wagenseil is a custom content strategist for CyberRisk Alliance, leading creation of content developed from CRA research and aligned to the most critical topics of interest for the cybersecurity community. He previously held editor roles focused on the security market at Tom’s Guide, Laptop Magazine, TechNewsDaily.com and SecurityNewsDaily.com.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms of Use and Privacy Policy.