UEC to deliver on Ethernet-based open, interoperable, high-performance full-communications stack architecture to meet the growing network demands of AI & HPC at scale.
SAN FRANCISCO – July 19, 2023 – Announced today, Ultra Ethernet Consortium (UEC) is bringing together leading companies for industry-wide cooperation to build a complete Ethernet-based communication stack architecture for high-performance networking. Artificial Intelligence (AI) and High-Performance Computing (HPC) workloads are rapidly evolving and require best-in-class functionality, performance, interoperability and total cost of ownership, without sacrificing developer and end-user friendliness. The Ultra Ethernet solution stack will capitalize on Ethernet’s ubiquity and flexibility for handling a wide variety of workloads while being scalable and cost-effective.
Ultra Ethernet Consortium is founded by companies with long-standing history and experience in high-performance solutions. Each member is contributing significantly to the broader ecosystem of high-performance in an egalitarian manner. The founding members include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), HPE, Intel, Meta and Microsoft, who collectively have decades of networking, AI, cloud and high-performance computing-at-scale deployments.
“This isn’t about overhauling Ethernet,” said Dr. J Metz, Chair of the Ultra Ethernet Consortium. “It’s about tuning Ethernet to improve efficiency for workloads with specific performance requirements. We’re looking at every layer – from the physical all the way through the software layers – to find the best way to improve efficiency and performance at scale.”
The consortium will work on minimizing communication stack changes while maintaining and promoting Ethernet interoperability.
The technical goals for the consortium are to develop specifications, APIs, and source code to define:
- Protocols, electrical and optical signaling characteristics, application program interfaces and/or data structures for Ethernet communications.
- Link-level and end-to-end network transport protocols to extend or replace existing link and transport protocols.
- Link-level and end-to-end congestion, telemetry and signaling mechanisms; each of the foregoing suitable for artificial intelligence, machine learning and high-performance computing environments.
- Software, storage, management and security constructs to facilitate a variety of workloads and operating environments.
UEC will follow a systematic approach with modular, compatible, interoperable layers with tight integration to provide a holistic improvement for demanding workloads. The founding companies are seeding the consortium with highly valuable contributions in four working groups: Physical Layer, Link Layer, Transport Layer and Software Layer.
UEC is a Joint Development Foundationproject hosted by The Linux Foundation. UEC will begin accepting applications for new members in Q4 2023. More information can be found at ultraethernet.org
Industry Analyst Quotes:
“Many HPC and AI users are finding it difficult to obtain the full performance from their systems due to weaknesses in the system interconnect capabilities. It’s also difficult for users to integrate and learn multiple new or different solutions. It’s exciting to see this impressive group of leading companies work together to create a new common higher-performance interconnect solution. Buyers in the HPC and AI areas have very demanding workloads, which the Ultra Ethernet Consortium (UEC) approach could greatly help improve interoperability, performance and capabilities. We look forward to seeing a new set of products enter the market in the near future,” said Dr. Earl Joseph, CEO of Hyperion Research.
“The business use cases of AI/ML and HPC are continuing to expand, with more companies looking to leverage scalable computing to their competitive advantage, whether in their own computing facilities or in the cloud. Today there are no standard, vendor-neutral data center networking solutions that focus on performance at scale for parallel applications. Because the majority of data centers are Ethernet-based, having extensible solutions driven by UEC will make scalability more straightforward and accessible. The companies involved in UEC are capable of developing consistent Ethernet solutions that scale from single connections to the largest supercomputers and hyperscale data centers,” said Addison Snell, CEO of Intersect360 Research.
“There has been an ongoing discussion, dare I say battle, over the best networking to use for infrastructure supporting the training and inference of large language models for generative AI. Some companies have been shifting to Ethernet-based networking, preferring its ease of installation and use. The UEC initiative will be a welcome addition to the AI community,” said Karl Freund, Founder and Principal Analyst at Cambrian-AI Research.
Founding Member Quotes:
“Highly compute-intensive workloads – such as AI training, machine learning, and HPC simulation and modeling – require scalable and cost effective industry-wide solutions with interoperability as a top priority. In an effort to create an open Ethernet-based architecture to address the evolving needs of modern data center workloads, we are joining the Ultra Ethernet Consortium as a founding member. AMD has a long history of supporting open industry standards and we are proud to continue on that course today with the UEC,“ said Robert Hormuth, corporate vice president, Architecture and Strategy, Data Center Solutions Group, AMD.
“Arista Networks is pleased to participate in the UEC, supporting the evolution of Ethernet to more use-cases as a ubiquitous transport for HPC and AI/ML workloads,” said Hugh Holbrook, Group Vice President, Software Engineering for Arista Networks.
“With its unmatched ecosystem, extreme flexibility, and high performance, Ethernet has become the fabric of choice for virtually every type of data networking. Broadcom has long been a supporter of Ethernet technology, driving innovations in all aspects of the network stack. We are excited to work alongside many of the cloud and networking industry titans in driving Ethernet to meet the needs of next-generation AI and HPC networks,” said Ram Velaga, senior vice president and general manager, Core Switching Group, Broadcom.
“We are at the start of a massive transformation in nearly every industry. AI/ML will fundamentally change what, when and how we do everything. To enable this transformation, the industry needs to evolve in how the networks of tomorrow are built. Cisco supports the goals of the UEC to identify and standardize optimizations that will benefit our customers deploying AI/ML infrastructure,” said Rakesh Chopra, Cisco Fellow, Common Hardware Group, Cisco.
“The HPC market has been the key driver in developing High-Speed Interconnects. With AI/ML/DL intensive and large-scale workloads, markets are converging towards the creation of a new standard encompassing interoperability, cost-effectiveness, and real high performance. We are proud and enthusiastic to be amongst the founding members of the Ultra Ethernet Consortium (UEC) which aims to tackle these challenges with an ethernet-based communication protocol and software stack. Atos through its Eviden business will bring to the table its field-proven HPC and AI expertise, leveraging its BXI interconnect technology, the group’s third-generation high-speed interconnect. We are confident that UEC will deliver strong results to meet the market needs and requirements,” said Eric Eppe, Group VP, HPC/AI/Quantum Portfolio & Strategy for Eviden at Atos Group.
“Generative AI workloads will require us to architect our networks for supercomputing scale and performance. The importance of the Ultra Ethernet Consortium is to develop an open, scalable, and cost-effective ethernet-based communication stack that can support these high-performance workloads to run efficiently. The ubiquity and interoperability of ethernet will provide customers with choice, and the performance to handle a variety of data intensive workloads, including simulations, and the training and tuning of AI models. As the data and size of AI models continues to grow, highly parallelized computing becomes an essential ingredient to performance, reliability, and sustainability,” said Justin Hotard, executive vice president and general manager, HPC & AI, at Hewlett Packard Enterprise.
“The computational and network performance demands for AI, Machine Learning, and high-performance workloads at-scale are insatiable. The industry needs open solutions to meet these demands to enable choice and freedom from proprietary solutions. Intel is proud to be a founding member of the Ultra Ethernet Consortium (UEC), which will usher in the computing infrastructure of tomorrow through an updated and optimized Ethernet-based, high-performance, scalable, and open network solutions and communication stack,” said Jeff McVeigh, corporate vice president & general manager of the Super Compute Group at Intel.
“Next generation AI Systems require unprecedented scale and performance. Meta is committed to building an open ecosystem of high-performance Ethernet fabric and technologies to enable the next era of computing,” said Alexis Björlin, Vice President of Infrastructure, AI Systems and Accelerated Platforms at Meta.
“The next era of computing will be characterized by breakthrough advancements in AI and AI-optimized infrastructure, and Microsoft is committed to empowering organizations to push the bounds of what is possible with the power of Azure. Joining forces to develop a common set of standards to enhance Ethernet for hyperscale AI and high-performance computing workloads will help enable continued innovation now and in the future,” said Steve Scott, Corporate Vice President of Azure Hardware Architecture at Microsoft.
###
About The Joint Development Foundation
The Joint Development Foundation, part of the Linux Foundation family, provides the corporate and legal infrastructure to enable organizations to develop technical specifications, standards, data sets and source code. JDF projects such as Ultra Ethernet Consortium, Alliance for Open Media, Coalition for Content Provenance and Authenticity and Overture Maps Foundation innovate markets, lead change and champion open participation and licensing policies. For more information, please visit us at jointdevelopment.org.