+65 6591 8608

Friday30 September 2022

Intel OPA switches

EXALIT is Intel Platinum Partner, authorized Intel Omni-Path (OPA) Provider, and a member of Intel Fabric Builders Ecosystem.

Intel® Omni-Path Architecture (Intel® OPA), an element of Intel® Scalable System Framework, delivers the performance for tomorrow’s high performance computing (HPC) workloads and the ability to scale to tens of thousands of nodes—and eventually more—at a price competitive with today’s fabrics. The Intel OPA 100 Series product line is an end-to-end solution of PCIe* adapters, silicon, switches, cables, and management software. As the successor to Intel True Scale Fabric, this optimized HPC fabric is built upon a combination of enhanced IP and Intel® technology.

For software applications, Intel OPA will maintain consistency and compatibility with existing Intel True Scale Fabric and InfiniBand* APIs by working through the open source OpenFabrics Alliance (OFA) software stack on leading Linux* distribution releases. Intel True Scale Fabric customers will be able to migrate to Intel OPA through an upgrade program.

Intel® Omni-Path Key Fabric Features and Innovations

Adaptive Routing

Adaptive Routing monitors the performance of the possible paths between fabric end-points and selects the least congested path to rebalance the packet load. While other technologies also support routing, the implementation is vital. Intel’s implementation is based on cooperation between the Fabric Manager and the switch ASICs. The Fabric Manager—with a global view of the topology—initializes the switch ASICs with several egress options per destination, updating these options as the fundamental fabric changes when links are added or removed. Once the switch egress options are set, the Fabric Manager monitors the fabric state, and the switch ASICs dynamically monitor and react to the congestion sensed on individual links. This approach enables Adaptive Routing to scale as fabrics grow larger and more complex.

Dispersive Routing

One of the critical roles of fabric management is the initialization and configuration of routes through the fabric between pairs of nodes. Intel® Omni-Path Fabric supports a variety of routing methods, including defining alternate routes that disperse traffic flows for redundancy, performance, and load balancing. Instead of sending all packets from a source to a destination via a single path, Dispersive Routing distributes traffic across multiple paths. Once received, packets are reassembled in their proper order for rapid, efficient processing. By leveraging more of the fabric to deliver maximum communications performance for all jobs, Dispersive Routing promotes optimal fabric efficiency.

Traffic Flow Optimization

Traffic Flow Optimization optimizes the quality of service beyond selecting the priority—based on virtual lane or service level—of messages to be sent on an egress port. At the Intel® Omni-Path Architecture link level, variable length packets are broken up into fixed-sized containers that are in turn packaged into fixed-sized Link Transfer Packets (LTPs) for transmitting over the link. Since packets are broken up into smaller containers, a higher priority container can request a pause and be inserted into the ISL data stream before completing the previous data.

The key benefit is that Traffic Flow Optimization reduces the variation in latency seen through the network by high priority traffic in the presence of lower priority traffic. It addresses a traditional weakness of both Ethernet and InfiniBand* in which a packet must be transmitted to completion once the link starts even if higher priority packets become available.

Packet Integrity Protection

Packet Integrity Protection allows for rapid and transparent recovery of transmission errors between a sender and a receiver on an Intel® Omni-Path Architecture link. Given the very high Intel® OPA signaling rate (25.78125G per lane) and the goal of supporting large scale systems of a hundred thousand or more links, transient bit errors must be tolerated while ensuring that the performance impact is insignificant. Packet Integrity Protection enables recovery of transient errors whether it is between a host and switch or between switches. This eliminates the need for transport level timeouts and end-to-end retries. This is done without the heavy latency penalty associated with alternate error recovery approaches.

Dynamic Lane Scaling

Dynamic Lane Scaling allows an operation to continue even if one or more lanes of a 4x link fail, saving the need to restart or go to a previous checkpoint to keep the application running. The job can then run to completion before taking action to resolve the issue. Currently, InfiniBand* typically drops the whole 4x link if any of its lanes drops, costing time and productivity.

Ask How Intel® Omni-Path Architecture Can Meet Your HPC Needs

Intel is clearing the path to Exoscale computing and addressing tomorrow’s HPC issues. Contact EXALIT to discuss how Intel® Omni-Path Architecture can improve the performance of your future HPC workloads.

Please visit Intel website (external link) for more technical details about Omni-Path architecture, or contact us today for more information.

JA Minisite

2857.orig.q75.o0 - Copy IntelTPP amd  tesla preferred partner  quadro partner  qctlogo e   Mellanox APAC Partner   1