The network topology — describing precisely how nodes are connected — plays a central role in
both the performance and cost of the network. In addition, the topology drives aspects of the switch
design (e.g., virtual channel requirements, routing function, etc), fault tolerance, and sensitivity to
adversarial traffic. There are subtle yet very practical design issues that only arise at scale ;wetryto
highlight those key points as they appear.
Many scientific problems can be decomposed into a 3-D structure that represents the basic building
blocks of the underlying phenomenon being studied. Such problems often have nearest neighbor
communication patterns, for example, and lend themselves nicely to k -ary n -cube networks. A
high-performance application will often use the system dedicated to provide the necessary perfor-
mance isolation, however, a large production datacenter cluster will often run multiple applications
simultaneously with varying workloads and often unstructured communication patterns.
The choice of topology is largely driven by two factors: technology and packaging constraints.
Here, technology refers to the underlying silicon from which the routers are fabricated (i.e., node size,
pin density, power, etc) and the signaling technology (e.g., optical versus electrical). The packaging
constraints will determine the compute density , or amount of computation per unit of area on the
datacenter floor. The packaging constraints will also dictate the data rate (signaling speed) and
distance over which we can reliably communicate.
As a result of evolving technology, the topologies used in large-scale systems have also changed.
Many of the earliest interconnection networks were designed using topologies such as butterflies or
hypercubes, based on the simple observation that these topologies minimized hop count. Analysis
by both Dally [ 18 ] and Agarwal [ 5 ] showed that under fixed packaging constraints, a low-radix
network offered lower packet latency and thus better performance. Since the mid-1990s, k -ary
n -cube networks were used by several high-performance multiprocessors such as the SGI Origin
2000 hypercube [ 43 ], the 2-D torus of the Cray X1 [ 16 ], the 3-D torus of the Cray T3E [ 55 ]
and XT3 [ 12 , 17 ] and the torus of the Alpha 21364 [ 49 ] and IBM BlueGene [ 35 ]. However, the
increasing pin bandwidth has recently motivated the migration towards high -radix topologies such
as the radix-64 folded-Clos topology used in the Cray BlackWidow system [ 56 ]. In this chapter, we
will discuss mesh/torus topologies while in the next chapter, we will present high-radix topologies.