Infrastructure for non-infrastructure people¶
Estimated time to read: 31 minutes
- Originally Written: October, 2024
Table of contents¶
- Overview
- Layout
- Power
- Installing equipment in a rack
- Cables and transceivers
- Connectivity between racks
- Devices
Overview¶
Perhaps you're new to networking or maybe you've spent all your career working with public clouds, only ever having SSH access to a machine. This post will give you an overview of what that infrastructure looks like and how it's physically installed. Although it's not a comprehensive guide for building a DC, I've always found it helpful to understand what devices or components look like and how they're connected.
For example, knowing what different power cables look like or that the slot for a 40G adapter is larger than a 10G SFP+ transceiver can help ensure you order the correct equipment and nothing is overlooked (do I need port-side intake or port-side exhaust for my switches?).
Disclaimer and thanks
Most of the images shown are from a lab used for testing and troubleshooting various networking scenarios, not a production data centre, so are not necessarily exactly how you would see them in a production environment
Thanks to my colleague Neil for taking the pictures of the nicer looking data center with proper racks, security, power, cooling, and cabling
Layout¶
You might be familiar with a topology similar to image on the left. This can look slightly different depending on how it's physically installed.
Racks¶
Data centres install equipment into racks (sometimes referred to as a cabinet) and in some cases, for example shared environments, may also have secure cages containing one or more racks. I don't have access to any caged environments so will only show photos of the racks. There are multiple racks placed side by-side into rows and multiple rows in each DC.
If you're familiar with networking you may have heard of the terms Top of Rack (ToR) and End of Row (EoR) switches. As the name suggests these are switches placed at the top of each rack or at the end of each row of racks.
Since there are many equipment vendors world wide, a standard measurement is used for the size which allows each vendor to manufacturer their devices to fit most available racks. One Rack Unit
(RU) is 1.75 inches or 4.4 centimetres. You will normally find 48RU or in some cases 42RU racks. This means you could fit 48 x 1RU devices in each rack, although this depends on other factors such as the power requirements.
Cooling¶
Equipment in data centres typically runs 24 hours a day, 7 days a week, and depending on what is installed the devices can generate quite a large amount of heat. Although some equipment is designed for higher temperatures, to ensure longevity and continuous proper function, the hot air must be extracted and the environment kept at a consistent cool temperature.
This in itself is a huge topic but here is a couple of examples of how cooling is achieved with the concept of hot-aisles and cold-aisles.
Hot-aisle/cold-aisle¶
Switches, routers, servers, and other equipment have fans which pull in cold air to cool the internal components. The fans in each device in a row should operate in the same direction. This results in all devices taking in cold air from one side of the rack (the cold-aisle) and exhausting hot air into the other side of the rack (the hot-aisle). Each of the rows are alternated to form hot-aisles and cold-aisles throughout the DC.
We use a raised floor with perforated tiles to run cables between racks and to also send cold air from the air conditioning systems into the cold-aisles.
You can see the grates in this photo.
Hot-aisle containment¶
Another design uses the concept of a hot-aisle containment system which has intakes at the top of each rack for the hot air. This air circulated through the cooling system and pumped out into the "outside" aisles on each side. The fans then pull the cold air through the devices. The hot air is contained to the middle aisle due to the wall at one end and doors at the other.
Airflow chimneys¶
You might also see airflow chimneys similar to those shown in these two images. By closing the doors to the back of the rack we can force the hot air up through the chimney and into the cooling system
References:
Power¶
The equipment's in the rack, everything's connected, now it's time to power it up. In some ways this is but also isn't like plugging in your toaster at home. Have you ever connected too many devices at home and suddenly had everything trip? How big of an issue would it be if your home lost power for an hour?
Although there are cables similar to what you might see on your home computer, providing power to a datacenter (even a small test lab like we have) is much more than just the cable.
How loud is it in the DC?
Before we get to that here's a great video showing why you should never shout at your servers. You can also hear how loud it is when everything is running
Here is a very high level overview of some of the components you might come across.
Starting from the left hand side, you'll see the power feeds come from two different suppliers and these eventually connect to two separate power supplies in the device via different PDUs. You would also normally see additional onsite sources of power such as diesel generators to ensure there is always a secondary option if the primary is not available.
In the scenario that the primary power source fails, the transfer switch shifts the load to the backup power source. This could be done automatically or manually. This switches back to the primary source when it becomes available again.
Although there's not much to see, these are the two power feeds. The second image shows the connection to the UPS. The big box in the middle is the maintenance bypass switch which allows you to maintain the load that the UPS is supporting while you take the UPS out of the circuit.
This is the APC Uninterruptible Power Supply (UPS). A UPS is used if the power fails and you need a few minutes of power to properly shutdown equipment or a few minutes to allow the backup power source to take over. It's not designed to run for extended periods.
In each of the racks you'll find Power Distribution Boards (PDUs) which are similar to power strips you may have at home. These are connected to the separate power feeds and each device such as a server or switch would connect to PDU-A and PDU-B. In these pictures you can see one side with the blue cables and one side with the orange cables.
You'll also find different types of PDUs including ones which can be connected to a network like a switch and be managed remotely. These allow you to monitor power across all the various racks and rows.
There are various power cables and connector types which are defined by the International Electrotechnical Commission as part of IEC 60320: Appliance couplers for household and similar general purposes. PDUs typically have C13 and C19 connectors.
One the left hand side you can see an IEC 19 (female) to IEC 20 (male) cable which is rated for 16A and used to connect to a higher powered blade server chassis. For a switch or lower powered server you may see a 10A cable such as the IEC 13 to IEC 14 cable in the right hand picture.
Some considerations¶
So far we've seen some components to get power from the outside world to the device in the rack but haven't yet explained power availability and calculations.
Rack Power¶
Just like in your home, there is a certain amount of power available in each rack, or in each PDU. For our lab we have 32 Amps. Another way you may see this represented is in Kilowatts. This is calculated as the current (32A) * voltage (230V) / 1000 = 7.36Kw
.
So in each PDU we have a maximum power of 32A or 7.36Kw of power available. You generally would not want to overload your racks and have some buffer so for example you may want to use only 80% of the power i.e. ~ 5800W. This would let you fit 20 - 30 devices using 200W-300W each.
In some scenarios you may actually see racks which are only half/quarter full since they contain high powered devices and there is not enough power to accommodate more in the rack.
The Uptime Institute complete a survey every year which includes a section on rack power to give you an indication of what you may see in other environments. Our available power per rack is quite low compared with some of the larger data centers. Here's the of the survey from 2022.
Cables and Connectors¶
Cables have a rating for 10A or 16A which is the maximum supported for the cable but doesn't mean the whole power budget is used.
Special care has to be taken to select the correct power cords. For example, what if you were to use an IEC 19 on the PDU side to connect to an IEC 13 on the device side?
Scenario 1: C19 to a C14¶
- C13/14 is rated for 10A and C19/20 16A and so is the device you connect to. If you connect C19 to a C14, you connect a 10A device to a circuit with a 16A fuse. If the device starts pulling over 10A, something is wrong and the fuse is supposed to break the connection. However, it won't because its a 16A fuse
- You've just created a fire hazzard!
Scenario 2: C13 to a C20¶
- If you use a C13 to a C20 you have connected a device capable of utilizing 16A to a 10A fuse. If the device draws over 10A the fuse will break and power goes out. This may work for a long time.
- Now imagine you have redundant power to your device and it draws 6A from both power supplies. You have power maintenance on one circuit and take down that UPS. The device then draws 12A from your other circuit which breaks the fuse (as it should) and the device fails. You thought you had redundant power but because of the mismatch you ended up losing both power supplies and the device.
Grounding/ESD/anti-static bags¶
Just like you may have a grounding rod/system in your home, data centres also have these systems in place. This is a critical safety measure designed to protect both the equipment and people. Groungin provides a low resistance path for electrical currents to safely dissipate into the ground, thereby preventing electrical shocks, equipment damage, and fire hazards.
Although you probably won't see it, there will be a connection to the earth using conductive materials like copper rods or grounding grids. In our case (from what I've been told), the racks connect to a solid copper earth bar which is connected to the main facility earth.
Here you can see the connection from the rack to the underfloor system.
Finally, wrist straps and anti-static bags are used to protect electronic components from electrostatic discharge (ESD), which can damage or destroy sensitive circuitry. The wrist straps dissipate static charges to a grounded surface while the bags are made from materials that dissipate static electricity. Packaging components into anti-static bags prevent the buildup of static charges that could harm the enclosed items during storage or transportation.
Installing equipment in a rack¶
Most devices will ship with rack mount kits and these will differ depending on the function and size. For example, servers may be on rails which pull out. This allows you to change the memory or other internal components of the server without having to completely remove it from the rack. Other equipment such as a chassis switch or blade server chassis may ship with very large rack kits to support the size and weight.
You may not realise it but a lot of effort goes into the designing of rail/rack kits. If you've ever had to install/uninstall multiple generations of the same product you may find that the later generations of mounts are easier to install and remove than the first generations. For example, having more easily accessible pull tabs or using coloured plastic to highlight where to press.
Have a look at the following links for different types of rack mount kits.
When the devices are in the rack you still want to secure them using screws. Small devices such as a 1RU switch may only need a few screws to support the weight. Cage nuts are used in the holes of the rack to allow for screws. They are small metal objects which you squeeze together and press into the hole.
Cables and transceivers¶
Once all the devices are installed ("rack and stack") it's time to connect them. This could be within a rack, between racks, or also between rows. There are many options for cables and transceivers and the choice will depend on a number of factors. For example, what speed is required, what is the port type, and what is the distance between the two points.
Cables¶
You are probably familiar with a CAT5 copper cable where there are RJ45 connectors at both ends and used for 100/1000Mbps connectivity.
Although it's possible to find 10G switches with RJ45 ports which can be used with CAT6 cables, they typically will use more power and produce more heat (see Table 8 for an example). For 10G+ speeds you will most likely use fiber and there are a couple of points to understand.
The first is the type of fiber cable, either multimode or single mode.
Multimode Fiber (MMF)¶
- Core size: Larger core diameter, typically 50 or 62.5 micrometers, which allows multiple modes of light to propagate
- Distance: Generally used for shorter distances, typically up to a maximum of 550m
Single Mode Fiber (SMF)¶
- Core size: Much smaller core diameter, typically around 8-10 micrometers, which allows only a single mode of light to propagate
- Distance: Suitable for long-distance (10s of kilometers)
OM1 vs OM2 vs OM3 vs OM4 vs OM5 fiber
Just like you hear CAT5 or CAT6 copper cables you may also hear about OMx fiber cables. These are all defined by the ISO/IEC 11801 international standard.
The standard defines several classes of optical fiber interconnect:
- OM1*: Multimode, 62.5 μm core; minimum modal bandwidth of 200 MHz·km at 850 nm
- OM2*: Multimode, 50 μm core; minimum modal bandwidth of 500 MHz·km at 850 nm
- OM3: Multimode, 50 μm core; minimum modal bandwidth of 2000 MHz·km at 850 nm
- OM4: Multimode, 50 μm core; minimum modal bandwidth of 4700 MHz·km at 850 nm
- OM5: Multimode, 50 μm core; minimum modal bandwidth of 4700 MHz·km at 850 nm and 2470 MHz·km at 953 nm
- OS1*: Single-mode, maximum attenuation 1 dB/km at 1310 and 1550 nm
- OS1a: Single-mode, maximum attenuation 1 dB/km at 1310, 1383, and 1550 nm
- OS2: Single-mode, maximum attenuation 0.4 dB/km at 1310, 1383, and 1550 nm
*Grandfathered
Reference: https://en.wikipedia.org/wiki/ISO/IEC_11801
This is a great resource for more details on each type of cable.
The next is the regarding the fiber connector type, either SC or LC, and this depends on the transceiver (discussed below).
- SC Connector - Larger form factor
- LC Connector - Smaller form factor
The SC connector is shown in the image on the left hand side below. The LC is shown on the right hand side.
In some cases you may need to connect a transceiver using LC to one with SC connectors. This is also possible by with a fiber cable containing LC on one side and SC on the other as shown below.
Other cables
There are many other cables, connectors, and transceivers available such as MPO-12, MPO-16, and XFP which we will not discuss in this post
Transceivers¶
Devices which support RJ45 cables typically have the adapters built-in and would support speeds up to 1Gbps, with some supporting 10GBASE-T. Other high speed devices will just have a "hole" where a transceiver is inserted. The cable is then connected to the transceiver. This allows the port to accept different types of transceiver, for example 10G-SFP-SR
for short range MMF connections or 10G-SFP-LR
for long range SMF connections.
Like with anything, over the years there have been many evolutions of transceivers and I've taken a few pictures of some of the generations of 10G and 1G transceivers.
10G¶
I didn't have a 10GBASE-T to include but you can see the evolution from the larger form factor with giant heatsink to the smaller SFP+ module.
1G¶
Note that in this image there are both copper (yellow outline) and fiber transceivers for 1G connectivity. There is also a transceiver used to convert a 10G SFP+ port into 2 x 1G SFP ports.
Here's what they look like with cables attached.
Direct Attached Cables (DAC)¶
For shorter distances such as server connectivity you may also come across Direct Attached Cables which are high speed, cost effective cables. They are either copper (sometimes referred to as Twinax) or fiber Active Optical Cables (AOC). Similar to CAT5 cables which combine the RJ45 connectors with the cables, Twinax and AOC cables combine either the copper or fiber with the transceivers. They can support multiple speeds such as 10G, 40G, 100G, 400G in different lengths.
The copper cables (black) are typically lower cost and lower power consumption than the fiber cables (orange) but as you can see from the picture below, they are thicker and have shorter lengths.
Some switches even support "breaking out" a port from a higher speed into multiple lower speed ports and breakout cables can be used for this connectivity. For example you may breakout a 40G QSFP+ port to 4 x 10G SFP+ ports. As you can see in the image below, the cable has one 40G port which connects to the switch while the 4 x 10G ends can connect to another device such as a server. This allows you to connect a larger number of devices to a switch but also increase the failure domain. i.e. if this switch port, transceiver, or cable breaks you will lose connectivity to four servers instead of just one.
Reference: https://www.fs.com/products/30827.html
Cable management¶
The final part of this section is on cable management which is not necessarily about making the rack look nice and clean. Correct cable management helps with troubleshooting (e.g. tracing a cable) and can impact the airflow (i.e. imagine the cables are blocking the cold air intake on the device).
This layout of having different sorted cable lengths can be very helpful to quickly select the appropriate cable.
As you can see in the following picture cable management is in place to guide the yellow and the black cables down the right hand side of the rack.
Overhead cable trays help to neatly take cables from one part of the data centre to the another. Here you can also see the green and yellow grounding cable.
There are many different cable management systems but you can find some examples from the following links. If you want to tie groups of cables together always try to use reusable straps such as velcro (first link below) rather than plastic cable ties. This allows you to easily add or remove a cable without having to redo all the ties.
- https://www.fs.com/c/cable-ties-1070
- https://www.cableorganizer.com/categories/racks/cable-management/horizontal/panduit-horizontal-cable-managers/
Miscellaneous Cables¶
This is a collection of miscellaneous cbales you might find.
- This is a rollover cable (console cable) used to connect from a workstation/laptop to the console port of a device such as router or switch so that you can configure it. This is an older style on but you can now find them with USB connectors
- You might see a device that looks like a switch above or below the top of rack switch which has many flat green cables. These similar functionality as the blue rollover cable above and connect into the console port of the device. They're used to provide an out of band network to the device so that you can remotely manage it
- These (the green cables) are the same as the switch above but are connected to a module contained within a router
- If you need to connect from a laptop to a device you may need a couple of adapters depending on the laptop. Here is a USB-C to serial console cable (blue) and a USB-C to Thunderbolt then Thunderbold to RJ45 adapter
Connectivity between racks¶
There are some pictures above showing the raised floor which, although is primarily used for cold air, is sometimes also used to run cables between racks. What you would normally find in a production DC are patch panels.
Imagine having to connect devices together but those devices are located in different rows of a data center. Rather than running individual cables between the rows, patch panels give you a much more structured, centralized cable management option. The back of the patch panel is connected to permanent cables that run through walls, floors, or ceilings to other patch panels or different parts of the building.
We have patch panels in some of the rows in our lab in both copper (RJ45) or fiber ports. I plug one end of the cable into my first device and the other end of the cable into the port in the row containing my other device. The second device is connected in the same way.
Devices¶
The final part section for now is the devices themselves and there can be many of them. We've seen many switches so far so here is an example of a server which runs various types of applications.
This is a rack server which is typically one or two RUs high and contains CPUs, memory, disks, network cards, power supplies, and fans.
Here you can see the internals and note the large heat sinks in the picture on the left hand side. These help the CPU avoid overheating.
This image shows a blade chassis which allows you to slide servers in the front while power supplies and fans are contained in the back. The power and fans are shared between the servers.
If you want more information, vendors will typically have a hardware installation or a specification document which shows the internal components. For example here is the spec sheet and hardware installation guide for the Cisco UCS C240 M7 server.
Don't forget to order the correct facing power supplies and fans. In this case the blue tabs should indicate these fans operate in a port-side exhaust mode. Although they are hot-swappable (can be removed while the switch is running), it's not fun to replace 100 of them because you ordered the wrong ones!
Finally, always have someone to help you put devices in the rack as they can be very heavy. If you need to install equipment by yourself there are various carts/cranes/jacks you can use.