The Challenges of Opening a Data Center — Part 1

Backblaze storage pod in new data center

This is part one of a series. You can find the second part here.

Though most of us have never set foot inside of a data center, as citizens of a data-driven world we nonetheless depend on the services that data centers provide almost as much as we depend on a reliable water supply, the electrical grid, and the highway system. Every time we send a tweet, post to Facebook, check our bank balance or credit score, watch a YouTube video, or back up a computer to the cloud, we are interacting with a data center.

In this series, The Challenges of Opening a Data Center, we’ll talk in general terms about the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process. Many of the factors to consider will be similar for opening a private data center or seeking space in a public data center, but we’ll assume for the sake of this discussion that our needs are more modest than requiring a data center dedicated solely to our own use (i.e. we’re not Google, Facebook, or China Telecom).

Data center technology and management are changing rapidly, with new approaches to design and operation appearing every year. This means we won’t be able to cover everything happening in the world of data centers in our series, however, we hope our brief overview proves useful.

What is a Data Center?

A data center is the structure that houses a large group of networked computer servers typically used by businesses, governments, and organizations for the remote storage, processing, or distribution of large amounts of data.

While many organizations will have computing services in the same location as their offices that support their day-to-day operations, a data center is a structure dedicated to 24/7 large-scale data processing and handling.

Depending on how you define the term, there are anywhere from a half million data centers in the world to many millions. While it’s possible to say that an organization’s on-site servers and data storage can be called a data center, in this discussion we are using the term data center to refer to facilities that are expressly dedicated to housing computer systems and associated components, such as telecommunications and storage systems. The facility might be a private center, which is owned or leased by one tenant only, or a shared data center that offers what are called “colocation services,” and rents space, services, and equipment to multiple tenants in the center.

A large, modern data center operates around the clock, placing a priority on providing secure and uninterrrupted service, and generally includes redundant or backup power systems or supplies, redundant data communication connections, environmental controls, fire suppression systems, and numerous security devices. Such a center is an industrial-scale operation often using as much electricity as a small town.

Types of Data Centers

There are a number of ways to classify data centers according to how they are set up and used. These factors include:

  1. Whether they are owned or used by one or multiple organizations
  2. Whether and how they fit into a topology of other data centers
  3. Which technologies and management approaches they use for computing, storage, cooling, power, and operations
  4. How green they are, which has become more important and visible in recent years

Data centers can be loosely classified into three types according to who owns them and who uses them.

Exclusive Data Centers are facilities wholly built, maintained, operated and managed by the business for the optimal operation of its IT equipment. Some of these centers are well-known companies such as Facebook, Google, or Microsoft, while others are less public-facing big telecoms, insurance companies, or other service providers.

Managed Hosting Providers are data centers managed by a third party on behalf of a business. The business does not own data center or space within it. Rather, the business rents IT equipment and infrastructure it needs instead of investing in the outright purchase of what it needs.

Colocation Data Centers are usually large facilities built to accommodate multiple businesses within the center. The business rents its own space within the data center and subsequently fills the space with its IT equipment, or possibly uses equipment provided by the data center operator.

Backblaze, for example, doesn’t own its own data centers but colocates in data centers owned by others. As Backblaze’s storage needs grow, Backblaze increases the space it uses within a given data center and/or expands to other data centers in the same or different geographic areas.

Availability is Key

When designing or selecting a data center, an organization needs to decide what level of availability is required for its services. The type of business or service it provides likely will dictate this. Any organization that provides real-time and/or critical data services will need the highest level of availability and redundancy, as well as the ability to rapidly failover (transfer operation to another center) when and if required. Some organizations require multiple data centers not just to handle the computer or storage capacity they use, but to provide alternate locations for operation if something should happen temporarily or permanently to one or more of their centers.

Organizations operating data centers that can’t afford any downtime at all will typically operate data centers that have a mirrored site that can take over if something happens to the first site, or they operate a second site in parallel to the first one. These data center topologies are called Active/Passive, and Active/Active, respectively. Should disaster or an outage occur, disaster mode would dictate immediately moving all of the primary data center’s processing to the second data center.

While some data center topologies are spread throughout a single country or continent, others extend around the world. Practically, data transmission speeds put a cap on centers that can be operated in parallel with the appearance of simultaneous operation. Linking two data centers located apart from each other — say no more than 60 miles to limit data latency issues — together with dark fiber (leased fiber optic cable) could enable both data centers to be operated as if they were in the same location, reducing staffing requirements yet providing immediate failover to the secondary data center if needed.

This redundancy of facilities and ensured availability is of paramount importance to those needing uninterrupted data center services.

Active/Passive Data Centers

Active/Active Data Centers

LEED Certification

Leadership in Energy and Environmental Design (LEED) is a rating system devised by the United States Green Building Council (USGBC) for the design, construction, and operation of green buildings. Facilities can achieve ratings of certified, silver, gold, or platinum based on criteria within six categories: sustainable sites, water efficiency, energy and atmosphere, materials and resources, indoor environmental quality, and innovation and design.

Green certification has become increasingly important in data center design and operation as data centers require great amounts of electricity and often cooling water to operate. Green technologies can reduce costs for data center operation, as well as make the arrival of data centers more amenable to environmentally-conscious communities.

The ACT, Inc. data center in Iowa City, Iowa was the first data center in the U.S. to receive LEED-Platinum certification, the highest level available.

ACT Data Center exterior

ACT Data Center exterior

ACT Data Center interior

ACT Data Center interior

Factors to Consider When Selecting a Data Center

There are numerous factors to consider when deciding to build or to occupy space in a data center. Aspects such as proximity to available power grids, telecommunications infrastructure, networking services, transportation lines, and emergency services can affect costs, risk, security and other factors that need to be taken into consideration.

The size of the data center will be dictated by the business requirements of the owner or tenant. A data center can occupy one room of a building, one or more floors, or an entire building. Most of the equipment is often in the form of servers mounted in 19 inch rack cabinets, which are usually placed in single rows forming corridors (so-called aisles) between them. This allows staff access to the front and rear of each cabinet. Servers differ greatly in size from 1U servers (i.e. one “U” or “RU” rack unit measuring 44.50 millimeters or 1.75 inches), to Backblaze’s Storage Pod design that fits a 4U chassis, to large freestanding storage silos that occupy many square feet of floor space.

Location

Location will be one of the biggest factors to consider when selecting a data center and encompasses many other factors that should be taken into account, such as geological risks, neighboring uses, and even local flight paths. Access to suitable available power at a suitable price point is often the most critical factor and the longest lead time item, followed by broadband service availability.

With more and more data centers available providing varied levels of service and cost, the choices increase each year. Data center brokers can be employed to find a data center, just as one might use a broker for home or other commercial real estate.

Websites listing available colocation space, such as UpStack.com, or entire data centers for sale or lease, are widely used. A common practice is for a customer to publish its data center requirements, and the vendors compete to provide the most attractive bid in a reverse auction.

Business and Customer Proximity

The center’s closeness to a business or organization may or may not be a factor in the site selection. The organization might wish to be close enough to manage the center or supervise the on-site staff from a nearby business location. The location of customers might be a factor, especially if data transmission speeds and latency are important, or the business or customers have regulatory, political, tax, or other considerations that dictate areas suitable or not suitable for the storage and processing of data.

Climate

Local climate is a major factor in data center design because the climatic conditions dictate what cooling technologies should be deployed. In turn this impacts uptime and the costs associated with cooling, which can total as much as 50% or more of a center’s power costs. The topology and the cost of managing a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Nevertheless, data centers are located in both extremely cold regions and extremely hot ones, with innovative approaches used in both extremes to maintain desired temperatures within the center.

Geographic Stability and Extreme Weather Events

A major obvious factor in locating a data center is the stability of the actual site as regards weather, seismic activity, and the likelihood of weather events such as hurricanes, as well as fire or flooding.

Backblaze’s Sacramento data center describes its location as one of the most stable geographic locations in California, outside fault zones and floodplains.

Sacramento Data Center

Sometimes the location of the center comes first and the facility is hardened to withstand anticipated threats, such as Equinix’s NAP of the Americas data center in Miami, one of the largest single-building data centers on the planet (six stories and 750,000 square feet), which is built 32 feet above sea level and designed to withstand category 5 hurricane winds.

Equinix Data Center in Miami

Equinix “NAP of the Americas” Data Center in Miami

Most data centers don’t have the extreme protection or history of the Bahnhof data center, which is located inside the ultra-secure former nuclear bunker Pionen, in Stockholm, Sweden. It is buried 100 feet below ground inside the White Mountains and secured behind 15.7 in. thick metal doors. It prides itself on its self-described “Bond villain” ambiance.

Bahnhof Data Center under White Mountain in Stockholm

Usually, the data center owner or tenant will want to take into account the balance between cost and risk in the selection of a location. The Ideal quadrant below is obviously favored when making this compromise.

Cost vs Risk in selecting a data center

Cost = Construction/lease, power, bandwidth, cooling, labor, taxes
Risk = Environmental (seismic, weather, water, fire), political, economic

Risk mitigation also plays a strong role in pricing. The extent to which providers must implement special building techniques and operating technologies to protect the facility will affect price. When selecting a data center, organizations must make note of the data center’s certification level on the basis of regulatory requirements in the industry. These certifications can ensure that an organization is meeting necessary compliance requirements.

Power

Electrical power usually represents the largest cost in a data center. The cost a service provider pays for power will be affected by the source of the power, the regulatory environment, the facility size and the rate concessions, if any, offered by the utility. At higher level tiers, battery, generator, and redundant power grids are a required part of the picture.

Fault tolerance and power redundancy are absolutely necessary to maintain uninterrupted data center operation. Parallel redundancy is a safeguard to ensure that an uninterruptible power supply (UPS) system is in place to provide electrical power if necessary. The UPS system can be based on batteries, saved kinetic energy, or some type of generator using diesel or another fuel. The center will operate on the UPS system with another UPS system acting as a backup power generator. If a power outage occurs, the additional UPS system power generator is available.

Many data centers require the use of independent power grids, with service provided by different utility companies or services, to prevent against loss of electrical service no matter what the cause. Some data centers have intentionally located themselves near national borders so that they can obtain redundant power from not just separate grids, but from separate geopolitical sources.

Higher redundancy levels required by a company will of invariably lead to higher prices. If one requires high availability backed by a service-level agreement (SLA), one can expect to pay more than another company with less demanding redundancy requirements.

Continue with Part 2 of The Challenges of Opening a Data Center

That’s it for part 1. Read the second part here where we’ll take a look at some other factors to consider when moving into a data center such as network bandwidth, cooling, and security. In future posts, we’ll take a look at what is involved in moving into a new data center. We’ll also investigate what it takes to keep a data center running, and some of the new technologies and trends affecting data center design and use.

•      •      •

Here's a tip!Here’s a tip on finding all the posts tagged with data center on our blog. Just follow https://www.backblaze.com/blog/tag/data-center/.

About Roderick Bauer

Roderick has held marketing, engineering, and product management positions with Adobe, Microsoft, Autodesk, and several startups. He's consulted to Apple, Microsoft, Hewlett-Packard, Stanford University, Dell, the Pentagon, and the White House. He was a Ford-Mozilla Fellow in Media and Democracy with Common Cause in Washington, D.C., where he advocated for a free, open, and accessible internet for all, reducing media consolidation, and transparency in politics and the media.