Announcing Support for IPv6

An illustration of network connections on a gradient background.

If your systems are IPv6-enabled or enabling IPv6 is on your roadmap, good news—starting yesterday and continuing over the course of the next few weeks, Backblaze will be “flipping the switch” and turning on IPv6 for our S3 Compatible API. While our IPv6 deployment isn’t completely done yet (we’re phasing the roll out through our environment), we thought we’d share some of the decisions we made that affect performance and functionality.   

Today, I’ll talk a little bit more about our choices along the way, and answer some questions that might come up about how we’re supporting the protocol (jump to the FAQ for that).

Hi, I’m Anthony

Since this is the first time you’re hearing from me, I thought I should introduce myself. I’m a senior network engineer here at Backblaze. The Network Engineering group is responsible for ensuring the reliability, capacity, and security of network traffic, and that includes our IPv6 deployment.

What is IPv6 and why did we enable it?

Internet protocol version 6 (IPv6) is replacing internet protocol version 4 (IPv4) as the standard for IP addresses. Most of the internet uses IPv4, and this protocol has been reliable and resilient for over 20 years. However, IPv4 has limitations that might cause problems as the internet expands—namely, there aren’t enough IPv4 addresses to go around

Demand for IPv6 is continuing to increase exponentially. A major factor in this is the combination of the continually growing population and the number of connected devices a given person carries. One study from 2020 suggests the average number of connected devices per person globally was 2.4 in 2018, and forecasted to be 3.6 in 2023. Specifically for North America, the study suggests 8.2 connected devices in 2018 and a whopping 13.4 in 2023! Every device connected to the internet needs an IP address, and the finite address space of IPv4 it is simply no longer sufficient. The key to IPv6 enhancement is the expansion of the IP address space from 32 bits to 128 bits, enabling virtually unlimited, unique IP addresses.

Support for IPv6 means our customers can reach our services in the most efficient and secure way possible.

Why should you care about us deploying IPv6?

We’ve learned some things over the years, so we approached our IPv6 deployment a little differently than our IPv4 deployment. If you’re a customer or potential customer, here’s what that means for you: 

  1. No action needed on your part: Unlike some of the traditional cloud providers, we chose to use the same endpoint URL and let the client choose whether or not to use IPv6. This allows for any systems already IPv6 enabled to benefit immediately. In fact, if your systems are IPv6 enabled and you are a B2 customer using the S3 compatible API, you might already be connecting to us over IPv6 now.
  2. Our deployment is better set up to scale: Because of the way we decided to assign virtual IPs (VIPs) to our API endpoints, we have more flexibility to distribute ingress traffic and the ability to add VIPs as we need to in the future.
  3. Improved network performance and simpler network management: With IPv6, we simplified IP assignments and reduced the need for customers to use Network Address Translation (NAT). NAT adds processing overhead to network traffic as it translates IP addresses, which can lead to latency issues, especially with high-volume data transfer. The less traffic you have to NAT, the better. On our end, there is no NAT with customer data flows regardless of IPv4 vs IPv6. We also made the decision to route traffic before using network switches, this helps with reducing IPv6 multicast “noise” and generally helps keep the “wire” cleaner.

And here’s how we got it all done.

If a VIP could only talk

First, a little background: Backblaze offers two APIs—the Backblaze S3 Compatible API and the Backblaze B2 Native API. You can learn more about our APIs here in our documentation, but a couple differences are important to note when it comes to our IPv6 deployment:

  • Backblaze B2 Native API: Uploads are sent directly to a Backblaze Vault. As part of the process of uploading a file, the client is provided an “upload URL”, which is a direct URL to an assigned member of the storage Vault. The data transfer is direct from the client to the storage Vault. Only downloads are served by the API server pool. Load balancers mainly handle distributing API calls.
  • S3 Compatible API: Uploads flow through load balancers and the API server pool. Our API server pool then distributes the data to the assigned Vault. Downloads are served by our API server pool just like Backblaze B2 Native API.

These functionality differences play a role in how we are able to perform traffic engineering.  We assign VIPs to our API endpoints, for example, s3.us-west-004.backblazeb2.com, or api004.backblazeb2.com. These VIPs are owned by our load balancers and API servers (for Direct Server Return). With the Backblaze B2 Native API, we really only need two VIPs per cluster: one for uploads and one for downloads. The upload URL that B2 Native provides to the client naturally distributes the flow across our IP space. With the S3 Compatible API, since uploads and downloads are handled by the same flow, we only needed one VIP…or so we thought.

Assigning a single VIP to the S3 Compatible API has been fine for a long time. However, as we’ve grown, and usage of the S3 Compatible API has grown, we discovered that a single S3 Compatible API VIP makes traffic engineering ingress flows challenging. When a large percentage of our S3 Compatible API ingress traffic happens to come from providers that prefer getting to us via a single path, having all that traffic destined to a single IP means we have no ability to steer (i.e.traffic engineer) portions of the traffic.

Starting at the beginning of this year, we’ve grown the number of API VIPs in our datacenters with the highest amount of S3 Compatible API traffic from a single IP, to four IPs in four different network prefixes (also known as subnets). This allows us to steer portions of S3 Compatible API traffic. This also helps distribute flows so that providers that have equal cost paths to us can be better utilized.

Lesson learned: With IPv6, we standardized on four IPv6 VIPs in four different prefixes with plans to grow if/when needed.

Route when you can, switch only when you need to

Backblaze datacenter networks are architected using a typical “three tier” approach. We have an edge layer, an aggregation layer (also known as a spine), and an access layer (also known as a leaf).

A diagram of a three-tier network design.

With IPv4, we have two IP “classes”. We have a private network (RFC 1918) and a public network. Every machine is assigned an IPv4 address on the private network, and only machines that need to directly interface with the outside world are assigned public IPv4 addresses. These two networks each reside within their own VLAN, and host networking is configured to tag traffic as necessary.

Given the tiered design of our network, different layers handle these VLANs. The aggregation layer acts as the router for the private network, and the edge layer acts as the router for the public network. From there, IPv4 traffic is switched, and thus we simply have two large (i.e. flat) VLANs for IPv4.

A diagram showing an example of how private IPv6 traffic travels through a network.
A diagram showing an example of how public IPv6 traffic travels through a network.

This has worked well (and still works just fine). A pair of VLANs that we can switch to anywhere in the datacenter keeps things simple. Hosts can reside anywhere within the datacenter and IPs can be assigned from the same pools. However, with IPv4 traffic being switched datacenter wide, the flat broadcast domain (thus the level of background broadcast noise) increases over time as the environment grows. In our largest (IP-space wise) datacenter we’ve needed to increase hosts’ arp cache size. With IPv6, we wanted to improve this.

The first decision we made was to eliminate the concept of public vs private address space with IPv6. Every host gets an address and all addresses are public (if the role requires). Existing firewalls and switch ACLs already permit/deny traffic as appropriate (which is also the same for our IPv4 networks).

Not only does this simplify IP assignments, it also reduces the need for Network Address Translation (NAT). We have many hosts that are not public facing, but do need to communicate with the outside world for various reasons. As we are able to move more and more communication with external services to IPv6, this reduces the load on resources we’ve deployed simply to handle NAT.

The second decision that we made was to route all the way down to the access switch layer.  Each access switch is assigned a /64 and hosts connected to a given switch are assigned an IPv6 address from a portion of this block.

A diagram showing an example of how IPv6 traffic travels through the Backblaze network.

This helps with reducing IPv6 multicast “noise” and generally helps keep the “wire” cleaner.  It does make host deployments a little more complicated as in order to assign a given host an IPv6 address from the correct network, one needs to be aware of the switch the host is connected to. Also, if data center staff need to move hosts around for power balancing or consolidation, the IPv6 address will need to be changed if the new location results in the host connecting to a different switch. 

Lesson learned: Even with the added complexity, the route when you can, switch only when you need to mantra works well for our environment.

What’s next?

We still have more work ahead. We are currently investigating ways to support the Backblaze B2 Native API with IPv6 as well as Backblaze Computer Backup. Stay tuned for more on that front.

FAQs

What’s the difference between IPv4 and IPv6?

The key difference between the versions of the protocol is that IPv6 has significantly more address space. The IPv6 address notation is eight groups of four hexadecimal digits with the groups separated by colons, for example 2001:db8:1f70:999:de8:7648:3a49:6e8, although there are methods to abbreviate this notation. For comparison, the IPv4 notation is four groups of decimal digits with the groups separated by dots, for example 198.51.100.1.

The expanded addressing capacity of IPv6 will enable the trillions of new internet addresses needed to support connectivity for a huge range of new devices such as phones, household appliances, and vehicles.

How can I use IPv6 with B2 Cloud Storage?

Currently, only the Backblaze S3-compatible API supports IPv6. To use IPv6 addresses with B2 Cloud Storage and the S3-compatible API, you do not need to make any changes.

Will IPv4 addresses still work?

Yes, IPv4 addresses will continue to be supported for both the B2 Native API and the S3-compatible API for the time being. We do not have any explicit plans for sunsetting IPv4 at this time.

What will happen if I continue to use IPv4?

Nothing. IPv4 will continue to be supported at this time.

Is IPv6 better/more secure than IPv4?

It is not more secure. Customers who reach us via IPv4 or IPv6 will have connections that are equally secure. Our APIs use the same strong TLS encryption regardless if IPv4 or IPv6 is used. Some customers may see a performance improvement if IPv6 allows them to avoid network address translation (NAT).

Is there an additional cost to use IPv6?

No.

I’m using Backblaze Computer Backup. Do I need to make any changes?

No. IPv6 is only relevant for Backblaze B2 Cloud Storage. You don’t need to make any changes to your Backblaze Computer Backup account.

About Anthony Hoppe

Anthony Hoppe is a Senior Network Engineer at Backblaze. Before he worked at Backblaze, he previously held positions in the public sector, including in state government and K–12 education. Networking of all kinds is his passion. Even his personal network(s) are complex enough to involve BGP, VPNs, and planned maintenance windows. In his spare time, you may find him punching down cross-connects on 66 blocks and neck deep in Nortel phone switch programming.