A few months back we did a blog post titled, LTO versus Cloud Storage: Choosing the Model That Fits Your Business. In that post we presented our version of an LTO vs. B2 Cloud Storage calculator, a useful tool to determine whether or not it makes economic sense to consider using cloud storage over your LTO storage.
Rather than just saying, “trust us, it’s cheaper,” we thought it would be a good idea to show you what’s inside the model: the assumptions we used, the variables we defined, and the actual math we used to compute our answers. In fact, we’re making the underlying model available for download.
Our Model: LTO vs Cloud Storage
The LTO vs. B2 calculator that is on our website was based on a Microsoft Excel spreadsheet we built. The Excel file we’ve provided for download below is completely self-contained; there are no macros and no external data sources.
Download Excel file: Backblaze-LTO-Calculator-Public-Nov2018.xlsx
The spreadsheet is divided into multiple sections. In the first section, you enter the four values the model needs to calculate the LTO and B2 cloud storage costs. The website implementation is obviously much prettier, but the variables and math are the same as the spreadsheet. Let’s look at the remaining sections.
Entered Values Section
The second section is for organization and documentation of the data that is entered. You also can see the limits we imposed on the data elements.
One question you may have is why we limited the Daily Incremental Backup value to 10 TB. As the comment notes, that’s about as much traffic you can cram through a 1Gbps upload connection in a 24-hour period. If you have bigger (or smaller) pipes, adjust accordingly.
Don’t use the model for one-time archives. You may be tempted to enter zeros in both the Yearly Added Data and Daily Incremental Backup fields to compare the cost of a one-time archive. The model is not designed to compare the cost of a one-time archive. It will give you an answer, but the LTO costs will be overstated by anywhere from 10%-50%. The model was designed for the typical LTO use case where data is written to tape, typically daily, based on the data backup plan.
Variables Section
The third section stores all the variable values you can play with in the model. There is a short description for each variable, but let’s review some general concepts:
Tapes — We use LTO-8 tapes that will decrease in cost about 20% per year down to $60. Non-compressed, these tapes store 12 TB each and take about 9.5 hours to fully load. We use 24 TB for each tape assuming 2:1 compression. If some or all of your data is comprised of video or photos, then compression cannot be used, which makes actual tape capacity number much lower and increases the cost of the LTO solution.
Tapes Used — Based on the grandfather-father-son (GFS) model and assumes you replace tapes once a year.
Maintenance — Assumes you have no spare units, so you cannot miss more than one business day for backups. You could add a spare unit and remove the maintenance or just decide it is OK to miss a day or two while the unit is being repaired.
Off-site Storage — The cost of getting your tapes off-site (and back) assuming a once a week pick-up/drop-off.
Personnel — The cost of the person doing the LTO work, and how much time per week they spend doing the LTO related work, including data restoration. The cost of a person doing the cloud storage work is calculated from this value as described in the Time Savings paragraph below.
Data Restoration — How much of your data on average you will restore each month. The model is a bit limited here in that we use an average for all time periods when downloads are typically uneven across time. You are, of course, welcome to adjust the model. One thing to remember is that you’ll want to test your restore process from time to time, so make sure you allocate resources for that task.
Time Savings — We make the assumption that you will only spend 25% of the time working with cloud storage versus managing and maintaining an LTO system, i.e. no more buying, mounting, unmounting, labeling, cataloging, packaging, reading, or writing tapes.
Model Section
The last section is where the math gets done. Don’t change specific values in this section as they all originate in previous sections. If you decide to change a formula, remember to do so across all 10 years. It is quite possible that many of these steps can be combined into more complex formulas. We break them out to try to make an already complicated calculation somewhat easier to follow. Let’s look at the major subsections.
Data Storage — This section is principally used to organize the different data types and amounts. The model does not apply any corporate data retention policies such as deleting financial records after seven years. Data that is deleted is done so solely based on the GFS backup model, for example, deleting incremental data sets after 30 days.
LTO Costs — This starts with defining the amount of data to store, then calculates the quantity of tapes needed and their costs, along with the number of drive units and their annual unit cost and annual maintenance cost. The purchase price of a tape drive unit is divided evenly over a 10-year period.
Why 10 years? The LTO foundation, states is will support LTO tapes two versions back and expects to release a new version every two years. If you buy an LTO-8 system is 2018, in 2024 LTO-11 will not be able to read your LTO-8 tapes. You are now using obsolete hardware. We assume your LTO-8 hardware will continue to be supported through third party vendors for at least four years (to 2028) after it goes obsolete.
We finish up with calculating the cost of the off-site storage service and finally the personnel cost of managing the system and maintaining the tape library. Other models seem to forget this cost or just assume it is the same as your cloud storage personnel costs.
Cloud Storage Costs — We start with calculating the cost to store the data. This uses the amount of data at the end of the year, versus trying to compute monthly numbers throughout the year. This overstates the total amount a bit, but simplifies the math without materially changing the results. We then calculate the cost to download the data, again using the number at the end of the period. We calculate the incremental cost of enhancing the network to send and restore cloud data. This is an incremental cost, not the total cost. Finally, we add in the personnel cost to access and check on the cloud storage system as needed.
Result Tables — These are the totals from the LTO and cloud storage section in one place.
B2 Fireball Section
There is a small section and some variables associated with the B2 Fireball data transfer service. This service is useful to transfer large amounts of data from your organization to Backblaze. There is a cost for this service of $550 per month to rent the Fireball, plus $75 for shipping. Organizations with existing LTO libraries often don’t want to use their network bandwidth to transfer their entire library, so they end up keeping some LTO systems just to read their archived tapes. The B2 Fireball can move the data in the library quickly and let you move completely away from LTO if desired.
Summary
While we think the model is pretty good there is always room for improvement. If you have any thoughts you’d like to share, let us know in the comments. One more thing: the model is free to update and use within your organization, but if you publicize it anywhere please cite Backblaze as the original source.