For the third year in a row, Om Malik and the team put on a fantastic event about cloud computing last week. The GigaOm Structure 2010 event was bursting at the seams as people filled the main room to capacity (with a recurring plea by MC Joe Weiman asking people to try and find seats before the fire marshal drags them away for blocking the doors and aisles.)
The schedule included a main room with two days of panels and three breakout sessions during each of the breaks and lunches. Far too much information was shared to be conveyed in a blog post, and you can watch the full videos online, but I will summarize some of the takeaways I gleaned.
Cloud Is Here
For those of us in the space, this seems like a comment from yesteryear. But this year, the cloud has leapt from the believers to the masses. Discussions about “What is the cloud?” or “Isn’t the cloud just a return to the mainframe?” were nearly non-existent. Instead, the discussions were “How do we move our applications into the cloud?” and “How do we manage a blend of using the cloud and internal systems?” or “How do we scale our cloud?”
Even the federal government has become a huge proponent of cloud computing with Federal CIO Vivek Kundra making it one of his primary initiatives. Kundra started Apps.gov (a cloud computing resource for federal government agencies) and even Treasury.gov and Recovery.gov are built on cloud services.
Werner Vogels, CTO of Amazon Web Services said, “talk becomes action because people have understood that it will be a competitive disadvantage not to take advantage of the cloud.”
Selling the Cloud
Salesforce succeeded early on by going rogue: rather than trying to convince the CIO to do a site-wide deployment, they would get into organizations through business heads and line managers. Because the hosted service did not require IT involvement, individuals could simply sign up for the service and it would grow from there.
While that still works today—and many considered it the best option to break into organizations, there was an acknowledgment that getting site wide deployments still required CIO buy-in. While Salesforce used to sell licenses in sets of 10, 50, 100… they are now able to sell in sets of 5,000 or 10,000 by winning of the CIO. (Of course, they are able to do this in part because the “Cloud Is Here.”)
Cloud Scale
Every discussion talked about how to scale up; scaling servers, databases, storage, bandwidth, etc. In a world of dedicated systems for particular problems, each system only needs to scale to the size of a particular problem. Moving to a cloud infrastructure means managing systems on a massive scale—instead of a database for all corporate data you need one to house the data of thousands of corporations; instead of tens or even hundreds of servers you need thousands or tens of thousands. Previous models break down quickly.
Big Data
Big data. Big data. Big data. It felt like nearly an entire day was devoted to big data. While the shared definition of “big data” was any data where you were struggling to keep up with it, there were comments around some specific samples of big data:
- The Mars rover alone generates a couple petabytes per year.
- Apple shipped 100 petabytes of distributed data via iPad sales already.
- NASA satellites will soon generate an exabyte (1,000 petabytes) of data per day.
Not at the conference, but separately I noticed:
- NetApp blogged that they had shipped an exabyte of data in 2009.
- Hotmail has an astounding 1.3 billion inboxes and 155 petabytes of data.
- Few enterprises or cloud providers actually state how much data they have (Backblaze now has 6.5 petabytes) but many are struggling with both how to store and manage it.
Latency Wormholes
When our data and applications were local to our desktops, performance was largely a function of RAM and CPU. When our data and applications are in the cloud, there are a number of new items that define how quickly you can interact with a system. Panelists said the speed of light through fiber optic cables is the fundamental limitation to how fast the systems respond. However, before we try to invent wormholes to break the speed of light, there are a number of other options to explore:
- Get more bandwidth. This won’t help with latency, but it is a critical component of getting data quantity to your desktop.
- Optimize the network itself (called “WAN optimization.”) Put software or appliances at both ends of the network to streamline communications in between.
- Optimize the applications to better account for the limitations of the IP network.
- Cache data closer to the user. (Ironically, the ultimate cache would be to put all data back on laptops and desktops.)
Users’ expectations are that an application responds instantly, regardless of whether it is local or in the cloud. Ensuring this happens will require continual innovation around bandwidth and latency issues. Of course, as one person pointed out, this mostly talks to the approximately 15% of the world’s population that even has access to the internet at all… and even many of those are on extremely slow connections.
Hardware Company Transition
While Dell, IBM, Sun/Oracle, and other hardware companies are happily selling multi-million dollar pieces of equipment to large enterprises, most of the cloud vendors are standardizing on inexpensive, disposable, commodity hardware and writing software that takes care of the performance, scalability, and redundancy requirements. If computing moves to the cloud, what is the role for these enterprise hardware manufacturers? Cloud providers expect to purchase high-volume, low-margin equipment. Will the hardware vendors be able to adapt to the demands of this market? Who will be the winners in this space?
Dell presented that it realized in 2006 that the top 20 customers in the world not only represented a very large share of the global market share of server purchases, but were growing at 83% CAGR. Since it had a tiny market share amongst these large customers, it started the Dell Data Center Solution group. The division started building systems that focused on integrating power and cooling into the racks themselves and started to blur the lines between where the “data center” ends and the “servers” begin.
Moving to the Cloud
Most of the discussion was not “whether” to move to the cloud but “how” or “which parts.” A variety of companies have sprung up to help organizations move existing systems over or kickstart new ones while staying integrated into their security and management models. CloudSwitch won the people’s choice award at the conference and is one of the companies that tries to make it point-and-click simple to move existing systems transparently to the cloud.
One common theme is the expectation that organizations will need to live with a hybrid model for a long time with many systems continuing to reside internally as currently deployed while others move to the cloud. Companies rarely want to “fix what’s not broken” but also want to start taking advantage of the capabilities of the cloud. Managing these hybrid systems will be more complex and require management architectures that allow control through a “single pane of glass.”
The New Stack
A fertile topic was about the appropriate development stack for cloud computing. LAMP (Linux, Apache HTTP Server, MySQL, and PHP) has been a very popular solution stack over the last few years for bringing web properties to life. However, the shift from individual web company infrastructures to shared cloud environments and some of the resulting scale demands is causing a re-look. A replacement for the MySQL database with some type of “No-SQL” system such as Cassandra, Hadoop, or other is a key area, but PHP is being reconsidered, and even Linux as the core OS has some people thinking.
The infrastructure as a service (IaaS) vs. platform as a service (PaaS) debate plays in here as well. Using a PaaS (such as Google AppEngine and Microsoft Azure) may help you get an offering to market faster. However, it also locks you in to a specific platform/stack. PaaS offerings do not seem to be taking off as quickly as IaaS ones in large part for this reason.
The new stack is still open for determination. A desire exists to have some standardization in order to have a significant pool of engineers with the expertise on that stack.
The Anti-Cloud Clouds
Who are the anti-cloud clouds? Salesforce. Facebook. Yahoo. Google.
And probably most others.
These are the companies who are both strong proponents of the concepts of the cloud (commodity hardware, shared resources, virtualization, etc.) and offer cloud platforms or services—yet would not even consider moving their own systems to someone else’s cloud. Om Malik asked Mark Benioff whether Salesforce would use Amazon Web Services if he were starting the company today; Benioff said “No. We need control over our infrastructure.” In other panel discussions, Facebook and Yahoo echoed the same sentiment. In fact, Facebook is going to the next step and is now breaking ground on its own physical data center.
Don’t get me wrong, I don’t fault them for this. In fact, I think they are making the right decision. As we dug into it at Backblaze, it can be significantly less expensive to operate the systems yourself than to outsource them to a cloud provider. You need to be at some scale for this, but that scale is actually not that significant. And you need to be in a position to predict your demand fairly well. Jonathan Heiliger, VP of Technical Operations at Facebook, said that no matter how hard they try, they can never predict demand from new features. Thus, they have to overbuild… but at some level of scale and predicability, this inefficiency is still a cost-savings over handing over the operations to someone else. The key thing is to think through what is important for your own business—the tradeoffs between focusing on your core business and control over your infrastructure, between flexibility and cost savings, and to figure out at what level of the entire stack you want to own.
The Challenge and the Opportunity
Tectonic shifts create huge challenges, but also huge opportunities. Moving from mainframes to PCs and from standalone to connected systems shut down many companies and created multi-billion dollar new ones. The issues discussed above will do both again.