What’s Next for the Cloud? Distributed Architectures

1Executive Summary

Every so often, computing architecture swings from centralized to distributed and then back again. It started with centralized mainframes, which gave way to distributed client/server systems, which in turn were displaced by centralized SaaS models and cloud computing. History has shown us that not much can stop this slow tectonic oscillation between central and distributed, not even the cloud, which is slowly dissipating across the Internet and its connected devices.

We’ve all learned that cloud computing lives in data centers, and we use networks to get to it. That’s a useful model, but the truth is that we’ve been using three kinds of cloud computing for years now, and data center-based clouds are just the first type of cloud to reach mass adoption. The next best-known type of cloud, the overlay cloud, is spread across many data centers but functions independent of any one cloud. The least-known type of cloud is highly distributed on clients and devices and emerges from the ability of a single administrator to manage hundreds of thousands of devices from a single console.

In this article, we’ll review what’s happening for each type of cloud.

Centralized clouds
Centralized clouds are all the rage. On the service side, you’ll hear Google, Amazon EC2, Microsoft Azure, GoGrid, Rackspace and Joyent talking about their centralized cloud architectures. On the virtualization front, Citrix, Parallels and VMware are happy to sell you a centralized cloud computing software package. This type of architecture assumes that most application logic sits in a big data center, hopefully sucking power from a big dam, and that browsers or other lightweight clients access the cloud over the wire.

This kind of cloud makes sense when all storage needs to be near compute resources, or where it is particularly valuable to quickly scale up and down the amount of compute power. However, it doesn’t perform as well when it comes to getting large amounts of data on or off the cloud, and there is always a delay on the network to get to the cloud. Enterprise apps, logistics, e-mail, compute intensive data analysis, and image processing are common use cases. Companies are willing to put up with slower performance for end-users in exchange for the pay per drink model of paying for the cloud.

Overlay clouds
Overlay clouds have nodes spread across hundreds or thousands of networks in many data centers, but they are managed and priced in the same way as central clouds. Overlay clouds are older than you might think. The first ones to emerge in the mid-90s were Content Delivery Networking (CDN) companies like Sandpiper, Digital Island and Akamai. Level3 owns Sandpiper and Digital Island now, and Akamai is going strong, competing with CDNetworks, Limelight, BitGravity and a new player named Cotendo.

Content owners paying for CDN services have no specific idea of where in the cloud their content will be served from. What they do know is that they will be billed for what they use, they don’t have to worry about running out of capacity, and that content gets to users more quickly.

Companies that pay for overlay clouds are usually solving a performance problem and scaling problems with a single service. Users tend to be large content providers or companies with a need to enable rapid downloads of software or high scale streaming video, but enterprises are increasingly using CDNs to speed portions of their applications, and CDN operators have a long history of working to put more enterprise-grade application logic on their distributed cloud overlay services.

When a company uses an overlay cloud, it invariably uses less central cloud computing capacity, which is one of the reasons that Amazon launched its low-cost CloudFront service, which attempts to use more centralized resources to compete with overlay clouds.

Overlay clouds are growing in amount of traffic and capabilities, but their size is dwarfed by centralized cloud operations.

Distributed clouds
Distributed clouds can be the most efficient of all. Distributed clouds are loose collections of processing nodes that may or may not be available at any one time, but which can all be controlled by a single console or set of rules. Some distributed clouds are grid computing–related, but not all of them.

Perhaps the most famous early distributed cloud was the SETI@home project, which linked hundreds of thousands of PCs together to analyze radio signals from space. Today, the open source BOINC project out of Berkeley allows any researcher to create volunteer networks of home PCs to solve computational problems.

One of the more disruptive distributed cloud vendors is called Plura Processing. Plura works with web content providers and software developers to distribute Plura’s JavaScript by injecting it into the web pages being served by the content providers. Every time a web surfer visits a Plura-enabled site, his web browser will perform small Plura computations in the background as long as the web page is opened in the browser.

Those computations, spread across many users’ spare browser processing capacity, can help Plura customers solve problems like analyzing the stock market, searching the web, working with bioinformatics, or working on high-end cryptography. For these distributed computing types of problems, the distributed cloud approach works very well, and it uses a trivial amount of data center resources to distribute the tasks and assemble the results. Compared to building a server farm to solve the problem, the actual cost to Plura is very low, which is why Plura’s pricing can be less than 10 percent that of a centralized cloud provider. The added benefit is that Plura can offer more than 50,000 nodes in parallel, something that no cloud provider can do on demand today.

If a distributed cloud like Plura could solve your type of problem, why would you ever pay to have a centralized cloud like EC2 do it? That’s not to say the Pluras of the world will take over EC2 — after all, Plura isn’t going to run a large relational database or the latest version of Apache, but it can solve many of the same heavy data-crunching problems that Amazon is targeting with its Spot Instances offering for scientific computing and financial modeling, among other uses.

However, the most omnipresent distributed cloud may have a node in your living room – your set-top box or wireless router. For quite some time, cable companies have been remotely controlling cable boxes, operating hundreds of thousands of them from a few management stations, but newer systems may pre-position video content without users knowing, or even use P2P to allow one customer’s set-top box to share content with another customer, removing the need for the cable operator to move the content out of a data center twice.

Along the same lines, UK startup SharedBand is working with a large telecom to transform the countless wireless routers in customers’ living rooms into a unified cloud of wireless routers that share bandwidth between each other based on rules set by the telecom. The end result would be that the telecom could sell faster, more reliable Internet access without having to deploy new infrastructure. This type of offering is good for customers, good for the telecom, and even good for the environment because it makes more efficient use of existing capacity, which is exactly the strength of the distributed cloud.

Distributed clouds today see less traffic than overlay clouds or centralized clouds, but because of their efficiency, companies will continue to use them as a part of an integrated cloud strategy. As more tools become available for central management, distributed cloud architectures will grow quickly.

Conclusion
All of this leads to a world in which “the cloud” is both nowhere and everywhere, in which every piece of data and every transaction is served from the lowest-cost location and cloud architecture that can meet the service level required. The best way to get ahead of the curve is to start looking at the cloud as anything you can control and scale from a web browser, not just a data center or virtual machine, and to begin mapping application requirements against these three types of cloud architectures. And it’s time for centralized cloud service providers to plan to partner with or even acquire overlay cloud and distributed cloud providers.

Relevant analyst in hadoop
You must be logged in to post a comment.
5 Comments Subscribers to comment
Explore Related Topics

Learn about our services or Contact us: Email / 800-292-3024