Cloud Computing Reaches the Final Frontier

1Executive Summary

The much-hyped benefits of cloud computing are of interest to a growing number of industry sectors, each of which brings with it a particular set of requirements. Possibly more esoteric than most, exploration of the cosmos presents diverse challenges for those working with the constant streams of data beamed down from distant sensors. Inside NASA and the European Space Agency (ESA), separate teams are working to understand how — if at all — cloud-based infrastructures might aid in furthering understanding of the worlds around us.

Mapping the Milky Way from Europe

Outside Madrid, Spain, Science Operations Development Manager William O’Mullane is preparing for the 2012 launch of ESA’s Gaia mission. Boosted out past the Moon from the jungles of French Guiana aboard a Russian Soyuz Fregat rocket, Gaia will spend the next five years transmitting some 40 Gigabytes of data per night back down to ground stations in Spain and Australia — data that will enable ESA to build what the Agency’s web site describes as “the largest and most precise three dimensional chart of our galaxy.”

As in many international collaborative efforts, the team responsible for working with the data is geographically dispersed. Members of the Data Processing and Analysis Consortium (DPAC) draw upon the capabilities of six Data Processing Centers spread across five European countries. At one, ESA’s European Space Astronomy Center (ESAC), O’Mullane does not intend to pursue the traditional approach to meeting the mission’s processing needs. Instead of spending a budgeted €1.1 million ($1.5 million) on local computing resources (and a comparable amount on power, cooling, staff and the rest), O’Mullane is investigating tapping into the cloud in order to save money and get the job done faster.

Speaking at the PoweredByCloud conference in London last month, O’Mullane and The Server Labs‘ Paul Parsons presented results from pilot testing with Amazon EC2 to suggest that costs could fall to €400,000 ($547,000) for ESAC’s share of the Astrometric Global Iterative Solution (AGIS) data processing. By utilising spot pricing this figure would conceivably fall even further, but with a possible reduction in the speed with which larger jobs would complete. The nature of the processing required here makes Gaia mission data perfect for benefiting from the on-demand nature of pricing for most public Cloud infrastructure. O’Mullane and Parsons are quick to stress that the significant cost savings they anticipate are based upon bursts of intensive processing with ongoing background activity at a far lower level. If there were a need to reprocess large quantities of data, the anticipated savings would soon begin to erode.

Alongside the expected cost savings, O’Mullane and Parsons comment favorably on the speed with which they can complete tasks. Rather than spending four weeks every six months to recompute the mission’s AGIS in-house, test results suggest that this task will be completed in just a week. The team is able to spread the job over a far larger number of virtual machines than they could afford to buy and maintain in-house, and Parsons also noted that Amazon’s virtual machines even appear optimized to run the mission’s Oracle-based database faster than they can manage inside ESA.

O’Mullane is far from alone within ESA in his enthusiasm for the cloud’s potential, and he points to interest within the Agency’s Earth Observation activities, as well as a recently commissioned study into the possible benefits of cloud computing.

Meeting NASA’s Evolving Requirements

Almost 6,000 miles away, in Silicon Valley (and within sight of the Googleplex), NASA’s Ames Research Center CIO Chris C. Kemp is also leveraging value from the cloud. But rather than depending on Amazon, Rackspace or their peers, Kemp is building NASA’s own private cloud — known as Nebula — in containers on-site at Ames, and leveraging some of the best connectivity on the planet in the process.

Kemp’s interest in the cloud began around two years ago, and it was initially aimed at solving a very specific set of housekeeping problems: Since starting out as one of the earliest presences on the web, NASA has managed to accumulate thousands of web sites. Many of these are created by small teams of external contractors, and run on a wide range of hardware and software, making it an increasingly difficult task to patch, maintain and migrate these in accordance with NASA’s obligations with respect to data security and accessibility.

The initial premise behind Nebula was to create a platform on which those web sites could run, freed from the need to individually address authentication, compliance, scalability or sustainability. Specifically, Nebula’s architecture was intended to maintain security while enabling project scientists to upload and run the code and simulations that make so many NASA sites engaging.

From this initial desire to offer an example of a platform (or PaaS) within the traditional cloud computing stack, Kemp noted during our conversation that it rapidly became apparent that “virtual machines, offered as Infrastructure as a Service (IaaS), are the optimal way to allow scientists to build out within a secure environment,” and it is as IaaS that Nebula’s first successes are now being realised. Through a series of beta projects involving subsets of the Agency’s 75,000 employees and contractors, Nebula is positioning itself as the compute and storage backbone of choice, offering affordable high-performance resources inside NASA and acting as a stepping stone to external services as required.

Nebula is based on open-source components such as Eucalyptus, and Kemp stresses NASA’s intention to contribute the Agency’s experience in optimizing code for their large-scale high-performance environment back to the wider community. While continuing to make decisions in order to optimize performance locally, the Nebula team is keen to implement standardized approaches as these begin to emerge, and to ensure interoperability with external Clouds wherever feasible.

One of the areas in which Nebula excels is in high-performance computation across large data sets; the sorts of data that, Kemp suggests, “would saturate our Internet connection for several months” if NASA scientists attempted to transmit it to a third-party cloud provider. By working closely with Intel, and by optimising configurations in favor of performance, Kemp claims “up to ten times the performance of Amazon” for the class of tasks that his scientists typically wish to perform.

Over the past six months, Nebula has been supporting five very different pilot projects drawn from across the Agency. In one, teams from NASA and Microsoft Research have been leveraging Nebula to bring full-fidelity images from the HiRISE and LROC missions to the public World Wide Telescope within an hour of the raw data being released for processing. The elastic nature of Nebula makes it straightforward for the team to create sufficient virtual machines to complete the task, then release them once finished.

Kemp reports that the other pilots are at different stages, but expects Nebula’s IaaS offering to be available to all scientists later this year. The PaaS offering originally envisaged for Nebula will follow, and he believes that a “thin SaaS layer” to complete the stack may also be appropriate to simplify the processes by which NASA data are made available for external consumption.

Culture or Technology?

In both cases, novel technological approaches are being adopted in order to solve weighty scientific problems. Yet through my conversations with both, it became apparent that the cultural shift might actually be both harder and more significant than the technology shift. Scientists are really no different than the rest of us. They want to have ‘their’ powerful workstation under the desk and they want to secure — and spend — the budget that fills a corner of the machine room with ‘their’ flashing lights and spinning disks.

Inter-agency and international collaboration further complicates matters, with the horse trading for local jobs, kudos and expenditure frequently likely to trump the cost-effectiveness or efficiency arguments in favor of some distant data center. As Chris Kemp noted, “We need to move away from cap-ex thinking toward on-demand thinking.” The cloud offers an elastic set of dynamic resources that have a real role to play in space and here on earth, if only we can get its potential beneficiaries to give up their fascination with flashing lights and spinning disks.

Relevant Analyst
PaulMiller-med5f09011b40a21c20b941ac1948ceee31-avatar2

Paul Miller

Founder The Cloud of Data

Do you want to speak with Paul Miller about this topic?

Learn More
You must be logged in to post a comment.
No Comments Subscribers to comment
Explore Related Topics

Learn about our services or Contact us: Email / 800-906-8098