Open source StarCluster shines on Amazon cloud

Dynamic computing allocation possible

A new open source project dubbed StarCluster has been released aiming to simplify the management of virtual clusters hosted on Amazon’s Elastic Compute Cloud (EC2) service.

According to developer Justin Riley, StarCluster minimises the administrative overhead associated with obtaining, configuring, and managing a traditional computing cluster used in research labs or for general distributed computing applications.

The StarCluster project started at MIT’s Software Tools for Academics and Researchers (STAR) Program.

A first beta release, StarCluster 0.90, was posted on the The Python Package Index (PyPI) last week and on Freshmeat.net yesterday.

StarCluster consists of a library and set of scripts that interface with EC2 to automate the creation (and deletion) of clusters of virtual machines and only paying for the time used.

For end-users, the scripts are the main user interface and provide options for getting started with distributed computing on EC2 like starting and stopping clusters, and managing software configurations.

StarCluster also has an API which provides an interface to EC2 for manipulating nodes, executing commands on nodes and copying files among nodes.

A configuration file provided by the user (including EC2 account details) requests cloud resources (number of machines, instance type) from Amazon and to automatically configure the Linux machines with a queuing system, an NFS shared /home directory, password-less SSH access, OpenMPI, and about 140GB of disk space.

StarCluster comes with a public Amazon Machine Image (AMI) on EC2 that includes a the software stack for distributed computing.

The AMI is based on Ubuntu 9.04 (i386 and x86_64) and also includes the Sun Grid Engine software and Python libraries for scientific computing.

StarCluster is targeted at computational research labs and to support classrooms with computational requirements.

“StarCluster is a way for graduate students and faculty to have an on-demand cluster,” according to the project. “This means students can access their research with the same hardware and software configurations wherever they go; even if they move to another institution.”

“It also removes the majority of system administration concerns since the initial setup procedures have been captured in StarCluster and in the user's software configurations. With this model there is also the benefit that if hardware problems occur it's easy to request a new set of machines in the cloud.”

Planned features include support for multiple clusters and the dynamic resizing of EC2 clusters where nodes would be launched, added to the cluster, used for computation, and removed when they're idle.

More about: Amazon, C2, Linux, MIT, SSH, Ubuntu
References show all

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
Users posting comments agree to the TechWorld comments policy.
Login or register to link comments to your user profile, or you may also post a comment without being logged in.
Related Coverage
Related Whitepapers
Latest Stories
Community Comments
Tags: amazon ec2, cloud computing, clustering, python, StarCluster
Whitepapers
All whitepapers

Twitter Feed