Addressing the growing market for tools that handle very large data sets, Microsoft has released a beta set of technologies, called Dryad, to manage and analyze large amounts of information across a cluster of Windows Servers.
The company has released Community Technology Preview editions of three packages -- called Dryad, DSC, and DryadLINQ -- that will install Dryad on Windows HPC Server 2008 R2 Service Pack 1.
"These technologies allow you to process large volumes of data in many types of applications, including data-mining applications, image and stream processing, and some scientific computations," the Microsoft Windows HPC Team Blog post stated.
First developed by Microsoft research, Dryad is a platform for running programs across multiple servers. Dryad-based programs can be broken up across multiple nodes. The pieces are connected using a process similar to Unix pipes.
Dryad can be used to analyze log data and other forms of massive collections of non-relational data, much like Google MapReduce technology does, explained Bob Muglia, Microsoft's president of server and tools, in an interview with IDG reporters.
Because Microsoft is still testing the software, the beta release can only work with up to 2,048 partitions and does not support all queries. Microsoft itself is using parts of Dryad for its own online advertising network.