MySpace to open source in-house data analysis technology
- 16 September, 2009 05:36
MySpace on Tuesday will release as open source a technology called Qizmt that it developed in-house to mine and crunch massive amounts of data and generate friend recommendations in its social-networking site.
Qizmt is a distributed computation framework based on the MapReduce programming model for processing large data sets in processor clusters.
The company hopes that the developer community will be able to benefit from Qizmt, as well as enhance and extend it, said Hala Al-Adwan, MySpace's vice president of data.
In use at MySpace for about a year, Qizmt is used to power the site's "People You May Know" feature, which makes friend recommendations to users.
MySpace hopes to extend its use to other types of recommendations, like suggestions for movies or books, and for products to buy, said Al-Adwan, who will demo Qizmt on Tuesday at Computerworld's Business Intelligence Perspectives conference in Chicago with MySpace COO Mike Jones.
"We want to look at our entire data set and start exploring the realm of social analytics. We're really trying to understand the nature of our users' behavior and the relationship with each other and with content as well," Al-Adwan said.
Before creating Qizmt in-house, MySpace looked at available options for doing large scale data processing in near real time, but didn't find anything it considered appropriate to its needs and to its .Net-based development platform, she said. Qizmt was developed using C# .Net for Windows.
In June, MySpace released as open source another internally developed tool called MSFast for tracking the performance of Web sites.