We were last in Cisco's new data center in Allen, Texas, in the fall of 2010 when the company was just putting the finishing touches on the 160,000 square foot building with 35,000 square feet of "raised floor" (they still use that lingo even though this facility doesn't use raised floors).
This data center, the crown jewel in the company's far reaching Global Data Center Strategy to consolidate and modernize core facilities, was brought online July 7, 2011 and we recently stopped back for an update (see our in-depth tour of the site under construction, or in pictures).
The Allen data center plays a critical role in the company's Cisco IT Elastic Infrastructure Services (CITEIS) private cloud, and is paired with a data center in Richardson, Texas, using Cisco's Metro Virtual Data Center architecture. MVDC enables one center to provide coverage for the other key applications, a fail-safe approach Cisco is using to safeguard critical applications.
Allen is 25% built out today, says James Cribari, Manager, Information Technology Global Infrastructure Services. "When we first opened, we had seven clusters of UCS (which is about 280 blades, 500 VMs), our installed raw storage was 1.5 petabytes, and our UPS critical load was 9%. Now it's 30% and we've grown to 45 clusters (1,192 blades, 4,000 VMs), and we have more than five petabytes of raw storage."
As built, the facility can house up to 250 UCS clusters, but that number is likely to climb as Cisco ramps up server density as new technologies emerge.
Many of the original data center design expectations are delivering as promised. The air side economizer, for example, was supposed to reduce the need for air conditioning by using outside air to cool the center 51% of the time, saving some $600,000 in electricity costs. So far, the outside temperature has allowed them to use ambient air 56% of the time, Cribari says. (They keep the data center at 78 degrees.)
The Power Usage Effectiveness (PUE) goal for the data center was 1.35, and that's better than what they have achieved to date, but to be expected. "We're probably sitting somewhere around 1.55 PUE today, but that's because we're not at the load that would be the optimum from an energy standpoint," Cribari says.
And the use of MVDC to pair Allen and Richardson to safeguard core applications is also going well, he says.
Not all applications, of course, need MVDC support, so the first step in rolling out that technology was to classify the criticality of the company's 1,500 applications, says Kirti Thakkar, a Cisco Information Technology Engineer.
They came up with five classifications, C1 to C5, with C1 requiring 99.999% availability (no acceptable downtime), C2 requiring 99.995% availability (with acceptable recovery time up to an hour), C3 requiring 99.99% availability (4 hours acceptable recovery time), C4 requiring 99.9% availability (24 hours of recovery time), and C5 with 99.9% availability and best effort recovery.
Cisco.com, for example, is a C-1 application, Thakkar says, as is authentication. "Of all of the apps, probably 10% to 15% are C-1s. There are a few C-5s, but most others fall in C-2 through C-4.
For Cisco.com, for example, "external requests go to either Richardson or Allen," Thakkar says. "The user doesn't know the difference. "Once someone hits a given DC, we try to keep them in that DC."
With MVDC the two data centers are "running in an active/active scenario on the Web and app layer, and for the database we have an active standby, so if we see an unplanned outage, the Oracle Observer sees one is not available will shift everything to the other and achieve zero data loss," Thakkar says. (The sites are linked via a 400Gbps fiber ring.)
Besides the failsafe benefits, MVDC has business benefits, Thakkar says: "We see a lot of improvement on the operation side where we plan outages. So we have www1Cisco.com sitting in Richardson, and www2cisco.com sitting here in Allen. If you need to do any maintenance on www1, we can go to our global site-select, which is our load balancer, and take www1 offline or suspend it for minutes or hours in order to do the maintenance, and then bring it back online."
Given the two data centers are in the same region, Cisco also has a bigger picture disaster recovery plan that involves a remote data center, this one in Raleigh, N.C.
The Raleigh facility serves a dual purpose. It is an AppDev environment where Cisco developers can use UCS hardware and Nexus technology, but in the event of a disaster "we can actually change the service profile in all those UCS clusters from application development to production" and reroute traffic so that "data center would go from AppDev to DR within 24 hours."
It actually takes Raleigh less than a minute to kick in, Thakkar says, so the 24 hour count is the outside number for all applications to be up and running in RTP.
How often do unplanned outages occur? "We had one or two incidents when we had a fiber go down," Thakkar says, but we had the signal relayed to another ring so there was no outage."
From a capacity management standpoint, things have also gone as expected, Cribari says. "We knew that first generation of UCS could only support up to 96GB per server. Now we can go the whole way up to 384GB, 512GB, 764GB, and those higher memory blades allow us to be more dense, and we took that into consideration. We also planned the structured cable plant to go from 1-Gig to 10-Gig and eventually to 40-Gigs, and we're going to be able to do that without requiring any additional construction."
The maximum kilowatts per rack for the center is 21kW, and "we're averaging anywhere between 8kW and 13kW," Cribari says. "UCS server racks are probably seven to eight, Nexus network equipment and some of the higher-density network components are in the 13kW range."
"In the art of data center capacity management, the goal is to fill it up, but you don't want to fill it up to the point where it's inefficient," he explains. "The trick is to make it so your power, your space, your network ports and everything else maximizes at the same point."
How long before Allen fills up? "Three to five years, based upon the trajectory we're on," Cribari says. "When we hit 80% in this data center, we'll probably start talking about expanding into Phase Two. We have acreage next door, we can add another 5.25 megawatts, so we think we can double this footprint. In fact we already have some of the infrastructure in place -- some of the pads are already there, some of the conduits are already there."
Read more about data center in Network World's Data Center section.