IT pro rethinks infrastructure from the ground up, ends up in clouds
- 05 April, 2012 05:39
Mark Adams, vice president of IT at HireRight, is living the dream -- the chance to completely rethink the infrastructure for a $300 million software-as-a-service employment screening service company. While the nucleus of the 1,600 employee company has been around for 30+ years, a three year acquisition spree resulted in data center sprawl, leaving the company with 10 facilities, including company owned and collocation and disaster-recovery sites, some of them overseas. Now HireRight is three quarters of the way through a consolidation effort with a heavy emphasis on cloud. Adams gave an update on the company's modernization progress to Network World Editor in Chief John Dix.
Where did you start with this consolidation effort?
We are a software-as-a-service company and we looked at our 10 data center footprint and asked ourselves "do we really need 10?" So we started thinking about, if we collapse to this, if we refresh that, if we could go with one Tier 1 storage vendor, what did all of that look like? And as we kept going we finally realized that for HireRight four data centers is the best. In this case, more is not more; less is more.
When did you start the process?
Just about two years ago. And it's still ongoing. We have two different kinds of clouds, a customer facing cloud which supports the software-as-a-service portals, and an internal back-office cloud for operations. So we looked at what's the next step for that? If we could consolidate everything down and go 100% virtual and introduce some of the flashier elements of virtualization -- dynamic processing, being able to move things around, if we could load up our blade servers with lots of memory -- what would that footprint look like? And in our case, it got pretty lean, pretty dense.
On the back-office cloud, were you able to accommodate all the assets you came by in the acquisitions?
Yes. That was helped by the fact that a few years ago we went pretty heavily into virtual desktops. One thing that's unique about HireRight is we are a security first company, and then we look at performance and up time. We are considered a credit reporting agency, so when we built this thing out, VDI got a lot of our attention. You can secure it well and scale it extremely well. So that's what we built our cloud around, keeping the data in one spot, not allowing it to go anywhere.
Cloud being a fashionable term these days, can I ask you to describe it to ensure we're thinking/talking about the same thing?
That's a great point, because everybody's got a different definition. When we talk about our back-office cloud we're talking about a certain set of tools in the application tier that have to have maximum up time and have to be available from anywhere in the world. So we built a cloud stack around that. (See 10 most powerful cloud computing companies.)
In the traditional model you have people with laptops running around, security teams trying to lock stuff down, VPN connections. We didn't want to go down that path. We wanted to be able to expose one interface to our employees and then control the backend horsepower we throw at it and be able to shift that horsepower around depending upon where the work is at any given time.
So being an international company, we didn't want to be limited to having to VPN into one location. We built it around the idea that, anywhere you are, at any time, with your certain set of security protocols, we'll give you access. The data's going to pretty much stay where it is and not leave. So that's our definition of our internal facing cloud. We have accomplished that through a variety of technologies, but it's been scaling quite well. We've put thousands of workers on it.
How hard was it to get right?
HireRight has used Citrix for a lot of years and because of that we knew going in what does and doesn't work. One of the things we don't really care about is the network speed coming in. We've architected the solution to handle up to 300 milliseconds.
But it took a lot of mapping of business functions and groups, putting people into compartmental boxes -- these guys should see this slice, these guys should see that slice of applications. So I'd say 50% of it is knowing what people need and getting them correctly allocated, while the other 50% is understanding the limitations of the technology and planning accordingly.
One piece of advice for companies getting into it is the devil is in the details. You'll see everything from disk storms on storage arrays to problems getting applications published correctly and questions about how you lock it down correctly. All those were exercises we spent years working on. So we're actually proud of this particular endeavor because it's kind of the culmination of a lot of years of effort.
And then we combined all that with the fact that now we can move things into centralized data centers that are very high density. It's not something for the faint of heart.
Was there much commonality in the platforms used by the different companies HireRight acquired?
You never get "straight out-of-the-box, everything matches." But most everything is x86. Whether it's .Net or Java, it all pretty much runs in a similar technology stack. We are a software-as-a-service company, so we tend to look at things that we write as kind of a Web application tier, so we can bring those things together. We've been fortunate that we've been able to focus on that over the years as our technology stack, so it does make things work pretty well. We also have an Oracle database on the backend, and Oracle has done a lot of great things in the scaling space, so we didn't have to recreate the wheel.
Is part of the goal to reduce the number of components and vendors you're dealing with?
Absolutely. One of the goals was to reduce the cost of administration, and there's not a lot of magic in how you do that. It's simplify, simplify, simplify. If something is complicated, it's going to take a lot of excess work to maintain.
So we looked to take redundancies out of the stack, to narrow it down to a certain set of vendors we felt comfortable with, and we did a lot of bake-offs. Since we are a security first organization, we pretty much encrypt everything. And we have to move large encrypted transactional databases across the wire. So where a lot of the architecture engineering time went was making sure we had that down, because it's one thing to say you encrypt a few pieces of data, it's another thing to say you have terabytes of data being encrypted across the wire.
That's one area we spent a lot of time working with partners on, looking for the absolute simplest way to do that. I don't know how much time you spend with security guys, but the word "simple" and "encrypt" generally don't go in the same sentence.
What did you end up with after the simplification effort?
Well, on the server side we started out with the four top-tier vendors and baked it down to one. On the storage side, we had three vendors and baked it down to one. In encryption technology, we really only had three or four vendors we could look at and got that down to one. Load balancers, we had two companies we've worked with over the years and narrowed that to one. Virtualization, we looked at three top-tier ones and narrowed that to one. I have to say virtualization was kind of a horse race, actually. Surprisingly enough, the last time I did this four years ago, it wasn't much of a race, but we had several compelling reasons to go either way with the virtualization vendors.
How about application count? Did you whittle that down too?
Some. We do a lot of our own internal development, so for the core applications, the CRM tools, the workflow management tools, some of the stuff we use for internal processes, we've written them ourselves and did the consolidation in the process so we really didn't have to streamline those much.
So where do you stand in your goal to get down to four data centers?
We're 75% done, so we're moving right along on that. It's going quite well. We have primary data centers in the U.S. and the U.K. that are both built and complete, and two DR sites behind that. So we actually are quite literally just at that point where we fire up a couple more blades, provision the storage, move the apps and decommission the old. So a lot of lessons learned, but it's gone pretty smooth.
How do you use the DR sites?
We use the secondary sites as replication sites for real-time data replication.
Everything per country is duplicated, because we have certain in-country data residence requirements.
Do you own these four centers?
We will own two of the four and the other two will be collocation sites.
How about some specifics on suppliers?
For server virtualization we use VMware ESXi. For virtual desktop, which is used on our internal facing cloud, we're using VMware VDI 5.0. And then for the server stack we have a combination of Linux Red Hat and Windows, depending upon the function. We do mostly Oracle, as I mentioned, but we do have some Windows SQL databases for reporting.
Are all your desktops virtual or just a subset?
Right now it's a subset. I'd say we're about 55% virtual. We do have whole offices on VDI, though. Our roadmap is to have pretty much everybody on a virtual solution in the next year and a half to two years. Our work-at-home teams are all on virtual.
You said you use VMware for VDI but mentioned Citrix before. Do you use both?
We previously used Citrix and transitioned over to VMware.
Are you doing it with thin clients or just streaming applications down to devices?
We do one of two flavors. We use either a thin client application or, in a lot of cases, we've deployed a hardware thin client. They've made some advances in recent years to support dual monitors. A few years back I tried deploying those and a lot of people didn't like them because they were limited, but the newer versions are pretty slick.
Who is the supplier?
We use Wyse. It works pretty well.
Did you try to repurpose any existing laptops as thin clients?
We have actually, we repurposed a few. We used the Wyse virtual desktop accelerator modules to help us in high latency situations and it works pretty well. If the PC has still got good life left on it, we'll use it, and then once we're done with that, one of our information security initiatives is to go more with the Wyse models because they don't have a hard disk.
How about on the server side. What did you end up with there?
We are a Cisco UCS shop, but I'll tell you that was a really interesting bake-off. We were an HP shop. We had strong HP loyalty.
What swayed you the other way?
I'd say two things. There's the technology stack. UCS is very dense in terms of memory footprint which helps for virtualization, so we saw a lift from that. But frankly it was the effort Cisco put in. They put a lot of engineering time against it to ask us what our applications did, what kind of CPU power we needed, what kind of memory footprint we had, and took the time to engineer it so we didn't have to overbuy.
Our primary U.S. data center is down in Nashville, and we put a lot of density in there because we realized it's not about the amount of square footage you have, it's how you use the footage. And we asked, how high can we stack? UCS can stack pretty high if you do it correctly, so we can get quite a few blades in each cabinet. Cisco spent I don't know how many hours going through the architecture on that stuff and helping us figure out the right solution.
How many servers did you end up with?
We generally don't publish that number, but a fair amount.
And do I understand you ended up with EMC on the storage side?
Yeah, EMC pairs pretty well with UCS. EMC put in a lot of effort helping us architect the right solution for VDI. With VDI it's not necessarily about how many spindles you have going, it's about how the data is cached in memory. So we did work with EMC, VMware, Cisco, Brocade, all these guys came together because we run a decent size VMware VDI desktop implementation, so they were very helpful in getting it right. And on top of that, we have encryption engines running across everything, so I'd say from a technology stack infrastructure perspective, it really was that kind of perfect blend of speed, virtualization, high up-time and high density.
So the data centers are finished, but how far along are you in the UCS build out?
We have a lot of UCS chassis installed. As far as the actual migration, we're about 75% done. So that's why we feel confident at this point that we're seeing the returns that we've spec'd out. It's working quite well, and I'm very pleasantly surprised at just how far we can go. We started this project expecting to go from 10 to six data centers. But as we looked closely at it and at some of the redundancies we could eliminate, it became apparent we could go from 10 to four. We can't go much farther below four because we have to keep an international presence and everybody has to have a DR site, so it was kind of the bare minimum.
Any reservations about going with Cisco given they are relatively new to the whole server game?
There was. It's Cisco, so you have a big name backing it, you feel comfortable from a financial perspective. But to your point, they were a recent entry into the server market and companies like HP and Dell have a pedigree there. From a HireRight perspective we sat back and said, prove it to us. There had been a lot of wins for Cisco in the bigger hosting spaces, but I wanted to actually see it. So we went through several months' worth of design and then site visits, having them show us what the capabilities are.
It took a little bit of getting our head around the benefits of some of the virtual profiles. There are a lot of things you can do that, I'll be frank, out of the gate, were a little bit puzzling to some engineers that had worked in the traditional space. But to their credit, Cisco had an endless amount of patience. And then before we go live with any data center we do pre-flight testing. I joke with the engineers that it's a lot like when you build a plane. Eventually at some point Boeing has to put a team of engineers and a pilot on that plane and say -- good luck, we'll see you when it gets back.
But to answer your question, it took some convincing. But I am very happy with the decision we made.
Are you mostly a Cisco network shop as well?
Yes, we are. We run pretty much, with the exception of our load balancers and some of our firewall gear, we run mostly Cisco.
So you're all in.
Yeah, pretty far in.
I imagine that gives you a little pause as well.
It's an interesting question. We do keep an eye on that. But we have Brocade doing a lot of our 10-gig encryption. And F5 does our load balancing. I won't get into the firewall vendors, but let's just say we maintain multiple vendors just for this reason. It keeps them honest.
So on the other side of your business, the cloud service you provide to customers, is that supported by this all as well?
Yes, we have similar type of infrastructure but split out into different security zones, etc. A combination of Java and .NET stacks, load balanced and secured.
How many customers do you have living on that?
Almost 50,000. And that's companies, so then each one of those has X number of users. Our requirements are seven days a week. Every two weeks we have our software release process, but other than that it's got to be up.
It must have been pretty hairy to do everything you've done while meeting those stringent requirements.
It's a little bit of a change-the-engine-on-the-plane-while-it's-flying routine. So, yeah, it's been a fun project. It's kind of an IT dream though, I think. We all like to build stuff.
Will you share any cost numbers around the effort?
Because of contractual reasons, I can't get into the prices we paid to the various vendors, but it was a several million dollar project, but sub $10 million.
Read more about data center in Network World's Data Center section.