A peek inside Amazon’s cloud – from global scale to custom hardware
- 01 December, 2016 02:13
Amazon Web Services brings on enough new server capacity every day to support the entire operations of Amazon the online retail giant when it was an $8.5 billion enterprise in 2005. Every day.
That was just one of the insights that Amazon Web Services' Vice President and Distinguished Engineer James Hamilton shared during an opening night keynote at re:Invent, Amazon’s user conference for its IaaS cloud platform. Hamilton provided an internal glimpse of operations that run the company’s cloud business, from its global network of 14 regions down to the custom-made silicon that run its servers, in many cases revealing information that was not previously public.
Some points from Hamilton’s talk:
- The first Amazon re:Invent had 6,000 attendees; this year there are more than 32,000.
- During Amazon ecommerce “Prime Day” the online retail giant – which uses the AWS cloud – spun up tens of thousands of servers to handle capacity load and spun them down after the event.
- Amazon has 14 regions for its cloud with announced plans for four more next year.
- Each region is composed of at least two availability zones but all new Amazon regions are built with at least three AZs and some have as many as five.
- Each AZ has at least one data center, although some have as many as eight.
- There are 68 points of presence for AWS around the world, all connected through an Amazon-owned, dedicated private network.
- One of Amazon’s newest projects is a Hawaii trans-pacific cable that will span 14,000 kilometers between Australia, New Zealand, Hawaii and Oregon. The company broke ground on the project in New Zealand last week.
- Within an AZ there are intra-AZ fiber connections within the data centers, inter-AZ fiber connections between data centers that make up the AZ and fiber connections between each of the data centers and one of two transit stations each AZ has that connect to the outside world. A typical AZ has 126 unique metro fiber spans.
- Most data centers are between 25- and 32-MegaWatt facilities, with between 50,000 and 80,000 servers; some AZs have more than 300,000 servers.
- Hamilton explained that AWS could build larger data centers, but 25-to-32 MW is “a good place to be.” If data centers are any larger, then there are only modest gains in efficiency related to its scale. On the other hand, there are negative consequences to building larger data centers: If one fails, it will bring down more capacity compared to if a smaller facility goes down. Keeping data centers in relatively modest sizes costs a little bit more, he added, but Hamilton believes it works well for AWS’s scale.
- Inside the data center, AWS runs its own custom-built hardware, including network routers that have been developed by an internal protocol team, optimized for the company’s use and built to spec.
- The networking gear runs custom-built silicon – Hamilton showed one from Broadcom - that has 78 transistors and can support up to 128 ports of 25Gb Ethernet each.
- AWS runs mostly 25GbE inside its data centers, despite most industry standards running 40 Gigabit Ethernet. Hamilton says dual 25G Ethernet, creating a 50G Ethernet connection, is less expensive and high bandwidth than a single 40G Ethernet.
- AWS custom builds its storage server rack infrastructure too. AWS has evolved from supporting 880 disks per rack to 1,110 disks per standard racks. That’s 11 PBs of storage, or 2,278 pounds of disk space per rack. Hamilton noted that’s not even the densest racks in AWS data centers.
- Compute server boxes are “simple, no-frills” 1RU server boxes that are about half-empty to optimize for thermal and power efficiency, which are favored over density. These minor adjustments to standard boxes create up to 1% efficiencies per box, but when they’re deployed at the scale of AWS’s those are worthwhile gains.
- Compute silicon is developed internally by Annapurna Labs, a company Amazon bought in January 2015.
- AWS is committed to being a 100% renewable energy-powered cloud. In April 2015 the company reached 25% of the goal. Its Oregon region is powered completely by renewable energy. One challenge, Hamilton noted, is that AWS’s cloud is constantly growing – at the rate of about the size of a Fortune 500 IT operation per day – so the amount of renewable energy required is constantly increasing too. Eventually the goal is to bring on about 2.6 million MegaWatt hours of renewable capacity to power AWS’s cloud.