Amazon and Annapurna Lab’s experience using software development models such as Agile led them to look to transfer those learnings into Silicon Innovation.
That idea and the experience of continuous optimization with open-source software have prompted Amazon to embark on several initiatives to enhance infrastructure and platform offerings using purpose-built chips. Several initiatives, including system-level chips, processors, ML accelerators, and storage, were covered at the AWS Silicon Innovation Day 2022 held on August 3rd, 2022. Cloud providers have also delivered efficiencies through software changes, and a note on software optimization is at the end of this post. This blog covers these silicon-level capabilities and the added value to cloud adoption, followed by the implications to customers, investors, and competitors and a summary.
In the early stages of cloud computing, vendors used commercial off-the-shelf products for virtualized computing, storage, and network infrastructure delivered with standard API access. However, commercial products designed for traditional data centers proved inefficient for large cloud data centers. Cloud vendors started using redesigned servers that consolidated redundant components to optimize the cloud-based delivery of infrastructure resources. Amazon began redesigning its servers and hardware using KVM instead of Xen as the hypervisor. This redesign has allowed AWS to provide near bare metal performance to customers while offering the best price performance.
Optimization required massive changes in the architecture for better scalability and performance. Increasing customer demand for cloud resources required better CPU utilization and processor performance. Amazon began using Annapurna Labs (See note on acquisition below) to accelerate silicon innovation for general purpose cloud efficiencies and the artificial intelligence space using System On a Chip (SoCs). The joint effort led to the introduction of Amazon’s Nitro System, which efficiently offloaded some processes to hardware and allowed for a more efficient operation. Amazon’s EC2 platform provided customers with an efficient process, agile computation, the best price, and premium performance. Amazon’s silicon team introduced workload-optimized chips to beat the production cost and time of outsourcing chip development to companies like NVIDIA and Intel. While other vendors have partnered to offer custom chips designed for cloud solutions, Amazon, through the Annapurna acquisition, is unique by owning the end-to-end process of custom chip development.
The following offerings from Amazon bring efficiencies to applications at the silicon level (more details can be found in the links embedded in each paragraph header):
AWS Graviton: First announced as Graviton in 2018 for general purpose workloads, the AWS Graviton Processor now has evolved with new versions optimized for computing, memory, and storage applications. Graviton2 was announced in 2019 and launched in 2020, and Graviton3 was announced in 2021 and launched in 2022, each improving on previous versions in terms of performance and energy consumption. Graviton customer testimonials range from various industries, from Cadence in electronic design to DirectTV in entertainment. A range of AWS partners use Graviton processors gaining efficiencies allowing for better competitiveness or improved profitability in providing services to customers.
AWS Inferentia: In contrast to Graviton, the AWS Inferentia is an Application Specific Integrated Circuit (ASIC) built for machine learning inference workloads. The volumes of machine learning workloads justify the investment into an ASIC chip similar to ASIC chips being used in network cards. An example of a large-scale need for GPU-based machine learning inference workloads is Amazon’s Alexa which has more than 100 million devices connected. For large workloads that process billions of inference requests every week, like Alexa, the cost savings and improved customer experience make a custom chip worthwhile.
AWS Nitro System: Annapurna Labs aided in offloading all the virtualization systems to dedicated hardware and software systems and utilizing all server resources to run customer instances. The Nitro System handles storage access, encryption, security, networking, and monitoring to deliver better performance. The Nitro System offloads more considerable computational power, more memory, better CPU performance, more networking throughput, and persistent disks to specific Nitro Cards. The Nitro system provides a lightweight hypervisor with Nitro Hypervisor, and AWS Nitro Enclaves for isolated compute environments. The Nitro Security Chip and NitroTPM handle offloading security functions and Trusted Platform Module functionality.
Nitro SSD: Customers desire storage drives at a lower cost by delivering higher throughput and optimized performance to meet their TCO objectives. Customer requirement for latency, performance and optimization was the motivation for Nitro SSDs. Solid State Drives (SSDs) allow storage of high capacity and data with low latency, thereby providing a high performance for users. Nitro SSD is a storage device that can persistently map the data to flash memory while running on silicon chips, allowing it to provide low latency and higher throughput. Nitro SSDs are an opportunity to improve security, reliability, and throughput on a cloud scale. By creating their own Nitro SSDs, Amazon can fix any customer issues faster than by outsourcing the SSDs to other vendors.
Implications to customers Enterprise digital transformation efforts give rise to many types of workloads ranging from those that need a high level of security to some applications requiring artificial intelligence/machine learning (AI/ML) accelerators. By offering an entire suite of silicon from the base processors to the accelerators, AWS is delivering innovations at all compute levels, giving customers more options. The choice of processors that provide better security and lower costs enables customers to experiment with different instances and evaluate and match silicon options available to optimize return on cloud investments. Those retail customers that tend to shun AWS due to competitive concerns should reconsider AWS for specific workloads.
Implications to competitors The availability of silicon accelerators will drive customer decisions in choosing the most appropriate cloud provider. Competitors, especially hyperscalers like Google Cloud Platform and Microsoft, need to offer differentiators at the silicon level to support new application types. Other hyperscale vendors also offer AMD, Intel, and Nvidia alternatives. Microsoft has built custom chips to augment laptops’ security and is reportedly making server-side silicon innovations. Google has a partnership with SkyWater Technology and Efabless providing an open-source offering for developers to create silicon designs. Oracle has partnered with Ampere Computing to offer an 80-core ARM server.
With silicon innovations moving to on-premise implementations, traditional on-premise vendors should look at emerging customer requirements and offer relevant solutions.
Implications to investors As customers strive to accelerate digital transformation and choose cloud providers that provide solutions with silicon-level optimization, revenue growth will balance in favor of silicon innovators. Cloud vendors with support from traditional chip vendors like Intel and Nvidia and owning end-to-end silicon offerings will have a larger addressable market and a competitive advantage. Investors should also pay attention to cloud vendors building chip design capabilities in-house.
Investors must watch chip manufacturers like AMD, Intel, and Nvidia. Their revenue growth could be threatened if custom chips get increasingly built by the hyper-scale cloud vendors and adopted by customers.
Summary:
Optimizing server workloads with hardware delivers significant improvement across the board. According to James Hamilton, speaking at the re:Invent conference in 2016, offloading to hardware results in gains showing up as roughly a 10th latency, a 10th of power consumption, and a 10th of the cost. When Amazon Retail had high growth in demand, Amazon acquired Kiva, a robotics company, to satisfy its need to automate warehouses. With the increased demand for infrastructure resources, Amazon acquired Annapurna Labs to fulfill data center needs. Amazon Web Services has capitalized on using internal chip design capabilities to build unique capabilities for customers. In conclusion, Amazon has gone to the next level of data center capabilities over the competition, which will be hard to catch up.
Notes on Software Optimization of Infrastructure
Optimizing IT resources is a perpetual challenge for applications with varying compute, storage, and network requirements. Centralization of IT resources at cloud providers gave vendors a deep insight into infrastructure utilization and visibility to opportunities for improvement. Standardization of resource consumption brought up the ability to squeeze out every bit of waste from resource consumption that began with software optimization.
When Amazon announced Lambda in 2014, the service offered security by isolating customer workloads in dedicated EC2 instances. This requirement was less efficient and, with very low Lambda pricing, perhaps a less profitable offering for Amazon. Requiring each customer to be in dedicated instances also created a limitation for very rapid scaling. Firecracker, an open source KVM virtualization technology, was released in 2018 that brought resource efficiencies with microVMs and security through workload isolation. The ability of a MicroVM to spin up fast also improves the ability to scale rapidly.
As customer experience progressed, vendors offered more efficiencies by accelerating execution at the chip level for specific software requirements.
Notes on Annapurna Labs Acquisition
Amazon acquired Annapurna Labs, an Israeli start-up, to create SoC (System-On-Chip) based on a fabless system. An SoC is usually an integrated circuit that combines different computing components to incorporate onto the same platform, which can be used in multiple applications. SoC is typically utilized for differential performance specifications. Acquiring Annapurna Labs gave Amazon an edge against its prime industry competitors in the cloud space and the opportunity to reduce any performance capacity restrictions of EC2, which was running on Xen, an open-source hypervisor. Amazon targets two main tech markets using custom chips -cloud computing and AI/ML computation with the motivation to address customer needs using silicon. The five main chips under Annapurna Lab are Amazon Graviton, Amazon Graviton2, Amazon Graviton3, Nitro Chip, and Amazon Inferentia.
Final note: Dhanyashree Prem Sankar, a student at Purdue, contributed to this piece with observations on Nitro and Annapurna Labs.