Table of Contents
- In the Beginning
- To the Cloud… and Beyond
- Why the Edge?
- Power Consumption
- Feedback Data
- Exceptions to the Edge Model
- Where Does the Edge Begin and End?
- How Far Can it All Go?
- Compatibility and Middleware
- About Byron Reese
Artificial intelligence (AI), primarily in the form of machine learning (ML), is making increasing inroads into our lives. There are several primary reasons for this:
- The rapidly-increasing capability of computers used to build and train ML models.
- Greater data-capturing ability across the compute environment, often in the form of inexpensive sensors embedded in everyday consumer, business, and industrial products.
- The development of new algorithms and approaches that improve the accuracy of ML applications.
- The creation of software toolkits that make building and training ML applications substantially easier, and therefore less expensive.
In addition to these four ‘truths’, there are two other factors which are often overlooked that are equally as important in bringing AI into our lives. These factors are not about where AI’s are built and trained, but where they are deployed and used:
- A reduction in cost, and increase in performance, of chips doing AI inference “at the edge.”
- The development of middleware allowing a broader range of applications to run seamlessly on a wider variety of chips.
It is these final two developments that will allow AI to enhance our lives in countless new ways and enable AI in our pockets, cars, houses, and a host of other places. This report explores these latter two factors, ignoring how AI is built and trained while focusing on the methods by which AI impacts our lives. It explores the natural architectural migration of AI from central, powerful computers where an AI algorithm or application may have historically been built, trained, and used, to an edge model. In the edge model, the AI compute happens either on a user device or somewhere in the network stack beneath the traditional cloud, perhaps on an edge server.
This leads to a new AI model that is match-fit for what is to come: building and training, which will mainly continue on ever-more powerful (and power-hungry) cloud-based computers, and inference. Inference will be performed at the device edge, or close to it. It is where the AI will run on ever-more-powerful (but less power-hungry) chips. This foundational change in the AI architecture will be the single biggest driver in the advance of AI at scale.
This new architecture has several advantages over a highly centralized or cloud model, specifically:
- More scalable
- Lower cost
- More secure
- Lower power
There are tradeoffs in this approach, including the fundamental constraints of the chipset and future upgradability. Further, there are still several outstanding questions about this shift that only time will answer:
- How far to the edge will AI compute finally be pushed to?
- Which chip design price/performance combinations will prove to be the most popular?
- How disposable will the chips of the future be?
It should be noted that there are use cases where this model of centralized training and edge inference will not be appropriate; cases where decision latency and power consumption are not factors. One can imagine, for instance, that a large and expensive medical device might ship data back to a central location to be processed and analyzed on a time scale (perhaps measured in seconds or minutes) that would be unacceptable in another application, such as a self-driving car. We discuss these exceptions as well.
The final part of this report briefly explores the societal impact of this change in architecture. Winston Churchill once said, “We shape our tools and then the tools shape us.” We are the generation that is shaping the digital tools of tomorrow, and it is worth reflecting on how they might shape us in return.