AI Is Transforming at the Edge
There was a time, not long ago, when IoT was no more than an acronym, and it remained that way for some time. But then, like an electronic Big Bang, it exploded, revealing itself in thousands of applications, all sharing some basic characteristics but each with its own unique requirements. Its evolution has been so rapid that the “billions of connected devices” predicted by pundits “eventually” has already been surpassed. Most remarkably perhaps is that all of this has occurred within the last five years. Today, the focus is on performing machine learning analytics at the edge of the network and this has resulted in development of applications processors that implement every feature needed, no matter what the application.
To gain some perspective on this meteoric rise, and what is necessary to accommodate it, we need to step back in time. In the beginning, the general idea was that connected devices, from alarm clocks to entire cities, would generate data that would be sent to the cloud for analysis and then back to where it originated for the results to be implemented. In retrospect, this now seems rather simplistic because the most demanding applications cannot wait for the data to return; they need it immediately because they operate in real-time. It quickly became apparent that some data should remain where it originated—at the edge of the network—and to some extent be processed there.
Not only would this reduce the amount of data sent to the cloud and unburden already struggling communications networks, it would also reduce round-trip latency to virtually nothing while also increasing security by keeping critical IP and data local. Thus, edge computing was born, and today it is the primary focus of IoT, from home automation to smart buildings and cities, industrial facilities, and vehicles.
Of course, as all applications are different, this scenario presents significant challenges. For example, some edge devices generate very little data while others produce mountains of it. A temperature sensor produces minimal data at random intervals while a high-resolution camera produces gigabytes of data in a single day when used in applications such as visual inspection or vehicles.
Agriculture is an excellent example of one of the most advanced uses of AI and machine learning (Figure 1). It relies on an extraordinary array of sensors, from cameras to satellite imagery, to evaluate their operations analytically and intelligently and even perform predictive maintenance to ensure all equipment is operating as it should and detect when failures are likely to occur.
Without the ability to analyze this data at the edge, sometimes directly on the equipment itself, most of these capabilities would not be possible. In this environment, devices range from those operating from a vehicle battery to a stand-alone device implanted in the ground that gets its power from either the sun or a rechargeable battery. To be effective, edge processors must be able to support each one of the scenarios without sacrificing performance.
At the other end of the spectrum are smart cities (Figure 2) whose operational scenarios run the gamut from battery-powered sensors to those mounted on roadside infrastructure as well as those that have much in common with agriculture, such as maintaining parks and other public areas and monitoring water levels and quality.
And as the world moves toward vehicle autonomy, cities will need to dramatically increase their number of sensors that can communicate with AI-enabled vehicle ecosystems. The vehicles themselves are already somewhat reliant on edge processing and its use is increasing as they rapidly transition from mostly mechanical to almost all-electric. ML will be a defining factor that allows them to fully achieve this transition, not just in one large processing unit but on every engine control unit.
So, the question becomes how to satisfy the requirements of these applications and everything in between with the greatest amount of security and the least amount of hardware. Needless to say, this is not trivial because it requires edge computing capabilities ranging from very simple to enormously complex, ideally all within a small footprint that consumes very little power but is still able to deliver remarkable computational performance for AI and machine learning.
Until recently, this would have required multiple discrete computational elements, as each type of processor, from microcontrollers to real-time processors, has its own advantages and disadvantages for a specific application. However, NXP has solved this problem in its i.MX 93 applications processors (Figure 3) by using all the tools it has developed over the last decade that collectively address the unique ML challenges posed by diverse edge compute environments, from the simplest home automation system to the most complex production and automotive application, smart cities, smart buildings, and hundreds more.
Imagining the Solution
NXP has achieved this in the form of highly integrated devices that combine all of the required computational abilities for very fast inferencing at the edge along with large amounts of I/O, high-speed memory, end-to-end trust-based security, compatibility with cloud platforms, and the ability to be updated instantly and scale to meet new requirements, supported by a comprehensive software development environment that is simple to use. The i.MX 9 series can thus serve applications ranging from home automation to industrial production, consumer audio, smart city and public safety systems, fleet management, farming and agriculture, healthcare, and any other application, whether the edge devices are powered by batteries, the sun, or the grid.
The i.MX 9 applications processors build the capabilities within NXP’s i.MX 6 and i.MX 8 series and add other features that have never been integrated before within a single edge device. To accomplish this, NXP collaborated with Arm to evolve their Ethos-U55 “microNPU,” dedicated exclusively to microcontrollers, to work efficiently as part of a more sophisticated Cortex-A applications processor-based system.
While MCUs typically mix SRAM and flash memory, Cortex-A-based applications processors generally use DRAM, which has much higher data rates and capacity but also increases latency, which the new microNPU now accommodates.
The result, called the Ethos-U65, increases the maximum raw MAC (multiply and accumulate) performance option supported by the IP core to 1 TOPS (512 parallel multiply-accumulate operations at 1 GHz)1 compared to the Ethos-U55 while also sizing it appropriately for system buses that feed data to and from the microNPU. The Cortex-M and Ethos-U65 combination has great power efficiency, which makes it a very cost-effective, high-performance solution for ML devices operating at the edge.
The Ethos-U65 software model relies on offline compilation and optimization of the underlying hardware. Offline compilation can be optimized for specific Ethos-U65 configurations and the amount of on-chip SRAM that the user decides to allocate to it. This increases the amount of critical data that can be stored in on-chip SRAM that results less frequent data usage spilling over to system DRAM. The device allows it to be used in a greater variety of applications designed to run complex ML workloads while increasing throughput and efficiency on a consistent software stack and with familiar tools.
As noted earlier, IoT edge devices and their applications vary greatly, and to accommodate all of them, the i.MX 93 applications processors employ NXP’s Energy Flex architecture. Energy Flex makes this possible by creating operating domains based on the needs of specific edge devices with a programmable power management subsystem. So, for example, low-power audio capabilities can be powered by the real-time domain in which the audio functions, but other functions are switched off and only turned on by the device’s wake-word detect engine.
Keeping the Hackers at Bay
As IoT expands its reach to enormous numbers of applications, and as it inherently connects to the Internet, it now provides the world’s most extensive attack surface, rivaling the Internet itself. The only truly effective way of protecting critical IP and other data is to ensure that security begins at the edge device where the data is generated, and the most effective way of executing this is using a secure enclave.
Simply stated, this is a comprehensive security solution located within the device die that completely isolates processors, memory, and data, and employs comprehensive encryption that is only decrypted within this secure enclave. In short, it creates a walled-off security environment so that application coding data is inaccessible to any other entity, anywhere.
At NXP this secure enclave is called EdgeLock® secure enclave. It has its own dedicated security core, internal ROM, and secure RAM and provides protection against side-channel attacks using symmetric and asymmetric crypto accelerators and hashing functions on the device. It functions autonomously to manage all security functions from root of trust to runtime attestation, trust provisioning, secure boot enforcement, and fine-grain key management. It even tracks device operation when running end-user applications to help prevent new attack services from emerging. For designers, this dramatically simplifies the process of managing huge numbers of edge devices with a high level of confidence.
That said, keeping an edge device secure long after initial deployment is a challenge that requires continuous trusted management services. NXP partnered with Microsoft to bring this capability to fruition with Azure Sphere chip-to-cloud security beginning with the i.MX 8ULP applications processor family and extends to devices in the i.MX 9 series as well.
Azure Sphere is an end-to-end solution for securely connecting existing equipment and creating new IoT devices with built-in security from the chip to the cloud and is enabled in EdgeLock secure enclave as the root of trust in the silicon. It detects emerging security threats through automated processing of on-device errors and responds to threats with fully automated on-device updates to the operating system.
In addition to the secured hardware, Azure Sphere includes the secured Azure Sphere OS, the cloud-based Azure Sphere Security Service, and ongoing operating system updates and security improvements that have been refined over more than a decade.
In a short time, much of the processing and analytics initially performed only in the cloud has transitioned to the edge for the reasons discussed in this white paper, but also because it just makes sense. Even if the operational scenario is not “real-time,” the most efficient use of compute and ML resources can only be achieved when these functions are performed locally. And to make this possible, those resources must be cost-effective, infinitely scalable, small enough to, in many cases, be implemented directly on the sensor, and most important, provide end-to-end security that begins with the edge processing and ends in the cloud.
NXP’s i.MX 93 applications processors are designed to meet all these mandatory requirements and many others, including the ability to provide connectivity via wired or wireless means by supporting a broad array of legacy and current interfaces while also providing designers with all the tools necessary to simplify the challenging task of implementing ML-based IoT at the edge and streamlining the process of integrating with cloud-based data centers such as Microsoft Azure.
1. Information about the ETHOS-U65 can be found here: