Beyond Static Partitioning: Scheduling ASIL-D Safety Functions and Multimedia QoS on the Same Silicon

The Cannot-Fail Core: Safety Island Architecture for Heterogeneous SoCs

 

The established answer to mixed-criticality consolidation in embedded systems is the partitioned VM: run ASIL-D software in one virtual machine, QM multimedia in another, and let the hypervisor enforce the partition boundary. This architecture works, it is certifiable, and it is what most automotive and industrial programs shipping today rely on. It is also architecturally conservative in a way that leaves substantial compute on the table, forces high-latency communication paths across partition boundaries, and creates a system structure that becomes increasingly awkward as the software-defined vehicle centralizes formerly separate ECUs onto shared high-performance SoCs.

The partitioned VM approach was correct when each VM's workloads were homogeneous and the interference between VMs could be managed by time-slicing cores or dedicating cores per partition. The problem it does not solve well is the class of system that needs to run an ASIL-D safety function and a QoS-constrained multimedia pipeline not just on the same SoC but potentially on the same core cluster, with both workloads needing responsive access to shared resources — cache, DRAM bandwidth, GPU, NPU — in ways that the static partitioning model handles poorly. The automotive industry recognized this concretely by 2018 when the first integrated cockpit controllers shipped sharing a GPU between an ASIL-B instrument cluster and a QM Android infotainment system, reducing system cost by 20 percent compared to the two-ECU alternative.

The ADAS-cockpit fusion problem that Aptiv and other Tier-1 suppliers are pursuing for the next generation of vehicles is harder: combining ASIL-D ADAS functions with QM infotainment and ASIL-B cluster on a single compute platform, with all workloads needing access to neural network accelerators, high-bandwidth memory, and display pipelines. The partitioned VM model needs to evolve to handle this.

What Freedom from Interference Actually Requires

Freedom from Interference (FFI) is the ISO 26262 concept that underlies all mixed-criticality architecture decisions. ISO 26262 defines FFI as the "absence of cascading failures between two or more elements that could lead to the violation of a safety requirement." The practical engineering question FFI asks is: can a fault in the QM or lower-ASIL element reach the higher-ASIL element and compromise its safety property?

ISO 26262 Part 6 specifies three categories of interference that must be addressed for software components of different ASILs running on shared hardware: timing and execution interference, memory interference, and information exchange interference.

Timing and execution interference occurs when execution of a safety-relevant software element is blocked or delayed by a fault in another element — a QM task consuming CPU time that should have been available for an ASIL-D task, or a runaway interrupt handler in the QM partition exhausting available processor cycles. This is the most commonly addressed interference type, typically handled by the RTOS scheduler with time budgets enforced by hardware watchdogs.

Memory interference covers both spatial interference (a QM task writing to memory belonging to an ASIL-D task) and temporal interference (cache eviction caused by QM memory access increasing the execution time of ASIL-D code beyond its certified WCET). Spatial interference is addressed by MPU/MMU configuration; temporal memory interference — specifically cache and DRAM bus contention — is the harder problem that pure VM partitioning does not automatically solve, because cache coherency hardware and DRAM controllers do not respect VM partition boundaries in the way that software memory mapping does.

Information exchange interference covers the paths by which QM software could corrupt data that ASIL-D software uses — through shared memory communication channels, through side effects on shared global state, or through hardware peripheral access.

For ASIL-D, ISO 26262 requires that software partitioning be supported by dedicated hardware features. The MPU is the minimum hardware mechanism for spatial memory protection; for cache and DRAM temporal interference, the hardware mechanisms are cache partitioning (coloring), bandwidth throttling through memory controller QoS registers, and temporal isolation scheduling. The critical insight is that VM partitioning provides spatial isolation but not temporal isolation — two VMs on different cores sharing an L3 cache or a DRAM controller can still create WCET invalidation through resource contention even when no data is directly shared.

The Three Interference Channels That Partitioned VMs Do Not Solve

Understanding exactly where the VM partitioning model breaks down clarifies what additional mechanisms are needed to achieve FFI on a shared SoC with co-located ASIL-D and multimedia workloads.

Cache interference is the first channel. A modern application processor SoC has a shared last-level cache (LLC) — typically L3 — accessible from all cores. When a QM multimedia pipeline processes a high-resolution video frame, its working set occupies a significant fraction of the LLC. If an ASIL-D control function runs immediately after on a different core, its working set has been evicted and it experiences cache misses that were not present when the WCET was measured in an isolated test environment. The resulting WCET violation is not detectable at the VM partition boundary — it manifests as extended execution time of the ASIL-D task that appears internally correct but violates its timing budget. This is temporal interference through shared cache state, and partitioned VMs do not prevent it because the cache is hardware-managed below the VM abstraction layer.

The Lazy Load scheduling research from LAAS-CNRS (2023) explicitly addresses this problem by combining a hypervisor with cache-coloring to achieve interference isolation between criticality domains. Cache coloring assigns different LLC sets to different VMs through careful alignment of physical page addresses to cache set indices, ensuring that QM and ASIL-D workloads never evict each other from the LLC. This requires hardware support from the page allocator (or a hypervisor-level memory allocator) to guarantee that the allocated physical pages map to the appropriate cache sets, and it requires characterizing which cache set regions are needed by each workload's working set.

DRAM bandwidth interference is the second channel. When a GPU rendering a multimedia frame and a Cortex-A cluster executing ADAS perception code simultaneously generate DRAM requests, they compete for the shared DRAM controller's bandwidth budget. In the worst case, the GPU's burst traffic delays the ASIL-D code's memory requests by the queuing time in the memory controller, extending its execution time beyond the WCET bound established in isolation testing. This interference is below the visibility of the hypervisor, below the RTOS scheduler, and below any software layer — it occurs in the memory controller hardware. The mitigation is bandwidth throttling: hardware QoS registers in the memory controller (available on NXP i.MX8M, TI TDA4VM, and similar SoCs) can cap the maximum DRAM bandwidth allocated to the GPU's request channel, ensuring that the ASIL-D code's requests are serviced within their latency budget.

Interrupt latency interference is the third channel. A high-priority interrupt in the QM partition — a DMA completion for a multimedia pipeline, a GPU command completion — can delay interrupt handling in the ASIL-D context if interrupt routing is not carefully partitioned. On processors without full interrupt virtualization support, an interrupt that fires during a QM VM's timeslice may be handled with QM-partition interrupt latency rather than ASIL-D interrupt latency, violating the determinism guarantees of the safety-critical function.

What Mixed-Criticality Scheduling Beyond VMs Provides

Moving beyond pure VM partitioning toward mixed-criticality scheduling on shared cores does not mean abandoning isolation — it means providing finer-grained isolation that covers the temporal interference channels that spatial partitioning misses, while enabling more efficient use of shared compute and memory resources.

The key mechanisms that enable this are:

Cache partitioning (cache coloring): implemented in the hypervisor memory allocator, assigns physically non-overlapping LLC regions to different criticality partitions. ASIL-D tasks' working sets occupy a reserved cache region that QM workloads cannot evict. This requires knowledge of the LLC's set-index to physical address mapping, which is SoC-specific, and a memory allocator that honors color constraints when granting pages to each partition. RISC-V and Arm Cortex-A platforms both support cache coloring through page coloring implementations in hypervisors like Jailhouse, ACRN, and Bao.

Memory bandwidth regulation: configured through the memory controller's QoS registers, assigns maximum DRAM bandwidth budgets per requester (by core, by master ID, or by partition). The ASIL-D partition's DRAM requests are given priority or guaranteed minimum latency; QM multimedia traffic is throttled to the remaining bandwidth. This converts non-deterministic DRAM access latency into bounded access latency for the ASIL-D partition, enabling WCET analysis that remains valid under QM load.

Hierarchical scheduling servers: provide CPU bandwidth reservation at the scheduler level. A reservation server for the ASIL-D partition guarantees that ASIL-D tasks receive their required CPU fraction within defined scheduling periods, regardless of QM task behavior. The server abstraction allows QoS-constrained multimedia tasks to use the remaining CPU budget after ASIL-D tasks are satisfied, with guaranteed minimum bandwidth for the multimedia pipeline to maintain frame rate and decode latency within QoS bounds.

The following table summarizes the interference channels, their mechanisms, and the architectural layer that addresses each:

Interference channel

Mechanism

Standard VM partition handles?

Required additional mechanism

CPU time (QM starves ASIL-D)

Temporal scheduling

Yes — core dedication or TDMA

Priority + budget scheduling server

Spatial memory (wrong write)

MPU/MMU address mapping

Yes

MMU + MPU in hypervisor

Cache temporal (eviction)

Shared LLC set contention

No

Cache coloring in hypervisor allocator

DRAM bandwidth (bus contention)

Shared memory controller

No

QoS registers / bandwidth throttling

Interrupt latency (delayed ISR)

Shared GIC routing

Partially

ASIL-rated interrupt controller config

Information exchange (corrupt data)

API / shared memory interface

By design

Validated interface with coverage

AUTOSAR, QNX, and the Evolving Hypervisor Landscape

The automotive industry's existing hypervisor solutions are the starting point for this evolution. Elektrobit's EB tresos Embedded Hypervisor, qualified for ASIL in accordance with ISO 26262, runs on the ARM Cortex-R52+ architecture and provides VM isolation with Freedom from Interference guarantees. Its 2024 mass-production release on the STMicroelectronics Stellar SR6G7 demonstrates that the real-time embedded hypervisor is moving into production automotive systems. BlackBerry QNX and QNX Hypervisor offer similar ASIL-rated partitioning for safety-critical applications alongside general-purpose OS guests.

The limitation of current automotive hypervisors is that they were designed for the Cortex-R52 real-time domain — relatively homogeneous cores, no GPU sharing, no NPU sharing, smaller L3 caches. The automotive domain fusion challenge moves the problem to high-performance application processors (Cortex-A55/A78 clusters, Cortex-A72) with substantial GPU and NPU complexes, where the interference channels are richer and the QoS requirements of the multimedia pipeline are more demanding.

ACRN hypervisor — an open-source Type 1 hypervisor from Intel targeted at automotive and industrial IVI applications — addresses the GPU sharing problem by providing passthrough and mediated passthrough mechanisms that allow a QM Android VM to access the GPU while a safety VM runs ADAS processing, with priority assigned to the safety VM's GPU submission queue. This does not yet achieve full ASIL-D certification on the GPU sharing path, but it represents the engineering direction: GPU virtualization with priority-based submission scheduling rather than static GPU partitioning.

The Red Hat In-Vehicle OS approach (RHIVOS) uses a Linux-based architecture with a QM partition implemented as a containerized subsystem within the Linux OS, rather than a separate VM. This moves the isolation boundary from the hypervisor layer to the Linux kernel's cgroup-based resource partitioning — CPU bandwidth via cpu.max, memory bandwidth via bandwidth throttling in the kernel, and process isolation via namespaces. RHIVOS has achieved ISO 26262 ASIL-B SEooC certification, not ASIL-D. The ASIL-D use case still requires a dedicated RTOS VM or partition isolated from the Linux environment by a certified hypervisor.

The path toward ASIL-D + QoS multimedia on shared high-performance cores is not a single product or standard — it is a combination of: a certified hypervisor for ASIL-D isolation and scheduling guarantee, cache coloring for LLC temporal isolation, bandwidth throttling for DRAM isolation, GPU and NPU mediated virtualization with priority scheduling for accelerator sharing, and a validated HSI between the ASIL-D partition and the shared hardware resources.

 

adas system

 

The Certification Challenge for Shared Resources

Certifying an ASIL-D function on a shared SoC requires demonstrating that every shared resource the ASIL-D function uses has a provably bounded interference contribution from co-located workloads. This is the certification argument that must survive scrutiny by the assessment body, and it is where the engineering sophistication of the mixed-criticality architecture translates directly into the certification evidence required.

For cache interference, the evidence is the cache coloring configuration, the analysis demonstrating that the coloring is correct for the target SoC's LLC indexing scheme, and the WCET measurements confirming that ASIL-D tasks' cache miss rates are consistent between isolation testing and integrated system testing.

For DRAM bandwidth interference, the evidence is the QoS register configuration, the analysis demonstrating that the ASIL-D partition's memory request latency is bounded under worst-case QM memory traffic within the allocated bandwidth budget, and timing measurements confirming this bound holds on target hardware.

For scheduling, the evidence is the scheduling server configuration, the schedulability analysis confirming that ASIL-D tasks' deadlines are met given the server budget and the period, and the analysis confirming that QoS-constrained multimedia tasks receive sufficient bandwidth to meet their frame rate and latency requirements in the remaining CPU budget.

The requirement that ISO 26262 imposes for ASIL-D — that software partitioning "shall be supported by dedicated hardware features" — means that each of these interference mitigations must be implemented in hardware-enforced mechanisms, not just software policy. Cache coloring enforced through hypervisor page allocation is hardware-enforced because the hardware MMU's address translation ensures that LLC interference cannot occur between differently-colored regions. Bandwidth throttling enforced through memory controller QoS registers is hardware-enforced because the memory controller hardware rejects excess requests from throttled masters. Software-only bandwidth policies (a QM application voluntarily limiting its memory traffic) are not acceptable for ASIL-D FFI.

ASIL-D Decomposition as an Alternative Path

Where full ASIL-D on a shared heterogeneous SoC is not achievable or certifiable within program constraints, ASIL decomposition provides an alternative that reduces the burden on any single partition. An ASIL-D safety requirement decomposed into two redundant ASIL-B channels — each developed independently to ASIL-B rigor, with independent hardware paths — provides ASIL-D coverage at the system level through the redundancy of two ASIL-B elements rather than through a single ASIL-D partition on a complex shared platform.

For the ADAS-cockpit fusion problem, this means the ASIL-D braking decision function could be decomposed: the perception processing on the high-performance SoC with multimedia sharing is ASIL-B, and an independent safety monitor on a dedicated Cortex-R52 lockstep core validates the outputs and provides the second ASIL-B channel. This architecture is more conservative in its use of the high-performance SoC — the SoC partition only needs ASIL-B certification — while maintaining the required system-level ASIL-D coverage through the lockstep monitor.

The tradeoff of this approach is that it requires additional silicon (the safety monitor core or MCU), additional communication latency between the main compute and the safety monitor, and additional complexity in the safety architecture documentation. Against this cost, the benefit is that the certification argument for the high-performance SoC partition is substantially simpler: demonstrating ASIL-B FFI on a shared SoC is better understood, more extensively validated, and served by more mature tooling than demonstrating ASIL-D FFI on a shared GPU-enabled SoC.

Quick Overview

Partitioned VM hypervisors handle spatial memory isolation between ASIL-D and QM workloads but do not automatically address temporal interference through shared LLC cache and DRAM bandwidth — the two most significant interference channels on modern high-performance SoCs. Cache coloring in the hypervisor allocator and memory controller QoS bandwidth throttling are the hardware-enforced mechanisms that provide temporal isolation required by ISO 26262 ASIL-D. The automotive domain fusion of ADAS and cockpit on a single SoC — reducing ECU count and system cost — requires combining certified hypervisor spatial partitioning with cache coloring, bandwidth throttling, and GPU/NPU mediated virtualization with priority scheduling. Where full ASIL-D temporal FFI on a shared GPU SoC is not achievable within program constraints, ASIL decomposition into two independent ASIL-B channels provides system-level ASIL-D coverage with a simpler certification argument.

Key Applications

Automotive software-defined vehicle (SDV) domain controllers consolidating ADAS, instrument cluster, and infotainment onto a single SoC, industrial robotics controllers integrating safety-rated motion control with operator interface workloads on shared compute, medical device platforms combining a safety-certified monitoring function with a general-purpose visualization pipeline, and any embedded SoC deployment where ASIL-C/D functions and performance-demanding non-safety workloads must share GPU, NPU, or DRAM resources without full spatial core dedication.

Benefits

Eliminating dedicated safety ECUs by consolidating ASIL-D and QoS multimedia on a single SoC reduces vehicle weight, wiring harness cost, power consumption, and unit BOM cost — quantified by Aptiv at approximately 20 percent system cost reduction for the instrument cluster and infotainment consolidation case. Fine-grained interference mitigation through cache coloring and bandwidth throttling supports more efficient use of the shared SoC's compute and memory bandwidth than static core dedication. ASIL decomposition provides a practical path to system-level ASIL-D coverage on complex shared SoCs that current hypervisor technology cannot certify at full ASIL-D for all shared resource interactions.

Challenges

Demonstrating temporal FFI at ASIL-D for shared LLC and DRAM on a complex application processor SoC requires per-SoC interference analysis that is not yet automated in standard tooling — each platform's LLC indexing scheme and memory controller QoS implementation must be independently characterized. ASIL-D certified hypervisors with cache coloring and bandwidth throttling support for high-performance Cortex-A SoCs are less mature than the Cortex-R52 hypervisors that serve current production programs. GPU and NPU virtualization with ASIL-rated priority scheduling is not yet available in certified form from any major GPU vendor.

Outlook

The software-defined vehicle initiative across automotive OEMs — centralized zonal architectures with fewer, more powerful ECUs — is the primary driver forcing the evolution beyond static VM partitioning. The EB tresos Embedded Hypervisor's 2024 production release on the STMicro Stellar demonstrates the maturation trajectory; the 2025–2026 Cortex-R52-based NXP S32Z27x qualification extends this to higher-performance real-time platforms. ISO 26262 Part 9's multicore guidance and CAST-32A's equivalent for avionics are the normative frameworks that will need to be extended to address temporal interference in shared GPU and NPU contexts — a standards gap that active working groups are addressing but that will take several years to fully close.

Related Terms

mixed-criticality system, ASIL-D, ASIL decomposition, Freedom from Interference, FFI, ISO 26262, QM, partitioned VM, hypervisor, cache coloring, cache partitioning, LLC, DRAM bandwidth throttling, memory controller QoS, temporal interference, spatial interference, WCET, Cortex-R52, lockstep, AUTOSAR, OSEK/VDX, TDMA scheduling, bandwidth reservation server, hierarchical scheduling, EB tresos, QNX hypervisor, ACRN hypervisor, Jailhouse, Bao, RHIVOS, software-defined vehicle, SDV, domain controller, ADAS, infotainment consolidation, instrument cluster, GPU passthrough, mediated passthrough, NPU virtualization, safety OS, SEooC, Cortex-A78, Cortex-A55, MPU, MMU, memory protection, ASIL-B, ASIL inheritance, dependent failure analysis

 

Contact us

 

 

Our Case Studies

 

FAQ

Why does VM partitioning not solve all mixed-criticality interference problems on a shared SoC?

 

VM partitioning enforces spatial isolation, different VMs cannot directly read or write each other's memory. It does not enforce temporal isolation of shared hardware resources. Two VMs running on separate cores of the same SoC share the last-level cache and the DRAM controller. When a QM multimedia VM's working set evicts the ASIL-D VM's cache lines, the ASIL-D task experiences cache misses that increase its execution time beyond the WCET bound certified in isolation. When the GPU in the QM VM generates burst DRAM traffic, the ASIL-D VM's memory requests are delayed. Both effects occur at the hardware level below the hypervisor's visibility and violate the WCET bounds that the ASIL-D partition's safety case depends on.
 

What is cache coloring and how does it provide temporal isolation between criticality domains?

 

Cache coloring assigns non-overlapping sets of LLC cache lines to different software partitions through careful control of the physical page addresses allocated to each partition. Because a modern LLC is indexed by a subset of the physical address bits, pages whose addresses map to different cache set index ranges can never evict each other from the LLC regardless of access patterns. A hypervisor that allocates colored pages to each partition, giving ASIL-D exclusive access to a reserved portion of the LLC, ensures that QM workloads cannot cause ASIL-D cache misses through eviction, making the ASIL-D partition's cache behavior deterministic under QM load. This converts temporal interference into spatial partitioning at the cache level.
 

What does ISO 26262 require specifically for memory partitioning at ASIL-D?

 

ISO 26262 Part 6 clause 7.4.9 requires that freedom from memory interference between software components of different ASILs be ensured, and specifies that for ASIL-D this partitioning shall be supported by dedicated hardware features. This means that memory protection through the MMU or MPU is required, software-only partitioning, a QM task voluntarily staying within its memory region, is not sufficient. The same principle extends to cache and DRAM temporal interference mitigation: hardware-enforced cache coloring and hardware-enforced memory controller QoS are required for ASIL-D coverage, not software policy alone.
 

What is the ASIL-D decomposition approach for mixed-criticality SoCs and when is it preferable?

 

ASIL decomposition splits an ASIL-D requirement into two redundant ASIL-B elements developed independently, achieving ASIL-D at the system level through redundancy. Applied to a high-performance SoC with shared multimedia workloads, this means the SoC partition needs only ASIL-B certification, which is better supported by existing hypervisor and interference analysis tooling, while a dedicated safety monitor on a separate lockstep core provides the second ASIL-B channel. The system safety argument is simpler and the certification evidence requirements are lighter than achieving ASIL-D on a shared GPU-enabled SoC. The tradeoff is additional silicon for the safety monitor and added latency through the communication path between the main compute and the monitor.