AV2 Packaging for HLS and DASH: CMAF Contracts, Low Latency, DRM, and CDN Efficiency

AV2 Packaging for HLS and DASH: CMAF Contracts, Low Latency, DRM, and CDN Efficiency

 

AV2 is being positioned as the next generation of open video coding, with AOMedia announcing a year-end release and highlighting goals such as better compression, improved screen content handling, multi-view delivery, and support for AR/VR-oriented experiences. What will determine how fast AV2 becomes deployable in real OTT stacks is not only encoder performance, it is packaging discipline across CMAF/ISOBMFF, HLS/DASH signaling, DRM alignment, and CDN behavior.

In practical terms, “AV2 readiness” is a delivery-chain property. A platform can have a functioning AV2 encoder and still fail production rollout because players cannot reliably select the right variant, segments cannot be switched seamlessly, key rotation boundaries drift, or low-latency modes drive cache fragmentation and origin load.

This article is a packaging and delivery deep dive built around concrete entities: CMAF switching sets, segment and fragment alignment, HLS and DASH manifest authoring patterns, low-latency mechanics (LL-HLS and LL-DASH), DRM encryption schemes and rotation policies, and CDN cache-efficiency levers. The objective is to help teams introduce AV2 with a controlled blast radius while keeping existing HLS/DASH plus DRM paths stable.

1) Treat AV2 packaging as a “CMAF contract,” not a codec toggle

Most mature streaming stacks converge on fragmented MP4 with CMAF constraints because it allows dual-protocol delivery: the same media segments can be referenced by both an HLS playlist and a DASH MPD, reducing duplication in packaging and storage. The trade-off is that CMAF is strict: the “contract” is defined by switching constraints, alignment rules, initialization segment stability, and consistent metadata signaling.

For AV2, the CMAF contract is where early adoption succeeds or fails because codec ecosystems are initially uneven: decoder availability differs by platform, player builds may lag, and tooling around codec signaling and conformance can be immature in the first wave.

A practical way to think about this is:

  • Encoder output is a media asset.
  • CMAF packaging is an interoperability contract.
  • Manifests are the control plane.
  • DRM and CDN behavior are the operational reality.

AV2 rollout becomes manageable when you explicitly formalize the CMAF contract per codec family and keep the remainder of the delivery chain deterministic.

2) Codec signaling: do not assume “the player will figure it out”

AV2 must be identifiable in three places before you can expect stable playback:

  1. Manifest signaling (HLS master playlist / DASH MPD attributes).
  2. Initialization segment metadata (ISOBMFF sample entry and codec configuration).
  3. Player capability model (device/OS/player/DRM constraints).

AOMedia’s announcement provides the “what” of AV2, but delivery teams must implement the “how”: variant selection and safe fallback in mixed ecosystems.

The operational rule: separate codec families

In early deployments, avoid mixing AV2 and legacy codecs inside a single switching group where clients might attempt seamless switching across codec boundaries. Instead:

  • Build a clean AV2 ladder (representations that can switch among themselves).
  • Keep your legacy ladders (AV1/HEVC/AVC) intact.
  • Use a gating layer (server-side or client-side capability logic) to decide which ladder a session receives.

This reduces failure modes: a client that mis-parses AV2 signaling should never enter a broken state; it should choose a legacy ladder deterministically.

The “fail-closed” requirement

When AV2 is not supported (decoder, DRM, or player), behavior should be:

  • Exclude AV2 variants from selection, or
  • Select a legacy codec variant and continue playback without retries that create startup delay loops.

Your monitoring must explicitly measure:

  • codec chosen per session,
  • fallback rates,
  • decode errors,
  • startup failures correlated with codec.

If you cannot measure these, you cannot safely expand rollout.

AV2 should be viewed not only as a packaging and delivery challenge, but as a response to how streaming expectations have evolved. Stable 4K playback, sharper graphics in sports and news, multi-angle viewing, and future immersive formats all raise the baseline for what delivery infrastructures must support. The efficiency gains of AV2 matter precisely because they enable these experiences at scale, but only when networks, CDNs, players, and monitoring systems are ready to absorb the change.

In that sense, AV2 does not simply optimize existing workflows; it forces platforms to align video quality ambitions with infrastructure readiness across the entire streaming stack.

3) CMAF fundamentals that actually matter in production

CMAF is often discussed as “fMP4 for HLS and DASH,” but for delivery engineering, the important entities are:

  • Initialization segment (defines track metadata and codec configuration).
  • Fragmented media segments (moof/mdat structure).
  • Switching set (a group of tracks that can be switched seamlessly).
  • Segment boundaries (the ABR grid).
  • Keyframe/GOP alignment (the visual switch points).

3.1 Segment grid: pick it once, enforce it everywhere

A robust packaging stack treats segment boundaries as a global grid for a presentation:

  • Every video representation in a switching set uses the same segment duration (for example, 2s or 4s).
  • Keyframes align with segment boundaries (or at least with a predictable subset in a controlled way).
  • Audio segment boundaries match video segment boundaries wherever possible, especially for low-latency configurations.

Why this is non-negotiable:

  • ABR switching logic assumes aligned boundaries to minimize rebuffering.
  • Ad insertion and program transitions rely on predictable discontinuities.
  • DRM key rotation and license windowing become far more fragile if boundaries drift.

3.2 Initialization segment stability: do not mutate headers casually

In production, header churn causes support issues:

  • Some clients cache initialization segments aggressively.
  • Some players behave unpredictably if codec configuration changes mid-stream without clear signaling.

For AV2, where the ecosystem is younger, treat initialization segments as immutable per representation (or per rendition profile) unless you have a tested reason to rotate them. If you must change them (for example, profile change), make it an explicit, well-signaled event aligned to a program boundary.

3.3 Switching constraints: keep ladders “switchable,” not just “encoded”

Switching is safe when the ladder is designed for switching, not simply created from different bitrates. In practice, that means:

  • Consistent frame rate policy across the ladder.
  • Consistent HDR/SDR grouping (do not surprise-switch between color spaces).
  • Consistent slice/tile and codec tool usage that does not break decoder expectations mid-switch.
  • Segment and fragment timing that is monotonic and well-formed across the ladder.

This is where “AV2 as a new codec” amplifies risk: any edge-case decoder behavior will surface first during switching, not during steady-state playback.

4) Manifest authoring for DASH: represent codec families explicitly

DASH MPDs give you a structured way to model switching groups, but only if you keep codec families clean.

4.1 AdaptationSet strategy: one codec family per AdaptationSet

A conservative, production-safe MPD design is:

  • One video AdaptationSet per codec family (AV2, AV1, HEVC, AVC).
  • Within each AdaptationSet: a ladder of Representations (bitrate tiers) that share switching constraints.
  • Separate AdaptationSets for HDR vs SDR if your device matrix is heterogeneous.

This reduces ambiguity and makes debugging easier: when a client selects an AdaptationSet, you can map it directly to a codec family and device capability class.

4.2 Initialization and SegmentTemplate consistency

Keep representation patterns consistent. A typical structure (illustrative) is:

<AdaptationSet id="v-av2" contentType="video" mimeType="video/mp4" codecs="AV2_CODEC_ID">

<Representation id="v-av2-3000" bandwidth="3000000" width="1920" height="1080">

<SegmentTemplate initialization="init-$RepresentationID$.mp4"

media="seg-$RepresentationID$-$Number$.m4s"

timescale="90000"

duration="180000"/>

</Representation>

</AdaptationSet>

Important points:

  • “AV2_CODEC_ID” is a placeholder for the standardized codec string once it is finalized for your toolchain; do not hardcode speculative identifiers.
  • SegmentTemplate durations must align across representations inside the switching set.

4.3 Low-latency DASH signaling: treat it as a separate operating mode

LL-DASH is typically built around the idea that segments become playable before they are fully produced, often using HTTP chunked transfer and signaling that segments are available “early.” DASH-IF low-latency guidance emphasizes consumption near the live edge while segments are still being produced.

Operationally:

  • LL-DASH adds sensitivity to origin performance and CDN connection handling.
  • Your MPD update strategy and availability windows become critical.
  • Your “normal latency” DASH config is not automatically safe for LL-DASH.

If you plan AV2 + LL-DASH, isolate variables: validate LL-DASH stability with a known codec first, then enable AV2 for a constrained cohort.

5) Manifest authoring for HLS: codec clarity and deterministic fallback

In HLS, the master playlist is the decision point. New codecs often fail not at decoding, but at variant selection because of signaling mismatches.

5.1 Master playlist: keep AV2 variants separable

A practical master playlist approach is:

  • Group variants by codec family.
  • Ensure CODECS attributes are correct for each group.
  • Avoid ambiguous codec strings that cause players to reject the entire master.

Illustrative example:

#EXTM3U

# Legacy ladder

#EXT-X-STREAM-INF:BANDWIDTH=4500000,CODECS="av01.0.08M.08,mp4a.40.2",RESOLUTION=1920x1080

legacy/av1/1080p.m3u8

# AV2 ladder (placeholder codec id until standardized for your stack)

#EXT-X-STREAM-INF:BANDWIDTH=3500000,CODECS="AV2_CODEC_ID,mp4a.40.2",RESOLUTION=1920x1080

av2/1080p.m3u8

This is intentionally explicit: if the HLS client does not understand the AV2 codec string, it should still be able to select the legacy ladder.

5.2 LL-HLS: know what your server must support

Apple’s Low-Latency HLS guidance introduces tags and server behaviors such as partial segments and preload hints (for example, EXT-X-PRELOAD-HINT) that enable clients to fetch media earlier.

LL-HLS changes your operational profile:

  • Higher request frequency (parts + playlist reload).
  • More sensitivity to clocking and live-edge drift.
  • CDN configuration becomes more than “cache segments,” it becomes “handle many small objects and frequent updates efficiently.”

When pairing LL-HLS with AV2, it is easy to misattribute problems. A disciplined rollout does this in order:

  1. Prove LL-HLS stability with legacy codec.
  2. Add AV2 within LL-HLS for a narrow cohort.
  3. Expand once the telemetry supports it.

 

AV2 adoption

 

 

6) Low latency changes your segment economics and cache behavior

Codec efficiency savings are real only if delivery remains cache-efficient. Low latency can reduce cache efficiency by design:

  • playlists update more often,
  • parts are smaller objects,
  • requests per second increase,
  • cache keys can fragment if manifests are not structured carefully.

This is why “AV2 saves bandwidth” is not the full story. You can win bitrate efficiency and still lose on total CDN cost if low-latency configuration increases miss rates and origin egress.

Practical controls that matter:

  • Segment duration and part duration policy (balanced for latency and cacheability).
  • Consistent URL patterns that avoid cache fragmentation.
  • Explicit cache-control policy for playlists versus media segments.
  • Origin capacity for long-lived connections and chunked transfer.

Your AV2 business case should therefore include:

  • bitrate savings,
  • request-rate changes,
  • cache hit ratio deltas,
  • origin egress impact,
  • and player error/retry behavior.

7) DRM and AV2: align encryption schemes, rotation boundaries, and signaling

Common encryption is the foundation for multi-DRM in ISO BMFF-based workflows. ISO/IEC 23001-7 defines common encryption formats that allow multiple DRM systems to access the same encrypted file or stream.

In practice, DRM introduces the most rollout risk because it couples:

  • encryption scheme compatibility,
  • platform DRM implementations,
  • player behavior,
  • and packaging correctness.

7.1 Know your encryption schemes

Common encryption includes multiple protection schemes (for example, AES-CTR and AES-CBC variants and pattern encryption modes).

You do not need to expose these details to business stakeholders, but delivery engineering must choose deliberately:

  • which schemes apply to which device families,
  • and how you keep audio/video aligned under rotation.

7.2 Key rotation: treat it as an alignment event

Key rotation is a frequent cause of stalls and “works on one device, fails on another” bugs. A reliable policy is:

  • rotate keys only on segment boundaries,
  • rotate audio and video together,
  • keep initialization segments stable unless the DRM system requires a change,
  • and validate rotations across your real player matrix (not only with packager output inspection).

7.3 Separate “codec support” from “DRM support”

A device may be able to decode AV2 but fail the DRM path you select (or vice versa). Your gating logic must therefore consider:

  • codec decode capability,
  • DRM capability (and specific scheme support),
  • and player version constraints.

In early AV2 deployments, reduce variables:

  • one protocol first (DASH or HLS),
  • one DRM family first,
  • and one player implementation first.

8) A staged rollout plan for AV2 that minimizes operational risk

AOMedia’s member survey suggests strong intent to adopt AV2 quickly once finalized, with reported plans for adoption within 12 months and broader implementation within two years. That intent is not a rollout plan. Streaming platforms still need gating, telemetry, and rollback.

A production-safe rollout typically has four gates.

Gate 1: Packaging correctness and CMAF contract verification

Before field tests:

  • validate segment alignment across the ladder,
  • validate init segments per representation,
  • validate manifest correctness (HLS + DASH),
  • validate switching behavior under bandwidth fluctuation.

Gate 2: Capability gating and deterministic fallback

Define your initial AV2 cohort by explicit rules:

  • device family and OS version,
  • player version,
  • DRM capability for the chosen scheme,
  • CPU/thermal headroom if software decode is involved.

Instrument:

  • codec chosen,
  • startup time,
  • rebuffering,
  • fatal decode errors,
  • DRM errors,
  • and fallback success rates.

Gate 3: VOD first, then limited live

Start with VOD:

  • stable assets,
  • predictable packaging,
  • better reproducibility in debugging.

Only then:

  • test live with normal latency,
  • then test low latency modes (LL-HLS/LL-DASH),
  • with narrow cohorts.

Gate 4: Rollback discipline

Rollback should be configuration-driven:

  • disable AV2 selection,
  • revert to legacy ladders,
  • avoid content reprocessing just to “turn off” AV2.

If rollback requires repackaging, you will hesitate to use it, which increases incident duration.

9) Where the “deep dive” becomes practical: the packaging checklist

If you want a concise engineering checklist for AV2 packaging readiness, it typically looks like this:

CMAF and segmentation

  • One segment grid per presentation, enforced across representations.
  • Keyframe alignment to segment boundaries for switch safety.
  • Stable init segments per representation (no silent header changes).
  • Clear separation of codec families into separate ladders.

Manifests

  • DASH: codec families separated by AdaptationSet; consistent SegmentTemplate policy.
  • HLS: codec families cleanly separable; CODECS attributes correct; deterministic fallback preserved.

Low latency

  • Validate LL mode baseline with legacy codec first.
  • Confirm server behaviors for partial delivery and playlist update patterns.
  • Quantify request-rate and cache-hit impacts before scaling.

DRM

  • Explicit encryption scheme selection per platform.
  • Key rotation aligned across audio/video and bounded by segment boundaries.
  • Validate with real device/player matrix, not only lab players.

CDN

  • Validate cache key strategy for parts and playlists.
  • Confirm origin capacity for low-latency connection patterns.
  • Measure cache hit ratio and origin egress changes during pilots.

10) Where Promwad can help (neutral scope)

Deploying a new codec typically requires coordinated engineering across encoder integration, packaging, player validation, and production monitoring. Promwad’s experience in embedded systems, FPGA acceleration, and video architectures can be applied to prototyping AV2-ready pipelines, validating codec and packaging toolchains, and planning staged rollouts that preserve existing HLS/DASH and DRM behavior while introducing AV2 where it is operationally safe.

AI Overview: AV2 Packaging for HLS and DASH

AV2 rollout readiness is driven by CMAF/ISOBMFF signaling, manifest authoring, DRM alignment, low-latency server behavior, and CDN cache efficiency, not only by encoder performance.

  • Key Applications: CMAF-first packaging for AV2 ladders delivered via both HLS and DASH, codec-gated rollouts with deterministic legacy fallback, low-latency delivery using LL-HLS or LL-DASH mechanics, multi-DRM encrypted ISO BMFF workflows, CDN-optimized segmentation and cache-key strategies.
  • Benefits: Faster and safer AV2 adoption by isolating codec families, predictable ABR switching via aligned CMAF segment grids, reduced operational risk through staged rollout gates and configuration-driven rollback, clearer observability of codec selection and failure modes, improved ability to realize codec savings without degrading CDN efficiency.
  • Challenges: Early ecosystem unevenness in AV2 signaling/toolchains and player compatibility, strict CMAF switching constraints and alignment requirements, increased sensitivity and request load in low-latency modes, DRM scheme and key rotation boundary compatibility across platforms, CDN cache fragmentation risks from parts and frequent manifest updates.
  • Outlook: Initial deployments prioritize VOD and narrow cohorts with CMAF-first packaging and strong telemetry, low-latency AV2 follows after LL delivery behavior is validated on baseline codecs and CDN/origin capacity is proven, broader adoption accelerates as AV2 toolchains and device support mature and packaging conventions stabilize.
  • Related Terms: AV2, CMAF, ISOBMFF, fMP4, HLS, DASH, LL-HLS, LL-DASH, segment alignment, switching sets, CENC, DRM, key rotation, CDN caching.
 

 

Contact us

 

 

Our Case Studies 

 

FAQ

How do you package AV2 once and deliver it through both HLS and DASH?

 

Use a CMAF-first workflow so the same fragmented MP4 segments can be referenced from both an HLS playlist and a DASH MPD, but keep codec families separated into distinct ladders and enforce alignment and switching constraints across the ladder.
 

What is the safest way to introduce AV2 into an existing multi-codec streaming stack?

 

Deploy AV2 as a gated ladder for a narrow, explicitly defined device/player/DRM cohort, preserve legacy ladders as the default, and expand only when telemetry shows stable codec selection and low failure rates.
 

Can low-latency streaming be combined with AV2 in the first rollout wave?

 

It can, but it increases risk because LL-HLS and LL-DASH amplify sensitivity to manifest correctness, origin performance, and CDN behavior. A safer sequence is to validate low-latency delivery with a known codec baseline first, then introduce AV2 for a narrow cohort.
 

What breaks most often when moving to CMAF and low latency?

 

Misaligned segment boundaries, inconsistent init segments, manifest errors near the live edge, DRM key rotation boundary drift, and CDN cache fragmentation from parts and frequent playlist updates.
 

How should DRM be handled when adding AV2 to a CMAF packaging workflow?

 

Choose encryption schemes deliberately per platform, keep key rotation aligned across audio/video on segment boundaries, and gate AV2 by both codec decode capability and DRM scheme compatibility to avoid “codec works, DRM fails” sessions.
 

How do you ensure AV2 savings translate into lower CDN cost?

 

Measure cache hit ratio and request-rate changes alongside bitrate efficiency. Low-latency modes can increase requests and reduce cacheability; tune segment and part sizing and control variant counts to avoid offsetting codec gains.