Apple’s HLS and MPEG-DASH are two of the most popular HTTP Adaptive Streaming (HAS) protocols that have become the de facto industry standards. In this blog we will talk about MPEG-DASH, more specifically its low latency variant (LL-DASH), explain how it works and things to consider when it comes to its implementation. You can also check out our previous blog posts to learn more about HLS and Low Latency HLS (LL-HLS).
Dynamic Adaptive Streaming over HTTP
Dynamic Adaptive Streaming over HTTP (DASH) was first released in 2010 from MPEG (Moving Picture Experts Group). This streaming protocol is codec agnostic, meaning it is compatible with almost all video codecs such as H.264, H.265/HEVC, VP9 and AV1, as well as most audio codecs. Low Latency DASH (LL-DASH) was first introduced in 2017 by means of interoperability profiles and extensions of the specification. It can be supported following the same approach as MPEG-DASH where support can be implemented in a “bring your own” media player approach on most common browsers and platforms. This includes notable platforms such as Android devices as well as smart TVs. Where most platforms don’t support LL-DASH natively, players for these platforms are available.
With MPEG-DASH being compatible with the Common Media Application Format (CMAF), most people wrongly assume this also allows for low latency. This is however not the case: CMAF itself does not reduce latency, but does provide some tools to make it possible such as splitting up media segments in smaller chunks. The main driver behind CMAF was efficiency, not low latency.
MPEG-DASH works by splitting video into smaller files, called segments which can be sent efficiently over the network and over Content Delivery Networks (CDNs). Latency is caused by the segments themselves (travelling through the video ecosystem), the encoder, the CDN or network, or from the player buffer. While reducing the size of the segments might slightly reduce the glass to glass latency, the smaller segments could cause the quality or the bandwidth of the stream to suffer, and it also causes more requests from the CDN (slowing down the stream).
How Does LL-DASH Work?
Although sometimes LL-DASH is referred to in the industry as CMAF-CTE (Chunked Transfer Encoding), this is not entirely accurate as CMAF-CTE is only a method used in LL-DASH. This method essentially cuts the segments into smaller, non-overlapping chunks (often containing a handful of frames), which are all independent of one another. This means the origin doesn't have to wait until the segment is completely finished before the first chunks can be sent to the player.
The Manifest can indicate when a segment will become available on the server. By observing when segments become available, a client can quickly and accurately request a segment as soon as it becomes available, reducing delays. This does require an accurate synchronization between the clocks of the client and server. For this, MPEG-DASH allows you to configure a time server.
DASH vs. LL-DASH
Usually when live streaming, the media is sent in segments that are each a few seconds long. These segments shouldn’t be shorter than 2 to 4 seconds or there could be a risk of poor performance/playback quality. With LL-DASH, if the player requests a segment which is not completed, chunks will be delivered as soon as they are available using Chunked Transfer Encoding. As media can be delivered to the client as soon as it is generated on the server side, overhead is reduced, allowing to reduce the client's buffer and, as a result, latency. A client's buffer should contain around 3-4 chunks to guarantee smooth playback, potentially reducing the latency to something of this order of magnitude. In practice, it is also important to take manifest handling, time synchronization and buffering for network issues into account. With LL-DASH, we can reduce the latency to a couple seconds instead of the multiple 10’s of seconds that we had before.
Protecting your low latency content with DRM for LL-DASH
DRM can have an important impact on latency because the player must first inspect the content and acquire a license for it, and once the player notices the encryption is there, it will start to request the license that is needed to start the stream. In low latency situations, buffers are smaller, and making these requests could cause a few hundreds of milliseconds or even seconds delay. This delay may not seem like much, but it’s important to keep in mind, especially when it comes to start-up time and key rotation (i.e. with SSAI when a period ends and a new one starts). The player must have this license request before playback can start, so the request must happen quickly.
In DASH you can put DRM initialisation information (called the PSSH) in the manifest or in the initialisation segment. By placing this information in the manifest, a client can start negotiating for a license before any media segments have to be loaded. In a low latency case, it is recommended to provide the PSSH information in the manifest in order to avoid any unneeded delays for the license request.
Widevine and PlayReady both support MPEG-DASH, so with that in mind, it is possible to encrypt and package your content once and decrypt using either DRM system. It is possible to do the same with HLS, and add all DRMs (including Fairplay) in one stream.
Monetizing your low latency content with SSAI for LL-DASH
With MPEG-DASH, SSAI can be implemented using Periods. Periods are some of the basic building blocks, indicating a coherent piece of content being played, such as a section of main content, or an advertisement. A manifest contains one or more periods, which in turn contain Adaptation Sets (about the same as a Track) containing Representations (which are the different qualities). When inserting an advertisement, a period can be added in the manifest to break up the main content. In LL-DASH, this same mechanism can be used.
There are some important limitations to multi period support:
Not all players support this.
Decoders (especially on older devices or platforms such as smart TV) don't like the reconfiguration required when changing periods. This often results in black frames or delays between playback of different periods.
Enabling or disabling DRM on a decoder can be especially tricky.
To identify a future period, a client has to update the manifest. With low latency, this means the player needs to refresh the manifest at a quick interval. Depending on when the ad has to be inserted, it can mean a segment ends early and that the client cannot anticipate a next segment becoming available at a certain location, but instead needs to discover it from an up to date manifest. This results in frequent refreshes of the manifest, causing overhead. All of this makes doing SSAI with Low Latency DASH far from trivial, and should be evaluated carefully.
Improving accessibility with subtitles and captions at low latency with LL-DASH
DASH is most often used with WebVTT or TTML. These require start and end times to be specified for every cue. Cues often appear on the screen for a few seconds, allowing a viewer to read the text. When the latency is supposed to be only a few seconds, knowing the end time of a cue can be a challenge. When a player is close to the live edge, the end time can be unknown, but this is not supported in the WebVTT or TTML formats. For Low Latency, a creative solution is to be found to ensure proper rendering of subtitles.
- Give subtitles a fixed duration: It's simple, but doesn't always give the desired result. For example when a commentator in a sports match is speaking rapidly during an on-screen action.
- Repeat subtitles: For example give subtitles an end-time of 1s and extend it if needed. This is also quite simple, but a client has to be aware subtitles are being repeated. If subtitles are badly aligned in start and end time, cues could appear on screen twice (however briefly), or could disappear for a short amount of time, creating a blinking effect.
Based on our experience the 2nd approach is the most flexible. To avoid blinking, a client can extend the cue further when no information about the cue is known after the end of the current segment. Also, it can detect duplicate text and render it on screen only once.
Test Your LL-DASH Stream
LL-DASH is supported widely across the video ecosystem in terms of compatibility and flexibility with different video codecs. DRM, SSAI and subtitles are features that are crucial for a wide range of services and impact content protection, content monetisation and accessibility. You can see LL-DASH in action on our test page, where you can use a stream provided by our friends at Akamai, or insert your own LL-DASH stream to test.
These solutions can be optimised together with our THEO Universal Player Solution to reduce latency and accelerate viewer experience. THEO has also implemented an Adaptive Bitrate algorithm optimized for Low-Latency DASH. If you have questions or would like personalised advice for your specific use case, contact our experts.
Contact our LL-DASH Experts