LL-HLS Series: Implementing LL-HLS with ABR, Subtitles, DRM and SSAI
by THEOplayer on September 1, 2020
In our previous blog posts we have discussed a number of different topics regarding Apple’s Low Latency HLS (LL-HLS) such as its main use cases and how to implement and configure it in an end-to-end solution. We also discussed how LL-HLS has evolved, and how those changes have impacted the industry. In this blog we will go more in depth on which challenges this immaturity leads to when implementing LL-HLS in combination with (standard) features such as ABR, subtitles, content protection and server side ad insertion (SSAI). As these capabilities are crucial for a wide range of services as they bring accessibility and have a critical impact on content and monetization options, it is important to be aware of these challenges.
Network Independence with Adaptive BitRate (ABR) Switching
ABR is important because it allows a viewer to have a good experience regardless of the capabilities or conditions of the network they are using. When a viewer’s network encounters problems, this will quickly impact the player’s ability to load data fast enough and continue smooth playback.
ABR helps to avoid playback stalls through identifying the current network state: average bandwidth, peak bandwidth, round trip time, etc. When the network quality deteriorates, the player switches to a lower bitrate, and to a higher bitrate when quality increases. In order to do this correctly and avoid playback stalls, the player should be able to span the time it needs to switch to a lower quality in its buffer. In low latency, buffers are often small, so it is important to have an optimized algorithm which can very quickly react to changes in bandwidth. In LL-HLS, this also means enough independent frames have to be available. This depends on the GOP size, which we discuss in detail in our previous blog post.
The ABR algorithm will need to make an accurate estimate of how long it will take to buffer from the last independent frame, and how long it will take until the next independent frame will appear. In function of the current buffer size, a decision is to be made on which action to take.
"ABR is important because it allows a viewer to have a good experience regardless of the capabilities or conditions of the network they are using."
An accurate network quality metric is required. As LL-DASH has shown, getting an accurate measure for chunked transfer encoding is difficult. ACTE was proposed as a measurement approach there. LL-HLS does however not “really” make use of chunked transfer encoding because parts have to be published at line speed. When parts are addressed individually in a playlist, an accurate speed measure can be made in the same way as with normal HLS, although the size of the data package on which it is measured is smaller. When parts are delivered as byte ranges out of a larger asset, chunked transfer will be used. It is crucial that an adapted algorithm is used, which measures time between the first bytes of the part and the last bytes. For LL-HLS this is still easier compared to LL-DASH where chunks are not required to be delivered at line speed and the size of chunks is unknown.
In LL-HLS there is another disadvantage: a new and up-to-date playlist is to be loaded for a new rendition before loading data. You can use EXT-X-RENDITION-REPORT in order to work around this, which allows the player to get an up to date playlist quickly. In LL-HLS, the advantage of having clear visibility of positions of independent frames and the size of parts, as well as parts being transferred at line speed is huge. An ABR algorithm should be tuned for LL-HLS to make use of this. It seems to be early days on the development of LL-HLS ABR algorithms. Even Apple’s algorithm seems to make rash decisions and go to the lowest bitrate at the slightest impression of deteriorating network quality.
It will be crucial to get this correct as overreaction has a big impact on viewer experience. At THEO we don’t believe in just reusing a standard algorithm, such as a legacy HLS algorithm or a LL-DASH algorithm. The ABR algorithm of LL-HLS is too different and requires careful optimization in order to guarantee a smooth user experience. We tend to use deadlines to determine how long it takes before we can make a switch, and also leverage buffer size heavily, which dynamically increases or decreases when needed. Everything to avoid a stall is done, starting from a viewer centric approach rather than a “pushing data from a server to a client”.
Increasing Accessibility Using Subtitles and Captions
When LL-DASH became first available, most implementations did not support subtitles. Even today, doing subtitles in LL-DASH is hard, and the same is true for LL-HLS. LL-HLS supports WebVTT, IMSC1, CEA608. Where CEA608 data is embedded in the video data allows for easy delivery on time, WebVTT and IMSC1 are delivered in parallel. This means they can be out of sync with the main content.
Knowing when a subtitle cue is to be shown isn’t hard, but it is often unknown when a cue has to be hidden. For example, what if someone suddenly says something quickly? This leads to shorter cues. The solution could be to have short subtitle cues and repeat until they have to be hidden, however, when video data is available, but no subtitles, should the player stall? Likely not, but if it becomes available shortly after, subtitles will start to disappear and reappear, giving a blinking effect. Within the THEOplayer UVP solution, we are working with our partners to resolve these problems. The most promising approach is to have subtitles with parts the same size as video/audio, which allow for the timely retrieval of data. Players can keep a cue in view for half of a part duration after the end if the end coincides with the end of the part. If a new cue appears, it is to be replaced.
Protecting Your Content with Studio Approved DRM
DRM and LL-HLS is exactly the same as in DRM and HLS. In HLS the EXT-X-KEY tag allows you to mark encrypted segments. You can use this in combination with Google Widevine, Microsoft PlayReady and of course most commonly with Apple FairPlay DRM, allowing you to target Apple devices.
In Low Latency HLS, when a segment or part is encrypted, a license is needed before it can be played. Once the player notices the encryption is there, it will start to request a license, certificate, or all of the information that is needed to start the stream. This license needs to be negotiated with a license server. The problem occurs when the player needs to request the license, because buffers with low latency are smaller, this means the negotiation of the license needs to happen quickly. A player only has a limited amount of time to make this request for buffer time, which is the amount of data between the current playback position and the start of the encrypted segment. It is mandatory to have this request before playback can start. The need for the request of a license or certificate introduces a delay, which could be very small, but the delay it is necessary to take into account nonetheless. This delay is important for two reasons: startup time and key rotation. An increased startup time often means an increased latency at startup. However, it’s key rotation which causes the biggest impact on viewer experience.
"You can use LL-HLS in combination with Google Widevine, Microsoft PlayReady and of course most commonly with Apple FairPlay DRM, allowing you to specifically target Apple devices. "
When using DRM, it is often advised to occasionally rotate your keys. This means you change the decryption key at a certain point in your stream. As a result, a new license can be required containing the new key. Rotating DRM keys in a low latency environment can complicate your solution. When a new license has to be requested, it is important your download buffer is large enough to traverse the time between initial request for the license and the response. If the player fails to receive the license, extract the key and decrypt the content before the buffer runs out, a playback stall will occur. As retrieving a license and starting up decryption with a new key can often take a while (especially on “older” connected devices such as Smart TVs), it is crucial you can issue this request on time in order to avoid a stall. For this reason, it is important to timely flag key rotations when working with low latency.
Monetizing your content using Server Side Ad Insertion (SSAI)
Server Side Ad Insertion (SSAI) with LL-HLS is quite similar to SSAI in HLS. In traditional HLS everything is sent out in segments, however, in low latency HLS the segments are divided into parts so the client can access parts before the entire segment is available. If the encoding of the ad is different from that of your main content, you will need an EXT-X-DISCONTINUITY tag as mandated by the HLS specification to indicate this difference. Similarly, when DRM configurations change, EXT-X-KEY tags can be added when starting and ending an ad to mark changes in configuration. This is all identical to the approach in traditional HLS.
The ads that get inserted into content can also be pre-recorded ads. These are stored as a large file rather than a file per segment, and they can be reused for different services. An LL-HLS playlist can still use this by specifying the segments and parts as byte ranges in this larger file. For example, the first segment could lie between byte 0 and 700 and the second segment between 701 and 1500.
When configuring your SSAI setup, it is important to keep as much of the configuration as homogeneous as possible. For example, ads must have the same part size as the main content in order to ensure a player does not need to suddenly modify its buffer. It’s possible that if the parts are suddenly larger, the need to download this larger part will cause the buffer to become smaller for a brief period. This is in a sense similar to segments being the same size in traditional HLS, but it’s important to keep in mind in order to guarantee a smooth playback experience.
When combining DRM with SSAI, it becomes extra tricky. Using DRM, the player can only playback the content after the license is received. With ads, you might need to do a license request when entering (or more likely) exiting an ad. Without SSAI the request for a license is done once at the beginning of playback, but with SSAI (or key rotation for that matter), it suddenly needs to be done while playing. As mentioned previously, the buffer is small in low latency cases, so the requests need to happen quickly to avoid the player stalling. As such it is important to get information on key rotations or new license needs to the player as quickly as possible and prepare your license servers to deliver a high amount of licenses in a short timespan when an ad break ends.
In an earlier version of LL-HLS, video content and playlists needed to be on the same server. Thanks to the addition of the EXT-X-PRELOAD-HINT tag, it is no longer a limitation, and playlist manipulation services (which alter the playlist to contain the ad segments) can live outside of the content CDN.
In general SSAI with Low Latency is also something which is not easy to deploy for LL-DASH. For LL-HLS, we expect it will still take a while before all vendors are ready to deploy.
As the tools are not fully productized and matured yet is extremely important to test everything properly moving forward. We are expecting further updates on LL-HLS during Apple’s GA release, launching it in their products in iOS 14 (now expected October 2020). THEO currently has a beta player available that supports the most recent LL-HLS update. If you’re interested, don’t hesitate to contact our LL-HLS experts.
If you would like to learn more about the specification, you can watch our on demand webinar with our CTO and Founder Pieter-Jan Speelmans. We also have an upcoming LL-HLS Roundtable Webinar with our partners Wowza and Fastly.
Interested in learning more about implementing a Low Latency HLS solution?
On-Demand Webinar: Is the Industry Ready for LL-HLS?
A Roundtable discussion with THEO's CTO Pieter-Jan Speelmans, Chris Buckley, Senior Sales Engineer at Fastly and Jamie Sherry, the Senior Product Manager at Wowza.
On-Demand Webinar: Apple’s LL-HLS is Finally Here
Hosted by our CTO Pieter-Jan Speelmans
Contact our LL-HLS Experts