THEO Blog

In-stream thumbnail support for (DVR) DASH streams

A thumbnail is a reduced size image of a specific frame which helps a viewer to seek to a specific event or scene in a video. Typically, thumbnails are added to a stream using a separate (WebVTT) file which contains pre-rendered images. While this approach works fine for static VOD content, it is pretty inconvenient for DVR/live content. A single, separate file is hard to sync with contents of dynamic streams.

DASH-IF IOP 4.2 section 6.2.6 defines the notion of image-based tracks in DASH. This allows thumbnails to be specified in the stream itself as an additional track next to audio, subtitles and video. This approach makes it possible to sync and update JPEG-based thumbnails with a dynamic stream. But how does it work? How do you need to generate these image-based tracks? And, why do you need thumbnails anyway?


Why are thumbnails important?

End User Impact: User Experience

The most obvious and straightforward advantage of thumbnails is that they allow your viewers to seek back or forward in the video. Take Facebook Live for instance. When you discover the video in your Facebook feed, it might have already been playing for several minutes. After a few seconds of watching, people might want to know what happened at the very beginning of the live stream. As Facebook live feeds often come online several minutes before the actual video starts, seeking to the exact start of the video using thumbnails simplifies this action significantly.

Or, imagine you’re tuning into a soccer game where the scoreboard dictates ‘2-3’. You might want to quickly go and catch-up on the goals you've missed without missing too much of the game's ending.

Business Impact: Bandwidth reduction

Thumbnails do have a second advantage: it might benefit your bandwidth usage. As your viewers will more easily find the content they are looking for by seeking through your video, less content needs to be buffered. Although the advantage is limited to a bandwidth optimisation of only a few percentages, with growing consumption every percentage saved on bandwidth results in significant cost savings.


Generating thumbnails

The easiest way to generate thumbnails is to create them based on key frames of your 2s-3s DASH segments. Advanced packagers today support this feature and will offer the possibility to amend your manifest files with a thumbnail stream. Have a look at the code example below which shows a DASH manifest. Next to the audio and video adaptation set, a third image based, adaptation set is present in the file. In the example, for every segment of duration 2 seconds, an image is generated. Using the $RepresentationJD$/$Number$.jpg a video player with In-stream thumbnail support will know where to look for the image and include it in the player.

...

<Period id="p0" start="PT0S">

<AdaptationSet contentType="audio" lang="eng" mimeType="audio/mp4" segmentAlignment="true" startWithSAP="1">
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="main" />
<SegmentTemplate duration="2" initialization="$RepresentationID$/init.mp4" media="$RepresentationID$/$Number$.m4s" startNumber="0" />
<Representation audioSamplingRate="48000" bandwidth="48000" codecs="mp4a.40.2" id="A48">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2" />
</Representation>
</AdaptationSet>
<AdaptationSet contentType="video" maxFrameRate="60/2" maxHeight="360" maxWidth="640" mimeType="video/mp4" minHeight="360" minWidth="640" par="16:9" segmentAlignment="true" startWithSAP="1">
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="main" />
<SegmentTemplate duration="2" initialization="$RepresentationID$/init.mp4" media="$RepresentationID$/$Number$.m4s" startNumber="0" />
<Representation bandwidth="300000" codecs="avc1.64001e" frameRate="60/2" height="360" id="V300" sar="1:1" width="640" />
</AdaptationSet>
<AdaptationSet contentType="image" mimeType="image/jpeg">
<SegmentTemplate duration="2" media="$RepresentationID$/$Number$.jpg" startNumber="0" />
<Representation bandwidth="10000" height="90" id="thumbs" width="160">
<EssentialProperty schemeIdUri="http://dashif.org/guidelines/thumbnail_tile" value="1x1" />
</Representation>
</AdaptationSet>

</Period>

...


Single Thumbnail Approach vs Tiled Images Approach

The above approach will generate 1 thumbnail for every 2 seconds. But what if we want to generate more thumbnails? The <EssentialProperty> property will help you. As it is not necessary that the referenced image contains only one image, the value “1x1” within the <EssentialProperty> property dictates that the image contains only one thumbnail. This is called the Single Thumbnail Approach.

 <AdaptationSet id="thumbnails" mimeType="image/jpeg">

<EssentialProperty schemeIdUri="http://dashif.org/guidelines/thumbnails" value="1" />

<SegmentTemplate media="$RepresentationID$/thumb$Number$.jpg" timescale="1" duration="5" startNumber="1"/>

<Representation id="thumbnails_320x180" width="320" height="180" frameRate="1/5">

</Representation>

</AdaptationSet>


However, the value could be as well “1x10” or “10x10”. The tiled image below shows a “1x10” matrix of images containing several thumbnails. This is called the Tiled Images Approach.

<AdaptationSet id="3" mimeType="image/jpeg" contentType="image">

<SegmentTemplate media="$RepresentationID$/tile_$Number$.jpg" duration="100" startNumber="1"/>

<Representation bandwidth="12288" id="thumbnails_320x180" width="3200" height="180">

<EssentialProperty schemeIdUri="http://dashif.org/thumbnail_tile" value="10x1"/>

</Representation>

</AdaptationSet>


When correctly described in the manifest file, the video player will know that a matrix is available of XxY images for a period of Z seconds. As a content provider, you can experiment with these different factors. For instance, while combining all images into a single matrix might be a good approach for VOD because of its limited bandwidth requirements, this will not work in live streaming, since the matrices have to be generated on the fly.


Thumbnails example


Fig. 1 - Example of thumbnails


THEOplayer’s Thumbnail Support

THEOplayer supports both Single and Tiled Thumbnail Approach. In-Stream thumbnail support can easily be activated if your packager is configured to comply to DASH-IF IOP 4.2 section 6.2.6.

The player will automatically pick up the image-based track and makes it available in its list of text tracks. If you are using the THEOplayer UI, you can show the thumbnails by setting the track's mode to ‘showing’. The API also exposes the content of each available image as a list of timed cues. This allows you to use the thumbnails for other use cases, like updating your EPG icon or creating a custom thumbnail display.

If you would like to activate thumbnails within your streaming infrastructure, learn more about the THEOplayer TextTrack API here. Feel free as well to reach out to our video streaming experts. They will be happy to help you with all your video streaming challenges.

Get in contact with us

Subscribe by email