Low Latency Check-Up: LL-HLS, LL-DASH and HESP

Watch the recording and explore WHAT, the WHY, and the HOW of HTTP-based Low Latency Streaming Protocols

  • Explore an in-depth discussion on low latency streaming protocols with experts Johan and Pieter-Jan.
  • Delve into critical parameters such as GOP-size, chunk-size, and segment size for protocols like low latency DASH, low latency HLS, and HESP.
  • Uncover insights into implementation strategies, testing methodologies, and performance metrics of various low latency streaming solutions.
  • Compare and contrast the benefits and trade-offs of different protocols to assist in selecting the most suitable approach for your use case.

Webinar transcript

Dive into an insightful conversation delving into the realm of low latency streaming protocols with experts Johan and Pieter-Jan. The discussion traverses through the intricacies of various protocols such as low latency DASH, low latency HLS, and HESP, shedding light on critical parameters like GOP-size, chunk-size, and segment size. 

Pieter-Jan: Hello, we're almost at the zero mark. So, I propose that we very slowly get started. That's probably a good idea. So, hello and welcome if you haven't heard me before, Giving a quick intro mentioning that you could grab something to drink. If you don't have something yet, you're a little bit too late right now because we're going to get started. So, you have me here. So, Pieter-Jan, PJ, depending on what's easiest for you to pronounce. CTO and one of the co-founders of THEO Technologies. We're known for THEOplayer, but we're of course known for a lot of other things, including a lot of innovations that we're bringing. For that reason, I also have Johan here with me. He's our VP of innovation, so, he's inventing all of the cool stuff. So yeah, we're happy that everybody is joining and let's get started:

One of the things probably important to note here as well, a little bit of housekeeping. After the presentation, we'll have a very brief survey to ask how we've been doing. Hopefully we can improve our webinars that way. But also, if you have any questions, we'll be having Q&A after this webinar. So, you're good to go, you can already start submitting all of your questions - just use that Q&A form. So that we can pick them up immediately once we hit the Q&A part of things.

So, welcome everybody! As a lot of people have probably noticed, there were a lot of big sporting events recently, Euro 2020, the Olympics and streaming numbers, they were quite huge. But if you dig through all of the comments that were there in the news and on all the articles: you would see that there was a lot of this gruntlement around quality of experience, a lot of messages, people wondering - hey, why was my neighbour shouting before I was? It's like the typical thing that everybody says, of course, in a latency webinar, but also people trying to watch the Olympics, different channels, trying to follow it all at the same point in time, but switching was taking a lot of time, and they would miss the aspects of the games. A lot of interesting feedback there from the viewers.

And of course, a lot of things have been said and done about latency in this industry. We've been talking about it for what seems like ages, but there are still a lot of questions popping up - some very trivial. So, let's hope that today we can actually start explaining a little bit more. To make sure that everybody can get their viewers that leanback experience with low latency that everybody wants and that everybody has become so familiar with when watching broadcasts, but let's try and bring it also to OTT.

So yeah, let's get started: we'll talk about some of the common streaming protocols today, some of the pros and cons and we'll have a demo at the end. It won't be a live demo. We're fearing the demo gods a little bit, but let's start and dive in.

So, to get started actually, first things first - the big mandatory overview of all the latencies that everybody knows. Of course, everybody also knows that you don't want to be on the left side of this graph, that legacy latency 30 seconds and up. You don't really want to be there anymore if you're starting up a new service or if you're trying to improve your new service. What people really want to be today, they really want to be basically on par with the broadcast experience, potentially a little bit faster latency wise so that they can start adding in interactivity. And that's also what we'll be focusing on today. We won't be focusing on the real time part of things, not for video conferencing, yes, that's a use case as well. It's a very important use case, latency is very important there. But that's usually at a lot lower scale. What we want to talk about today are the high-scale cases, the cases not with 10 people watching, but the cases with hundreds, thousands, and more viewers at the same point in time.

And that's actually why those HTTP-based streaming protocols that we will talk about today - why they were designed. HLS, DASH HESP, they all run over HTTP. If you want to go lower real-time latency: there's protocols like WebRTC, SRT, RIST, all those kind of things, but they're a little bit less suited for distribution at scale, so that's why we will not talk about those today, but focus on the HTTP-based protocols, which usually have a lot lower cost if you try to scale them.

Understanding Latency in Broadcast and Streaming Chains

Pieter-Jan: So where does latency come from? Probably more or less everybody in here knows as well, you have latency being added throughout the entire pipeline of the broadcast chain or of the streaming chain. It's usually related towards the minimal units that you can process at every point. And the encoder, that's usually the set of frames that you potentially want to refer or would allow your encoder to refer when it's trying to get the highest quality out of your bits. But in your distribution part, it's something else - usually, it is the minimal transfer unit and HTTP-based protocols that used to be your segment. These days, it's a little bit different: it's a chunk, a part, a packet, whatever the name, depending on the protocol. And of course, a very big source of latency as well is your buffer. The buffer, of course, on the player side, very important thing, why? Because you don't want stalls to happen, you don't want interruptions to happen if there's a network hiccup. And of course, there are all kinds of other reasons why you want that buffer to be there and other limitations. For example, decoders only starting playback at an independent decodable sample.

And all of these parts, this entire chain has a huge impact. I mean, the latency there, everything has an impact. It usually feels very easy to make low latency happen: you just start reducing all of these bits, smaller packets, smaller segments, smaller buffers. But once you start digging into that, that's not really that easy. And it's also not just about latency. It's also about a lot of other things because once you start reducing these different units and these different chunks in this chain then you will have a lot of impact on other aspects. One of those things that has an impact is for example the startup time - startup time is something which has a lot of names, some people call it the channel change time, some people call it the zapping time, but it is very important. It's very important for a simple reason - because it's one of the biggest reasons of viewer joy. People don't like to wait. They don't like to sit around, watch loading spinners all day, they want content to start. If it doesn't start fast, then they will basically go somewhere else, or they will simply go away. Not something that you want.

And those big startup times, they actually have a big impact on latency as well. Because when you have a decoder, it has to start, as I mentioned earlier, on an independent frame. And startup of your live feed, when you want to start it up, you basically have two options: You either start at the last known independent frame or you wait for the next one. If you want to wait for the next one so that you can get a lower latency when you start playing immediately, you might need to wait for a very long time because you need that independent frame. And to get that independent frame, the problem there is it's linked to your GOP size. We'll be talking about  GOP size a lot more in this webinar, It's a very important parameter when you're looking at low latency.

The other alternative that I was saying, you start fast, you just pick up the last known independent frame. It's a very easy way to get started, but it will have an impact on the latency that you will see when you start up. It might be too high; it might actually be too low as well. Not very ideal, not the thing that you really want. Where that startup time also is very important, or at least the ability to very quickly start with a new independent frame, is actually when you start looking at network capabilities and network resilience more specifically. Adaptive bitrate switching, it always sounds very simple: You basically sense whatever the bandwidth is, or the network capabilities are that you have at a certain point in time, and you try to adapt and switch qualities up or down, whenever you notice that it's changing. But also, when you try to do these switches, you also need to start from independent frames. And that is a pretty tricky thing actually. It's not that difficult if you want to switch up in quality since your network has improved, because at that point in time, you can basically do the same as I mentioned earlier, you can wait until the next independent frame comes by, you download it and you can just surf your way up. Other alternative is that you basically have to re-download everything from the last independent frame which was available on the new quality that you want to reach, but that means that you need to download a lot of extra bytes. And all of that, as I mentioned, is okay when your network is improving. But if your network is going down and your bandwidth is going down, your buffering will be crucial and your buffer will be emptying out, you won't be able to stay on that current quality that you have and just keep downloading there, not until the next IDR frame in a lot of cases. So, you really need to be able to switch down and you need to switch down before your buffer ends up. And that's why that buffer is so crucial because you don't want to stall. Once you get a stall, that's of course another very big reason why people actually churn. People don't want to look at loading spinners, but they also don't want to look at spinners because your player has stopped.

So, all of that, it's very important. And one of the easy answers, one of the quick fixes that you could be thinking about right now is - this all depends on the number of independent frames. So that's very easy - let's just start pushing more and more independent frames in our feed. That's a very good idea, except for one thing: the big problem that you will get is that your quality will actually go down. The number of independent frames that you have has a big impact on that perceptual quality or the bandwidth that you need to stream out. There are some overviews here on this slide. And I know this varies very heavily depending on the type of content that you're actually broadcasting - for some content, adding in more key frames, more independent frames, will not matter that much; but for most content, this is a very important parameter. In the past, people would have very big GOPs, very large intervals between independent frames. If you have a more dynamic piece of content, you can put it a lot shorter or you will put it a lot lower. But all in all, you cannot just go around adding in more and more independent frames. Bandwidth will increase, which basically means increased cost, or perceptual quality will go down significantly. And that's, of course, not something that you want either.

Navigating Low Latency Protocols: A Brief Overview of LL-DASH, LL-HLS, and HESP

Pieter-Jan: Before we can dig in a little bit deeper in this aspect, let's very briefly, I promise it will be briefly, do a quick recap on low latency DASH, low latency HLS, and HESP, so that we can talk you through what kind of parameters are valuable for these kind of protocols. And then we can also start really comparing with a demo.

So, let's start with low latency DASH: if you look at the low latency HTTP-based protocols, it is often seen as the oldest one of them, because the IOP, the Interoperability Guide was published in 2018. It's not really the oldest HTTP-based low latency protocol. Because low latency DASH is actually using the same technique that we presented in 2016 together with Periscope at Streaming Media West. Periscope had built the low latency HLS or a first version of LL-HLS as a protocol. And the approach that they want to take is, as I mentioned, identical to what low latency DASH does or very similar. So, it chops up the segments into smaller chunks, for the low latency DASH, those are smaller fragments than before, so smaller CMAF chunks. But you could do it with WebM or other container formats as well. And those are just streamed out as one big segment over chunk transferring coding.

So, it's a very simple approach. It's not something shocking, but it has a very big impact. It basically means that you can download all of the content as it is being produced, and you can start using it immediately. So, it means that your latency can go down quite dramatically. As another aspect, with LL-HLS, we would announce segments early, with low latency DASH, you have your availability start time, you can anticipate when a new segment becomes available. You do need some clock synchronization there. But you can download everything very quickly, and all a player really has to do is identify where the player wants to start. So which segment to start with. Usually, it's just - pick the last segment, you start downloading it, data comes flowing in and then depending on how well you picked and how lucky you are, you will have a low latency, or you might need to adjust still a little bit for low latency. And there are different techniques here. You could also, as I mentioned earlier, wait until an independent frame comes by and a new segment gets loaded with LL-DASH and then pick it up from there. That's basically low latency DASH explained in one slide.

And as I misspoke with LL-HLS, let's go to that one next slide: Low latency HLS does not use the chunk transfer approach. So, it's very important actually when you're talking about LL-HLS that you're not confusing it with the LHLS periscope or community approach. You also shouldn't confuse it with a low latency HLS specification which Apple produced in 2019 because that actually used HTTP to push. They are also not doing that anymore. Still a few people confused about that.

But how the LL-HLS is actually working in the spec as announced by Apple in 2020 is that they're splitting up the segments in parts, but those parts are being delivered at line speeds all individually, and they are being announced with a pre-loaded. With a blocking request, you make a request, once the server is ready, it will send you that data from that part, and that's how everything is working now.

There's one very big difference between LL-DASH and LL-HLS, and that is the fact that all of the parts are being delivered at line speed over HTTP2 (we'll get to HTTP2 part  in a bit). But the fact that it's at line speed, that's also very important. One of the biggest struggles with low latency streaming is of course the bandwidth metrics. You have a lot less time to make sure that you're adapting to the right quality. And that's also something where Apple saw the opportunity to transfer everything at line speed. So, there's no idle time in between data from a part being sent, which makes calculating bandwidth a little bit easier. There are other ways to do it for LL-DASH, and those metrics are pretty good these days. But this is a very big difference.

The disadvantage of all of these parts... about some of the properties of HLS where you need to list every segment in a playlist. You also need to list every part in a playlist, which means that if you try to make your parts very short, then you will have a lot of playlist updates that you need to do. That's also where the HTTP2 part kicks in. With HTTP2, you can send multiple requests over the same connection. And that's what Atlas is using here, because you will be making a lot of requests for a lot of playlists. That's actually the biggest downside with LL-HLS. It will increase the number of requests that you need to do quite significantly, especially once your parts become small. There are ways around this, I won't go into too much detail, but if you use range requests and if you use byte ranges to annotate your parts, then you can cut that number of requests in half. But even then, that number of requests, will remain quite high. A nice advantage is you're listing all of the parts independently or individually, and that means that you can also identify which parts contain independent frames, which then again, makes it a little bit easier and gives you a little bit more insights to get started quicker or to pick the right segment or the right part to get started with.

And then a very quick refresher still on HESP, the third protocol that we're talking about today: driven by the HESP Alliance and actually hit standardization, or at least the standard has been published earlier this year with the IDF. Also, a CMAF-based protocol, very similar to what HLS and DASH are doing, and actually also using chunk transferring coding with fragments of MP4s in there.

But there is one very big difference with HESP- HESP actually makes use of two streams, one to get started quickly and to get an independent frame very quickly, and one stream for the bulk of the viewing, which is called the continuation stream. And the reason for that is actually because HESP doesn't just focus on the low latency aspect, but it also aims to increase the streaming quality to allow you to have better network adaptivity, but also to allow you to do those fast channel changes. So that you can basically reduce your buffer even further. And well, with HESP, you can actually hit even sub-second latency that comes with it range. The nice thing here is that the GOP size is decoupled from the segment size, decoupled from the chunk size, and decoupled from latency in general. So that's really one of the nice things that you have here.

But all in all, if you look at these three protocols, they're actually more or less the same. You can use identical bit streams for the bulk of the viewing. There are some differences in the manifest, there are of course some differences in the initialization stream that you have for HESP. But actually, very similar all in all. So, Johan, maybe it's time that we dig into the rest.

Enhancing Viewer Experience: the Multifaceted Approach to Low Latency Streaming

Johan: Absolutely. Thank you, Pieter-Jan! So before maybe really diving into what are exactly chunk sizes, segment sizes, GOP sizes, as we learn in a couple of slides. A very quick recap, because it's not just about latency, it's about the complete viewer experience. And yes, low latency is a critical part of this complete viewer experience, because low latency means we can increase interactivity and we can deliver OTT-content actually completely in sync with standard broadcast services.

So, what's low latency about? Making sure that your video images make it as fast as possible from source to screen. But we also need fast channel changes, fast startup times, fast seeks, because as Pieter-Jan already mentioned, spinners or loading hourglasses, that's really killing the viewer experience, and that makes viewers abandon the service. So, as we will explain in a couple of slides, Timely availability of these independent frames or IDRs, as they are called, which is linked to the GOP size, that's critical. And we will explain what would be a good GOP size for low-latency stream.

Networks are not always perfect. And also, in the case that you have a network that has some jitter, that has some bandwidth variation that might suffer from some packet loss, we don't want the video to stall. We don't want to see stutters. And that's why we not just need the latest streaming, but we need dynamic buffer management, we need adaptive bitrate to cope with these failures.

And of course, and maybe I should have started with that, video quality is a key element. Now, high video quality, it's feasible, but if you want to combine high video quality with a lot of IDRs, we need them for the fast channel change, that still is perfectly feasible. But then you need to work with pretty high bitrate and that comes with large network cost. So, of course it's not acceptable to have these high network costs. And that's why we will see in a few slides, we need to find a good trade off between bitrate on one hand and quality on the other hand.

So, in short, for a good viewer experience, we need end latency, short start times, ABR, video quality and low bitrate. All of that needs to work out together. And if we have that, that's probably what you will call the high efficiency streaming paradigm that's optimizing for the complete viewer experience without compromises. Now in the next slide we will see how we can get there, what configuration should we apply to  LL-DASH, LL-HLS, to HESP, what should we apply to get the best, the most optimal viewer experience and the lowest latency stream.

Deciphering Critical Parameters for Low Latency Streaming

Johan: Well, before elaborating on how to configure the LL-DASH, LL-HLS and HESP, we need to focus on what are critical parameters. And basically, three parameters are extremely important: that's the chunk size, the GOP-size, and the segment size.

Chunk size - that's important for obvious reasons because chunk size has a direct impact on your latency. Units, small packets that's generated by the encoder, that's managed by the packager, that's sent through the CDN, and that's taken in by the player and put in the buffer by the player. Ultimately, your latency will be a multiple of the chunk size. So clearly, the smaller the chunk size, the better. There are some caveats. We'll come to that later.

The GOP size, that's a second important parameter. Now, let me first briefly explain concept of a copy. If you have a video, some images are encoded independently. That means that these images can be decoded completely standalone. They contain all information to be decoded and to be disposed of. Of course, that means that they are large. They're called IDRs. They are large because they contain a lot of information. And that's why most codecs also make use of other images that are differently encoded. Let's call them Delta images. These images are not completely decoded. They only contain the differences. Well, it's a bit more subtle than that, but they contain the differences. So, you first need to have a first image, then you decode the differences, and then you apply these differences, and once you have applied the difference, you have a next image. What's great about these delta images - they are much more, but they could be five times or even 10 times smaller than these IDRs. So that's great for your bitrate and that's why codecs use it.

Now, what's the GOP-size? Well, after an IDR, you can have multiple of these delta images. Actually, you can put as many as you want. The GOP size defines how many of these Delta images there are following the IDR. So, if you have a GOP-size of one second, you have one IDR in that second and all the other images are these Delta images. If you have a GOP-size of two seconds, then you have one IDR for these two seconds and all the other images are the Delta images that are as said before, much smaller. So, the GOP-size clearly then has two direct impacts: first of all, if your GOP-size is smaller, well, clearly your bitrate will go up because you have more IDRs and that are the larger images. Or if you keep the bitrate constant, your quality will go down. Second important impact, that is that the GOP-size impacts how fast you can start up, how fast you can react network changes. Because clearly if you have an IDR, let's say, every second, you can much faster react to a change than if you would only have an IDR every, let's say, ten seconds.

The third important parameter is the segment size. For HESP and low latency HLS, it's mainly there for backward compatibility, but for low latency DASH, the segment, that's the basic units that you can address, that's the basic unit that you can request from the server. So, for LL-DASH, the segment size directly influences the startup time and the time that it takes, the ability that it takes to respond to network charging changes.

Now these three parameters are not completely independent. They are constrained one by another. Because it doesn't make sense to have a chunk size that's larger than the GOP-size. Also, because a segment is a basic unit for a LL-DASH, you need an IDR in a segment. So, it doesn't make sense to make the GOP-size larger than the segment size.

Now, okay, where does that bring us? Because it's fine, we have our three crucial parameters. We'll now see how these parameters influence the latency. As Pieter -Jan already explained, we have latency in the complete chain, have latency in the trans-corridor, you have latency in the network, just because it takes some time to transfer the bits, and you have latency in the buffer. And the buffer size, so the latency that's spent in a buffer, that's directly impacted by the chunk size, by the GOP-size, by the segment size.

Now let's first start with the easy situation. That's a steady state situation, a situation where the network characteristics are not changing. Then life is easy in that case, because you need a buffer just to be sure that you have the time to download the next chunk, because you always want to continue playing. So, you want to be sure that you have a bit of safety margin to get the next chunk. And that can be pretty short, especially if your chunks are short. Now, all networks have jitter, some more than others. And we want to take that into account as well, of course. And also, the size of images and chunks may vary, so that means that the time to download one chunk can be different than the time to download another chunk. So, we want to have a larger safety margin, but all in all, I would say, if you have a good network, we can say that the latency can be as small as, let's say, three times the chunk size. Looks great. Let's go for pretty small chunks, three times the chunk size, perfect low latency.

But networks are not stable, networks are unfortunately not perfect. The network conditions change over time. So, the capacity, the bandwidth capacity can change over time. There is heavy jitter, there is some packet loss and to cope with those variations, we need ABR, adaptive bitrate. We need an algorithm that selects a different quality. If your bandwidth drops too much, we will select a quality with a lower bitrate that meets the available capacity of the network. So, then we're in a different situation because then we cannot just simply download the next chunk of that new quality. We need to make sure that chunk can be decoded and that's when this notion of IDR, independent frame comes into the picture because it means that the buffer should be large enough, not just to download the next chunk, but should be large enough to ensure a smooth playback until we reach the next IDR. So worst case, that means that you need to have a buffer, roughly the size of the GOP. And yes, there is also a bit of other overhead like the initializer in DASH or like the playlists in HLS, but let's say roughly latency, bound by the GOP size. Now, to be fair, we could apply some advanced measures and maybe shave off a bit of the buffer so that we are not a full GOP size and maybe a bit lower. But all in all, the main message is - we won't cut it if we just have a buffer and that's the latency of our three chunks. We will come pretty close to the GOP size if we want to have end low latency, but specially guaranteed smooth playback.

Optimizing Parameters for Low Latency Protocols: LL-DASH, LL-HLS, and HESP

Johan: So now let us see what does that mean for low latency DASH. Well, for low latency DASH, well actually we can make the chunks as small as we want. We can start from the segment size and then we can go down extremely even to a single image, and the steady state latency will go down. But as we discussed, at some point further reducing the chunk size does not really have further impact, because we want to be sure that you get support for IDR switches, also because just there is the network latency. So that's why we will not go until a complete frame, we will have a bit larger chunk size and a larger buffer size.

The GOP size, as already explained, has two conflicting impacts: First, the larger the GOP size, the lower the bitrate. Secondly, unfortunately, the larger the GOP size, the larger that it will take to fetch the next IDR frame and we need that for an ABR switch or channel change. So, we'll have to find a compromise there, compromise between those two conflicting requirements. Finally, segment size, as already said, segment should at least be the GOP size, but for low latency streaming, it doesn't make sense. to make the segment larger than the GOP size because for low latency DASH, the segment that's the smallest unit we can address, so, we take the segment equal to the GOP-size.

And then that brings us to the following values that are optimized for low latency: a chunk size - 100 to 200 milliseconds, and then GOP  and segment size of two seconds. And that will lead to pretty good low latency performance, where we also take into account that we can do an ABR switch.

Now for low latency HLS: well low latency HLS doesn't work with chunk, works with parts, but actually it's the same - it's the whole segment cut into smaller pieces. And again, as with DASH, the smaller the parts, the lower the latency. Now, the difference with DASH is that on top of the limits that we already had for low latency DASH, that after some point it doesn't make sense to go lower in chunk size and latency, has another impact - that's the playlist. Because as Pieter-Jan already explained, every part is put announced in the playlist. So, for every part, playlist is updated and we have requests going back and forth. So, the smaller the part size, the more overheads we have. That means that a part size, it will be a bit larger than the chunk size that we could have for low latency DASH. GOP size - the reasoning is exactly the same as for low latency DASH, same trade off to be made.

When it comes to segment size, low latency HLS has a nice feature, the so-called independent parts - that means that we are not limited to segment boundaries to start playback. We can start playback with so called - independent parts. And as a consequence, there is no need actually for the segment size to be limited to the GOP size, we can afford a larger segment size.

So, if you combine all of this information, what would we say as a decent recommendation: a part size of one second, which also by the way is the recommendation of Apple, honestly going lower works, but you gradually start having a lot of overhead because of the requests and the playlist update. So recommended GOP size 2 seconds, same as for the latency DASH, and then segment size - we can go to 5-6 seconds, I'm going to say 6 seconds to be a multiple of the GOP-size.

And now for HESP, what should be the decent parameters to get low latency streaming there? Well, the important thing to understand for HESP is that the streaming is relying on the so-called continuation stream. And as Pieter-Jan explained, the continuation stream, that's just a regular chunked CMAF-stream, just like low latency DASH. So, when it comes to chunk size we can do the same - we can go as slow as we can go for LL-DASH. But there is good news: whereas for LL-DASH, we are constrained at some moment in time because we want to do ABR switches and so on, and the buffer should be large enough to go to the next IDR, we don't need that for HESP. So, for HESP, we really can go very, very low in chunk size up to a single frame. And that's the major difference for HESP because we have this initialization feed, this initialization feed consists of all these IDRs. And we use that initialization feed, we use an IDR of that feed to start streaming. And because we have all these IDRs, we are no longer limited by the GOP-size. We can start streaming whenever we want, whether it is for a seek, whether it is to change channel. whether it's to change qualities for ABR, thanks to the initialization stream HESP allows to start at any given moment, because we have these IDR images available from the initialization feed.

So that means that we can choose these extremely low chunk sizes, the buffer size can be kept low, but also, and that's a good thing - the GOP size of the continuation feed, and that's the one that's normally used, can hence be much larger than for low latency HLS or low latency DASH, because we don't need to wait for the IDR in the continuation feed. And that longer GOP-size will lead that we can go for lower bitrates for the same quality.

So basically, what HESP allows to do is to decouple the latency from the GOP-size. Whereas with LL-HLS and LL-DASH, the GOP size somehow constrains the latency because we want just that the stream works in varying network conditions when the ABR kicks in. With HESP, we decouple those, and so, as a consequence, the latency for an HESP feed will mainly come from the network latency and the time that it takes to download a initialization packet.

That brings us to the following recommendations: chunk size - one frame, the lowest you can get. Recommended GOP size - of course, you can do 2 seconds, which would be the case if you really want to have a fully compatible feed, if you want the HESP feed to be reused for LL-DASH and LL-HLS as well. But otherwise, we can go to four, five, six seconds GOP size and benefit from the bitrate reduction. And the segment size - we can pick whatever segment size that we want, larger than the GOP size, but we can go for the six seconds that we have for LL-HLS, we go to 20 seconds, we can go higher as well. There, I would say the size limitation does not come from latency, but comes from the fact that some CDNs like smaller files more than larger files, so, we are there more constrained by CDN than by latency metrics.

Comparative testing of Low Latency Protocols

Johan: So of course, we want to test that - we start from a video, we have a burnt-in timecode. That video, we transcoded it into three different feeds: an RTMP feed, a low latency DASH feed, and a low latency HLS feed. The RTMP-feed that we take and we send it to a cloud-based HESP packaging environment. So, the RTMP feed is sent to the cloud, rhere is an HESP encoder, there is an HESP packager, and then that HESP feed is sent over a CDN.

And so, then we have the three feeds, an HESP feed, an LL-DASH feed, a LL-HLS feed. These three feeds are then displayed on a laptop, and that gives a feed. We also have the reference image that immediately comes from the source video. And then just by comparing the two images, because remember there was this burnt-in timecode, by comparing the two images, we will have a very accurate way of telling what is the end-to-end latency, and we know that then for the three different protocols. And of course, because we have the three feeds, we can also measure the bitrate.

So that's the setup for our comparison. Now what we have done now is - we did a screen capture of the test environment, and we are going to show that screen capture. So, the same source displayed three times as HESP feed, as LL-DASH feed, as LL-HLS feed and the latency is displayed for these three feeds. So, as you can see with HESP we achieve the lowest latency, and then DASH and HLS are being a couple seconds behind. Now, that latency difference of a couple seconds might not look too large, but if you really compare them side by side, and that would be the case if you are watching a sports game on protocol A, and your neighbours are watching the same sports game on protocol B, then these few seconds really make a difference because on HESP you would see the winner first and then you would see the winner in DASH and just a split second later you would see it in HLS. And the few seconds really make a difference between your neighbours cheering and you asking why, who won the game.

Comparison of Low Latency Protocols: Key Parameters and Performance Metrics

Pieter-Jan: So, to wrap up, let's put all of these results in a table. So, comparing low latency DASH, low latency HLS and HESP. Again, just to recap on the three important parameters: GOP size, chunk size, segment size. So, for the GOP size, low latency DASH, low latency HLS, two seconds, I said before, HESP, larger GOP size, six seconds, which will lead, we'll see it in a minute, to a lower bitrate. Chunk size, low latency DASH, 200 milliseconds. LL-HLS, one second part, again, it could go lower, but then the overhead of the playlist update and all the requests that are associated becomes too high. And HESP, one frame as a chunk. Segment size - two seconds for a DASH because the segment is  the smallest unit that we can use to address the feed. So, let's say it should ask six seconds, HESP, we put it to 24 seconds just to indicate that the segments can be large.

So, what does that give us then in terms of latency? Well, HESP, we managed to get a protocol latency, so protocol latency being the latency between the output of the transcoder and the display on the screen of roughly half a second. Low latency DASH, the test that we have done was around 2.5 seconds, low latency HLS, 3 seconds.

Now, of course, this is in a pretty good environment. If your network is worse, then these values might be a bit higher. Clearly, if you would stream, let's say, from Australia to Belgium, where we are. Your RTT or just your network delay will of course impact these values.

Startup times, HESP had a startup time of 400 milliseconds. For low latency DASH, and low latency HLS, we have a range of startup times because it depends on what you optimize for. If you really optimize for the fast possible startup time, then we also have the same startup times there often half a second, maybe 700 milliseconds that order of magnitude. But then, at the moment of the startup, you are not at the buffer depth that you want. You're also not at the target latency that you want to have. If you want to start right away streaming low latency DASH, low latency HLS, if you want to stream right away with correct targets - buffer size and the target latency, then you need to accommodate for a bit more time before the playback can actually start. And that's why we indicate that range there.

Bandwidth - HESP, because of the six-second GOP as the lowest bitrate, for the video that we've selected, DASH and HLS are only 5% more in bandwidth. That's because the content was pretty dynamic, as it's the sports game. The overhead can be much larger. And then so far it can be up to 10% or even more that DASH or HLS will require more bandwidth than HESP.

Now, finally, the number of requests for LL-DASH. We have 60 requests per minute, not just the consequence of the segment size, we have two second segments, so that means 30 segments per minute, and we need two requests per segment. One to get the segment, to request to get the segment, and then another one for the manifest update. We can find a solution to reduce the number of manifest fetches, but that would be just too far. LL-HLS, we have a request per part, we have parts of one second, so that are already 60 requests per minute, but we need to request per part to get the content and we need to request to update the playlist. So that brings us at 120 per minute. HESP ultimately just needs a request per segment and with segment sizes of 24 seconds, that's about three requests per minute that we have.

Pieter-Jan: Yeah, so I think that that's a very nice overview. I'm very happy to see that a lot of people are sticking with us because we're coming up close to the hour. Don't worry, we'll go over a little bit for the Q&A as well.

But yeah, as you can see from those results, it's more than just the latency as well that you want to optimize. It's really what we call the high-efficiency streaming, getting low latency, but also getting that low startup time, getting that ABR and that video quality and really making sure that you get that full experience. Reducing the latency, that's good if you want to avoid the desynchronization, but you really want to make sure that you don't get those spinners, that you don't get stalls. And that's the full superior viewer experience without any compromises and preferably at a cost that actually scales for your business. Those are the things that we really think are important, especially moving forward for most setups out there.

And that's actually us reaching the end of the talk itself. So, I hope we did manage to transfer some of the knowledge that we've built up. And well, there are definitely questions because I can see them coming in. So, let's get started with the Q&A session. And if we cannot tackle your question because you have to leave and you don't have time behind the hour. Just let us know. We'll try to pick it up over email as well and we'll get you that answer. So yeah, let's get started with the Q&A session is what I would propose.


Pieter-Jan: So, let's see what kind of questions we have and let's just go top to bottom, I think. So, first question:

“Does HESP support HEVC or AV1 codecs? And what kind of audio codecs does it support?”

In theory, HESP can actually support all of the codecs. Most of the testing that we're doing right now is happening with H.264. We've done some testing with HEVC already as well. Of course, the question will always be what kind of packagers do you have in your setup? There are multiple packagers for HESP at this point in time, but yeah, it's going to be the vendor game, of course, we need to keep expanding and keep working together with the HESP Alliance to broaden support there. Johan do you want to add something as well?

Johan: No, it's fully correct.

Pieter-Jan: Now let's go to the next one. And that looks like it's one for you:

“What's the impact of TCP acknowledgements and retransmissions and the influence on latency?”

Johan: That's a complicated question and complex to answer. The simple answer is the worse your network is, of course, the more it will influence latency. Because if you have a packet loss that then will lead to be transmitted at the TCP level, that will immediately impact the round-trip time or the time that it takes for the video image to go from the server, let's call it like that, until the displaying device. So that will manifest itself as a longer network delay and then of course, directly impact latency. Also, because it's a TCP network, of course, what we have is there is a lot of packet loss, and the TCP windowing will kick in, and your actual throughput will also go down. Actual throughput going down, of course, well, yeah, you might even have to go as far as switching to a lower quality, but also it means that we have a bit less headroom for the different chunks to make it from the server to the player buffer. Just if you would have a four megabit per second, just making up a number, a video feed, if you have a four dot something kind of network in practice because of the TCP windowing, then you are really pretty close to the limit and probably the numbers that I gave are already over the limit. Whereas if you have abundance of bandwidth, then you have more headroom and your chunks will move very fast, the buffer gets refreshed rapidly. And so, you can then live with a smaller buffer than what you would have in case of a bad network.

Pieter-Jan: Yeah, so indeed it all boils down to the buffer, I think. And of course, chunk size and all those kinds of things that match up with that as well.

So, let's look at the next question:

“In unstable network examples, like LTE radio networks, small frames will be faster. But what about the quality versus packet and frame loss? Will it compromise quality or packet loss?”

I think that's very similar to the question that we just answered. So again, it will be down to the buffer.

“Are the requirements for latency distribution (ex. CTE) supported by the most well-known CDNs?”

They are actually. It does start to vary once you start going into the compatibility mode, for example, between HLS, DASH and HESP. If you start using, for example, byte range requests on low latency HLS, byte range requests in combination with chunk transfer encoding aren't always supported. And the same can actually be said about HTTP2. Most CDNs these days, they all support it. You might bump into some limits and some extremes. Actually, when we started, working with low latency years ago. My answer here would have been no. Most well-known CDNs don't support everything that you need, but these days I think we can say pretty easily that almost all CDNs do support all of the requirements that you need here. So that should not be a problem.

And then:

“What is the reason not to use RTP streaming?” 

Johan: There are many, many reasons for that. But I would say the first reason is as we are focusing on OTT streaming to a wide range of devices. And OTT-video is mostly consumed in web browsers, RTP and web browsers don't go together well. So that's why we need to have a streaming protocol that can be displayed in web browsers. And that's outrules to some extent RTP. RTP on the other hand is mainly used of course in IPTV networks, which is a managed network, and there are clearly RTP has advantages. So, I don't think it's a matter of why not using RTP, it's a matter of RTP has very good use cases. We need to work with HTTP based streaming protocols because we want to address amongst other browsers.

Pieter-Jan: Yeah, and it's not just browsers, of course. Keep in mind it's also webOS, Tizen devices, all those. They all run on that same stack which is limited like web browsers. And of course, also there's one other good reason that I can come up with. You have the traditional broadcasts and traditional distribution, which is happening over RTP for a big part of it. But that means that you need to set up to parallel silos again. You need that IPTV distribution, you need HTTP distribution in parallel, also not really optimal. If we ever want to be able as an industry to really bring those together, we'll need to find one solution that works across the board. And that might be still a little bit trickier than if we try to use RTP-based streaming, I think.

So, a lot of questions on HESP it appears:

“So which encoder and origin devices software support is there with setup for HESP?”

If you really want more details on HESP by the way, I really recommend you go to the HESP Alliance website. There's a lot of information there. We've done specific webinars on HESP in the past. Of course, if you have other questions, just let us know. We're more than happy to answer them. But what is the support for HESP at this point in time? We have a few different packaging vendors who are working on it. We have, for example, Synamedia, also a part of the HESP Alliance - they've set up their packager to also produce HESP feeds. And there are many more. We're also working, for example, together with Videon, they're adding support for HESP right out of their EdgeCaster boxes. So, there's quite a lot popping up there, and that's looking very good. The other thing, which is very nice, HESP just uses the standard CDNs, the standard origins. So, you can really use commodity software there, it's already available. It's often a lot cheaper than setting up specific CDNs for specific protocols like you would have to do with webRTC. So, we're really building a lot of support there, I think that's some good news! And that probably also answers part already of this question:

“What kind of Packager support is there for HESP? What kind of player support is there?”

We definitely support it, of course. There's a lot more to come. We don't have too many players yet, but of course with the HESP Alliance, we're really trying to build that out. So, if you look on the HESP Alliance website, you'll probably see already some names that you're familiar with, and well, we can grow further from there.

So, the next one:

“I was wondering if there have been any discussions plans on video streaming metrics logging within the HESP specification, drawing a parallel to the quick HTTP3 group where logging is being discussed as we speak. Is HESP planning on defining a set of metrics which can be tackled to analyse performance?”

That's a good question. Don't know if you want to tackle it or if I can tackle it.

Johan: Go ahead.

Pieter-Jan: Well, I think there are definitely options that we have here. Do we have specific metrics at this point in time? The answer is no. Is it fully compatible with things like common media client data, those kinds of things? Yes, absolutely. We actually have a connector built already in our player that can send you all of those same metrics with HESP, with HLS, with DASH, whatever protocol you are using. But there is still a lot of room for further deepening this. If you have good ideas, if you have good questions or metrics that you would like to see, well, talk with us, talk with the Alliance. We're happy to discuss how we can expand the protocol. I mean, this is a protocol that we really want to make as open as possible. So that's definitely something we're open to discuss!

The next question, that sounds like a good one for you:

Johan: “Does the initialization feed of HESP have an impact on required bandwidth?”

The answer is no, when it comes to the bandwidth that's going, let's say from the server to the player. The reason for that is that normally it's the continuation feed that's being used. The initialization feed is only used once, and that is to start the stream. So, at that moment, we get an IDR from the initialization feed, we bring it to the player, to initialize the player, and then, immediately thereafter, the continuation feed takes over. So, it's only one IDR that's added. And that one IDR, you need it anyway, because without IDR, you cannot start decoding. So, there is no impact on the bandwidth required towards the player.

Pieter-Jan: Yeah, if anything, there's actually a positive impact. Because if you wouldn't get that IDR and you would need to re-download a part of it, then you would need to actually download a lot more. And of course, because we have that initialization feeds, as Johan actually explained, you can have a lot bigger GOP sizes. So, the impact is actually a positive impact, if you would ask me.

Johan: Yeah, absolutely.

Pieter-Jan: “Does the need for DRM affect my choice of HLS versus DASH versus HESP?”

It really depends on what kind of platforms you're targeting, because if you do all of this correctly and you start setting it up with CMAF chunks, what you can do these days is you can actually make one uniform stream using CMAF with CBCS-encryption, and you can have HLS with PlayReady, FairPlay, and Widevine, you can have DASH with PlayReady, FairPlay, and Widevine, and you can actually have HESP with PlayReady, FairPlay, and Widevine as well. So, the need for DRM doesn't really impact the protocol that you have to pick. Of course, if you need to support older players as well, or older devices usually, then you cannot always use CBCS. So, you might need to set up a secondary stream with CTR-encryption in there. But that's going really deep in the DRM side of things. I think the very short answer here is that the protocol and the DRM, they're not really influenced anymore by each other these days.

“What platforms support HESP play-outs - Fire, Roku, etc?”

Well, this is a very good question. It's actually a good question for all of the protocols at this point in time. If I purely look at what our player does, our player does low latency HLS, low latency DASH, and HESP across all of the platforms, except for one, which is Roku. On Roku, you always have to use the native Roku player, there our SDK also has to leverage the same streaming pipeline. There you cannot play out low latency HLS, low latency DASH or low latency HESP. Well, you can in compatibility mode, but it will not be playing at a low latency. So, I think that's the big disadvantage there. But other than that, the Fire TV platform, all the web browsers, the smart TVs, all of them can play out low latency HLS, low latency DASH and HESP. I think that's one of the nice things about the compatibility that we have with between the three of them.

“Isn't this latency a little bit theoretical, as it would typically have a tiered CDN networking adding latency to each step?”

Something we discussed earlier today as well.

Johan: Yes, absolutely. The answer there is no, because the measurements that we have for HESP are actually with a CDN in between. So, what we do is - we have a video feed, we encode that, we make it turn it into an RTP feed, we sent an RTP feed to an AWS based transcoder and packager. We send that into a commercial CDN, and then it's fed back into just a web-based player. And there, I mean, just a couple of days ago, we did the measurement, and it was actually 510, if I'm not mistaken, milliseconds different glass-to-glass. So, the values are already there. Now, of course, you are fully right - each step in between will add latency. Just as if you have a bad network,  then you will have to increase your buffer size a bit to avoid stalls and stutters. For HESP, we download each chunk one after the other. If you really have such a bad network, then it takes half a second between two consecutive images to be downloaded for whatever reason, then if your buffer isn't at least half a second, you will have a stall. So, if your network is decent, then these values are correct. And yes, for HESP, there is a CDN in between.

Pieter-Jan: Yeah. So, I think the answer there is. It's not theoretical, it's what we actually measure. So that's good!

“Can a CMAF or can a CMAF segment shared by HLS and DASH also be consumed by HESP?”

There the answer is yes. We already mentioned it a few times, I think. But the nice thing is that the HESP continuation stream can actually be a normal low latency HLS or a low latency DASH stream as well. As Johan did mention, you do lose the ability to use very big GOPs if you want the low latency capabilities of HLS and DASH to be shared with HESP. But the nice thing here is yes, it is all the same. So, you can have all of those segments, you can have them cached just once on your CDN, getting very good hit ratios, something CDN vendors probably will like a lot. And usually actually viewers also like because it means that your round-trip times will reduce if you don't have to go to origin. So that's the impact that you were just discussing as well. So, the long answer was that. The short answer was yes.

“Maybe outside of these conversations, but are all of these numbers start after our encoder delay? If we are comparing a live video stream to the final output, even HESP will be longer than 0.5 seconds because of encoder delay. Are there faster encoders versus slower encoders?”

Johan: So in 510 milliseconds  was with the encoder included. But yes, there are faster encoders and slower encoders. It really depends on what the encoder is made for. First, I would say some proof points that you really have fast encoders - we have been using FFMPEG just as a workhorse, 60 milliseconds delay introduced to do the transcode. So yes, that's 60 milliseconds, but I think it's pretty much acceptable. You also have encoders like the Videon 1 that are meant for as a contribution feed. These encoders are really super fast, even faster than these 60 milliseconds achieved by FFMPEG in a specific mode. So also, if you would have the encoders that are used for video conferencing, all of these are damn fast. On the other hand, you also have slower encoders. And that are encoders that basically made a different trade-off that make use of more optimizations in the encoder, that make use of multiple passes in the encoder. And the reason why these encoders made that design choice is because they want to get an as high as possible quality for the lowest possible bitrate. So, it really depends on what the encoder is optimized for.

Pieter-Jan: Yep. I see we still have quite a few questions. So, we'll try to go through all of them. At a certain point, we will probably have to cut off, but let's try to tackle the ones that are outstanding at this point.

“What is the overhead for error recovery?” And I'm more or less assuming here that for error recovery, we’re talking about packet loss, which we already mentioned. Might also be a connection, which is basically completely disconnecting at a certain point in time. That's something we haven't talked about yet. All of that, what you really need there is buffer. Just to give you an example, I go outside here when we're in the office. When I'm on meetings, I walk outside to go and have a walk. Even then, I notice if my connection switches from Wi-Fi to 4G, I have a network disconnect when in the video conference. If you have that kind of set up, a real network disconnects, you will always have an issue. If that disconnect isn't too long, protocols  HESP, HLS, DASH, they will be able to recover it in time as long as the buffer was big enough. If the buffer wasn't big enough, then you will have impact. But if you cannot push bits to a network for a few seconds, then I think there's very little that you can do about that, unfortunately.

“Does the init stream only contain IDRs and continuation stream does not have IDRs?”

Johan: Well, the continuation stream also contains IDRs. The continuation stream is just a regular CMAF stream. Any, I would say, DASH player is capable of playing the HESP continuation stream because it's really a regular CMAF-CTE stream. So, it contains IDRs. The only difference compared to everything an optimized LL-DASH CMAF stream is that our GOP size can be larger and then with the advantage just already explained of lower bitrate and so on. The init stream is actually, yeah, we call it 'stream', but it's never continuously streamed. It’s there, it’s on the server and it’s only used when we want to start playback, we just ask - give me one IDR. And so yes, the init stream, that's an intra only coded stream. So, for each image position, there is an IDR. So, whenever we want to start playback, whether it's at T0 or 40 milliseconds later, we request the initialization stream's current IDR, and we will get it. As a response to our request.

Pieter-Jan: Okay, weird sentence, but :

“You don't have to request per part. You can do full segment requests after the initial byte range request for one second parts.”

Definitely true for LL-HLS, you can indeed, you can actually cut the number of requests that we mentioned in our test results in half if you start using byte range setups and you only have to request once for every segment and once for every part, the playlist. But then you can basically have one preload hint requests and in parallel just playlists requests. So that's definitely true. The same can actually be true for a DASH as well. So, we have the setup where we were loading and the manifest and the segment. For DASH, you can actually just request the segments as well if you have time-based or number-based segments. Then you just load to manifest once at startup, similar to how we do it with HESP, and then you can just keep loading the segments one after the other without having to load the manifest. So definitely true.

''LL-HLS with byte range request can't we reduce the number of requests?''

LL-HLS with byte range, so indeed the same question. So, the number of requests can be reduced, but keep in mind even if you cut those 120 requests per second in half, it will still be quite significant. It will still be 60 requests if you take in mind that you can cut it in half with DASH as well, so DASH then becomes 30 requests. That's still an order of magnitude bigger compared to what HESP does and there's still a pretty significant multiple there between HLS and DASH as well.

“How do you see big guys like Netflix, YouTube, etc. adopting HESP?”

That's a very good question. I do think that there's one extremely big difference as well. Netflix, YouTube, mostly known of course for VOD indeed. There we still have benefits, for example, seeking time, also ability to switch qualities very quickly. But of course, with video on demand, you can build a lot bigger buffers. So, the need probably is a little bit smaller there. They will definitely be interested, I think, in increasing the quality. But even there, for video on demand, that GOP size, it's not that important. You can make it pretty big. So, unless if they want to start optimizing the seeking time, I don't think that it's something that they have to adopt immediately. But for all the live streaming, it is extremely important, or at least it's extremely interesting. Even if you have a live feed like YouTube live, what they could actually do is they could just store the continuation stream afterwards, chop it up in a regular big segment, HLS and DASH feeds, and surf it out like that as well. So, they could have the best of both worlds and have compatibility there.

“Doesn't inserting an IDR from the initialization feed in the middle of a GOP from the continuation stream cause the following B/P frames to become inconsistent?”

That's a very good question.

Johan: That's a very, very good question. We always get that indeed. The answer is if you do it carelessly, the answer to that is yes. If you do it good, the answer is no. So, I'm going to just skip the B frames for the time being because it's easier to explain with P frames. We do have, let's say, if we impose a little constraint on the P frames because a P frame can reference one or more previous images. If we constrain the referencing to only reference one previous image, in that case, the key frame injection works because the P frame references previous image and will actually reference them suddenly to the newly injected IDR. And that works perfectly well. So of course, we need to constrain that it only references to one previous image, which we have done tests that hardly impact quality versus bitrate. If we have B frames, then the situation becomes a bit more complex because as you know, a B frame can reference and to a previous image and to a next image. If we have B frames, then this famous keyframe injection, well, we have several options, but in that case, the keyframe injection cannot happen at every given moment, but can only happen at what we would call sub-GOP boundaries. So, that would mean that if you would have a GOP structure with an I, the three Bs and the P, we would not be able in that case to inject an IDR at every B frame position, but associated with the B frame position, which means that in the example that I gave, we can do a keyframe injection not every image position, but let's say every image position out of four. And the way how the HESP player works is able to sort that out just by the way how the IDR initialization stream packets are constructed.

Pieter-Jan: Yeah. And there's actually the other approach that we can take there as well. There's still some debate. We will probably have an update of the HESP specification in October or November. Which will detail most likely more about the B-Frames. Because otherwise, and our alternative is actually sending an IDR frame and a B-Frame in the initialization segment. And then we could handle the two references of the B-Frame as well. So, there are options here. And if you're concerned about the quality that Johan mentioned as well, don't worry about that. We actually have been working together with the HESP Alliance and with some universities, which have been researching this exactly. So, there are papers available if you're interested to read more about that.

Johan: Also, with respect to the B frames, if you have B frames, then reordering is necessary to do the decode. So, introducing B frames will automatically introduce additional latency. So, if you have let's say three, four B frames, your reordering will impact three or four image positions, 25 FPS. So, four times 40 milliseconds, that's 160 milliseconds, which is not a lot for say normal HLS or DASH. But if your target is sub seconds or say even as aggressive as the 500 milliseconds we refer to, in that case, 160 milliseconds is a lot. So what we see is that prospects we are talking to, prospects we are talking to for really ultra-low latency streaming, that they really want to squeeze out most of it, and that they rather prefer putting, let's say, 100 milliseconds in the buffer to be a bit more resilient towards nasty networks than to put 100 milliseconds in the decode tree ordering to have B frames.

Pieter-Jan: Yeah, that's indeed the encoding chunk that we discussed earlier, the packet that you need, or how far you want to be able to reference frames.

“IDR and its following frames are actually in sequence. Is it not the same to deliver the init stream and continuation stream within one stream as traditional protocols do.”

Yes and no, there's also of course, cacheability here that we want to keep in mind. We don't want to stream out all of them one after the other. So, you could in theory, stitch on the edge, the initialization stream and then the continuation stream one request but that would not be as ideal for caching. We really want standard CDNs without edge compute to be able to just download the continuation stream given a certain starting reach.

And then I see that there was also another question.

Johan: Yeah, it's a question we answered. “Do you really see B-frames for live streaming?”

Not really if we really want to go for the lowest possible latency.

Closing Remarks

Pieter-Jan: Yeah, and then I think we're more or less reaching the end of all of the questions. So that's good.

So, I would say really thank you everyone for joining, I found it extremely fun. Keep in mind, we still have this survey that we're asking would help us a lot if you're willing to answer it. And if you have any other questions or questions specific to your use cases that you would want to see answered, you can always reach out to us, or you can just reach out to us through the website. We would be more than happy to set up a conversation and to talk about how we can further help you with any latency questions that you have. So, thanks everybody for joining. And I would say I hope to see all of you again, hopefully at one point in time in person as well. Let's see, I have very good hopes for the upcoming trade shows and especially those ones following next year probably.

Johan: So good for the day, evening, depending on the region where you're in and thank you for staying with us and listening to us.

Back to top


Pieter-Jan Speelmans - Hexagon-1


Founder & CTO at THEO Technologies

Pieter-Jan is the Founder and the head of the technical team at THEO Technologies. He is the brain behind THEOplayer, HESP and EMSS. With a mission to ‘Make Streaming Video Better Than Broadcast’, he is innovating the way video is delivered online from playback all the way to ultra-low latency streaming. Pieter-Jan is committed to enable media companies to easily offer exceptional video experiences across any device.

Johan Vounckx_Hex-01


VP of Innovation at THEO Technologies

Johan Vounckx is VP of Innovation for Theo Technologies. He works on inventive methods to improve the delivery of streaming video. Prior to joining THEO, he worked for major players in the video and broadcast industry, such as EVS Broadcast Equipment and Telenet (Liberty Global Group). Dr Vounckx received an MSc and a PhD from the University of Leuven, Belgium.


THEO Technologies

Founded in 2012, THEO is the go-to technology partner for media companies around the world. We aim to make streaming video better than broadcast by providing a portfolio of solutions, enabling for easy delivery of exceptional video experiences across any device or platform. Our multi-award winning THEO Universal Video Player Solution, has been trusted by hundreds of leading payTV and OTT service providers, broadcasters, and publishers worldwide. As the leader of Low Latency video delivery, THEO supports LL-HLS, LL-DASH and has invented High Efficiency Streaming Protocol (HESP) - allowing for sub-second latency streaming using low bandwidth with fast-zapping. Going the extra mile, we also work to standardise metadata delivery through the invention of Enriched Media Streaming Solution (EMSS).

Want to deliver high-quality online video experiences to your viewers, efficiently?

We’d love to talk about how we can help you with your video player, low latency live delivery and advertisement needs.