How to Simplify Your OTT
App Testing Cross-Platform

Watch the recording and master automated video playback testing

  • Navigating the cross-platform maze: understand the complexities of testing across different operating systems, versions, and hardware.
  • Finding the optimal balance between device/platform coverage and test depth.
  • Establishing an automated video playback testing system with a single interface and test suite.
  • Exploring real-world stories and experiences in video playback testing and debugging.

Webinar Transcript


Benny: Hello everybody, a warm and friendly welcome to all of you. My colleague, Michel, and I are happy to host today's webinar from THEOplayer on how to simplify your OTT app testing cross-platform. I see people are dialing in from all over the world, which is great. Here in Belgium, we just had a long Easter weekend behind us. And considering that we are based in Belgium, there is always plenty of chocolate here. So, we have hopefully fueled up to do this webinar today. Let me have a look, see lots of people are still dialing in. So, thank you all for dialing in. That's great to see so many people on the webinar. So maybe while we wait and give other people the chance to join, let's do some introductions first. Michel, you want to start off?

Michel: Sure, Benny, sure. Intrigued by the Easter weekend and the chocolate. But welcome to the webinar. I'm Michel, VP Engineering at THEO since four years, apparently it's around my anniversary, LinkedIn tells me. Before that, I was 14 years in telco with Telenet and Liberty Global, and always encountering test automation, always a fun topic. So pretty excited to talk today. Over to you, Benny.

Benny: Thank you, Michel. So, I'm Benny, I'm part of the Customer Success team here at THEO. A large chunk of my career has been centered around video and quality assurance, actually. So as a QA consultant, I've worked with numerous cable operators across Europe, diving into their video delivery ecosystems. And it was actually during one of these engagements that I had the pleasure of meeting Michel. So, it was really great to be reunited with Michel here at THEO, and even better that we can do this webinar, and share our collective experiences here today.

So, without further ado, I would say let's get started. So maybe first let's do some practical things for this webinar. So, this is the screen you're looking at now. You see there is a ‘react’ button. So, we really love it when it's interactive and people react. So, feel free during the slides to show some emotions there. Next to that, at two points in the webinar we'll do a poll. So then you will see the poll pop up on your screen and if you access the "polls" section, you can actually see which results and the outcome of the polls. And then thirdly, there is an option to ask questions. You can also use that option if you give a specific answer in a poll, and you want to provide some additional context feel free to just put it in the question box. There are points in the webinar where we'll pause to answer these questions and also at the end of the webinar we'll come back to it and answer all the questions. If there was something really specific, we might reach out after the webinar to you, set up a meeting maybe to follow up.

So that's all the practical stuff behind us. Now, let's start off immediately with a poll. So, the question here is what are the top challenges you face today while you're testing your OTT ecosystem? So, give everybody a minute to answer here. Go ahead.

Michel: Look at those results coming in. Always interesting.

Benny: So, is it a bit what we expected, Michel?

Michel: Yeah, it's in the line of my expectations. It's even more than I expected. That's very, very interesting.

Benny: Yeah. So, as we can see here at the moment, the strong lead is that too much platforms and specific devices are needed. So that's indeed the one we expected, but maybe not so prevalent, as Michel described. Second, but that's then only 15%, is addressing the different real life customer conditions. And then, yeah, all the others follow closely behind, but there was a clear separation on the platforms and the specific devices. So that's really interesting. And we really hope that after this webinar, we give you some insights on how we tackle this problem at THEO. So, thank you all for providing your answers. It's really useful.

So, let's look at the agenda. So, we'll start off with looking at the cost platform base. So, the answer most of you just selected. So, a bit situating the problem. Then we'll look at the approach we took at THEO to tackle this. And then there is a showcase from Michel that shows you live demo of how we implement the automation. And then afterwards we look at the value of all of this, some key takeaways, and then a round of questions and answers, of course.

Navigating the Evolving Landscape of Cross-Platform Testing

Benny: So, let's get going. First of all, the cross-platform maze. So, I think when I started work about 15 years ago, it was all about set-up boxes. So typically, you want a platform consistent across your user base. And I personally experienced the growth in the devices that you had to support. So first, of course, the addition of web, then mobile devices, then they started with casting, smart TVs. And then next to that, you saw that a lot of companies want to do their own thing, like the birth of Fire TV and Roku for instance. So many different things to take into consideration.

Next to this, there was also the aspect of the size of your release versus the speed. So, I think that's a well-known thing in software development nowadays. The more features you add to a certain release, the more complicated it becomes to get it out. And typically, product departments and marketing departments just include some more features in the release and then at the end the quality is suffering under it, and we cannot get the release out anymore which will then ultimately impact your customers and that's something of course we want to avoid at all costs.

Next to that is the mandate for cross-platform presence. So, it becomes actually a necessity today. So, when you don't support your app cross-platform and you're saying, okay, let's for instance, not support connected TVs or let's not support a given platform, that's almost a given churn risk at the moment because customers expect that you deliver similar experience across all platforms. That's definitely a big challenge in today's market.

Next to that, of course, we have the different streaming technologies that we have to take into consideration. And then in general, the device fragmentation, when you think about it, all the different operating systems, but also the versions of the different operating systems. What today is becoming significantly more complex is the versioning of smart TVs, like people buy a new smart TV or a new TV maybe every eight to ten years. So, we need to support a lot of older operating systems and also older hardware which makes it very complex to deliver that consistent user experience. Michel, over to you.

Michel: This is a very interesting topic. I think for everybody who's ever experienced test automation, one of the key elements that I've always seen is the mandate to be on top of your results, basically the experience to always look at every result that is mandated as green or red is actually correct. There's nothing as bad as a failed test being marked green, but also the opposite. Because if you let the test results grow beyond you, so you have, before you know it, the large population of devices with different versions in your rack, typically that you're maintaining. But if you allow it to deteriorate, if you allow it to start to show false positives or false negatives, the moment you know it, you're not able to master your results anymore. So, you really need to make sure that you can always trust on what your results are telling you. And basically, that means, yeah, a sort of a mandate to every day, look at the results and be diligent about understanding what is happening. And what we've always seen is, it's of course the perfect world that you test completely white box, like a customer would do. But in the end, you want to mainly get to stable results.

And that's a topic we'll also cover a bit later on today. That is, how do you manage all of these platforms and still get accurate results? So, what we balanced is: how do we get the most coverage? Are still able to master the results? And what test depth are we able to achieve? Like sometimes you do want to sacrifice a bit of test depth to get a wider coverage across your platforms, just to stay on top of the results.

Poll: Current OTT Testing Methods?

Benny: All right, so it's time for another poll already. Now that we've dived in a bit deeper, how are you currently testing your OTT ecosystem today? So again, giving you all a minute to answer.

Michel: I’m starting to believe that you really like polls, Benny.

Benny: Well, it's very interesting considering, I think the background I was in, so because we had to tackle these challenges many times in the past, so it's always interesting to hear how others do it.

Michel: You can't get away from your history.

Benny: Again, it seems that there was a very strong winner here. And that is we tested in-house before every update. Interesting, Michel, right?

Michel: That's a very interesting one to see indeed. And combining that with the previous one, like too much platforms and specific devices, I can imagine some people are having a headache with every release.

Benny: Yeah, headache is an understatement, I think.

Michel: It's for me still with every release, like you said earlier, like your releases become bigger, you're not able to get them out because you need to cover your full test scope. Your marketing team is already on top of you because the release is late. We want to shout to the market and then it goes to the market, and you still find bugs in your customer services. 

Benny: And we are not even considering specifically the video aspect here, because, of course, testing your app goes beyond the video part, you know, all the bells and whistles, the features, the UI. So yeah, that's a big headache.

Michel: Well, maybe a little bit of what is still to come. If you look at your OTT app, 80% is your video player.

Benny: Yeah. When we look at the results, the in-house answer is almost 60%. And then second, almost 20%. We're already using an automated test system, which is also interesting. So, if you want to disclose it, we would be really interested. So please just pop it in the question section, which system you're using today, or maybe you built your own system based on open-source solutions. Just let us know. It's interesting to learn.

All right. Thank you all again for providing these insightful answers. Let's move on. So, Michel, over to you again.

Deliberate Choices in Test Automation Approach

Michel: Really, it's me again?! Okay so our test  automation approach. Let's see what we're going to talk about. So, we made some deliberate choices. At THEO, I think we do test automation since three years and a bit. I think when I started at THEO, still had a very big QA team actually, but that quickly grew beyond us: at THEO we have four SDKs for people that don't know it. So, we have a Web SDK, iOS SDK, Android SDK, a Roku SDK that cover all of the platforms from smart TVs to web browsers to fire TV sticks and those kinds of things. So, if you want to start testing all of the new features or you want to certainly test all of the regression at the pace that we are releasing, that becomes a lot. So, we made some deliberate choices at what is the test depth we go to get to more stability in our results. Because actually, we do it with a very small team, and we moved almost all of the quality assurance responsibilities to our development teams. But to be able to do that, you do need to make sure that your maintenance is pretty low on your test automation system and that you can really rely on it.

So, as we will, or as I will demo a little bit later today, we made a deliberate choice to really validate based on APIs and events that are exposed by the player and by the platform. So, we are not going into the visual aspect really deliberately because you will always have that 5%, even maybe less, noise that is actually not a failure and it will creep upon your developers that are looking at your results. They will get frustrated and before they know it, they give up. So, test step versus testability was a deliberate choice.

I think the second one, and it's interesting that number two in the poll was that you're already using automated testing systems. There is a lot of technology out there. And maybe already a spoiler effect, the tool doesn't matter. So, we built a tool ourselves at THEO, but actually you can do it with any tool. The tool is not a silver bullet. But what we really chose for at THEO is to make sure that we are able to run tests using one technology being JavaScript. Everybody knows that. Ideally, also with a framework that everybody can Google, which I think is one of the biggest things at THEO that we set: the answer needs to be Googleable. If you have built something proprietary and you need to always ask the same person, that's not going to scale, that's not going to work.

Then on the other end, to control the devices, like I said, there is a lot available. You can use Appium for your mobile devices. I think iOS comes with their own tooling. Android comes with their own tooling. Roku has their own tooling. You have Cypress IO. You have gazillion things for web. But all of those will also require maintenance. And that is one of the reasons that at THEO, we basically took an approach, which I'll show you, to build that ourselves, to abstract away all of those technology difference and basically harmonize the test interface so that you have one API, one interface to talk to, which is supported across all of these devices.

And then lastly, very important, don't reinvent the wheel. I had the honor to go to New York for THEO, which is the picture you're seeing there last week and that I also spoke about. It's the same with a video player, it's the same with any other technology. Don't reinvent the wheel. Focus on what makes you strong, focus on what makes you unique. That's where you should be spending your effort. Also, in technology that you use for your test automation, don't reinvent the wheel. The same also, for example, for the player in your ecosystem. Don't reinvent the wheel in the testing that is being done there and also rely on what is being provided by your partner for example.

Defining the Test Scope at THEO

Michel: Let's see what's next. The test scope. Benny, do you know what we are testing at THEO?

Benny: Yeah, and it's interesting because there was already a question from one of our listeners asking what our focus areas are in our testing. Of course, I think people involved in testing know very well about functional versus non-functional. So, in a functional test, it will become very apparent when Michel does a live demo, but we have, checking of course the more touchable things. And then the non-functional part is actually, which is really important from a video player as well as where we actually validate stability. We leave the player running for a longer time and see if the player continues to run of course, but also if the performance combined with it, if the player continues to perform to the standards that we set here at THEO, which are really high. So, that's a bit the overview. So, it's not only functional, but also non-functional testing. And the second one is very valuable as well, considering the fact that we are a video player company.

Michel: I think it's maybe nice to also add that even during the onboarding cycle of our customers and people will recognize that we always ask, do you have a reference stream? Do you have a test stream? The reason we do that is of course to during the onboarding really help our customers to get to the level where they should be. But sneakily, we also put that reference stream typically in our test automation framework. We will not test all the features, but we will make sure that every release we check: Does it still play? Does the base functionality still work?

CI/CD Pipeline and Release Flow at THEO

Michel: So, if you then look at our release flow in our CI/CD pipeline, on which we did get questions from a lot of customers, I think if you would look at our branching structure, we have three main branches. Develop, release, master. And if you look there, every feature development happens on a feature branch where continuous integration tests run on, which are not run on actual devices, which is purely, in the Bitbucket pipeline that it runs. But then all the feature branches, they basically go to the develop branch, which is our bleeding edge, is where all the latest features are, and which is almost daily changing and where the team during the sprint is working. Every night on that branch, you will find our nightly tests. The nightly tests that run against actual devices with the framework and with the tooling that we will show and that we'll explain.

You will have a next branch, which is our release branch,  some people will call it a staging branch, etc. That's basically the branch that the code gets merged to at the end of the sprint. The end of the sprint, we merge everything which is on develop to that release branch. On that release branch, the same nightly tests at the same moment run as on that develop branch. Exactly the same set, of course, it can be that new features have been added and that they come with, of course, the test automation, tests that are also merged according to this branch structure.

The last branch we have and speaks for itself, Master is what we actually have in production, which is the last release you will find in our changelog online. But also there, we are running every night the same set of tests. Why do we do this? This is really because this gives us that every morning, you step in and you're able to see three results of the same tests against three different branches. Because this can easily help you to spot like, okay, my develop branch is red, but release is green, master is green, I already know in a split second, this must be a regression. So this is actually a bit of the, I'm always looking for the word today, and I don't know why I'm not getting to it, but it's a bit of the mandate you have to really stay on top of your results, to really make sure that your results are correctly green or red, but also that you, maintain your results daily, that you're able to compare between these three versions. So that means your test needs to be 100% stable, your device needs to be 100% reliable, and you need to look at this every day. From the moment you allow it to say, look, we know this sometimes fails, you cannot use the power that comes from comparing these three versions.

If you look at our release schedule, and that's what I said earlier, how fast we release, every two weeks we release a new player version, which is actually something we could even speed up. We could go even faster, but right now every two weeks we release. You will also see that at the moment we merge to the release branch, to the master branch, we also run a more extended set of tests where the reference streams that I just mentioned of our customers are also quickly validated across the devices. Afterwards, if everything's green, we deploy to the package managers, we update our documentation, and we update our changelog. So that's a bit how our release flow and our CI/CD works. So, the test against real devices, they run on a schedule every night against three different versions every day. That's, I think, what I want people to remember from this slide.

Self-Managed Device Lab and Maintenance Resilience

Michel: If you then look at the devices, that's also a very interesting topic. So, we have our own self-managed device lab. I can already hear people think, yeah, but you're saying don't reinvent the wheel. There are things like browser stack out there, etc. Which is true. If it fits your use case, and I would even tell people, if you're building an OTT app, the experience of that OTT app, that's what sets you apart. That's what makes you different. So, if you're able to validate that functionality, losing these cloud-managed or remote-managed device apps certainly do so. Only for our use case, where we are building a video player, which behaves differently on any different TV of every different year, of every different model, it is really required that we're able to get access to that device. Because we also need to be able to condition that device, put it into a certain network condition, and control every aspect around that device. That is why we chose to build our own device lab with the help of our customers of course, but what you will find is that of every, and every is probably a too strong word, but of all major device families, you will find that we have the different models in our office. And typically, the most low-spec one, because if it works there, it will probably also work on the higher spec ones.

What else do I want to say here? That is that this self-managed device lab also comes with maintenance. And I think that is one of the critical things that we also built is the required resilience in all of this. So, we also have the scripts running that every night reinitiates all of the devices to make sure that anything that's left behind by any test is completely cleaned up and that you can always reliably trust on the availability.

I went a bit too quickly, maybe to also say, with having a self-managed device lab, we also allow our customers during their deployment or their onboarding to use our devices for their testing. So, if you would have in the field certain devices that you cannot test, and you don't have yourself, we will gladly help with that. Note that we are not a test automation company, certainly not, but we do have the devices and we, of course are open to help our customers.

Benny: Yeah. Thanks, Michel. Maybe, sorry to interrupt you, but before you dive into your live demo, there was an interesting question from one of our listeners asking how long one of these runs, specifically the testing part, takes.

Michel: That's a very interesting point. And it's to be honest, a tedious effort to properly scale all of this. Like I said, it runs every night. So, meaning we typically have a window from, let's say eight in the evening when certainly the last person left until eight in the morning because we're of course also heavily hammering our internet connection if all the devices start to play a four-megabit stream here. So, we have a limited window to test three different versions. So, divide 12 hours by three. So, it means you have four hours for every branch. Now, that is how long the test runs take. So, I think if you would look at the night schedule, I think on every branch we would be testing for three hours with our automated tests. Because there are also longevity tests in there that we let the stream run for a longer period. But on the other hand, by doing this method every day and every day comparing the three different branches, we could, if everything is green, release even every day. So, I hope I'm giving a bit of an answer to that. So, I'm not able to say it takes you three minutes and 45 seconds, but I can tell you that with the approach we take, we could every day release the software if you would like.

Test Automation Framework Overview
What did we build at THEO?

Benny: All right. Thanks, Michel. Now, the floor is yours.

Michel: The floor is again mine. Let's see. What did we build at THEO? It's probably something we already talked about today. But I'm also going to show you a bit of the concept. So don't expect from me the most brilliant code and to completely start from scratch. I'll show you a minimalistic version that I made back on the plane from New York. But it shows the concepts. We call our tests framework XAGGET, which is a bit of a practical joke that goes from our previous experience and previous employers, et cetera. But I'll show a simplified implementation.

With simplicity comes great power. And like I said earlier, reuse what is available. Don't reinvent the wheel. Use the tools that are available. So, the framework we will be showing basically contains four elements, we will have a test client per platform. And what do I mean with that? So that means that we have built a test client for iOS, we've built one for Android, we've built one for web, and that one for web we use on browsers, but also on smart TVs, and we built one for Roku. What does that test app do? That test app in our case, wraps our video player. And in all honesty, also our competitor video players are embedded into that test application because it allows us to benchmark.

But what that test application really does, it abstracts away the underlying platform and technology because they all implement the same interface. The same interface to talk to the device, but also for the device to let us know what is happening. So that's what we call the test client. In the demo, I'll show you the HTML one because that's the simplest. Then we have one test runner. One test runner that covers all of these platforms. And this test runner just implements the interface that each of these test clients also implement. This test runner is simple Jest JavaScript test runner. You can Google Jest, you can see how it works, what it all can do. And it basically runs the test against that test client. And then to glue the two together, we have MQTT, which is probably known to a lot of people. It's something that mostly comes from the IoT sphere. It's basically a message broker which exposes, which you can access through WebSockets, but that really enables bidirectional communication. That makes it super strong because that makes that your test client can publish to the cloud everything that's happening, all the events, all the metrics, all the data, but also that your test runner is able through the MQTT broker able to target a test client and tell the test client, hey, start playing, pause, stop, and those kind of things.

The last one, and you'll see that it doesn't have a yellow box, I'll not show it today because it leads us a bit too far. But the other part that we connected to MQTT is our ELK Stack. Elasticsearch, Logstash, and Kibana. Why? Because through Logstash, you can just tap into the MQTT program. All the messages that go over, just pump them all into Elasticsearch, which is the database. and you use Kibana to make nice dashboards, which then allow you to really see, okay, what was the device doing? But also out of the box, you get very cool test results and test dashboards that you can populate. So don't reinvent the wheel, don't start building dashboards, building your own charts, etc. We use ELK. So, a very simple framework, which I propose. We dive in, if that's okay, Benny?

Benny: I would say let's go ahead.

XAGGET Framework in Action

Michel: Let's go ahead. Let's see if I can share my screen. It's always interesting adventure. Exciting moment. Here we go. So, as I said, it's a very simplified version. And don't expect the nicest code here. We will also share it later on. Certainly, if you would like to see it, like to know more about it, we're happy to help. I just want to show you the concepts today. Let's see. What have I got running locally? If I locate Docker, because we use Docker to host all of our application. Abstracts it away, you're able to then also run it into a Kubernetes cluster, etc. It makes it just 10 times easier. Even our test runner, we run into Docker because then we can deploy it anywhere and we can scale as many as we want.

But what do I have running? I have locally a RabbitMQ running. The RabbitMQ I use as the MQTT broker. MQTT broker is not really the topic today. A simple Docker compose that spins up my RabbitMQ, and that's it. Nothing more, nothing less. Typically, you run this in the cloud or you take a managed service from HiveMQ, for example, and that will also do the trick for you.

Then let's see what we have next. Next, I have my clients. My client in this case is a web application. A web application that implements the test interface.

So, what am I using? Let's have a look in my package JSON, I'm using Parcel to build as my bundler. I'm using THEOplayer as a player, of course, because I want to have a good player, and I'm using Paho MQTT to connect to my MQTT local broker.

So, let's go one by one. My index HTML, nothing fancy here, just THEOplayer. as a div to put the player in. And I also have a canvas element. I'll not show that today in the test, but through the canvas element, I'm able to take also screenshots from what the player is actually doing, which is kind of neat.

And then I have my JavaScript file. Of course, you need JavaScript, and we do everything in TypeScript.

So, what am I doing in the client? So, remember, this is the client that runs on your Smart TV that runs in a web browser. So, this is the one that will be executing commands that come from the cloud. What do I have? Basically, every of our device gets a unique ID, because you need to be able to target one device, one client. I let it listen to two topics. One topic is the event topic, and the other one is the action topic. Presume people can guess, the event topic is to publish everything that happens, and the action topic is to listen, what do I need to do. Makes sense, right?

Then, license for THEOplayer. Don't copy it, because then you have a free THEOplayer.

What do I do next? I create my MQTT clients.

Then when my client is connected, I presume everybody knows how promises work.

Client is connected, I subscribe to my action topic, because my client, of course, needs to listen what do I need to do. And I'm publishing an event, then I'm online. Remember this. I'll show you in a second.

Then I create a player.

Then I let my client listen for all the messages that come on the MQTT broker and handle them. I'll show you in a second.

But also very important, I attach listeners to the player.

And everything that happens by the player, so in this case, I just took a few events, I publish them all on MQTT. I publish them all to the broker for whomever that wants to listen to them.

I also do the same for all the track events where I will get notified if the quality changes and those kind of things.

Then I want to go back because I also had here handle message, which is basically every incoming message. And this is a very simple implementation. You can do it much nicer; I know. But if an action comes in to, for example, set a source or to start playing, it's handled here. And very simply, if I would follow it, if the cloud tells me to play, I call player.play. Not rocket science, I would say.

And I will show you in a second. So, if I go here, you see a very big black screen. This is my test client, basically, which currently has the player loaded, but nothing is playing. And I have a bit further on, I have a tool which connects to my local MQTT broker. And I'll show you, if I do publish, it says my client is not connected, so I should connect, of course. Look at that, I'm connected. And I will send a message to myself. Look at that. Hi there, I'm connected. Yay. And you saw in my code that if the test application here comes online, it publishes, hey, I'm there. So, let's have a look. I refreshed, and it tells me on the topic player with this ID, on the event topic, hey, I'm online, and this is my user agent. This we use in THEO to basically address certain devices or to see who's online, for example.

OK, good. So, I showed you the test client, I showed you how to spy on MQTT, so I'll leave this open because I'm going to use it in a second. I'll next show you where the magic happens in the test runner basically. The test runner is just a simple test tool, it's nothing more than that. What I did build on the plane is a player module which basically simplifies my connection to MQTT, which is way too much code for what it should be. But it's actually just to allow in my test to very simply talk to that client on the opposite. And you'll see the same topics here again. So, events, where to listen for, and action, where to publish to. Very simple.

But then, this is where it really matters. This is my test. So, this is simply Jest. And you will see here, before the test starts, I'm going to connect to player one, which is the identifier of the player. Of course, if everything is finished, I disconnect. That's nothing more than a good habit. What am I going to test? I'm going to reload the application just to make sure that it's bootstrapped. And I'm going to know that it's online by listening for the online event that I just showed you. Once that happens, then I check that the player is not muted. So, I listen for the state of the player, and then I ask for the state of the player. And once I get it, I validate that the current time is set to 0, and muted is false. I mute the player, I check that it's muted, then I load the best movie ever, Big Bug Bunny, which in this case is a DASH stream, and I wait on the player until the source changed, until the player tells me, hey, I've changed source. I check that actually happened, and then I start playing Big Bug Bunny. I let Big Bug Bunny play for three seconds, and I compare before and after. So, I compare. And again, this is a very simplified version. I checked that my current time progressed, and I checked that I'm not paused. I'm still playing. You could add here that I request a screenshot. You could add here that you're able to validate the current time more precisely, etc. But I just want to show you the concept today. And then we stop playing Big Bug Bunny.

I hope Benny you're excited to now see how that looks if we run it. And I hope all the other people on the call are also. So, using my Mac, I can put both side by side. So now we have a big black wheel. Very cool. And you have on the right side, my test. Are you ready, Benny?

Benny: I'm ready.

Michel: Sure. There we go. So NPM run test. Here we go. There is Big Buck Bunny. And Big Bug Bunny is gone. And you'll see that Jest ran my test. So, the reload worked. The state was not muted. Then muted it, was muted. It loaded Big Bug Bunny, played it, played it for three seconds, and then stopped. So, everything passed in my case. I hope nobody expected here that something would fail, because this is, of course, the result that I wanted. Lastly, using Jest, I can also attach, for example, HTML reporters. And you'll see that it generated my very nice test report.

So, I'm now thinking, have I forgotten anything? And I just remembered that I forgot to show you this. So, I was still spying on my MQTT broker. And here you see all the messages that came by during my test. So, for example, what do we have here? Here we have an action state. So, this exposes the state of the player at that time. But you also find a lot more information here. You'll see a track change in the audio. So, it switches the audio track, for example. And that, for example, is also how we check, is the right quality playing? So, if I remove this window again from my split view, I can show you one example.

So, I have an example here, which is a track change. So, for example, this also comes over MQTT. And I would be able to run my validations against it. So here, this is a video track change. And what it basically told me is the active quality change. So, the player in this case is playing at this bandwidth, this codec, frame rate, etc. And these are all the different other qualities. And that's basically how we validate at THEO to give you a bit of an idea. So also, here we check is the right quality playing and we do that basically based on the events.

Next to that, and this is the test step that you're seeing. next to that. In all honesty, we do also have cameras pointed at the TVs just to be able to record and afterwards look, was everything okay? But we don't use it because we really want to have super stable test results so that our developers are able to trust what comes out and also get motivated. I think that's what I wanted to show. Did it make sense, Benny?

Benny: For me, it made sense.

Benchmarking MQTT vs. Hardware Adjustment Remote Control Units

Benny: And I think there was an interesting question that touches a bit on this. And I'll partially answer it already. But one of the listeners asked, if the testing code is included in the repository, more specifically, could it be that the development branch is testing other things, streams, than the production branch in these nightly runs we do?

Michel: A super cool question, to be honest. When I joined THEO, I was flabbergasted that there was a concept of a monorepo. Me being a microservices adept, I couldn't stand the monorepo, but in the end, it does make sense. And in the beginning, all of these tests were also in a separate repo, but that still creates a gap between your developers and your testers. And to really get ownership with your development team, we afterwards changed that, and we basically put the test code, the test with the player, so with the player code. For the simple reason that, A) the developers, they take ownership because they're able to work in the same code base. But B) also that you get your tests properly versioned together with the player. So, if you're testing develop and you have a new feature, that feature is automatically already included and is not running on the release branch and on the master.

How do we make sure that we are not testing something else on develop than on release? We, of course, have our regular development practices. Every pull request is reviewed by three people. So, there is a pretty diligent check there that nothing is changed that should not be changed. I hope I answered the question a bit.

Benny: Thank you, Michel. I see the thumbs up, so that's a good sign. I see more questions are popping up. I'll probably continue a bit now with the slides and then come back to these other questions at the end in the Q&A session.

Michel: Can I maybe take one more, Benny, because I find a few interesting.

Benny: Of course.

Michel: There is a question if we benchmarked MQTT versus hardware adjustment remote control units. That's a very interesting one to be asked. At THEO, we did do that just to validate that everything maps correctly. But also in previous experiences, we used that. And there, we did also do the same benchmarking. And for example, you would now think, okay, but this infrastructure, this setup is very nice for a player that's easy and simple. But in a previous experience, we used it in a React Native application. And there we implemented something similar and we actually implemented that you could through MQTT, navigate through the application and simulate that you would go from the TV guide back to the home screen, etc., which worked exactly the same. At that moment and also in a previous experience before that, we did have both side-by-side and that you would be able to switch, for example, between, okay, now I want to run my tests with an IR control, now I want to use them with my own proprietary interface. Just to make sure that from time to time you benchmark it. But typically, if you go through an IR or an RF controlled remote, you will again get into the test step principle. So, we do benchmark that everything works correctly, but we do rely the majority of the tests based MQTT. Sorry Benny, I stole your thunder a bit.

Benny: No worries, no worries.

Michel: I'm going back in time. Let's go forward in time.

Benny: That works better, yeah. All right. So, thank you very much, Michel. At least I always find these things very cool. The more technical, the better. So, I hope all our listeners enjoyed it. Again, if you have more specific questions on this, technical things, feel free to use the question section or just hit us up on LinkedIn or something afterwards. We'll be happy to talk through.

Unified User Experience

Benny: Now, let's talk about the value and business impact of all of this. So, first of all, and I want to piggyback back to the first slide where we showed the waterfall and where we discussed the fact that making everything big is endangering all your releases to go very slow and prone to bugs. So, what's now the value of having this such an automated solution?

So, first of all, there is a unified user experience. So, what we do, because of the solution we have here at THEO, we excel and make sure that our customers are actually relieved from any video related concerns. So, they can focus on delivering their brand experience, the important parts, you know, making sure the app functions correctly. From a functional point of view, that all the UIs look the same across the different platforms and we take away the video concern.

Next of that, of course to all of this you'll make sure that your releases will be more done more quickly more frequent, because we take away part of that video concern out of your hands.

And then of course thirdly in-field testing and customer support, so we work for instance together with customers that say, okay, we have a certain set of devices that we need to support. Which for instance are quite old, connected TVs that are like, yeah, from 2018 models, for instance, which are challenging to support. So, we include that in our test setups, and we include, for instance, also streams from our customers and that. So, we have that automatically included in our test set. So, every night these are being run as well. And if something goes wrong there on, for instance, an older model of a TV, we know that instantly we act on it even before it reaches any of our customers.

So, I think that's three main values we deliver with this solution.

Key Takeaways

Benny: Key takeaways, maybe a bit of a summary of it all. Michel?

Michel: Yes, and I think what really is important, we are not a test automation company, not at all. We do put this, what we build, towards the service of our customers, of course, as THEO and we're also happy to help discuss with you, if you're interested in building something like that, we're happy to help, we're happy to guide.

But I think the key takeaways today, and let me emphasize that, in an OTT application, and that's also why we called the webinar not just testing video playback, but your OTT application, the video playback component is a big part of it. I heard it from one of the recent customers, it's about 80% of the time is spent in the video player. So having that 80% already battle tested across all of the platforms and devices. And typically, the device's difference will be made in the video player experience, not in how you display your metadata and how the user flow is. So, if you already take that away and you have a partner that helps you with that or assures that is done well, it's already a very big part of your release cycle that you simplify.

What I also spoke about in New York, focus on what sets you apart. And what sets you apart is your unique core business, the things that only you can do. Your fans, your viewers, that's where you want your focus to be. So also, your people, also your very specific developers, that's where you want their mind to be, that's where you want their effort to be. Not on testing a video.

Testing tools and devices, honestly, it's irrelevant. I showed you a tool, I can make 10 other tools, you can use 20 other tools even, that's not going to make the difference. The difference is going to be made by putting the right people at the right part of the stack, knowing what you're testing, knowing what you should be expecting, and also next to that, every day looking at it and staying on top of what you are doing. So, what you get with us is, okay, we have a good tool, but it's the knowledge that you get from video experts that makes a difference.

Taking away the complexity here will shorten the release cycles because you will not have to go to a gazillion labs and external partners to go testing. And I think what I said already, we're happy to share our experience around video playback. We're happy to talk about it at the upcoming events somewhere where we meet, on LinkedIn, just reach out, happy to discuss. 


Benny: Thank you, Michel. That was really insightful. So, questions are pouring in, which is great. So, let's get started. Let me maybe take the first one myself. So, there's a question on how many full-time people we have to maintain this testing infrastructure without the actual task here. So, putting a number to it is hard because everybody from the development team contributes. So, it's actually, as Michel alluded earlier, we are actually making the development teams themselves responsible.

Of course, we all know that, that you need next development, also, people that focus on the QA part and the people dedicated that for instance, look at the reports we have on a daily basis, but also on a 2-weekly base before we release a new sprint, people will look into the reports, if something goes wrong, they give feed back to the teams, and make sure that things get set in motion to either fix or investigate why something turned out red.

So, it's hard to put any given number to it. But on average, we have a few people looking into the results. That's our main responsibility, of course, to make sure that's get feedback and they have some guarding overall on the quality that comes out of our solution. But I want to stress the fact that we want to make the developers themselves responsible maintaining this system, making sure it's updated.

Michel: Let me maybe add a few things to that. So, I think over the time also, I showed you a very minimalistic framework. But of course, things get built around it. Things to throttle the network, tools to generate streams, the devices here, and to bootstrap the devices, and all of that cost, of course, effort to build them. But after it's built, we typically don't have a lot of maintenance to it. And if you really would like to put a number to how many people does it cost to do that, then pointing over there because our devices are over there. To maintain all the devices, then I would say we have one person that's not even full time, that just maintains the infrastructure. But the actual tests, that's the developers, and that's the development team. But through the choice for test depth to get out wider coverage, we also know that when something is flagged red, we will, of course, not waste effort because it's a correct red fail.

Benny: All right, thanks, Michel. Next one is for you. Question is, if we perform testing on production attached by using MQTT to report issues, for example, monitoring and collecting anomalies or predictable issues like buffer event entries. If yes, how you manage log data streams? Are we using ELK for that as well?

Michel: I love the question. It's something we completely forgot during the webinars to be honest. Through MQTT, we get a very nice advantage, we can test anywhere in the world on any device. And actually, we've done that. Customers overseas, customers somewhere else in Europe, whatever device that is showing issues which we cannot reproduce here, so, we ask them to put our client on there, that even happened to add an end customer of our customer, very weird, but then we're also able to test in those conditions. And over the time, and I've not shown that during the demo, although it was in there, we've grown the testing tool from just testing to even diagnostics and troubleshooting. So, for example, overriding fetch in your browser will give you access to all the network traffic that is happening. For example, getting screenshots there is also very insightful. And using that, we're able to not only abstract away technology like an old web kit, for example, which comes with ancient technology and ancient dev tools, because we now do it with our own tool, but also to test anywhere in the world, and yes, indeed, we also use it in production. And we have played indeed with the idea to see, okay, can't we use this even a step further - to really during a field trial of our customers, use the data that we have or that we know from this system and do something with.

Those ideas have not yet fully developed, and to be honest I am very excited to look at initiatives like CMCD, because if you, for example, would add CMCD into your test stack, you're able to validate that the right data is being published to the client. But also, you're able to understand how that data behaves, so that you're able to recognize patterns, so meaning that also in production, you're much swiftly able to use that data to see what is going on. So yes, it brings us very close to bridge the gap between just seeing that something goes wrong in production and actually going to the root cause. And we are using that in a controlled environment, but we're not yet at the moment that we're really giving that as a solution to our customers. Might happen someday, but right now I would say, combine it with something like CMCD or your own analytics solution. And from there benchmark what is happening, which for example, we also do with analytics vendors, we benchmark through this system, how the player behaves and make sure that it's correctly reflected there.

So that would be my answer there.

Benny: Yeah, thanks, Michel. Next one I will pick myself again. As a question, if you are performing any manual testing next to that. Also, a really interesting question, of course. The automated part that we just discussed is actually focusing really much about the core functionality of a player. Next to that, we have tons of integrations with analytics, ads, but also player features, like picture-in-picture, casting, and of course there, there was more complexity to cover that in an automated way. And some parts are covered in EVE, but also other parts are tested manually within each sprint. The teams are responsible to cover that and make sure, okay, are all the features working, are all the integrations working. For instance, we have a React Native SDK, which is really gaining a lot of traction in the last two years. There the team is responsible. Okay, are all the connectors, as we call them, all the features, as we call them, are they working as expected still? Can we still do casting? Is that still working across all the platforms? Can we build the builds? Is nothing failing, everything turning green? So that's all becoming the responsibility of the development team. So yes, there is still a fair portion of manual testing involved.

Michel: I see a very interesting question, Benny. I see the question if we have a set of test assets that intentionally provide a bad user experience. It's a very interesting question. To be honest, I want to go back in time. Four years ago, I had a very big QA-team at THEO, very smart people and very good people. They were testing with test assets they found on the internet, which then of course all of a sudden go down and is it then the player or is now the stream, etc. Right now, what we do is we have test assets from our customers, but we also have test assets of ourselves.

Now creating test assets can be tricky, because you want to play with codecs, you want to play with time scale, you want to play with frame rate, you want to play with discontinuities, you want to play with different MPEG-DASH periods to simulate SSAI conditions. So, we build also there, and you also need to be a bit handy with what you're doing and know what you're doing. We also build tooling around all of it. So, we built tooling that, for example, allows us to transform an HLS or DASH VOD asset into a live stream, which then gives us VOD and live, but which then also allows us to, of course, generate a VOD with, for example, a FFmpeg and Shaka packager in all different conditions. And then you can indeed create streams that have a bad user experience. You can delete segments, for example, or you can put segments of a wrong quality there. And that's something that we actually did a lot. And if you joined one of our webinars in the past, that's also how we uncover the black box, which is a lot of smart TVs. A lot of smart TVs, you cannot call the person that built it because they're no longer there, because they're so ancient. So, a lot of the compatibility that we built with THEOplayer is actually with this tooling, because we were able to feed different variations of the stream and see what actually makes the difference.

Very long answers to just say yes. I should just learn to say yes, but that's the answer.

Benny: No, I think, Michel, that's great. Love to provide always more details and insights. All right. I see the person who asked the question is really happy with the answer, so job well done, Michel.

Michel: I see another one. I want to do another one. I see a question if we use it.

Benny: We still have one minute.

Michel: Do we use tools like Sentry in development or in production? No. We have the big downside that, not downsides, we love to be a partner of our customers, but we cannot put things into the application of our customer that our customer is not aware of. Things like GDPR, privacy, those kinds of things, becomes a little bit tricky, so we cannot do that. Which, to be honest, in my previous experiences, we heavily connected field trials to this. You test a lot in-house with this automation, you pretty quickly go into a field trial with friendly customers who give you valuable feedback and where you put the right monitoring tools in place to get the data out that you need. But as a partner of our customers, that's not something we can do. And that's why we're also saying during this webinar, talk to us, we're happy to help, we're happy to share experience, because that's really, really necessary. It's really an important step.

Closing Remarks

Benny: Yeah. All right. Thanks, Michel. So, I see we hit the hour mark. Time flies.

So, before we call it a day, which I am talking about time flies - NAB is around the corner. I can't believe it, but we're already leaving 10 days from now, Michel. So, both Michel and myself will be there. If you're there, let us know, come by our booth. We continue to talk about testing automation. We talk about video in general, or grab a coffee together, or just, if you want to meet in the evening for a beer also possible so definitely... I see a question if we have some free tickets for NAB to give away, I fear that won't be possible I would have to check with our marketing team but, anyhow if you're there definitely let us know. What did you say Michel?

Michel: We can always discuss we can always negotiate.

This is a cool one, Benny.

Benny: I give you the last words for you, Michel. I really enjoyed this. Thank you all. Go ahead, Michel.

Michel: Can I?

Benny: Yes.

Michel: Really? So, let's stay in touch for sure. If you have questions, reach out, LinkedIn, email, whatever, contact us. Happy, happy, happy to discuss. Even if it's not video player related, we have the experience, we're happy to help. There is a next super cool webinar coming around the advertising. Very excited about monetization and about how things can be done differently. It's one of the focus areas in our roadmap also. So certainly, join that webinar. That's going to be a killer one, even more than this one. Not that this one was not a killer, but that's really going to be a very interesting one. So, I hope to invite you all to join our next webinar around monetization because next to focusing on your brand, you also want to make money. So, monetization is a pretty important topic. So again, please join!

Benny: All right. Thank you, Michel. I'm already looking forward to that one. And thank you all. One more big thank you to all our listeners. I hope you all enjoyed it and hope you can meet up later or just let us know if you want to discuss. Have a great day, everybody. See you next time. Bye-bye.

Back to top

The Team Behind the Webinar

Michel Headshot Circle
Michel Roofthooft
VP Engineering
Benny Headshot
Benny Tepfer
Technical Customer Success Manager

Want to deliver high-quality online video experiences to your viewers, efficiently?

We’d love to talk about how we can help you with your video player, low latency live delivery and advertisement needs.