HLS in MP4 using ffmpeg in a browser

Hello! For more than two months, in my free time I have been sawing a web application for converting HLS and DASH to MP4 using emscripten and ffmpeg, from which I want to share how I managed to do this.

In this article I will not cite the source code of the ffmpeg edits and patches, as most of them were done on my knee, and I'm not very good at C. But now there are enough articles to help you.

Introduction


Two years ago I had the goal of combining an audio and video track together into one mp4 file. Then I just plunged into emscripten and for the basics I found the ffmpeg.js repository from which I learned a lot. Then I was almost able to achieve the goal, although I was very conditionally oriented in C.
Understanding the source code, ffmpeg made a patch for working with the file system, where reading from a file called an asynchronous function in js from which I read the blob of the file and passed the buffer, and when I was writing, I called js function that sent buffer data to the repository.

But there was a problem with asynchronous functions, which I could not solve correctly, they worked through asyncify (fastcomp), which did not work correctly in some cases, namely, the execution of code in wasm did not stop, without waiting for a result from the js function, that's all broke down. This problem was fixed via the EMTERPRETIFY_WHITELIST flag, which apparently moved the code from wasm to asm at the same time and slowed it down, and it was necessary to debug the call stack and add a broken function to the list with every exception.

In general, with such problems, this could not be called a working solution, on which all this remained a small demo.



One and a half years later


After watching a report on Google Dev Summit about new features in WebAssambly , I went to see how emscripten was doing and saw a message:
Emscripten emits WebAssembly using the upstream LLVM wasm backend, since version 1.39.0 (October 2019), and the old fastcomp backend is deprecated

I wanted to go try to rebuild my repacker formats. About a week I googled how to fix new compilation problems and finally put it all together. The changes were not that many, but it wasn’t going to because of the new library linker, and already desperate to collect at least something, I just sawed out the problem libraries (as it turned out, the libraries themselves are connected and you no longer need to point them out with the hands).

And now, the moment has come when it has gathered and earned! The problem with the asynchronous code was gone, there was no need to debug anything, it worked as it should from the beginning.

Here I seem to have reached my goal, but ... a new one appeared.

Rewrite HTTP Protocol


Such a thought has been in my mind for a long time. This can allow you to download HLS or DASH, and not only a ready-made playlist but also a live stream. And I’ve never seen anything like this on the Internet.

It took me about three weeks to make at least something working with me with short breaks. I knew C (while having zero experience), there were a lot of problems with pointers (it’s hard to keep track of where it goes, and even in someone else’s code), but at last something was compiled without errors. After the first successes, this gave even more enthusiasm to complete the idea.

Just a couple of weeks, and finally I managed to do the first iteration of the working http protocol, and that would seem to be all?

When the hardest is over


At this point, I had a framework ready, a small html form with an url input field and a start button, basically it worked. But it was still necessary to write an extension to bypass CORS and load data, make a storage that would write data in chunks, make an interface with progress display, all this was debugged to fix problems in different browsers. In general, the time has come to finally make it possible to use it.

Basically, userscript was made, which was a proxy for fetch requests from ffmpeg to download data.

A couple of days later, an extension for Chrome and Firefox was ready, which using webRequest collected all the hls links that the browser loads when watching a video.

In Firefox, as it turned out, the extension API does not allow you to manage power, from which you can not prevent the computer from falling asleep, alas.

The extension looks like this:



Just improved the page on which the site was a little, screwed material-ui, finalized all the places that were whipped up.



After testing different ways of storing data, I revealed a number of problems:

Blob - Chrome writes them to RAM and drops to disk when it overflows, but only in OSX when the memory overflows, the OS leaves the account and closes all the applications that were open. And Firefox generally always kept data in memory.

Cache storage- it works like IndexedDb, but after writing data, in Chrome blob they remain in RAM (either a bug or a fitcha), but it turns out that the data is written to cache storage (to disk) and also drops the same when the memory is full volume to disk as blob.

IndexedDb - works like a clock, knows how to store blob, writes data without frills, but Firefox strictly limits the amount of 2gb.

I worked on it a bit, I managed to make the function of interrupting the ffmpeg process (via the pointer), I came up with how to make the choice of formats (ffprobe) and network error handling.





And now, you can try the result yourself here

For me, this is an indispensable thing when you need to record a stream on a tweet or download a VOD. It also works with a laptop, mixer, and any other sites that broadcast content in HLS or DASH (alas, the DASH implementation in ffmpeg is very conditional and live may not download correctly, because it does not take into account the fragment request interval).

Thank you for reading!

If you have questions about ffmpeg and emscripten - write, I will try to answer.

All Articles