A scalable WebRTC-based framework for remote video collaboration applications

Multimedia Tools and Applications (MTA)

Published August 11, 2018

Stefano Petrangeli, Dries Pauwels, Jeroen van der Hooft, Matúš Žiak, Jürgen Slowack, Tim Wauters, Fillip De Turck

Remote video collaboration is common nowadays in conferencing, telehealth and remote teaching applications. To support these low-latency and interactive use cases, Real-Time Communication (RTC) solutions are generally used. WebRTC is an open-source project for real-time browser-based conferencing, developed with a peer-to-peer architecture in mind. In this peer-to-peer architecture, each sending peer needs to encode a separate, independent stream for each receiving peer participating in the remote session, which makes this approach expensive in terms of encoders and not able to scale well for a large number of users. This paper proposes a WebRTC-compliant framework to solve this scalability issue, without impacting the quality delivered to the remote peers. In the proposed framework, each sending peer is only equipped with a limited number of encoders, much smaller than and independent of the number of receiving peers. Consequently, each encoder transmits to a multitude of receivers at the same time, to improve scalability. A centralized node based on the Selective Forwarding Unit (SFU) principle, called conference controller, forwards the best stream to the receiving peers, based on their bandwidth conditions. Moreover, the conference controller dynamically recomputes the encoding bitrates of the sending peers, to maximize the quality delivered to the receiving peers. This approach allows to closely follow the long-term bandwidth variations of the receivers, even with a limited number of encoders at sender-side, and increase the delivered video quality. An integer linear programming formulation for the bitrate recomputation problem is presented, which can be optimally solved when the number of receivers is small. An approximate, scalable method is also proposed using the K-means clustering algorithm. The gains brought by the proposed framework have been confirmed in both simulation and emulation, through a testbed implementation using the Google Chrome browser and the open-source Jitsi-Videobridge software. Particularly, we focus on a remote collaboration scenario where the interaction among the remote participants is dominated by a single peer, as in a remote teaching scenario. When a single sending peer equipped with three encoders transmits to 28 receiving peers, the proposed framework improves the average received video bitrate up to 15%, compared to a static solution where the encoding bitrates do not change over time. Moreover, the dynamic bitrate recomputation is more efficient than a static association in terms of encoders used at sender-side. For the same configuration mentioned above, the same received bitrate is obtained in the static case using four encoders as in the dynamic case using three encoders.