MCU (video conferencing architecture)

MCU (Multipoint Control Unit) is a video conferencing architecture which supports server-side data processing. Upon gathering primary video streams from all the endpoints, the server separately performs the following actions for each endpoint:

  1. Combines thumbnail videos of the participants in a single video stream with the layout requested by the endpoint.
  2. Encodes the video stream up to the quality that meets bandwidth from the MCU to the specific endpoint at the given moment.
  3. Sends a video stream to the endpoint.

MCU (video conferencing architecture) 1Thus, it is possible to increase the number of conference participants that can be displayed by an endpoint based upon the limits of its equipment and communication channel. Hypothetically, MCU scaling is more efficient than SFU scaling because the user always sends and receives only one thread. But in practice, however, an MCU-based system requires a lot of computing power and does not scale that well, even taking into account possible virtualization scenarios. Besides, it is really expensive to connect new users to such systems, because on average, each new user needs a separate logical core of the server CPU.

MCU (video conferencing architecture) 2

Advantages of the MCU architecture

  • Does not require wide bandwidth for the client. Bitrate does not depend on the number of participants, but on the amount of the data sent and received.
  • The client connects to the media server, not directly to other participants.
  • Server-side recording is available.
  • The server sends a single media stream, which allows you to participate in conferences on weak devices (smartphones or tablets).
  • Devices using H323/SIP protocols can participate in conferences.

Disadvantages of the MCU  architecture

  • An expensive server is required to mix multiple media streams into one.
  • The number of participants in a conference directly depends on the performance of the media server, so in practice, this architecture rarely allows conferences with more than thirty participants.
  • The user cannot control the video layout or disable video reception from a specific participant.
  • In order to differentiate participants in video layout, each participant’s video is labeled, which might negatively affect video quality.

Comparison: 16 participants, on the terminal

Outgoing streams 1 1
Incoming streams 1 15
Out channel, Mb/s 1,0 1,2
Incoming channel, Mb/s 1,0 1,2
CPU load 20% 70%

Comparison: 16 participants, on the server

Outgoing streams 16 240
Incoming streams 16 16
Out channel, Mb/s 16,0 19,2
Incoming channel, Mb/s 16,0 19,2
vCPU load 1600% 0%

For a more in-depth understanding of how this and other video conferencing architectures work, we recommend watching this video:

You can also learn more about other video conferencing architecture types on our website.