What is Video Conferencing: Architecture, Protocols and Common Challenges

Devices

Video conferencing is a great way to communicate with your colleagues, partners, and clients. Online meetings are more engaging than traditional phone calls, email conversations or instant messages and can really boost your team productivity. However, video meetings are also more demanding: they impose increased requirements for both video conferencing endpoints and your communication channels.

What Affects the Quality of Your Video Conferences?

Bandwidth is possibly the most important asset to a successful video conference, and we are used to evaluating video conferencing quality with the bandwidth. However, this might not be totally true. The connection speed can rapidly change during the meeting, drop or shift depending on the transmission mode, while it is critically important for the data streams to be stable, smooth and predictable during video conferences.

Video conferencing system can easily adjust the bandwidth from 64kb/s to 4Mb/s depending on the conferencing mode and signal quality of the participants. It is much more difficult to adapt the stream to constantly changing network conditions of each conference participant

Video conferencing architecture and its ability to operate under constantly changing conditions plays the key role in ensuring video conferencing quality. Here are some common video conferencing challenges that may negatively affect your meetings:

  • CPU power of your endpoint. During video conferences users might simultaneously perform resource-intensive tasks and load endpoint’s CPU.
  • How video is captured on your endpoint’s camera. Even large resolution cameras can produce grainy picture in low light conditions.
  • How video conference is displayed on your endpoint's screen. For example, if user exits the full-screen mode, there is no need to send him or her high resolution video.
  • Channel bandwidth between your video conferencing server and conference participants. That’s the most common issue. For example, your colleague started downloading large amounts of data and reduced network resources for the video meeting. Or you’ve been running a video conference on your smartphone and got in a very crowded space - and you provider is unable to maintain the same speed and connection quality.

How to prevent the most common video conferencing challenges? The simplest and the most expensive solution is to put fixed restrictions on both hardware and network resources of your video conferencing system.

Fortunately, science and technologies are evolving fast and modern video conferencing systems provide excellent connection quality under any conditions due to advanced software architecture.

Video Conferencing Architecture

SVC

In any group video conference, there is a certain way to transmit data between its participants. Given the fact that direct connection between conference participants is hardly applicable due to the most common video conferencing challenges, we need to consider a system that supports star typology and can be used as a medium, i.e. video conferencing server.

All solutions were previously divided into two categories: software and hardware solutions. but this approach has been considered outdated since 2014, because clear separation between hardware and software solutions simply disappeared: there are hardware systems that combine typical software architecture (switching and SVC) and software systems with built-in MCU. Second, leading video conferencing vendors tend to deliver their video infrastructure as a software in a virtualized environment.

Mixing-Based Video Conferencing Architecture (MCU)

During a video conference, the server receives streams from each participant, decodes and decreases their resolution, creates a new image of the required quality and resolution for each participant (as adjusted for common video conferencing challenges), encodes the stream and sends it. All these stages require massive computational power, delays server processing and might impair video quality as a result of recompression. The scalability of such architecture is extremely low even considering its virtualization capabilities, so the price of such an infrastructure is extremely high and unjustified.

Multiplexing-Based Video Conferencing Architecture (Switching)

A classic example of this architecture is a software video conferencing system, such as Skype. Unlike MCU, video conferencing server does not recompress the video; instead it creates copies of the incoming streams and sends them to other participants "as is". Thus, each endpoint receives several streams in full quality and is incapable of displaying them simultaneously in original resolution. The endpoint has to reduce the resolution of each incoming video stream from participants on its side or request the the other side to reduce it before sending, which impairs both video quality and bandwidth requirements for all other participants.

This approach has one particular advantage: the infrastructure is not resource-demanding and even an ordinary PC can run hundreds of such conferences simultaneously. However, the disadvantages outnumber: an endpoint (usually an ordinary PC) has to decode several streams simultaneously and the video server requires several times more outgoing channel bandwidth to send all created copies of the streams.

Consider the real conditions, and we get a system that can hardly hold a video conference with more than 3 participants and impairs video quality for all participants when a mobile device is unable to process the original video quality it gets from other participants.

SVC-Based Video Conferencing Architecture

This type of architecture includes all advantages of mixing approach and escapes all drawbacks of multiplex-based systems. It is affordable and easily scalable, and it runs on any platform thanks to advanced signal processing and data compression technologies.

Here’s what SVC-based architecture does: an endpoint compresses its video stream in layers - each additional layer comes with an increased video resolution, quality and FPS. If the channel between an endpoint and a video conferencing server provides high bandwidth, the endpoint sends the maximum number of layers. SVC stream varies by only 15-20% bandwidth as compared to non-SVC stream, and requires much less bandwidth than the switching approach.

After receiving an SVC stream with layers, the video conferencing server cuts off excessive layers without transcoding by getting rid of data packages. In this way it creates individual sets of streams for each participant of a group video conference on the fly, in accordance with its actual connection conditions, available resources, layout requested, screen resolution etc. This in turn brings great resiliency.

This type of architecture includes all advantages of mixing approach and escapes all drawbacks of multiplex-based systems. It is affordable and easily scalable, and it runs on any platform thanks to advanced signal processing and data compression technologies.

The Use of Advanced Protocols and Codecs

Standard data transfer protocols are used to hold video conferences between software systems and hardware endpoints from third-party manufacturers.

Video Conferencing Prorocols
  • H.239 is a communication protocol that supports two media streams from different sources. It is used for conducting video conferences where the picture is displayed on two different screens (e.g. two screens in a meeting room, one displaying the speaker and the other displaying the shared content).
  • H.323 is a data transfer protocol with non-guaranteed bandwidth applied in both personal and group video conferences.
  • SIP is a network protocol for connecting client applications from different vendors. SIP has replaced H.323 and is used in video conferencing and IP telephony.

Compression and playback of video and audio during a video session is carried out through the use of audio and video codecs.

Video codecs
  • H.264 is a video compression standard providing a high compression level while preserving the original quality.
  • H.264 Scalable Video Coding (SVC) is a video codec that compensates missing data and transfers video through several video streams. Resistant to network errors, e.g. packet loss.
  • H.265 is a video compression standard that features more efficient encoding algorithms than H.264. The key elements of this video codec include an increased resistance to data packet loss during media data transfer, and minimal signal delay during video conferences. This standard supports UltraHD formats: 4K and 8K.
  • Opus is a codec for audio compression with exceptional performance, not impacted by changes in internet connection during sessions.
  • G.722.1 Annex C is a broadband audio signal compression standard.
  • VP8 is a video codec with a high-speed decoding and an increased resistance to frame loss.
  • VP9 is an open source video compression standard. Its initial purpose was to improve the characteristics of VP8 and H.265 codecs. In case of VP8, the main objective for the developers was to achieve bitrate decrease by 50% with no initial video quality loss. For H.265, the main goal was to improve video stream compression efficiency.

Summary

Ready to give video meetings a shot? Try TrueConf Server, which meets all necessary requirements, integrates with your workflow, easily scales and solves all video conferencing challenges. Learn more here.