Back to Terminology

End-to-End Encryption (E2EE)

Encryption is transforming information in such a way that it remains concealed from outsiders (unauthorized parties). This form of information protection has been known at least since Ancient Rome. For example, Julius Caesar (100-44 BCE) used the shift cipher when writing to his generals.

Illustration showing a hacker intercepting data between two people communicating online, symbolizing a man-in-the-middle attack.

There are three types of encryption:

Symmetric — the same key is used to encrypt and decrypt messages (traffic).

The following encryption algorithms are widely used: AES, ChaCha20-Poly1305.

The advantage of symmetric encryption lies in its speed due to lesser mathematical complexity in comparison with asymmetric algorithms. Besides, modern consumer CPUs have built-in blocks that improve AES speed at the hardware level.

Asymmetric encryption uses two different keys: the public key, which is needed for encryption of messages/traffic (this key is open and known to everyone and can be sent via an uprotected channel) and the private key required for decryption.

RSA is a widely-used asymmetric encryption algorithm.

The strength of asymmetric encryption is that this approach allows the public key to be sent via unprotected channels.

The private key is a sequence of bits with required length (for example, according to the RSA standard as of 2023, 2048-bit numbers or even longer have to be used). In case of online communications, a private key is usually:

Generated on the devices of users participating in the communication session.
Stored only on these devices.
Cannot be shared with anyone, even with other user devices.
Generated separately for each message or message chain of a certain length (depending on the implementation by a specific vendor).

The public key is given to everyone who should be able to send messages. It is impossible, at least within a reasonable time, to break the private key by its public key of time and decrypt or substitute data.

Hybrid encryption usually refers to the method of data transfer which involves encryption with a secret key combined with a symmetric algorithm. In this case, the key is sent in an encrypted form (asymmetric cipher is used).

Encryption transforms data into a set of bits that is useless to hackers because it can be decrypted only with the secret key. Encryption implemented by cloud-based video conferencing vendors does not ensure full protection against potential leaks since meta and media data routed through the server of the provider can be decrypted and viewed by third parties.

This article will show how one can safeguard oneself against potential leaks and explain why existing implementations of end-to-end encryption are not the best strategy if you want to achieve the most secure cybersecurity software and effective communications.

What Is End-to-End Encryption?

End-to-end encryption (also known as E2EE) is a method of protecting data from unauthorized access and change. In this case, only the users, who take part in the conversation, will have access to messages or data.

End-to-end encryption ensures that user information (text, video and audio streams, shared files) is completely unavailable even to the servers involved in the transfer of information. Messages are encrypted and decrypted directly on the devices used in the communication session. So, no one except a recipient can read or substitute data which is being sent.

As a rule, E2EE is based on a hybrid approach: the symmetric algorithm key used for encoding transmitted data is encrypted asymmetrically. This approach makes it possible to lower system requirements for hardware since these requirements are much higher when asymmetric keys are used. This scheme is implemented in the Diffie–Hellman protocol.

When Should End-to-End Encryption Be Used?

End-to-end encryption protects against:

Unauthorized access: If E2EE is used, no one can read the data when it is transmitted via the network because only the sender and recipient have keys needed for decryption of messages. Although every message can be visible to an intermediate server that helps to transfer the message, it will not be read.

Authentication of keys is required for prevention of “man-in-the-middle” (MITM) attacks. In particular, it is necessary to compare the fingerprints of an open key via an open channel. For example, in Telegram, emojis are used as fingerprints. If fingerprints match on both devices, the channel is considered secure.

Data falsification: E2EE combined with the AES algorithm used in GCM mode ensures data integrity. It means that a message cannot be altered in any way since any attempt to change the data will be noticed at once. When large financial transactions are carried out, falsification of transmitted data (e.g., transaction amount or the account number) may lead to significant risks for business partners.

Please note that all these advantages can be achieved only if malware is not installed on users’ devices and hackers do not have remote access to these devices. So, it is critical to make sure that only the software certified by the vendor is used. If users are communicating via a browser, all of them should go to the correct web page of the video conferencing service (they should carefully check the server address).

What Cannot Be Protected With End-to-End Encryption?

The protection does not extend to metadata, for example, the date and time when a message was sent cannot be concealed. Similarly, 100% protection cannot be guaranteed if endpoints are hacked, vulnerable intermediates are used (e.g., push-notifications services or technologies for confirming the account) or if there are some system backdoors.

Simple Example of End-to-End Encryption

Let us imagine a mail service whose employees open envelopes and read letters before sending them to recipients. The idea of something like that does not even occur to you. In fact, employees may not only read the letter out of curiosity, but also share its content with other people or even substitute the letter with a different one.

The management of the mail service may promise that its employees will not read letters. However, one cannot rely on such a promise because an unscrupulous employee may not follow the rules. Users have to be sure that the mail service takes adequate measures to protect confidentiality and integrity of each letter. Unfortunately, one cannot be absolutely certain.

The encryption method is called “end-to-end” because none of the communications providers that stand between users can decrypt the message. Let us suppose that instead of sending a letter in an envelope, someone sends it in a locked safe that can be opened with the PIN which is available only to the sender.

Now, it will be physically impossible for anyone to read the letter, except the person for whom the letter was intended. That is what can be called true encryption. To secure your private data IAM Resilience platform can be the best choice as a security.

How It Works?

The method described below can also apply to group chats or calls. The only difference is that in such a case, a larger number of recipients will have to send their public keys to the server in order to exchange them with other participants.

End-to-end encryption works in the following way:

At the beginning, when the communication session is created, the receiving application generates two keys, public and private.
The public key is sent to the server. As it has been said before, this key is used only for encryption of data. Its availability does not enable the server to read the sender’s message. In this case, it is critical that the server cannot replace the original pair of private and public keys with its own pair.

This requirement is critical for eliminating the risk of an MITM attack mentioned before. So, one has to make sure that the server provided by a vendor and the vendor are trustworthy.

The sending application downloads the public key for encrypting the message.
An encrypted message is sent to the server.
The receiving application downloads the message and decrypts it with its own private key.

Diagram explaining how public key encryption works in message exchange between a sender and recipient through a server

Difference Between E2EE and TLS

TLS (Transport Layer Security) is an encryption protocol designed to secure communication via the Internet. Similarly to E2EE, it makes use of open-key encryption at the stage when the server and , if necessary, client (TLS handshake) are authenticated.

However, TLS secures communication between a user and a server, but not directly between users. It ensures secure data transfer to and from a server. However, the data is decrypted on the server. As a rule, the server needs access to data to ensure correct work of a web application.

This approach violates the confidentiality of information and cannot apply to the situations when we are talking about highly-sensitive data. This argument is particularly relevant to the cases when users want to exchange messages in such a way that the provider of services (server) cannot view the chat. Leveraging security in snowflakes can further enhance data protection and ensure privacy during communication, even in sensitive scenarios.

End-to-End Encryption During Video Conferencing

MCU

End-to-end encryption is impossible when the classical MCU scheme is used. since the video conferencing server does not act only as an intermediary. It is also responsible for the following processes:

Decoding incoming video streams in order to minify them.
Merging video streams into a single layout in which all meeting participants are visible.
Encoding the received video with the bitrate required by the recipient.

At each of these stages, the server has to work with a decrypted video stream which creates potential loopholes for hackers who can view the data or use them in a malicious way. So, at a certain stage of the transcoding process, sensitive media data are in an open and unprotected state.

Diagram showing a video conference system where an MCU server receives individual video streams and sends a combined stream to participants.

SVC

Scalable video coding (SVC) technology is a cutting-edge approach to the transfer of media streams; it allows multiple sub-streams of different quality to be sent in a single (major) stream. Below, we will discuss a theoretical implementation of a video conference involving both SVC and E2EE.

The general approach can also apply to other technologies such as Simulcast. However, here, instead of dividing video into multiple layers, each media stream will be encrypted.

To optimize encryption, we can use a common key for all streams (symmetric encryption, for example, AES-256). This key can be securely passed with the help of asymmetric encryption (RSA).

The scheme will work in the following way:

When a conference is created, the client application of the owner will receive open keys from conference participants.
The owner generates a random session key and encrypts it with the received open key for every conference participant individually.
The encrypted session key is sent to conference participants.
Conference participants decrypt the received packet with their private key.

Now, participants and the conference owner use the agreed-upon session key to exchange the encrypted video stream:

On the sender’s device:
- The original content (source video stream) is encoded with the help of SVC technology and is divided into layers with different quality.
- Every layer is encrypted with the session key.
Encrypted media data are sent to a media server.
The server determines the bandwidth of the recipient and forwards a set of encrypted layers so that each participant could receive video in the appropriate quality.
Every recipient decrypts a video stream with the session key and is able to play it.

Diagram illustrating secure SVC video streaming, where encrypted quality layers (720p to 4K) are decoded by devices using a session key.

Limitations

As of 2023, many popular messengers like WhatsApp and Telegram have implemented E2EE only in point-to-point calls. Why is it difficult to implement end-to-end encryption in group calls? Given the underlying principles of end-to-end encryption, as it has been described previously, this approach cannot be easily implemented within the context of a group conference. Each stream (audio and video) has to be encrypted and decrypted, and this load has to be carried only by a user’s device.

Please note that even the vendors claiming to have implemented “true” E2EE (without a server) in group conferences, have to restrict the number of participants in an event. Besides, multiple features provided by the server become unavailable, e.g., conference recording and streaming.

For example, in a point-to-point call, the outgoing stream is encrypted while the incoming stream is decrypted. In a group call, the number of streams grows and the number of decryption operations also increases. Let us suppose that there are 10 conference participants. In such a case, every device does not only have to encrypt 1 stream and decrypt 9 streams; it also has to place the decrypted videos in the layout. Additionally, one has to consider the load of such features as chats, file transfer, content sharing, and slideshow.

Given this analysis, we can identify two important points:

Either media data should be on a trusted server (self-hosted instead of cloud-based) at the time when it is processed and opened.
Or, the cloud service should support end-to-end encryption for SVC (unfortunately, we do not know of any such services).

So, if an on-premise video conferencing server (e.g., TrueConf Server) is used, you can enjoy high-quality video conferencing with optimized streams thanks to SVC. At the same time, you will not be limited in terms of features. Encryption of streams will be unnecessary because you have full control over the data.

Conclusion

As of 2025, end-to-end encryption is effectively implemented only in point-to-point calls and if it is used, multiple features are not available, for example, in Zoom and Teams, one cannot use the following features:

Cloud-based recording
Streaming
Audio transcription
Breakout rooms
Surveys and others

In video conferences, encryption is used between a client and the server; this ensures traffic protection only up to a point when the traffic reaches the server. Hackers, who have access to the server, can view all passing traffic. Since large vendors promote mostly cloud-based solutions (Zoom, MS Teams), lack of information about encryption methods is also coupled by the risk of data leak from the servers that are not controlled by users.

Due to these reasons, we suggest using TrueCong Server, an enterprise-grade video conferencing server. From the standpoint of communications security, it will offer you the following advantages:

AES-256 which is used for the encryption of media streams sent via the TrueConf protocol, SRTP – for connections via SIP and from a browser (WebRTC), and H.235 – for encryption of H.323 connections.
TLS 1.3 protocol is used for the coordination of communication protocols and exchange of session keys (e.g., for HTTPS).
Scalable video coding (SVC) technology based on the VP8 codec which lowers the requirements for channel bandwidth and performance of client devices.
No need for permanent Internet connection which allows the solution to work in closed networks. Check here to learn more about advantages.

Empower your video conferencing security with TrueConf!

Learn more