Deep Dive: How to Break the Congestion Barrier – Achieving low latency with high throughput for safe teledriving

Apr 14, 2023

By Ralf Globisch (Tech Lead of Real-time Transport), Damien Feneyrou (Principal Software Engineer) and Jesús Gonzalez Tejeria (Engineering Manager for Connectivity)

As the world becomes increasingly connected, the demand for fast and reliable internet connectivity has never been greater. From streaming high-definition video to cloud gaming, remote surgery and of course teledriving, there are countless applications that require high bandwidth and low latency (in the order of tens of milliseconds). Maintaining low latency is especially challenging when the network is congested.

Congestion control algorithms are a crucial part of internet transport protocols and determine how the sending applications adapt the rate of data sent into the network based on indicators of congestion such as packet loss and/or delay. It is desirable to maximize link utilization while keeping latency low. In loss-based congestion control algorithms, packet loss is used as a signal for congestion control to reduce the sending rate. A later extension to TCP/IP called Explicit Congestion Notification (ECN) [1] does not rely on packet loss and indicates congestion by marking certain bits in the IP header. It was standardized in the early 2000s but found limited deployment on the internet [2][3]. As a follow-up of ECN, the internet Engineering Task Force (IETF) came up with L4S.

What is L4S?

L4S is short for Low Latency, Low Loss, and Scalable throughput. In January 2023, the IETF approved a publication called RFC9330 [4] that explains how to achieve low latency even when dealing with high-throughput applications, such as video streaming. It repurposes the bits in the IP header that are used for ECN. The goal is to make sure that this method is compatible with existing internet congestion control mechanisms.
Before L4S, current Active Queue Management (AQM) algorithms in network devices may still result in latencies of several 100ms [4], which is not desirable in applications such as online gaming, conversational video applications and teledriving. It is crucial to highlight that L4S can be applied to any network, not just mobile networks.

How does L4S work?

L4S builds on top of ECN. Once the latency of L4S enabled queues in the Radio Access Network (RAN) exceeds a predefined threshold (typically a value of 4ms is used), packets are gradually marked with Congestion-Experienced (CE) markings indicating the early onset of congestion. The probability of packets that are marked increases until the queue delay reaches an upper threshold, where all packets are marked. The CE markings are received by the receiver application and this information is sent back to the sender over transport protocol feedback such as RTCP [7] so that the sender can adapt the sending bitrate accordingly. Failure to do so may result in additional delay, which depending on the application may need to be avoided.

Mobile Networks & L4S

Mobile network infrastructure is usually designed for achieving high download rates, while uploads are traditionally less important. In the past couple of years, live streaming services such as Twitch and Youtube live have become more prevalent, and these types of services also have high upload rate requirements. One of the main sources of jitter variation in mobile networks occurs in the RAN either due to congestion or some other factor characterized by RAN such as scheduling jitter, or handover [5]. This is why it is becoming increasingly important for mobile network operators to provide a more reliable network. The L4S approach provides finer-grained congestion information allowing the application to react and adapt its bit rate accordingly.

Teledriving and mobile network requirements

One core component of any teledriving technology is the video streaming over mobile networks. A good teledriving experience requires certain video quality and latency. While 4G and 5G already support the video quality or latency required for teledriving (Vay has already received permission to drive on public roads without a person inside the car [https://vay.io/press-release/a-historic-moment-the-first-car-drives-without-a-person-in-the-vehicle-on-a-european-public-road/]), L4S helps deal with network congestion events and increases the quality of the video stream thanks to the explicit feedback of buffer build up at the radio interface, which is usually the bottle neck in mobile applications. Eventually, L4S may also reduce the number of redundant networks without jeopardizing safety.

Vay, Deutsche Telekom, Ericsson & L4S

In the joint work with Deutsche Telekom and Ericsson, Vay explored the benefits of managed latency using L4S and the SCReAM congestion control algorithm [6] for teledriving. SCReAM is a congestion control algorithm standardized for real-time conversational video. The SCReAM implementation is based on the open source code available on github [9].

Vay’s technology stack uses multiple mobile networks to handle potential connectivity problems that might occur in practice: if one network fails, the signal can still be received via another mobile network. Furthermore, the connection between the vehicle and the teledriver is continuously monitored and if issues occur which trouble safe control of the vehicle by the teledriver, the vehicle automatically conducts a so-called Minimum Risk Manoeuvre (MRM).

For the showcase with Deutsche Telekom and Ericsson, Vay explored driving on a single mobile network, but using the L4S technology to be able to detect congestion early.

The result of the collaboration was showcased at the 2023 Mobile World Congress, where a car was teledriven in our testing center in Tegel (part of the Urban Tech Republic facilities in Berlin) from Barcelona. The data traffic was routed via Sweden and London amounting to a total of about 5000 kms between Barcelona and Berlin. The benefits of using L4S were strongly evident:

The overall latency was low and stable even after adding background traffic to the cell that simulated the presence of other users of the mobile network, e.g. downloading or streaming movies or having video calls.
In spite of the distance between the car and the teledriver (over 5000 kms), teledriving was feasible.
L4S decreased the level of network redundancy.

Results of teledriving using L4S and SCReAM at MWC

This section will dive into the details and analyze the results of teledriving with one mobile network from Barcelona in Berlin, via Sweden and London (5000 kms apart), with and without L4S. (Due to the large distances involved, the results presented in this article are not representative of what Vay experiences during daily operations.)

For this analysis the focus is on two key elements:

Video frame latency: it is defined as the latency from the time an image is captured by the camera sensor until it is rendered on the screen for the teledriver. The higher the frame latency or the greater the variation in image latency, the more difficult it is for the teledriver to operate the vehicle remotely.
Queue delay: it represents the amount of time the packets are buffered in the network. It is estimated by SCReAM.

Figure 1 shows the frame latency measured at the receiver during queue delay build up which is shown in figure 2. Without L4S, denoted in blue, it can be observed that the queues contain more than one second of delay, even though the SCReAM algorithm has been designed to detect congestion in the network and adapt the bitrate accordingly. This is because SCReAM uses over the top heuristics to adapt the bitrate unlike L4S which is part of the network. In Vay’s teledriving application, such latency spikes would trigger an MRM, as it is no longer safe to operate the vehicle remotely.

In contrast, the red line in figure 1 shows the video frame latency measured when the network supports L4S. Here, it can be observed that the frame latency never exceeds 200ms. That there are no spikes, and that the frame latency variation is much smaller.

Figure 1: video frame latency during queuing in the base station

Figure 2: queueing delay estimated by SCReAM

In Figure 2, the red line also shows how the queuing delay is kept in a controlled, much smaller range, whereas without L4S the queueing delay often exceeds a second. The SCReAM algorithm, with the support of the network i.e. L4S is able to detect congestion earlier and can therefore make swifter and more accurate rate adjustments to maintain a low latency.

Challenges ahead and next steps

Beyond the advantages of L4S, presented in this article, it is the time to look at the future and how L4S can be broadly used in commercial networks.

For a massive adoption of L4S, it is important that all the networking equipment in the path does not clear (also known as bleaching) the ECN bits in the IP packets. Bleaching effectively prevents the adoption of L4S. According to [8] the clearing of the ECN bits amounts to 10.8%. The silver lining is that service providers and mobile network operators are in control of this and can prevent it.

Additionally, not all the networking equipment in the path needs to implement L4S, i.e. to toggle the ECN bits when congestion is detected, it is sufficient to implement L4S in the bottleneck node, which is likely on the RAN.

Beyond SCReAM, other congestion control algorithms that already support L4S and will further boost the ecosystem are TCP prague [10], Apple QUIC [11] or BBRv2 (Bottleneck. Bandwidth and Round-trip time) [12].

As outlined in this article, L4S is a promising technology that can improve the performance of real-time applications such as gaming or remote driving that require reliable and low-latency connectivity. With ongoing research and development, L4S is expected to become more widely available and adopted, opening up new opportunities for innovation and growth in various industries. Overall, L4S represents a significant step forward in the evolution of the internet.

References

[1] , K. Ramakrishnan, S. Floyd and D. Black, The Addition of Explicit Congestion Notification (ECN) to IP, RFC3168, 2001

[2] B. Trammell, M. Kühlewind, D. Boppart, I. Learmonth, G. Fairhurst and R. Scheffenegger. (2015). “Enabling Internet-Wide Deployment of Explicit Congestion Notification”. 8995. 193-205. 10.1007/978-3-319-15509-8_15.

[3] A. M. Mandalari, A. Lutu, B. Briscoe, M. Bagnulo and O. Alay, “Measuring ECN++: Good News for ++, Bad News for ECN over Mobile,” in IEEE Communications Magazine, vol. 56, no. 3, pp. 180-186, March 2018, doi: 10.1109/MCOM.2018.1700739.

[4] B. Briscoe, K. De Schepper, M. Bagnulo and G. White, “Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: Architecture”, RFC9330, 2023

[5] Deutsche Telekom and Ericsson, “Enabling time-critical applications over 5G with rate adaptation”, White Paper, 2021

[6] I. Johansson and Z Sarker, Self-Clocked Rate Adaptation for Multimedia, RFC8298, 2017

[7] , M. Westerlund, I. Johansson, C. Perkins, P. O’Hanlon and K. Carlberg, Explicit Congestion Notification (ECN) for RTP over UDP, RFC6679, 2012

[8] Hyoyoung Lim, Seonwoo Kim, Jackson Sippe, Junseon Kim, Greg White, Chul-Ho Lee, Eric Wustrow, Kyunghan Lee, Dirk Grunwald and Sangtae Ha, A fresh look at ECN Traversal in the Wild, 2022

[9] https://github.com/EricssonResearch/scream

[10] Bob Briscoe, Koen De Schepper, Olivier Tilmans, Mirja Kuhlewind, Joakim Misund, Olga Albisser and Asad Sajjad Ahmed, Implementing the ’Prague Requirements’, for Low Latency Low Loss Scalable Throughput (L4S)

[11] Reduce networking delays for a more responsive app, https://developer.apple.com/videos/play/wwdc2022/10078/

[12] BBR2, https://github.com/google/bbr

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Thomas von der Ohe

WHAT DRIVES ME

BACKGROUND

Fabrizio Scelsi

WHAT DRIVES ME

BACKGROUND

Bogdan Djukic

WHAT DRIVES ME

BACKGROUND

Mariona Bosch

WHAT DRIVES ME

BACKGROUND

Justin Spratt

WHAT DRIVES ME

BACKGROUND

Irene Molins

WHAT DRIVES ME

BACKGROUND

David Gossow

WHAT DRIVES ME

BACKGROUND

Johanna Loomis

WHAT DRIVES ME

BACKGROUND

Vladimir Bilonenko

WHAT DRIVES ME

BACKGROUND

ALINA PRESTI

WHAT DRIVES ME

BACKGROUND

Claire Eagan

WHAT DRIVES ME

BACKGROUND

By Ralf Globisch (Tech Lead of Real-time Transport), Damien Feneyrou (Principal Software Engineer) and Jesús Gonzalez Tejeria (Engineering Manager for Connectivity)

Related Stories

Introducing Cole, Vay’s first Remote Truck Driver

Industry Latest: Vay partners with Kodiak, brings remote driving to the Bay Area, and grows in Ve...

Introducing the Remote Drivers Team

Industry Latest: Europe’s first commercial Remote Driving launch an...

Latest from Vay: CEO Insights

2024: Vay’s Biggest Year Yet!

Want to try Vay?

Want to try Vay?

Thomas
von der Ohe

Fabrizio
Scelsi

Bogdan
Djukic

Mariona
Bosch

Justin
Spratt

Irene
Molins

David
Gossow

Johanna
Loomis

Vladimir
Bilonenko

ALINA
PRESTI

Claire
Eagan