An Evaluation of Tail Loss Recovery Mechanisms for TCP

By: 
M. Rajiullah, P. Hurtig, A. Brunstrom, A. Petlund, M. Welzl
Appears in: 
CCR January 2015

Interactive applications do not require more bandwidth to go faster. Instead, they require less latency. Unfortunately, the current design of transport protocols such as TCP limits possible latency reductions. In this paper we evaluate and compare different loss recovery enhancements to fight tail loss latency. The two recently proposed mechanisms "RTO Restart" (RTOR) and "Tail Loss Probe" (TLP) as well as a new mechanism that applies the logic of RTOR to the TLP timer management (TLPR) are considered. The results show that the relative performance of RTOR and TLP when tail loss occurs is scenario dependent, but with TLP having potentially larger gains. The TLPR mechanism reaps the benefits of both approaches and in most scenarios it shows the best performance.

Public Review By: 
Joel Sommers

"Tail loss" in TCP connections is a situation that occurs when packets at the end of a burst are lost. These losses cannot be recovered through TCP’s fast retransmit mechanism, thus must be recovered through retransmission timeouts (RTOs). Recent work [11] found that tail loss is surprisingly common and can have a significant impact on flow completion times. There are two current proposals for addressing tail loss in TCP to reduce flow completion delays: tail loss probe (TLP), which was introduced in [11] and is currently an Internet Draft[9], and RTO restart (RTOR), which also currently exists as an Internet Draft [14]. In this paper by Rajiullah et al., the authors seek to empirically compare the effectiveness of these two approaches. They also evaluate a combined approach, which they call TLPR. In laboratory experiments with losses that are controlled to explicitly occur in the tail of a burst, and in experiments in which losses arise purely due to congestion, the authors find that the combined approach almost always performs better than either TLP or RTOR alone. Since any modifications to TCP can have widespread impact, the reviewers of this paper pressed the authors to provide the most clear explanations possible in all phases of their work, and to be concrete and quantitative in their comparisons. Interestingly, it was the reviewers of this paper that suggested using a combination of TLP and RTOR which, as the paper shows, results in the most effective approach. There are still questions regarding how realistic the evaluations are and how the mechanisms will perform in the wild, but it is likely that some form of these proposed modifications to TCP will eventually be standardized and that this work will be instrumental in that process. Public review written by