Industry observers expect VoIP to eventually replace most of the existing land-line telephone connections. Currently however, quality and reliability concerns largely limit VoIP usage to either personal calls on cross-domain services such as Skype and Vonage, or to single-domain services such as trunking, where a core ISP carries long-distance voice as VoIP only within its backbone, to save cost with a unified voice/data infrastructure. This paper investigates the factors that prevent cross-domain VoIP deployments from achieving the quality and reliability of existing land-line telephony (PSTN). We ran over 50,000 VoIP phone calls between 24 locations in US and Europe for a three-week period. Our results indicate that VoIP usability is hindered as much by BGP's slow convergence as network congestion. In fact, about half of the unintelligible VoIP samples in our data occur within 10 minutes of a BGP update.
Voice over IP (VOIP) is now part of every day life almost as much as e-mail and the web. We've been trying to get it to work for at least a quarter of a century. The voice funnel was an early device to packetize speech and was integrated into early ARPANET experiments by BBN. There is a direct line of descent from those experiments via the Network Voice Protocol, through to today's Realtime Transport Protocol.
Much of the work in the early days (indeed until the early 1990s) revolved around proposals to modify the Internet layer to provide QoS directly through the packet forwarding and end-to-end service interfaces. Thus the ST and ST-II protocols were developed, and RSVP, and a whole plethora of packet scheduling algorithms such as worst-case fair, weighted fair queueing and so forth, as well as their associated admission control algorithms, and, most relevant here, route pinning.
In the Inter-domain world in which we live, for the vast majority of end-to-end communications sessions, packets will traverse multiple ISPs. Even where some ISPs have deployed QoS or over-provisioned their links, a user cannot be assured of this in general. Thus their traffic may be subject to congestion or to re-routing. While web browsers are somewhat insensitive to variation in throughput or latency during a download, and e-mail users are really quite oblivious to it, VOIP users will perceive impact in the quality of experience directly from either congestion or from re-routing. Jitter, packet re-ordering, and packet loss are all things that VOIP applications are designed to cope with. However, there are limits to the amount that a play-out buffer can adapt before the user will simply hang up.
In the past, most work has concentrated on the impact of queuing on the play-out delay, loss concealment and hence resulting audio quality. This paper is one of the first to pin down the amount that inter-domain re-routing impacts VOIP. And it shows: BGP is a significant part of the problem. Indeed, the paper shows through systematic experimentation over a fairly large number of paths that chaos following BGP updates can account for as many as 50% of problems for VOIP calls, but worse, that these are the most serious problems and account, potentially for 90% of dropped calls. Their are limitations to an automated experimental approach, such as the one employed in this paper, that mean we cannot tell if this latter figure reflects actual user paper, but the model employed by the authors certainly supports a result in that sort of region.
This is a serious problem since it is unlikely that paths between users will traverse less ISPs suddenly (the ISP economic landscape cannot change that quickly, even if AT&T take over most of the world). The BGP protocol world would seem to be the next place to look for solutions. Perhaps the IETF needs to consider some VOIP aware approximate route pinning mechanism. Perhaps someone is already working on this, and will write a followup paper to explain the solutions.
We would like to hear from you, if you are: We must fix BGP!