A practical solution to the client-LDNS mismatch problem

By: 
Cheng Huang, Ivan Batanov, Jin Li
Appears in: 
CCR April 2012

Internet services are often deployed in multiple (tens to hundreds) of geographically distributed data centers. They rely on Global Traffic Management (GTM) solutions to direct clients to the optimal data center based on a number of criteria like network performance, geographic location, availability, etc. The GTM solutions, however, have a fundamental design limitation in their ability to accurately map clients to data centers - they use the IP address of the local DNS resolver (LDNS) used by a client as a proxy for the true client identity, which in some cases causes suboptimal performance. This issue is known as the client-LDNS mismatch problem. We argue that recent proposals to address the problem suffer from serious limitations. We then propose a simple new solution, named ``FQDN extension'', which can solve the client-LDNS mismatch problem completely. We build a prototype system and demonstrate the effectiveness of the proposed solution. Using JavaScript, the solution can be deployed immediately for some online services, such as Web search, without modifying either client or local resolver.

Public Review By: 
Renata Teixeira

In Web services with multiple data centers, clients are often assigned to the “best” server based on the IP address of the local DNS resolver, not the true client IP address. In some cases, this mismatch may lead to sub-optimal choice of server. This paper proposes that clients obtain a cluster identifier for each service and add this identifier to hostnames. Cluster IDs should capture the client location better than the local resolver (for instance, all clients attached to a given point-of-presence should have the same cluster ID). The evaluation from clients deployed in PlanetLab shows that this solution reduces object load time by tens of milliseconds when loading objects from Microsoft’s data centers. All reviewers acknowledge that this paper presents a solution to a practical problem of current content delivery platforms. Some reviewers also appreciated that the solution can be deployed immediately. Reviewers also point out some limitations and open problems. First, one reviewer had concerns about the novelty of the solution presented in this paper when compared to the IETF proposal to extend DNS. Second, the authors’ early study with Microsoft data showed that today less than 10% of clients are directed to sub-optimal servers. Therefore, the benefit of this solution in practice is unclear. Third, this solution will cause more fragmentation in the local DNS cache. Finally, clients must keep track of one cluster ID per complexity on the client. The authors have revised the paper to address each of these issues. Although a better understanding of the complexity and practicality of the solution will require further evaluation, this paper presents a novel solution to a practical problem. Hence, we decided to publish this paper.