Xiaoqi Ren

Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale

By: 
Xiaoqi Ren, Ganesh Ananthanarayanan, Adam Wierman, Minlan Yu
Appears in: 
CCR August 2015

As clusters continue to grow in size and complexity, providing scalable and predictable performance is an increasingly important challenge. A crucial roadblock to achieving predictable performance is stragglers, i.e., tasks that take significantly longer than expected to run. At this point, speculative execution has been widely adopted to mitigate the impact of stragglers. However, speculation mechanisms are designed and operated independently of job scheduling when, in fact, scheduling a speculative copy of a task has a direct impact on the resources available for other jobs.

Syndicate content