Achieving low and predictable execution time of short jobs in Hadoop clusters has gained a great attention due to their importance on system productivity and user experience.
However, one major contributor that makes it challenging is disk I/O interference.
We observed that disk writes unintentionally block latency-sensitive short jobs and cause unexpected high latency.
Unfortunately, previous research including a disk read bandwidth throttling do not suffice to mitigate such interference.
This paper proposes the application-assisted writeback that allows the Hadoop framework to control asynchronous writebacks.
We applied the application-assisted writeback to optimize short jobs by preventing asynchronous writebacks when they are expected to interfere with short jobs.
Our evaluation resulted in reduction on the average and 99-th percentile execution time of short jobs by 22% and 40%, respectively, without imposing non-acceptable overheads on co-running throughput-oriented batch jobs.
In addition, combining the application-assisted writeback with the user-level disk bandwidth throttling can further accelerate short jobs.