With the explosive growth of WWW traffic, there is an increasing demand for the high performance Web servers to provide a stable Web service to users. The cluster-based Web server is a solution to cope with the heavy access from users, easily scaling the server according to the loads. In the cluster-based Web server, a back-end node may not be able to serve some HTTP requests directly because it does not have the requested contents in its main memory. In this case, the back-end node has to retrieve the requested contents from its local disk or other back-end nodes in the cluster.
To reduce service latency, we introduce a new prefetch scheme. The back-end nodes predict the next HTTP requests and prefetch the contents of predicted requests before the next requests arrive. We develop three prefetch algorithms based on some useful information gathered from many clients’ HTTP requests. Through trace-driven simulation, Time and Access Probability-based Prefetch ($TAP^2$) algorithm, which uses the access probability and the inter-reference time of Web objects, shows the best performance among the proposed prefetch algorithms. With $TAP^2$ algorithm, the service latency is reduced by 20.1% in a small sized memory and 1.5% in a large sized memory, comparing with none-prefetch mechanism.