Logo image
Using fault injection to evaluate the performability of cluster-based services
Technical documentation   Open access

Using fault injection to evaluate the performability of cluster-based services

Kiran Nagaraja, Xiaoyan Li, Bin Zhang, Ricardo Bianchini, Richard P. Martin and Thu Nguyen
Rutgers University
2002
DOI:
https://doi.org/10.7282/t3-c3g3-bw33

Abstract

We propose a two-phase methodology for quantifying the performability (performance + availability) of cluster-based Internet services. In the first phase, evaluators use a fault-injection infrastructure to measure the impact of faults on the server’s performance. In the second phase, evaluators use an analytical model to combine an expected fault load with measurements from the first phase to assess the server’s performability. Using this model, evaluators can study the server’s sensitivity to different design decisions, fault rates and other environmental factors. To demonstrate our methodology, we study the performability of 4 versions of the PRESS web server against 5 classes of faults. We use Mendosus, a new fault-injection and network emulation infrastructure, to effect phase 1 of our methodology. We then use our model to quantify the gain or loss in performability as PRESS was modified for increasing performance. We also use our model to study the impact of reducing live operator support and adding RAIDs on PRESS’s performability.
pdf
dcs-tr-491216.29 kBDownloadView
Technical Documentation Open Access
url
Report an accessibility issueView
Please complete a content remediation request to report an accessibility issue with a library electronic resource, website, or service.

Metrics

161 File downloads
84 Record Views

Details

Logo image