An important technique to ensure the scalability and availability of clustered computer systems is data replication. This paper describes a new approach to data replication management called Robust Snapshot Replication. It combines an update anywhere approach (so updates can be evaluated on any replica, spreading their load) with lazy update propagation and snapshot isolation concurrency control. The innovation is how we employ snapshot isolation in the replicas to provide consistency, fail safety, and also to achieve high scalability for both readers and updaters, by a system design without middleware or group communication infrastructure. We implemented our approach using the PostgreSQL database system and conducted an extensive experimental evaluation with a small database cluster of 8 nodes. Our results demonstrate the scalability of our algorithm and its performance benefits as compared to a standard consistent replication system based on synchronous propagation. We also evaluated the costs for adding a new cluster node and the robustness of our approach against node failures. It shows that our approach is at a sweet-spot between scalability, consistency and availability: it offers an almost perfect speed-up and load-balancing for our 8 node cluster, while allowing dynamic extension of a cluster with new nodes, and being robust against any number of replica node failures or a master failure.
|Cite as: Rohm, U., Cahill, M., Jung, H., Rodley, M. and Fekete, A. (2013). Robust Snapshot Replication. In Proc. Database Technologies 2013 (ADC 2013) Adelaide, Australia. CRPIT, 137. Wang, H. and Zhang, R. Eds., ACS. 91-92 |
(local if available)