Achieving Strong Consistency and Low-Cost Fault Tolerance using Logical Clocks in Distributed Transactional Systems

Masoomeh - Lehigh University Javidi Kishi; Roberto - Lehigh University Palmieri

About this Digital Document

Distributed transactional systems are the standard medium to let client (or application) requests interact with a shared state, in the presence of concurrency. These systems require clients to wrap the application logic that acts on the shared state around a well-defined block of code, called transaction. Transactions are guaranteed to execute atomically and in isolation by the transactional system. Practical workloads in distributed transactional systems often involve a high number of interacting users whose requests need to be processed safely while maintaining high performance, scalability, and fault tolerance of the system.The concurrency control is the component responsible for ensuring that operations from concurrent transactions are performed while preserving safety and ensuring a desired consistency level. A consistency level provided by a concurrency control directly affects the simplicity in which programmers develop distributed applications. The stronger the consistency level, the closer the behavior of the system to a serial one. For a programmer, a desirable consistency level is the one that does not violate application semantics and prevents the exposure of behaviors (or anomalies) that cause clients interacting with the system to observe ``unrealistic" results (e.g., where read operations return obsolete values). In addition to providing a desirable consistency level, a transactional system should also provide fault tolerance, which refers to its capability of surviving failures without compromising the shared state.The traditional approach to ensure fault tolerance is introducing redundant resources (e.g., storage) to preserve multiple copies. However, two problems arise from adopting redundant resources. First, these copies should be kept consistent, which adds additional design challenges to the synchronization protocol. As a result, system performance might be negatively affected, especially if the assumption is that no centralized source of synchronization is deployed. Second, storage cost increases due to redundancy. This aspect can be neglected if the data repository is not very large. However, modern workloads, such as those produced by social networking applications, replicating the data repository would introduce an unfeasible increase in storage cost.The research included in this dissertation has a twofold aim applied to distributed transactional systems, i) improving the data freshness of transactional read operations while adopting a fully decentralized design where no centralized source of synchronization is assumed; and ii) improving the cost of fault tolerance, namely the space overhead needed to ensure normal operation despite the presence of failures. These two properties are both appealing and orthogonal, therefore can be independently adopted by transactional systems.In this dissertation, we are interested in the external consistency and Parallel Snapshot Isolation (PSI) consistency levels as they are widely accepted by a variety of real-world applications, spanning from On-Line Transaction Processing (OLTP) to social networking applications. Ensuring high level of data freshness in these consistency levels is the first key driving factor of the presented research. The level of data freshness is a property of the concurrency control formalizing the guarantees of a read operation in terms of obsoleteness of the returned value. Given the result of a read operation, the gap between the returned version of a shared object and its latest version quantifies the level of freshness of that read. The second key driving factor of the research included in this dissertation is to address the aforementioned problem of increased storage space due to redundant resources.In order to improve the level of data freshness, we present two transactional systems named SSS and FPSI.SSS is a transactional key-value store that guarantees external consistency. SSS supports abort-free read-only transactions and its novelty is in the deployment of a new technique, named snapshot-queuing, to propagate established serialization orders among concurrent transactions. Such a propagation enables update transactions to be serialized along with read-only transactions in a unique order where reads always return values written by the last update transaction returned to its client, hence providing the highest level of data freshness.FPSI is a distributed transactional in-memory key-value store whose primary goal is to enable read operations to read fresher than existing implementations of the well-known Parallel Snapshot Isolation (PSI) consistency level, in the absence of a synchronized clock service among nodes. At the core of FPSI, a novel concurrency control allows abort-free read-only transactions to access the latest version of objects upon their first access to a node. FPSI does that by implementing a visible read technique, which lets read-only transactions to advance their logical clock upon the first access to a node, while retaining safety.The third contribution in this dissertation overcomes the lack of theoretical framework to reasoning about correctness of distributed concurrency control implementations based on their data freshness.The first two contributions of this research aim at providing fault tolerance through replication, which might cause a significantly large amount of data to be replicated. To address this challenge, we present EPSI, a transactional key-value store that integrates a layered concurrency control connected to an existing PSI to handle storage access. The general idea of EPSI is to deploy the erasure coding technique in its read and write operations and provide a desired system resiliency level through a lower storage cost, in comparison with the traditional replication techniques deployed by state-of-the-art transactional systems.

Full Title

Achieving Strong Consistency and Low-Cost Fault Tolerance using Logical Clocks in Distributed Transactional Systems

Member of

Theses and Dissertations

Contributor(s)

Creator: Javidi Kishi, Masoomeh - Lehigh University

Creator: Palmieri, Roberto - Lehigh University

Date Issued

2020

Language

English

Type

Genre

Media type

Subject (LCSH)

Javidi Kishi, . M., & Palmieri, . R. (2020). Achieving Strong Consistency and Low-Cost Fault Tolerance using Logical Clocks in Distributed Transactional Systems (1–). https://preserve.lehigh.edu/lehigh-scholarship/graduate-publications-theses-dissertations/theses-dissertations/achieving-0

Javidi Kishi, Masoomeh, and Roberto Palmieri. 2020. “Achieving Strong Consistency and Low-Cost Fault Tolerance Using Logical Clocks in Distributed Transactional Systems”. https://preserve.lehigh.edu/lehigh-scholarship/graduate-publications-theses-dissertations/theses-dissertations/achieving-0.

Javidi Kishi, Masoomeh, and Roberto Palmieri. Achieving Strong Consistency and Low-Cost Fault Tolerance Using Logical Clocks in Distributed Transactional Systems. 2020, https://preserve.lehigh.edu/lehigh-scholarship/graduate-publications-theses-dissertations/theses-dissertations/achieving-0.