Distributed Locking in Database

Database

A Database is a system where data is stored in different forms and accessed by various tools or software. A Database is an organized arrangement of information that has been arranged and is often kept electronically in a computer system. A Database Management System often oversees a Database (DBMS). The term "Database System," which is frequently abbreviated to "Database," refers to the combination of the data, the DBMS, and the applications that are connected to it.

To simplify processing and data querying, the most popular types of Databases currently in use typically model their data as rows and columns in a set of tables. The data may then be handled, updated, regulated, and structured with ease. For writing and querying data, most Databases employ Structured Query Language (SQL).

SQL (Structured Query Language):

Almost all relational databases employ SQL, a programming language, to query, manage, and define data as well as to provide access control. The SQL ANSI standard was first developed at IBM in the 1970s, with Oracle playing a significant role in its development. Since then, SQL has inspired numerous expansions from businesses including IBM, Microsoft, and Oracle. Despite the continued popularity of SQL, new programming languages are starting to emerge.

Locking in Database

In Database Management Systems, to maintain Concurrency, we use locking systems to lock the item which is being shared by more than one transaction at a time. The lock is just a variable that determines whether data items can be read/written or not. There can be various locking algorithms like One Phase Locking Algorithms or Two-Phase Locking Algorithms etc.

Distributed Database Systems:

When the Database is distributed over several devices or systems, then it is called a Distributed Database System. In a Distributed Database System, we have a locking concept for Concurrency Control. We will discuss some of the Concurrency Control methods below:

Distributed Two-phase Locking Algorithms in Database:

In a Two-phase Locking Algorithm, each transaction executes in two phases. In the first phase transaction gets all the locks it requires, and in the second phase, it releases all the locks. In the Distributed Two-phase Locking Algorithm, there are sites that are called lock managers and are used to manage the locks. These lock managers get the requests from the transactions to provide them lock or release the lock. So, it coordinates between different nodes or systems of distributed networks to manage the locks and monitor all the transactions. A dedicated site is assigned as the lock manager in the Distributed Systems.

There can be three types of Distributed Two-phase Locking Algorithms based on the number of lock managers, which are described below:

  • Centralized Two-phase Locking:

In this type of algorithm, one site from the system is assigned as the central lock manager, and all the other nodes access and release the locks from the central node. So basically, all the managing of locks is done by the central lock manager.

  • Primary Copy Two-phase Locking:

In this algorithm, there are various sites assigned as lock control centers. Each lock control center has some predefined set of locks to manage, and each control center has an idea about the set of locks managed by other lock centers. So, locks of different data items are managed by different lock centers.

  • Distributed Lock Phasing:

In this type of system, there is a lock manager for each site in the Distributed System. Each lock manager is responsible for managing the lock of the data item which is present on the local site. So, for each node, the management of locks is done by its local lock manager. The number of lock managers will be equal to the number of sites.

Apart from the Locking Algorithms in the Distributed System, we have other methods for Concurrency Control which are described below:

Distributed Timestamp Concurrency Control in Database

In Centralized Systems, we have a local clock to maintain the timestamps, but in distributed systems, there are different clock times for each system. So, for timestamps, we use site id and their local clock timing. Now there is a transaction manager which uses a queue in order to provide the resources on a first come, first serve basis. For each translation request, it is assigned in the queue with its timestamp in increasing order.

Distributed Optimistic Concurrency Control Algorithm

In this algorithm, the first transaction is validated then it is committed. Distributed Optimistic Concurrency Control Algorithm has two rules or two phases which are described below:

  • Rule 1: In the first phase, all the transactions from the sites are validated locally on their sites. If the transaction is not valid, then it is aborted, and if it is valid, then it is passed for the global validation. Local validation is used for serializability on the site where it is executed.
  • Rule 2: when the transaction passes the local validation, it enters for the global validation. In this validation scheme, if two transactions that are conflicting with each other are running together at different sites, then they will synchronize till both commit their execution. In this algorithm, one transaction can wait for another to maintain the synchronous behavior. This algorithm is not optimistic because transactions have to wait many times before committing.





Latest Courses