Why is recovery needed in DBMS?

Basically, the operating system is in charge of ensuring or verifying that all operations that must be carried out as part of a transaction have been completed successfully and that their effects are either recorded in the database or the transaction has no impact on the database or any other transactions whenever a transaction is submitted to a DBMS for execution. Hence to prevent data loss recovery approaches based on postponed updating and immediate update or backing up data can be implemented. The availability of a unique file called a system log is crucial for recovery methods. It includes details about each transaction's beginning, middle, and end as well as any changes that take place throughout the transaction. Every transaction operation that modifies the values of database objects is recorded in the log. To recover from transaction failure, you need to know this information.

Although database systems might fail, just like any other type of computer system, the information contained inside must always be accessible. A database must have the tools necessary for quick recovery in the event of failure. It must also be atomic, meaning that either the transaction has no impact on the database or it is properly finished and committed (the effect is permanently stored in the database). Both data backup and recovery from failure conditions can be done in automatic and non-automatic methods. Database recovery procedures are methods for recovering data that has been lost as a result of system failures, transaction errors, infections, catastrophic failure, improper command execution, etc.

The DBMS cannot allow certain transaction T operations to be applied to the database while leaving other T operations unapplied. This essentially may occur if a transaction fails after performing some but not all of its processes.

Types of Failure

Failure of a database is the inability of the database to carry out the requested transaction or the loss of database data. A DBMS is susceptible to several failures, and each of these faults requires a unique management strategy. Database failures can occur for a variety of causes, including network problems, system crashes, natural catastrophes, negligence, sabotage (intended data corruption), software faults, etc.

Transaction failure

Failure in a transaction is defined as when a transaction is unable to complete or reaches a point where it can no longer be completed.

Cause of a DBMS transaction failure:

Logical error: A logical error happens when a transaction cannot be carried out owing to a bug in the code or because of an internal issue.
System error: When a database system terminates an active transaction on its own because of a system problem or because the database management system is unable to complete the transaction. For instance, if a deadlock condition or resource shortage occurs, the system terminates an operational transaction.

System Failure

A hardware or software failure is the most common reason for a system crash. Other issues that come from outside the system and cause it to cease working suddenly or finally crash include operating system faults, power outages, main memory crashes, and transaction failures.

These failures, sometimes known as soft failures, are what cause the data losses in volatile memory. The fail-stop assumption, which is made, assumes that a system crash has no impact on the data kept on non-volatile storage.

Data-transfer Failure

Data-transfer failures are defined as disc failures that occur during data transfer operations and cause the loss of material from disc storage. Disk head crashes, disc unreachability, the development of faulty sectors, read-write problems on the disc, etc. are a few other causes of disc failures.

The backup copy of the data kept on other tapes or discs can be utilized to swiftly recover from a disc failure that occurred in the middle of a data transfer process. Thus regularly backing up your data is a smart idea.

System Error

This sort of error is caused by an operation that is carried out as part of the transaction, such as divide by zero or an integer. This kind of failure is sometimes referred to as a transaction, and it can happen as a result of incorrect parameter values or logical programming mistakes. Moreover, a user may stop the execution in progress, which might cause the transaction to fail.

Local Error

When we do a transaction but certain circumstances arise that can cause the transaction to be cancelled, this basically occurs. In essence, this kind of error falls under the category of local error. An easy illustration of this is the possibility that the transaction's data won't be located. When we attempt to debit funds from an account with an inadequate balance, our request or transaction is cancelled. And in order for this exception to not be seen as a failure, it should be coded into the transaction itself.

Enforcement of concurrency control

Because the transaction essentially violates serializability, the concurrency control method may opt to abort it, to restart it, or we may argue that numerous processes are in a deadlock.

Disk failure

This kind of failure essentially happens when a disc loses its data as a result of a read or writes error or as a result of a read/write head crash. This could take place when the transaction is being read or written to.

Catastrophes

Physical issues are another name for catastrophes. It generally refers to the seemingly limitless list of issues, which includes theft, sabotage, and fire, overwriting discs or tapes unintentionally, and mounting the incorrect tape by the operator.

Data recovery techniques in DBMS are used to recover the data in such situations of system failure. Even if the database system fails, the data in the database must be recoverable to the most recent state prior to the loss of the system. The atomicity and durability of the database are maintained through recovery methods in DBMS. If a system crashes during a transaction and loses all of its data, it is not considered durable, and if just a portion of the data is updated during the transaction, it is not considered atomic. The data recovery strategies in DBMS ensure that the data is always recoverable to safeguard the durability and that its state is retained to safeguard its atomic property.

Log-based recovery

Each DBMS has its own system logs, which keep track of all system activities and include timestamps for the events as they happened. Several log files are handled by databases for database activity like failures, queries, and other changes. The log is kept in the following file formats:

The structure [start_transaction, T] denotes the start of execution of transaction T.
[write_item, T, X, old_value, new_value] shows that the value of the variable, X is changed from old_value to new_value by the transaction T.
[read_item, T, X] represents that the value of X is read by the transaction T.
[commit, T] indicates the changes in the data are stored in the database through a commit and can't be further modified by the transaction. There will be no error after a commit has been made to the database.
[abort, T] is used to show that the transaction, T is aborted.

These logs allow us to track how the status of the data changed throughout a transaction and restore it to either its original state or a new one. To investigate the [write item, T, X, old value, new value] action and obtain the data's previous state, utilize the undo operation. Only if the [commit, T] action is carried out is it feasible to undertake a redo operation to restore data that was lost due to a system failure to its new state.

Conceded Update Method

The adjustments to the data are not done in the approved update method until the transaction has finished, or at the commit operating. The data is altered and then permanently saved in the main memory following this procedure. The logs are kept up-to-date throughout the process and are utilized to identify the failure point in the event of a failure. This gives us an advantage since, even if the system crashes before the commit step, the status will still be handled and the database's contents will not be changed. Unlike the method required for an undo operation, if the system fails after the commit stage, we can quickly reapply the modifications to the new stage.

Many databases have automated logging setup, but we may also manually configure it. To configure logging into a MySQL database, do the following steps in the MySQL terminal:

Make a variable to hold the location of the.log file where the logs should be saved.

Format the log file.

The database's general logging option has to be activated.

The system now keeps track of every database activity and logs it in the general.log file. The command below may be used to check the configuration for this, which is kept in the general log file variable:

The data is updated simultaneously in the rapid update technique before the transaction enters the commit step. As soon as modifications are made to the data, the logs are also kept track of them. If a transaction fails, the data can still be in progress, but undo operations can be used to recover the data. Moreover, we may use SQL instructions to indicate the transaction's status and restore our data to that state. To do so, use the instructions listed below:

The current state of the data in a transaction is saved using the SAVEPOINT command. This command's syntax is,

To return the status of the data to the save point given by the command, use the ROLLBACK command. The command's syntax is,

Quick Update Technique

The current state of the data in a transaction is saved using the SAVEPOINT command. This command's syntax is,

To return the status of the data to the save point given by the command, use the ROLLBACK command. The command's syntax is

What distinguishes an immediate update from a deferred update?

The transaction log files of the DBMS are maintained using two database recovery techniques: deferred updates and immediate updates.
With the Delayed update, the database's state of the data is not updated right once after any transaction is completed; rather, it is altered once the commit has been performed and the changes have been logged in the log file.
With the Instant update, the database is directly updated after each transaction, and a log file is also kept that contains the old and new information.

Immediate Update	Deferred Update
Changes made to the data during a transaction are not instantly implemented.	As soon as the transaction takes place, the database is immediately updated.
The modifications that will be implemented are stored in the log file.	The modifications, together with the new and old values, are stored in the log file.
With this strategy, buffering and caching are employed.	This approach makes advantage of shadow paging.
When a system malfunctions, it takes longer to restore the data.	To handle the logs during the transaction, several I/O activities are made.
If a rollback is performed, the log files are deleted and the database is left unchanged.	If a rollback is performed, the records in the log file are used to restore the data to its previous state.

What backup strategies are there?

A backup is a copy of the database's current state that is kept in a different place. When the system is damaged or destroyed due to a natural disaster, this backup is helpful. The database may be restored to the previous backup state using these backups. The following are some of the several backup techniques used:

Copies that are stored on hard drives or other devices are considered an immediate backup. We can utilize this data to recover the data in the event of a disc crash or any other technical error.
A duplicate of the database is preserved on big storage systems or in cloud settings for archival backups. When a natural calamity damages the system, they are employed to restore the data.

How do Transaction Logs work?

The transactions that have updated the data in the database are all recorded in the transaction logs. To retrieve the data using transaction logs, perform these steps:

All of the log files are searched by the recovery manager to identify transactions that have a start transaction stage but no commit stage.
With the aid of the logs and the rollback command, the transactions in the aforementioned example are rolled back to the previous state.
Changes to the database are made through transactions that include the commit command, and these changes are documented in the logs. The undo function will also be used to undo these modifications.

Shadow Paging

A database is divided into n- different pages in shadow paging, each of which corresponds to a fixed-size disc memory.
Similar to that, a number of shadow pages that are replicas of the actual database are also produced.
The state of the database is transferred to the shadow pages at the start of a transaction.
Only the real database will be changed during the transaction; the shadow pages won't.
The updates are made to the shadow pages after the transaction has reached the commit step. The adjustments are made in such a way that if the i-th sector of the hard drive is altered, the i-th shadow page is likewise altered.
The database's genuine pages and shadow pages are compared in the event of a system failure, and recovery activities are carried out.
A group of buffers referred to as DBMS buffers are present in the logical memory when using the Caching/Buffering approach. All process logs are stored in buffers, and when the transaction reaches the commit step, the main log file is updated.

Conclusion

Failures in a database system might be brought about by issues with drives, systems, or transactions.
When the new state of the data is lost or all of the data in the system is lost, recovery procedures in DBMS are employed to recover the data.
The operations in a database are recorded in logs, which are then utilized to retrieve the data from the database.
In general, start transaction and commit are the first and last words in a log entry, respectively.
The data is updated at a later or quicker time using various transaction update methods including postponed and instant update. To recover data, these modifications are treated differently.
A checkpoint is used to save all of the database's data in a permanent state that may be changed at any moment.
All database data is recorded using the SAVEPOINT command, and the database is reset to the saved point using the ROLLBACK TO command.
In order to retrieve data in the event of a natural catastrophe, backups are employed to keep copies of the data.
To restore the data, the shadow paging approach compares two copies of the disc memory.

These are a few examples of backup methods:

Backup of the entire database - This backup of the entire database, including the data and the database, includes full-text catalogues and backs up the Meta data necessary to recover the entire database in a preset time series.

Differential backup: It only saves the data that has changed since the most recent complete database backup. The most current version of the altered data is stored in a differential backup when some data has changed several times since the last complete database backup. We need to restore a complete database backup for this first.

Backups of all database activities, including a record of each and every statement run, are made in the transaction log. It contains all transactions that had occurred to the database and is a backup of transaction log entries. This allows the database to be restored to a certain point in time. If the data files are lost but not a single committed transaction, it is also feasible to perform a backup using a transaction log.

Next Topic#

← prev next →