Redundancy in DBMS

In this article, we will learn about redundancy in DBMS. First, let us understand data redundancy.

Data redundancy means the occurrence of duplicate copies of similar data. It is done intentionally to keep the same piece of data at different places, or it occurs accidentally.

What is Data redundancy in the database management system?

In DBMS, when the same data is stored in different tables, it causes data redundancy.

Sometimes, it is done on purpose for recovery or backup of data, faster access of data, or updating data easily. Redundant data costs extra money, demands higher storage capacity, and requires extra effort to keep all the files up to date.

Sometimes, unintentional duplicity of data causes a problem for the database to work properly, or it may become harder for the end user to access data. Redundant data unnecessarily occupy space in the database to save identical copies, which leads to space constraints, which is one of the major problems.

Let us understand redundancy in DBMS properly with the help of an example.

Student_id	Name	Course	Session	Fee	Department
101	Devi	B. Tech	2022	90,000	CS
102	Sona	B. Tech	2022	90,000	CS
103	Varun	B. Tech	2022	90,000	CS
104	Satish	B. Tech	2022	90,000	CS
105	Amisha	B. Tech	2022	90,000	CS

In the above example, there is a "Student" table that contains data such as "Student_id", "Name", "Course", "Session", "Fee", and "Department". As you can see, some data is repeated in the table, which causes redundancy.

Problems that are caused due to redundancy in the database

Redundancy in DBMS gives rise to anomalies, and we will study it further. In a database management system, the problems that occur while working on data include inserting, deleting, and updating data in the database.

We will understand these anomalies with the help of the following student table:

student_id	student_name	student_age	dept_id	dept_name	dept_head
1	Shiva	19	104	Information Technology	Jaspreet Kaur
2	Khushi	18	102	Electronics	Avni Singh
3	Harsh	19	104	Information Technology	Jaspreet Kaur

1. Insertion Anomaly:

Insertion anomaly arises when you are trying to insert some data into the database, but you are not able to insert it.

Example: If you want to add the details of the student in the above table, then you must know the details of the department; otherwise, you will not be able to add the details because student details are dependent on department details.

2. Deletion Anomaly:

Deletion anomaly arises when you delete some data from the database, but some unrelated data is also deleted; that is, there will be a loss of data due to deletion anomaly.

Example: If we want to delete the student detail, which has student_id 2, we will also lose the unrelated data, i.e., department_id 102, from the above table.

3. Updating Anomaly:

An update anomaly arises when you update some data in the database, but the data is partially updated, which causes data inconsistency.

Example: If we want to update the details of dept_head from Jaspreet Kaur to Ankit Goyal for Dept_id 104, then we have to update it everywhere else; otherwise, the data will get partially updated, which causes data inconsistency.

Advantages of data redundancy in DBMS

Provides Data Security: Data redundancy can enhance data security as it is difficult for cyber attackers to attack data that are in different locations.
Provides Data Reliability: Reliable data improves accuracy because organizations can check and confirm whether data is correct.
Create Data Backup: Data redundancy helps in backing up the data.

Disadvantages of data redundancy in DBMS

Data corruption: Redundant data leads to high chances of data corruption.
Wastage of storage: Redundant data requires more space, leading to a need for more storage space.
High cost: Large storage is required to store and maintain redundant data, which is costly.

How to reduce data redundancy in DBMS

We can reduce data redundancy using the following methods:

Database Normalization: We can normalize the data using the normalization method. In this method, the data is broken down into pieces, which means a large table is divided into two or more small tables to remove redundancy. Normalization removes insert anomaly, update anomaly, and delete anomaly.
Deleting Unused Data: It is important to remove redundant data from the database as it generates data redundancy in the DBMS. It is a good practice to remove unwanted data to reduce redundancy.
Master Data: The data administrator shares master data across multiple systems. Although it does not remove data redundancy, but it updates the redundant data whenever the data is changed.

Conclusion:

You have read this article about Data Redundancy in Database Management Systems. You have understood that data redundancy refers to the repetition of similar data, which may be done intentionally or it may be accidentally.
You have studied the problems caused by data redundancy, such as delete anomaly, insert anomaly, and update anomaly.
You have studied the advantages and disadvantages of data redundancy in DBMS.
You have studied some of the methods which reduce data redundancy in DBMS.

Next TopicAdvantages of DBMS over File System

← prev next →