First Normal Form (1NF)
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE. EMPLOYEE table:
The above EMPLOYEE table is an unnormalized relation as it contains multiple values corresponding to EMP_PHONE attribute i.e. these values are non-atomic. So relations with multi value entries are called unnormalized relations. To overcome this problem, we have to eliminate the non atomic values of EMP_PHONE attribute. The decomposition of the EMPLOYEE table into 1NF has been shown below:
There are 3 ways to achieve first normal form. Method 1: To remove the repeating values for a column, the EMPLOYEE table was converted to a flat relation EMPLOYEE_1 table by repeating the pair (EMP_ID, EMP_NAME) for every entry in the table. Now the new relation does not contain any non-atomic values so the table is said to be normalized and is in First Normal Form. Method 2: Another method is to remove the attributes that violate 1NF and place it in a separate relation along with primary key. So the unnormalized relation, EMPLOYEE table is decomposed into two sub-relations EMP_DETAILS and EMP_PERFORMANCE EMP_DETAILS
EMP_PERFORMANCE
The main idea of decomposing the relations is to keep the different types of information in their separate relation as first normal form disallows multivalve attribute that are composite in nature. In the EMP_DETAILS relation, the attribute (EMP_ID) acts as a primary key and in the EMP_PERFORMANCE relation the attributes (EMP_ID, EMP_PHONE) act as a primary key. Now it satisfies both the conditions for a relation to be in 1NF. The relation is decomposed according to the following rules:
Method 3: The third method of normalizing a unnormalized relation into 1NF will be explained with following example where skills of an employee of some company are fixed. Suppose an employee can have maximum of five skills. EMP_SKILL relation
Here the EMP_SKILL relation is not 1NF as the skill attribute contains a set of values. So to remove this problem, we define multiple Skill columns as shown. The above relation is decomposed into 1NF in the following example.
The above representation is in 1NF but this technique is not preferred as it may cause problems such as:
To sum up, all the three approaches are correct because they transform any unnormalized table into a first normal form table. However, the second approach where table is decomposed into relations is more efficient as minimizes the duplicacy of the data. So for a relation to be in first normal form, each set of repeating groups should appear in its own table and every relation should have a primary key. Anomalies in first normal form Whereas the first normal form was concerned with the structure of the representation of relation, the second normal form is concerned with the eliminating redundancy in these relations. The various anomalies can be divided as:
These anomalies have got their name from the relational operations they perform on a relation. Let us take a following example of ORDER_BOOK relation:
Insertion anomaly: Suppose that we want to insert information about a new book into the ORDER_BOOK relation. But we cannot insert this information until some order is placed for it because in this relation the primary key is composed of two attributes Order_No and B_Name which are called composite keys. So neither the Order_No nor B_Name can contain null values because it is against the principle of entity integrity rule. So we cannot insert the information of a new book whose order has not been placed yet because in that case, the attribute Order_No will contain null value which is against the entity integrity rule i.e. primary key cannot null values. This is shown in following figure:
Relations which exhibit such kind of undesirable property are said to suffer from insertion anomaly. Deletion anomaly: Suppose that an order whose order number is 4154 is cancelled due to certain reasons. Therefore, we would have to delete this order information from the ORDER_BOOK relation. As we can see from the relation that this particular order contain information about the book whose name is “IT”. So on deletion of this record from the relation would result in loss of information about the “IT” book. This may lead to loss of vital information as it is the only record which contains information about the book “IT”. But if we try to remove any other record from the relation then it would cause no problem as it still contains information of the book in other record. For Example: Deleting record whose Order_No = 4154 as shown in following figure:
Relations which exhibit such kind of undesirable property are said to suffer from deletion anomaly. Updation Anomaly: Modifying some values in the relations may also prove cumbersome. Suppose that if the price of the book C is modified to 190 then every tuple referring to this book have to be updated and multiple updating always carries some risk of inconsistencies. In the ORDER_BOOK relation, the updation seems to be very easy because it contains only two tuples having B_price as 175. But if in case relation has thousands of tuples containing a large number of redundant data, the updations may lead to inconsistency as humans are prone to errors. Relations which exhibit such kind of undesirable property are said to suffer from updation anomaly. The above considerations leads us to a conclusion that relations in 1NF have undesirable data manipulation properties hence bringing relation to 1NF would not terminate logical database design. Further transformations are needed to eliminate this kind of anomalies form a set of original relations. So this bring us the concept of second normal form. Next TopicDBMS 2NF |