First Normal Form (1NF)

  • A relation will be 1NF if it contains an atomic value.
  • It states that an attribute of a table cannot hold multiple values. It must hold only single-valued attribute.
  • First normal form disallows the multi-valued attribute, composite attribute, and their combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.

EMPLOYEE table:

EMP_IDEMP_NAMEEMP_PHONEEMP_STATE
14John7272826385,
9064738238
UP
20Harry8574783832Bihar
12Sam7390372389,
8589830302
Punjab

The above EMPLOYEE table is an unnormalized relation as it contains multiple values corresponding to EMP_PHONE attribute i.e. these values are non-atomic. So relations with multi value entries are called unnormalized relations.

To overcome this problem, we have to eliminate the non atomic values of EMP_PHONE attribute.

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_IDEMP_NAMEEMP_PHONEEMP_STATE
14John7272826385UP
14John9064738238UP
20Harry8574783832Bihar
12Sam7390372389Punjab
12Sam8589830302Punjab

There are 3 ways to achieve first normal form.

Method 1:

To remove the repeating values for a column, the EMPLOYEE table was converted to a flat relation EMPLOYEE_1 table by repeating the pair (EMP_ID, EMP_NAME) for every entry in the table. Now the new relation does not contain any non-atomic values so the table is said to be normalized and is in First Normal Form.

Method 2:

Another method is to remove the attributes that violate 1NF and place it in a separate relation along with primary key. So the unnormalized relation, EMPLOYEE table is decomposed into two sub-relations EMP_DETAILS and EMP_PERFORMANCE

EMP_DETAILS

EMP_IDEMP_NAME
14John
20Harry
12Sam

EMP_PERFORMANCE

EMP_IDEMP_PHONEEMP_STATE
147272826385UP
149064738238UP
208574783832Bihar
127390372389Punjab
128589830302Punjab

The main idea of decomposing the relations is to keep the different types of information in their separate relation as first normal form disallows multivalve attribute that are composite in nature. In the EMP_DETAILS relation, the attribute (EMP_ID) acts as a primary key and in the EMP_PERFORMANCE relation the attributes (EMP_ID, EMP_PHONE) act as a primary key. Now it satisfies both the conditions for a relation to be in 1NF.

The relation is decomposed according to the following rules:

  • One relation consists of the primary key (EMP_ID) of the original relation (i.e. EMPLOYEE) and non repeating attributes of the original relation (i.e. EMP_NAME).
  • The other relation consists of copy of the primary key of the original relation and all the repeating attributes of the original relation.

Method 3:

The third method of normalizing a unnormalized relation into 1NF will be explained with following example where skills of an employee of some company are fixed. Suppose an employee can have maximum of five skills.

EMP_SKILL relation

EMP_IDSkill
14DBMS, C, C++
20JAVA, C
12DBMS, HTML, VB, MS OFFICE

Here the EMP_SKILL relation is not 1NF as the skill attribute contains a set of values. So to remove this problem, we define multiple Skill columns as shown.

The above relation is decomposed into 1NF in the following example.

EMP_IDSkil_1Skill_2Skill_3Skill_4Skill_5
14DBMSCC++--
20JAVAC---
12DBMSHTMLVBMS OFFICE-

The above representation is in 1NF but this technique is not preferred as it may cause problems such as:

  • It would be difficult to query the relation. For Example, it would be difficult to answer the queries like “which employee share a skill?”, “Which employees have skill C?”
  • Restriction of employee skills to 5. If employee with more skills appears, it would be left unrecorded.

To sum up, all the three approaches are correct because they transform any unnormalized table into a first normal form table. However, the second approach where table is decomposed into relations is more efficient as minimizes the duplicacy of the data. So for a relation to be in first normal form, each set of repeating groups should appear in its own table and every relation should have a primary key.

Anomalies in first normal form

Whereas the first normal form was concerned with the structure of the representation of relation, the second normal form is concerned with the eliminating redundancy in these relations.

The various anomalies can be divided as:

  • Insertion anomaly
  • Deletion anomaly
  • Updation anomaly

These anomalies have got their name from the relational operations they perform on a relation.

Let us take a following example of ORDER_BOOK relation:

Order_NoB_NameQuantityPrice
4253C15175
4253Database20225
4154IT30200
4256C50175
4186Database15225

Insertion anomaly: Suppose that we want to insert information about a new book into the ORDER_BOOK relation. But we cannot insert this information until some order is placed for it because in this relation the primary key is composed of two attributes Order_No and B_Name which are called composite keys.

So neither the Order_No nor B_Name can contain null values because it is against the principle of entity integrity rule. So we cannot insert the information of a new book whose order has not been placed yet because in that case, the attribute Order_No will contain null value which is against the entity integrity rule i.e. primary key cannot null values. This is shown in following figure:

Order_NoB_NameQuantityPrice
4253C15175
4253Database20225
4154IT30200
4256C50175
4186Database15225
NULLOperating System30300

Relations which exhibit such kind of undesirable property are said to suffer from insertion anomaly.

Deletion anomaly: Suppose that an order whose order number is 4154 is cancelled due to certain reasons. Therefore, we would have to delete this order information from the ORDER_BOOK relation.

As we can see from the relation that this particular order contain information about the book whose name is “IT”. So on deletion of this record from the relation would result in loss of information about the “IT” book. This may lead to loss of vital information as it is the only record which contains information about the book “IT”. But if we try to remove any other record from the relation then it would cause no problem as it still contains information of the book in other record.

For Example: Deleting record whose Order_No = 4154 as shown in following figure:

Order_NoB_NameQuantityPrice
4253C15175
4253Database20225
4256C50175
4186Database15225
NULLOperating System30300

Relations which exhibit such kind of undesirable property are said to suffer from deletion anomaly.

Updation Anomaly: Modifying some values in the relations may also prove cumbersome. Suppose that if the price of the book C is modified to 190 then every tuple referring to this book have to be updated and multiple updating always carries some risk of inconsistencies.

In the ORDER_BOOK relation, the updation seems to be very easy because it contains only two tuples having B_price as 175. But if in case relation has thousands of tuples containing a large number of redundant data, the updations may lead to inconsistency as humans are prone to errors.

Relations which exhibit such kind of undesirable property are said to suffer from updation anomaly.

The above considerations leads us to a conclusion that relations in 1NF have undesirable data manipulation properties hence bringing relation to 1NF would not terminate logical database design. Further transformations are needed to eliminate this kind of anomalies form a set of original relations. So this bring us the concept of second normal form.


Next TopicDBMS 2NF