Normalization in DBMS

In this course, we will study about normalization and its types: first normal forms, second normal forms, third normal forms, Boyce and Codd normal forms, fourth normal form, fifth normal form and domain key normal form.

Normalization in DBMS

Normalization is a process of organizing the data inside the database in such a way that we can ignore data redundancy, insertion anomaly, update anomaly & deletion anomaly.

It can be seen as a process used for minimizing redundancy from a relation or set of relations.

Redundancy in relation leads to insertion, deletion and update anomalies. To eliminate or reduce redundancy in database tables normal forms are used.

In normalization, we split a large table into smaller units or tables and then define the relationships between them to increase the clarity in data.

Normalization in DBMS

There are various levels of normalization:

  1. First normal form (1NF)
  2. Second normal form (2NF)>
  3. Third normal form (3NF)
  4. Boyce and Codd normal form (BCNF)
  5. Fourth normal form (4NF)
  6. Fifth normal form (5NF)
  7. Domain key normal form (DKNF)

First normal form

Data in the database is stored in the form of tables that contain rows and columns.
As per the rule of the first normal form, an attribute or the column of a table can only hold single values.
It can hold only atomic values. In case if a relation comprises a composite or multi-valued attribute then it is said to violate the rules of first normal form or relation.

Student number Student name Grade Phone number State
1 Anand A 6286436876, 8286876436 Punjab
2 Kamal A+ 9257939700 Andhra Pradesh
3 Mohan B 8736475208 Rajasthan

As we know that there should be only one value in each column. So we need to normalize the table.

Student number Student name Grade Phone number State
1 Anand A 6286436876 Punjab
2 Anand A 8286876436 Punjab
3 Kamal A+ 9257939700 Andhra Pradesh
4 Mohan B 8736475208 Rajasthan

Second normal form

  1. For a relation to be in second normal form it must be in first normal form.
  2. There should be no partial dependency.
  3. Partial dependency means that non-prime attributes can be determined using a proper subset of the primary key.

Consider an example, R(PQRS) in which PQ->R and Q->S
In this example, PQ is the candidate key and therefore are a part of the prime attributes and R and S are the non-prime attributes. S not being dependent on the entire candidate key depends on part of the candidate key and this type of dependency is known as partial dependency.

Now to make it in the second normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(QS).Now we have it in second normal form as there is no partial dependency.

Third normal form

  1. For a relation to be in third normal form it must be in the second normal form that means it should be free from partial dependency.
  2. No non-prime attribute should be driving another non-prime attribute.
  3. There should be no transitive dependency.
Transitive dependency: if A->B and B->C are two functional dependencies then A->C is called transitive dependency.

Consider an example, R(PQRS) in which PQ->R and R->S
In this example, PQ is the candidate key and therefore is a part of prime attributes and R and S are non-prime attributes. S is dependent on another non-prime attribute and this type of dependency is known as a transitive dependency.

Now to make it in the third normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(RS). Now we have it in the third normal form as there is no transitive dependency.

Boyce Codd normal form

Consider an example, R(PQR) in which PQ->R and R->Q In this example, PQ and PR are the candidate keys and therefore are a part of prime attributes. R is not a super key and hence it is not in BCNF.

Now to make it in the Boyce and Codd normal form we will break the tables and make new relations out of it. R(PQR)->R(PR) and R(RQ) because there should be a common attribute among the decomposed tables and in at least one of the decompositions the common attribute should be candidate key. Now we have it in the Boyce and Codd normal form(BCNF).

Fourth normal form

Fifth normal form

For a relation to be in the fifth normal form it must be in fourth normal form.
When there is no division of tables possible to reduce redundancy.

Domain key normal form

When all the insertion and deletion anomalies are removed then the domain key normal form is attained.