In this course, we will study about normalization and its types: first normal forms, second normal forms, third normal forms, Boyce and Codd normal forms, fourth normal form, fifth normal form and domain key normal form.

Normalization is a process of organizing the data inside the database in such a way that we can ignore data redundancy, insertion anomaly, update anomaly & deletion anomaly.

It can be seen as a process used for minimizing redundancy from a relation or set of relations.

Redundancy in relation leads to insertion, deletion and update anomalies. To eliminate or reduce redundancy in database tables normal forms are used.

In normalization, we split a large table into smaller units or tables and then define the relationships between them to increase the clarity in data.

**There are various levels of normalization:**

**First normal form (1NF)****Second normal form (2NF)**>**Third normal form (3NF)****Boyce and Codd normal form (BCNF)****Fourth normal form (4NF)****Fifth normal form (5NF)****Domain key normal form (DKNF)**

Data in the database is stored in the form of tables that contain rows and columns.

As per the rule of the first normal form, an attribute or the column of a table can only hold single values.

It can hold only atomic values. In case if a relation comprises a composite or multi-valued attribute then it is said to violate the rules of first normal form or relation.

Student number | Student name | Grade | Phone number | State |
---|---|---|---|---|

1 | Anand | A | 6286436876, 8286876436 | Punjab |

2 | Kamal | A+ | 9257939700 | Andhra Pradesh |

3 | Mohan | B | 8736475208 | Rajasthan |

As we know that there should be only one value in each column. So we need to normalize the table.

Student number | Student name | Grade | Phone number | State |
---|---|---|---|---|

1 | Anand | A | 6286436876 | Punjab |

2 | Anand | A | 8286876436 | Punjab |

3 | Kamal | A+ | 9257939700 | Andhra Pradesh |

4 | Mohan | B | 8736475208 | Rajasthan |

- For a relation to be in second normal form it must be in first normal form.
- There should be no partial dependency.
- Partial dependency means that non-prime attributes can be determined using a proper subset of the primary key.

**Consider an example, R(PQRS) in which PQ->R and Q->S**

In this example, PQ is the candidate key and therefore are a part of the prime attributes and R and S are the non-prime attributes. S not being dependent on the entire candidate key depends on part of the candidate key and this type of dependency is known as partial dependency.

Now to make it in the second normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(QS).Now we have it in second normal form as there is no partial dependency.

- For a relation to be in third normal form it must be in the second normal form that means it should be free from partial dependency.
- No non-prime attribute should be driving another non-prime attribute.
- There should be no transitive dependency.

**Consider an example, R(PQRS) in which PQ->R and R->S**

In this example, PQ is the candidate key and therefore is a part of prime attributes and R and S are non-prime attributes. S is dependent on another non-prime attribute and this type of dependency is known as a transitive dependency.

Now to make it in the third normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(RS). Now we have it in the third normal form as there is no transitive dependency.

- For a relation to be in Boyce Codd normal form it must be in the third normal form that means it should be free from transitive as well as partial dependency.
- For every relation, the left-hand side should be a super key.
- It is free from redundancy.
- Sometimes there can be a loss of functional dependency during decomposition.
- When all the attributes in a relation are prime attributes then it must be in BCNF.
- In every functional dependency if on the left-hand side there is a super key then it is in BCNF.

**Consider an example, R(PQR) in which PQ->R and R->Q**
In this example, PQ and PR are the candidate keys and therefore are a part of prime attributes. R is not a super key and hence it is not in BCNF.

Now to make it in the Boyce and Codd normal form we will break the tables and make new relations out of it. R(PQR)->R(PR) and R(RQ) because there should be a common attribute among the decomposed tables and in at least one of the decompositions the common attribute should be candidate key. Now we have it in the Boyce and Codd normal form(BCNF).

- For a relation to be in the fourth normal form it must be in Boyce Codd normal form.
- A relation should not contain more than one multivalued attribute.
- It eliminates independent many to one relationship between the columns.

For a relation to be in the fifth normal form it must be in fourth normal form.

When there is no division of tables possible to reduce redundancy.

When all the insertion and deletion anomalies are removed then the domain key normal form is attained.