Category: 06. Normalization in Database Designing

https://cdn3d.iconscout.com/3d/premium/thumb/data-trends-3d-icon-png-download-12809862.png

Difference Between 4NF and 5NF

Normalization is an essential part of designing efficient and reliable databases. Most discussions focus on the first three normal forms and Boyce-Codd Normal Form (BCNF), however the journey does not stop there. The Fourth Normal Form (4NF) and the Fifth Normal Form (5NF) take normalization to even higher levels. 4NF and 5NF address specific types of dependencies like multivalued and join dependencies. Read this chapter to understand these advanced normal forms in detail.

What is Fourth Normal Form (4NF)?

The fourth normal form helps in eliminating multivalued dependencies from a database table. A multivalued dependency occurs when two attributes are independent of each other but depend on a third attribute. For instance, if one value in a table implies multiple rows, then there may be redundancy caused by multivalued dependencies.

Rules of 4NF

For a table to satisfy 4NF −

It must already be in Boyce-Codd Normal Form (BCNF).
The table must not have more than one multivalued dependency.

What is Multivalued Dependency?

If A →→ B (A multi-determines B), this means for a single value of A, there can be multiple values of B. But B is independent of any other attribute.

Example: Multivalued Dependency in Action

Consider a table that stores information about a Person, their Mobile Numbers, and their Food Preferences −

Person	Mobile	Food_Likes
Mahesh	9893	Burger
Mahesh	9424	Pizza
Ramesh	9191	Pizza

Here −

Person →→ Mobile
Person →→ Food_Likes

Both Mobile and Food_Likes are independent of each other, however they depend on Person. This results in redundancy because the same Person is repeated unnecessarily for each Mobile and Food_Likes.

Normalizing a Table to 4NF

To bring this table into 4NF, we can separate the dependencies into two tables −

Table 1 − Person-Mobile

Person	Mobile
Mahesh	9893
Mahesh	9424
Ramesh	9191

Table 2 − Person-Food_Likes

Person	Food_Likes
Mahesh	Burger
Mahesh	Pizza
Ramesh	Pizza

By decomposing the table, we eliminate redundancy. And each table is now in 4NF.

What is Fifth Normal Form (5NF)?

After 4NF, we must understand the concept of Fifth Normal Form, which is also known as the Projected Normal Form (PJNF). It addresses the join dependencies.

A join dependency exists when a table can be split into two or more tables. And the original table can be reconstructed by joining them without any data loss. In 5NF, every join dependency must be implied by the table’s candidate keys.

Rules of 5NF

For a table to satisfy 5NF −

It must already be in 4NF.
It must not contain any join dependency that cannot be implied by its candidate keys.

What is Join Dependency?

A join dependency occurs when a table can be decomposed into smaller tables, but joining those smaller tables recreates the original table without any data loss or spurious rows.

Example: Join Dependency in Action

Let us consider a table that stores the data on Agents, the Companies they work with, and the Products they sell.

Agent	Company	Product
A1	PQR	Nut
A1	PQR	Bolt
A1	XYZ	Nut
A1	XYZ	Bolt
A2	PQR	Nut

Here, if an Agent works with a Company and the Company sells a particular Product, then the Agent is assumed to sell that product.

Normalizing a Table to 5NF

To eliminate the join dependency, we decompose the table into three smaller tables −

Table 1 − Agent-Company

Agent	Company
A1	PQR
A1	XYZ
A2	PQR

Table 2 − Company-Product

Company	Product
PQR	Nut
PQR	Bolt
XYZ	Nut
XYZ	Bolt

Table 3 − Agent-Product

Agent	Product
A1	Nut
A1	Bolt
A2	Nut

When these three tables are again joined back using their shared attributes (Company and Product), the original table is reconstructed without any spurious data.

Comparing 4NF and 5NF

The following table highlights how 4NF differs from 5NF −

Aspect	4NF	5NF
Dependency Type	Multivalued Dependency	Join Dependency
Purpose	Eliminates redundancy from multivalued dependencies.	Ensures lossless decomposition of join dependencies.
Example Scenario	Person with multiple phones and food preferences.	Agent, company, and product relationships.

Practical Considerations in Using 4NF and 5NF

Although 4NF and 5NF provide the highest levels of normalization, they are used quite rarely. 4NF and 5NF are not always necessary for every database. because −

Complexity − Splitting the tables into smaller parts can make the database structure more complicated.
Performance − Highly normalized databases might require more joins, which may slow down the query performance.
Use Cases − Applications with simpler data relationships often do not need to go beyond BCNF or 4NF.

However, when it is paramount to maintain accuracy and avoid any sort of data redundancy (while maintaining financial or scientific databases, for example), it becomes a necessity to normalize the tables to 4NF and 5NF.

November 25, 2025

Boyce-Codd Normal Form (BCNF) in DBMS

The Third Normal Form (3NF) eliminates many redundancy issues, but still there are cases where 3NF is not strict enough. In such cases, we have to apply the Boyce-Codd Normal Form (BCNF), which is also known as 3.5NF.

BCNF is a more restrictive version of 3NF that addresses potential anomalies. Read this chapter to understand the basic concepts of BCNF, its rules, and how to apply it in practice.

What is Boyce-Codd Normal Form?

The Boyce-Codd Normal Form (BCNF) is a special case of 3NF. As we know in 3NF, it allows some flexibility with non-prime attributes and functional dependencies. The BCNF tightens the rules. It ensures that every functional dependency in a table adheres to stricter conditions.

Rules of Boyce-Codd Normal Form

For a table to be in BCNF they must follow the following rules −

The table must be in 3NF.
For every functional dependency, the left-hand side (LHS) must be a candidate key or super key.

In 3NF we know that there will be no transitive dependency. Where the RHS can be a prime attribute to satisfy the rule. In BCNF it demands that the LHS always includes a candidate key. This eliminates any chance of redundancy caused by improper dependencies.

Understanding BCNF with an Example

Let us consider the following Student table −

Roll Number	Name	Voter ID	Age
1	John	V001	20
2	Alice	V002	22
3	Bob	V003	21

The table has the following candidate keys and functional dependencies −

Candidate Keys − Roll Number and Voter ID (each uniquely identifies a row).
Functional Dependencies −
- Roll Number → Name
- Roll Number → Voter ID
- Voter ID → Age
- Voter ID → Roll Number

Checking Each Dependency against BCNF

Now let us analyze each functional dependency step by step and check whether the table is BCNF compliant.

Roll Number → Name −

LHS (Roll Number) is a candidate key.
This dependency is valid under BCNF.

Roll Number → Voter ID −

Again, the LHS is a candidate key, so it satisfies BCNF.

Voter ID → Age −

LHS (Voter ID) is also a candidate key.
This is valid under BCNF.

Voter ID → Roll Number −

Here, the LHS is a candidate key (Voter ID), making it compliant with BCNF.

Since all the functional dependencies have a candidate key or super key on the LHS, this table is in BCNF.

Importance of Boyce-Codd Normal Form

BCNF eliminates anomalies that might persist even in 3NF. For instance −

Redundancy − A non-prime attribute depending on something other than the candidate key can lead to duplicate data.
Update Anomalies − Incorrect updates to redundant data might create inconsistencies.
Deletion Anomalies − Removing a record could unintentionally delete important relationships.

BCNF Compared to Other Normal Forms

Let us compare BCNF with the normal forms that we have discussed so far −

First Normal Form (1NF)	Ensures no multi-valued attributes.Every cell must contain atomic values.
Second Normal Form (2NF)	Builds on 1NF.Eliminates partial dependencies.
Third Normal Form (3NF)	Builds on 2NF.Eliminates transitive dependencies.Allows non-prime attributes on the RHS if the LHS is a candidate key or super key.
Boyce-Codd Normal Form (BCNF)	Builds on 3NF.No exceptions: the LHS of every functional dependency must be a candidate key or super key.

Relationship between Normal Forms

The following diagram shows how the different Normal Forms are related −

BCNF is the most restrictive form, sitting inside 3NF. 3NF sits inside 2NF and 2NF sits inside 1NF. This hierarchy means every table in BCNF is also in 3NF. However, not every table in 3NF is necessarily in BCNF.

Another Example: BCNF in Practice

Consider a relation {A, B, C, D} with these functional dependencies −

AB → C
C → D

Step 1: Identify the Candidate Keys

From AB → C, we see AB is a candidate key because it determines C.
From C → D, we can derive AB → D via transitivity, so AB determines all attributes.

Step 2: Check Each Dependency

AB → C −

LHS is a candidate key.
Satisfies BCNF.

C → D −

LHS (C) is not a candidate key.
Violates BCNF because C is a non-prime attribute.

Step 3: Split the Table

To satisfy BCNF, we split the table into two −

Table 1 − {AB → C}

Table 2 − {C → D}

BCNF – Key Points to Note

Make a note of the following key points on BCNF –

Stricter than 3NF − BCNF removes any dependency where the LHS is not a candidate key.
No Exceptions − While 3NF allows non-prime attributes on the RHS, BCNF does not tolerate them unless the LHS is a candidate key.
Requires Splitting − Many tables in 3NF need to be split further to achieve BCNF.

November 25, 2025

Third Normal Form (3NF) in DBMS
Once a table meets the requirements of Second Normal Form (2NF), the next step is to convert it into Third Normal Form (3NF). This is used to eliminate transitive dependency. Read this chapter to get a clear understanding of 3NF and transitive dependency.

What is Third Normal Form (3NF)?

In relational database, the third normal form has two certain conditions −
- The table is already in Second Normal Form (2NF).
- The table does not have any transitive dependency.
What is Transitive Dependency?

A transitive dependency occurs when a non-prime attribute depends on another non-prime attribute rather than directly depending on a candidate key.

To put it simply, in a transitive dependency, one non-prime attribute is indirectly connected to the candidate key through another non-prime attribute. This will may make unnecessary redundancy and inconsistency in the data.

To understand this better, let us elaborate some of the important terms −
- Candidate Key − A minimal set of attributes that uniquely identify each row in the table.
- Prime Attribute − An attribute that is part of a candidate key.
- Non-Prime Attribute − An attribute that is not part of any candidate key.
Example of Transitive Dependency

Consider the Student table with the following data

Roll Number State City
1 Punjab Mohali
2 Punjab Ludhiana
3 Karnataka Bangalore
4 Maharashtra Mumbai

The Functional Dependencies are −
- Roll Number → State
- State → City
In this example, the candidate key is the Roll Number because it uniquely identifies each row. State and City are non-prime attributes.

The Roll Number determines the State. And, the State determines the City. It creates a transitive dependency. Here, the Roll Number indirectly determines the City through the State.

While “Roll Number → State” is valid, this dependency “State → City” violates 3NF because it is a relationship between two non-prime attributes.

How to Eliminate Transitive Dependency?

To remove transitive dependency, we need to split the table into smaller tables or decomposition.

Step 1: Divide the Table

We create two separate tables: one table links the candidate key to the first non-prime attribute (State), and the other table links the first non-prime attribute (State) to the second non-prime attribute (City).

Step 2: Identify the Keys

In the Student Table, the candidate key remains Roll Number. The State is a non-prime attribute. In the State Table, the primary key is State, but we can see there are duplicate States. So, we can add another key attribute called state id, and that can be used inside the Student table as well.

By splitting the table, we can ensure that all non-prime attributes directly depend on the candidate key or are part of a separate table where they follow the same rule.

Another Example of 3NF: Table and Functional Dependencies

Consider a relation {A, B, C, D} with the following dependencies −
- AB → C
- C → D
Step 1: Identify the Candidate Key

To find the candidate key −
- AB → C means AB determines C.
- C → D means C determines D.
From AB, we can easily determine all the attributes: {AB → C → D}. So, the candidate key is {AB}.

Step 2: Classify the Attributes

Next, let’s classify the attributes −
- Prime Attributes − A, B (since they form the candidate key).
- Non-Prime Attributes − C, D.
Step 3: Check for Transitive Dependency
- AB → C: This is valid because the candidate key determines a non-prime attribute.
- C → D: This creates a transitive dependency since:
- D is a non-prime attribute.
- D is determined by another non-prime attribute (C).
Step 4: Eliminate Transitive Dependency

Split the table into two −

Now,
- In the Main Table, AB is the candidate key and C is a non-prime attribute directly dependent on it.
- In the Derived Table, C is the candidate key and D depends directly on it.
How to Check for 3NF Compliance

To determine if a table is in 3NF, follow the steps give below –
- Ensure the Table is in 2NF − No partial dependency should exist.
- Check Functional Dependencies − For each dependency, ensure that either:
  - The left-hand side (LHS) is a candidate key or super key, or
  - The right-hand side (RHS) is a prime attribute.
A quick rule of thumb − if a non-prime attribute depends on another non-prime attribute, it is a transitive dependency. This is violating 3NF.

Practical Steps to Achieve 3NF

Follow the steps given below to turn a table into its Third Normal Form –
- Find the Candidate Keys − Use closure methods to identify all the candidate keys.
- Classify the Attributes − Divide the attributes into prime and non-prime categories.
- Check the Dependencies − If the LHS of a functional dependency is not a candidate key or super key, check if the RHS is a prime attribute.
- Eliminate the Transitive Dependencies − Split the table into smaller ones if necessary.
November 25, 2025
Second Normal Form (2NF) in DBMS
Normal Forms ensure that the data in the tables remain structured and efficient. After achieving First Normal Form (1NF), the next step in normalization is the Second Normal Form (2NF) that helps eliminate certain types of redundancy by addressing partial dependency. Read this chapter to learn in detail what is 2NF and how it is applied.

What is Second Normal Form (2NF)?

The rules for 2NF are straightforward. According to E. F. Codd, the father of relational databases,
- The table must already be in First Normal Form (1NF).
- There should be no partial dependency in the table.
Let’s break down these rules further −
- 1NF Requirement − It means the table should not have multi-valued attributes. Each cell should have one and only one value.
- No Partial Dependency − A partial dependency occurs when a non-prime attribute depends on only a part of a composite candidate key; not only on the entire key.
To understand this better, let us elaborate some of the important terms −
- Candidate Key − A minimal set of attributes that can uniquely identify each row.
- Prime Attributes − Attributes that are part of a candidate key.
- Non-Prime Attributes − Attributes that are not part of any candidate key.
What is Partial Dependency?

A partial dependency exists when a non-prime attribute is dependent on just a part of a composite candidate key. It creates redundancy and anomalies in the database.

Example of Partial Dependency

Let’s consider the following table −

Customer ID Store ID Location
1 1 Delhi
2 1 Delhi
3 2 Bangalore
4 3 Mumbai

In this case,
- Candidate Key − The combination of Customer ID and Store ID uniquely identifies each row.
- Prime Attributes − Customer ID, Store ID
- Non-Prime Attribute − Location
If we look closely, the attribute “Location” is dependent only on “Store ID”. For example, wherever Store ID is 1, the location is Delhi. It violates the second rule of 2NF because Location does not depend on the entire composite key (Customer ID, Store ID).

How to Convert a Table to 2NF?

Whenever a table violates 2NF, the solution is to split it into smaller tables. We call this process “Decomposition“. Let us see how it works with the above example.

Step 1: Divide the Table

We break the table into two smaller tables −
- One table stores the composite key along with its prime attributes.
- Another table stores the partial dependency.
Step 2: Identify the Keys
- In the Customer Table, the candidate key remains Customer ID, Store ID.
- In the Store Table, the primary key is Store ID.
Now, the non-prime attribute Location depends fully on the candidate key of its table (Store ID). This is ensuring that both tables are in 2NF.

General Steps for Achieving 2NF

Given below are the general steps to make any table comply with 2NF −
- Identify the Candidate Key − Use functional dependencies to find all possible candidate keys.
- Classify the Attributes − Prime attributes are part of the candidate key. Non-prime attributes are everything else.
- Check for Partial Dependencies − If any non-prime attribute depends on only part of a composite candidate key. It is a partial dependency.
- Split the Table − Create separate tables to resolve partial dependencies.
Another Example: Functional Dependency

Let us explore a more complex example with functional dependencies.

Table and Functional Dependencies

We have a relation: {A, B, C, D, E, F} with these dependencies −
- C → F
- E → A
- E, C → D
- A → B
Step 1: Identify the Candidate Key

To find the candidate key, we must look at the right-hand side (RHS) of the functional dependencies: {F, A, D, B}. Attributes not on the RHS (E and C) must be part of the candidate key.

Start with E, C −
- E → A
- C → F
- E, C → D
- A → B
Thus, the closure of E, C covers all attributes: {E, C, A, F, D, B}. So, the candidate key is {E, C}.

Step 2: Prime and Non-Prime Attributes
- Prime Attributes − E, C.
- Non-Prime Attributes − A, B, D, F.
Step 3: Check for Partial Dependencies

A partial dependency occurs when −
- The left-hand side (LHS) of a functional dependency is a proper subset of the candidate key.
- The RHS is a non-prime attribute.
In our example,
- C → F: C is a proper subset of {E, C}, and F is non-prime. Partial dependency exists.
- E → A: E is a proper subset of {E, C}, and A is non-prime. Partial dependency exists.
Step 4: Split the Table

In this step, to eliminate partial dependencies, we divide the table into smaller tables −

The derived tables are given below –

Each table satisfies 2NF because all non-prime attributes fully depend on the candidate key of their respective tables.

Key Concepts in 2NF

Following are the key concepts in 2NF –
- Prime Attribute − Part of a candidate key.
- Non-Prime Attribute − Not part of any candidate key.
- Partial Dependency − A non-prime attribute depends on part of a composite candidate key.
- Full Functional Dependency − A non-prime attribute depends on the entire candidate key.
Conclusion

In this chapter, we explained in detail the concept of Second Normal Form (2NF) and how to address partial dependency in relational databases. We started with the basic rules of 2NF, highlighting the importance of achieving 1NF first.

Through detailed examples, we explored the concept of partial dependency and demonstrated how to split tables into smaller ones to remove redundancy. By ensuring 2NF, we make the databases more efficient by reducing the data redundancy and setting the stage for higher levels of normalization.
November 25, 2025
Normalization
Functional Dependency

Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional dependency says that if two tuples have same values for attributes A1, A2,…, An, then those two tuples must have to have same values for attributes B1, B2, …, Bn.

Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-hand side attributes determine the values of attributes on the right-hand side.

Armstrong’s Axioms

If F is a set of functional dependencies then the closure of F, denoted as F⁺, is the set of all functional dependencies logically implied by F. Armstrong’s Axioms are a set of rules, that when applied repeatedly, generates a closure of functional dependencies.
- Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta.
- Augmentation rule − If a → b holds and y is attribute set, then ay → by also holds. That is adding attributes in dependencies, does not change the basic dependencies.
- Transitivity rule − Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also holds. a → b is called as a functionally that determines b.
Trivial Functional Dependency
- Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold.
- Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD.
- Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non-trivial FD.
Normalization

If a database design is not perfect, it may contain anomalies, which are like a bad dream for any database administrator. Managing a database with anomalies is next to impossible.
- Update anomalies − If data items are scattered and are not linked to each other properly, then it could lead to strange situations. For example, when we try to update one data item having its copies scattered over several places, a few instances get updated properly while a few others are left with old values. Such instances leave the database in an inconsistent state.
- Deletion anomalies − We tried to delete a record, but parts of it was left undeleted because of unawareness, the data is also saved somewhere else.
- Insert anomalies − We tried to insert data in a record that does not exist at all.
Normalization is a method to remove all these anomalies and bring the database to a consistent state.

First Normal Form

First Normal Form is defined in the definition of relations (tables) itself. This rule defines that all the attributes in a relation must have atomic domains. The values in an atomic domain are indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form

Before we learn about the second normal form, we need to understand the following −
- Prime attribute − An attribute, which is a part of the candidate-key, is known as a prime attribute.
- Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a non-prime attribute.
If we follow second normal form, then every non-prime attribute should be fully functionally dependent on prime key attribute. That is, if X → A holds, then there should not be any proper subset Y of X, for which Y → A also holds true.

We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on any of the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial dependency, which is not allowed in Second Normal Form.

We broke the relation in two as depicted in the above picture. So there exists no partial dependency.

Third Normal Form

For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy −
- No non-prime attribute is transitively dependent on prime key attribute.
- For any non-trivial functional dependency, X → A, then either −
- A is prime attribute.
We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists transitive dependency.

To bring this relation into third normal form, we break the relation into two relations as follows −

Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF states that −
- For any non-trivial functional dependency, X → A, X must be a super-key.
In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-key in the relation ZipCodes. So,

Stu_ID → Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.
November 25, 2025

Roll Number	State	City
1	Punjab	Mohali
2	Punjab	Ludhiana
3	Karnataka	Bangalore
4	Maharashtra	Mumbai

Customer ID	Store ID	Location
1	1	Delhi
2	1	Delhi
3	2	Bangalore
4	3	Mumbai

Category: 06. Normalization in Database Designing

Difference Between 4NF and 5NF

What is Fourth Normal Form (4NF)?

Rules of 4NF

What is Multivalued Dependency?

Example: Multivalued Dependency in Action

Normalizing a Table to 4NF

What is Fifth Normal Form (5NF)?

Rules of 5NF

What is Join Dependency?

Example: Join Dependency in Action

Normalizing a Table to 5NF

Comparing 4NF and 5NF

Practical Considerations in Using 4NF and 5NF

Boyce-Codd Normal Form (BCNF) in DBMS

What is Boyce-Codd Normal Form?

Rules of Boyce-Codd Normal Form

Understanding BCNF with an Example

Checking Each Dependency against BCNF

Importance of Boyce-Codd Normal Form

BCNF Compared to Other Normal Forms

Relationship between Normal Forms

Another Example: BCNF in Practice

Step 1: Identify the Candidate Keys

Step 2: Check Each Dependency

Step 3: Split the Table

BCNF – Key Points to Note

Third Normal Form (3NF) in DBMS

What is Third Normal Form (3NF)?

What is Transitive Dependency?

Example of Transitive Dependency

How to Eliminate Transitive Dependency?

Step 1: Divide the Table

Step 2: Identify the Keys

Another Example of 3NF: Table and Functional Dependencies

Step 1: Identify the Candidate Key

Step 2: Classify the Attributes

Step 3: Check for Transitive Dependency

Step 4: Eliminate Transitive Dependency

How to Check for 3NF Compliance

Practical Steps to Achieve 3NF

Second Normal Form (2NF) in DBMS

What is Second Normal Form (2NF)?

What is Partial Dependency?

Example of Partial Dependency

How to Convert a Table to 2NF?

Step 1: Divide the Table

Step 2: Identify the Keys

General Steps for Achieving 2NF

Another Example: Functional Dependency

Table and Functional Dependencies

Step 1: Identify the Candidate Key

Step 2: Prime and Non-Prime Attributes

Step 3: Check for Partial Dependencies

Step 4: Split the Table

Key Concepts in 2NF

Conclusion

Normalization

Functional Dependency

Armstrong’s Axioms

Trivial Functional Dependency

Normalization

First Normal Form

Second Normal Form

Third Normal Form

Boyce-Codd Normal Form