Difference Between 4NF and 5NF

Normalization is an essential part of designing efficient and reliable databases. Most discussions focus on the first three normal forms and Boyce-Codd Normal Form (BCNF), however the journey does not stop there. The Fourth Normal Form (4NF) and the Fifth Normal Form (5NF) take normalization to even higher levels. 4NF and 5NF address specific types of dependencies like multivalued and join dependencies. Read this chapter to understand these advanced normal forms in detail.

What is Fourth Normal Form (4NF)?

The fourth normal form helps in eliminating multivalued dependencies from a database table. A multivalued dependency occurs when two attributes are independent of each other but depend on a third attribute. For instance, if one value in a table implies multiple rows, then there may be redundancy caused by multivalued dependencies.

Rules of 4NF

For a table to satisfy 4NF −

  • It must already be in Boyce-Codd Normal Form (BCNF).
  • The table must not have more than one multivalued dependency.

What is Multivalued Dependency?

If A →→ B (A multi-determines B), this means for a single value of A, there can be multiple values of B. But B is independent of any other attribute.

Example: Multivalued Dependency in Action

Consider a table that stores information about a Person, their Mobile Numbers, and their Food Preferences −

PersonMobileFood_Likes
Mahesh9893Burger
Mahesh9424Pizza
Ramesh9191Pizza

Here −

  • Person →→ Mobile
  • Person →→ Food_Likes

Both Mobile and Food_Likes are independent of each other, however they depend on Person. This results in redundancy because the same Person is repeated unnecessarily for each Mobile and Food_Likes.

Normalizing a Table to 4NF

To bring this table into 4NF, we can separate the dependencies into two tables −

Table 1 − Person-Mobile

PersonMobile
Mahesh9893
Mahesh9424
Ramesh9191

Table 2 − Person-Food_Likes

PersonFood_Likes
MaheshBurger
MaheshPizza
RameshPizza

By decomposing the table, we eliminate redundancy. And each table is now in 4NF.

What is Fifth Normal Form (5NF)?

After 4NF, we must understand the concept of Fifth Normal Form, which is also known as the Projected Normal Form (PJNF). It addresses the join dependencies.

join dependency exists when a table can be split into two or more tables. And the original table can be reconstructed by joining them without any data loss. In 5NF, every join dependency must be implied by the table’s candidate keys.

Rules of 5NF

For a table to satisfy 5NF −

  • It must already be in 4NF.
  • It must not contain any join dependency that cannot be implied by its candidate keys.

What is Join Dependency?

join dependency occurs when a table can be decomposed into smaller tables, but joining those smaller tables recreates the original table without any data loss or spurious rows.

Example: Join Dependency in Action

Let us consider a table that stores the data on Agents, the Companies they work with, and the Products they sell.

AgentCompanyProduct
A1PQRNut
A1PQRBolt
A1XYZNut
A1XYZBolt
A2PQRNut

Here, if an Agent works with a Company and the Company sells a particular Product, then the Agent is assumed to sell that product.

Normalizing a Table to 5NF

To eliminate the join dependency, we decompose the table into three smaller tables −

Table 1 − Agent-Company

AgentCompany
A1PQR
A1XYZ
A2PQR

Table 2 − Company-Product

CompanyProduct
PQRNut
PQRBolt
XYZNut
XYZBolt

Table 3 − Agent-Product

AgentProduct
A1Nut
A1Bolt
A2Nut

When these three tables are again joined back using their shared attributes (Company and Product), the original table is reconstructed without any spurious data.

Comparing 4NF and 5NF

The following table highlights how 4NF differs from 5NF −

Aspect4NF5NF
Dependency TypeMultivalued DependencyJoin Dependency
PurposeEliminates redundancy from multivalued dependencies.Ensures lossless decomposition of join dependencies.
Example ScenarioPerson with multiple phones and food preferences.Agent, company, and product relationships.

Practical Considerations in Using 4NF and 5NF

Although 4NF and 5NF provide the highest levels of normalization, they are used quite rarely. 4NF and 5NF are not always necessary for every database. because −

  • Complexity − Splitting the tables into smaller parts can make the database structure more complicated.
  • Performance − Highly normalized databases might require more joins, which may slow down the query performance.
  • Use Cases − Applications with simpler data relationships often do not need to go beyond BCNF or 4NF.

However, when it is paramount to maintain accuracy and avoid any sort of data redundancy (while maintaining financial or scientific databases, for example), it becomes a necessity to normalize the tables to 4NF and 5NF.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *