Denormalization is a process of adding redundant data to normalized tables in order to avoid unnecessary join operations. This improves the performance of read operations as there is no need to join multiple tables, however this requires extra storage space for redundant data, also it can cause data inconsistencies in database, if the redundant data is updated frequently.
Note:
1. Denormalization is not a reverse of normalization in DBMS.
2. Denormalization cannot be used in any scenario (we discussed this in detail in this article after the following example).
Denormalization Example
There are two tables Department
and Employee
, where Department
table contains the data for department id represented by Dept_Id
, name (attribute name: Dept_Name
), employee id (attribute name Emp_Id
). The Employee
table contains fields such as employee id, name, age.
Department Table
Dept_Id Dept_Name Emp_Id D01 Sales E101 D02 Marketing E102 D03 Retail E102 D04 IT E103 D05 HR E104
Employee Table
Emp_Id Emp_Name Emp_Age E101 Ram 29 E102 Shyam 28 E103 Veer 30 E104 Mohan 27
Now every time when we need to access the department information along with the employee details such as employee name, we need to join these two tables. One way of avoiding the unnecessary join operation is to denormalise the Department table like this:
Department table:
Dept_Id Dept_Name Emp_Id Emp_Name D01 Sales E101 Ram D02 Marketing E102 Shyam D03 Retail E102 Shyam D04 IT E103 Veer D05 HR E104 Mohan
Employee Table:
Emp_Id Emp_Name Emp_Age E101 Ram 29 E102 Shyam 28 E103 Veer 30 E104 Mohan 27
After this denormalization, whenever we need to get the department data along with the employee name, we do not need to join these tables as the Employee details are already present in the Department
table. This way, we avoided the join operations but we had to store the extra data in the database. Along with that
When you should use Denormalization?
As discussed in the beginning, denormalization can cause data inconsistencies in database so we must be very careful when using this process. Let’s point out the cases where you can use denormalization safely:
1. When the redundant data doesn’t require to be updated frequently or doesn’t update at all. In our example above, the redundant data is employee name and name doesn’t change frequently, thus it is an ideal case where the denormalization can be safely used.
2. When there is a need to join multiple tables frequently in order to get meaningful data. In this case, denormalization can significantly boost the performance of read operations at the cost of extra storage space in the database.
Advantages of Denormalization
1. Read Operations are faster as table joins are not required for most of the queries.
2. Write query is easy to write to perform read, write, update operations on database.
Disadvantages of Denormalization
1. Requires more storage as redundant data needs to be written in the tables.
2. Data write operations are slower due to redundant data.
3. Data inconsistencies are present due to redundant data.
4. It requires extra effort to update the database. This is because when redundant data is present, it is important to update the data in all the places else data inconsistencies may arise.