Decomposition Algorithms

In the previous section, we discussed decomposition and its types with the help of small examples. In the actual world, a database schema is too wide to handle. Thus, it requires algorithms that may generate appropriate databases.

Here, we will get to know the decomposition algorithms using functional dependencies for two different normal forms, which are:

  • Decomposition to BCNF
  • Decomposition to 3NF

Decomposition using functional dependencies aims at dependency preservation and lossless decomposition.

Let's discuss this in detail.

Decomposition to BCNF

Before applying the BCNF decomposition algorithm to the given relation, it is necessary to test if the relation is in Boyce-Codd Normal Form. After the test, if it is found that the given relation is not in BCNF, we can decompose it further to create relations in BCNF.

There are following cases which require to be tested if the given relation schema R satisfies the BCNF rule:

Case 1: Check and test, if a nontrivial dependency α -> β violate the BCNF rule, evaluate and compute α+ , i.e., the attribute closure of α. Also, verify that α+ includes all the attributes of the given relation R. It means it should be the superkey of relation R.

Case 2: If the given relation R is in BCNF, it is not required to test all the dependencies in F+. It only requires determining and checking the dependencies in the provided dependency set F for the BCNF test. It is because if no dependency in F causes a violation of BCNF, consequently, none of the F+ dependency will cause any violation of BCNF.

Note: Case2 does not work if the relation gets decomposed. It means during the testing of the given relation R, we cannot check the dependency of F for the cause of violation of BCNF.

BCNF Decomposition Algorithm

This algorithm is used if the given relation R is decomposed in several relations R1, R2,…, Rn because it was not present in the BCNF. Thus,

For every subset α of attributes in the relation Ri, we need to check that α+ (an attribute closure of α under F) either includes all the attributes of the relation Ri or no attribute of Ri-α.

result={R};
done=false;
compute F+;
while (not done) do
	if (there is a schema Ri in result that is not in BCNF)
		then begin
			let α->β be a nontrivial functional dependency that holds 
			on Ri such that α->Ri is not in F+, and α ꓵ β= ø;
			result=(result-Ri) U (Ri-β) U (α,β);
		end
		else done=true;

Note: If some set of attributes α in Ri violates the specified condition in the algorithm, in such case consider the functional dependency α->( α+ - α) ꓵ Ri. Such dependency can be present in the F+ dependency.

This algorithm is used for decomposing the given relation R into its several decomposers. This algorithm uses dependencies that show the violation of BCNF for performing the decomposition of the relation R. Thus, such an algorithm not only generates the decomposers of relation R in BCNF but is also a lossless decomposition. It means there occurs no data loss while decomposing the given relation R into R1, R2, and so on…

The BCNF decomposition algorithm takes time exponential in the size of the initial relation schema R. With this, a drawback of this algorithm is that it may unnecessarily decompose the given relation R, i.e., over-normalizing the relation. Although decomposing algorithms for BCNF and 4NF are similar, except for a difference. The fourth normal form works on multivalued dependencies, whereas BCNF focuses on the functional dependencies. The multivalued dependencies help to reduce some form of repetition of the data, which is not understandable in terms of functional dependencies.

Difference between Multivalued Dependency and Functional Dependency

The difference between both dependencies is that a functional dependency expels certain tuples from being in a relation, but a multivalued dependency does not do so. It means a multivalued dependency does not expel or rule out certain tuples. Rather it requires other tuples of certain forms to exist in relation. Due to such a difference, the multivalued dependency is also referred to as tuple-generating dependency, and the functional dependency is referred to as equality-generating dependency.

Decomposition to 3NF

The decomposition algorithm for 3NF ensures the preservation of dependencies by explicitly building a schema for each dependency in the canonical cover. It guarantees that at least one schema must hold a candidate key for the one being decomposed, which in turn ensures the decomposition generated to be a lossless decomposition.

3NF Decomposition Algorithm

let Fc be a canonical cover for F; 
i=0;
for each functional dependency α->β in Fc
	i = i+1;
R = αβ;
If none of the schemas Rj, j=1,2,…I holds a candidate key for R 
Then
	i = i+1;
	Ri= any candidate key for R;
/* Optionally, remove the repetitive relations*/
Repeat
	If any schema Rj is contained in another schema Rk
 Then
/* Delete Rj */
Rj = Ri;
i = i-1;
until no more Rjs can be deleted
return (R1, R2, . . .  ,Ri)

Here, R is the given relation, and F is the given set of functional dependency for which Fc maintains the canonical cover. R1, R2, . . . , Ri are the decomposed parts of the given relation R. Thus, this algorithm preserves the dependency as well as generates the lossless decomposition of relation R.

A 3NF algorithm is also known as a 3NF synthesis algorithm. It is called so because the normal form works on a dependency set, and instead of repeatedly decomposing the initial schema, it adds one schema at a time.

Drawbacks of 3NF Decomposing Algorithm

  • The result of the decomposing algorithm is not uniquely defined because a set of functional dependencies can hold more than one canonical cover.
  • In some cases, the result of the algorithm depends on the order in which it considers the dependencies in Fc.
  • If the given relation is already present in the third normal form, then also it may decompose a relation.

Next TopicDBMS Tutorial