4 SYSTEMS DESIGN
4.3 NORMALISATION
Note. The Student_id is shown in bold and underlined to signify that it is a unique identifier (primary key), foreign key attributes, if present, are shown in italics.
Initially, when deriving relations not all attributes may be known so ‘skeleton relations’
are produced which only include the primary and foreign key attributes.
At the logical design stage the relations may be in one of the following states:- - Un-normalised (UNF),
- 1st Normal Form (1NF), - 2nd Normal Form (2NF), - 3rd Normal Form (3NF).
By applying the rules, a relation is transformed until it is in 3rd normal form, which is commonly referred to as “normalised”.
An un-normalised relation contains a repeating group of attributes. Any repeating group needs to be separated from the original relation and placed into a new relation with its own primary key. The identifying attribute for the new relation remains in the original relation to act as a foreign key to provide a link to the new relation. This process can be illustrated as follows:
The following example shows an un-normalised table called STUDENT:
Student Id
Student Name
Module Id
Module name Module Grade
Course Id
Course name
101 J Smith DB Databases 75 CMP Computing
101 J Smith SA Systems
Analysis
65 CMP Computing
101 J Smith PR Programming 70 CMP Computing
102 A Khan DB Databases 55 CMP Computing
102 A Khan PR Programming 60 CMP Computing
103 R Berger DB Databases 50 BIT Business IT
103 R Berger WD Web
Development
55 BIT Business IT
Fig 4.11 Un-normalised table
Download free eBooks at bookboon.com
The attributes Module Id, Module name and Grade are repeated for each student, so to transform the STUDENT table into 1NF these repeating attributes are removed and placed into a new table called MODULE GRADE. The primary key for the new table will be a compound key consisting of the Module Id and the student Id (also a foreign key). This results in the following 1NF tables:-
STUDENT
Student Id (PK) Student Name Course Id Course name
101 J Smith CMP Computing
102 A Khan CMP Computing
103 R Berger BIT Business IT
MODULE GRADE Student
Id (FK) Module
Id Module name Module Grade
101 DB Databases 75
101 SA Systems Analysis 65
101 PR Programming 70
102 DB Databases 55
102 PR Programming 60
103 DB Databases 50
103 WD Web Development 55 Compound Primary key
Fig 4.12 1NF tables
To transform 1NF tables into 2NF involves checking for part-key dependencies and removing these to form a new table. To explain what a part key dependency is, let us look at the 1NF MODULE GRADE table in Fig 4.12.
This table has a multi-part key consisting of the Student Id and the Module Id. A check is made to identify if any other columns are dependent on only one part of the key (i.e.
determined by a part of the key). On inspection, it becomes apparent that the Module name can be determined by just the Module Id, so there is a part key dependency. This results in the Module Id and Module name being used to form a new 2NF table called MODULE. Contrast this with the module grade column; this can only be determined by having both the Student Id and the Module Id, so is dependent on the full key and remains in the MODULE GRADE table.
Download free eBooks at bookboon.com
As the MODULE table only has a single column key – Module Id – the table is, by default, already in 2NF. The 2NF tables are as follows:-
STUDENT Student
Id (PK) Student
Name Course
Id Course name
101 J Smith CMP Computing
102 A Khan CMP Computing
103 R Berger BIT Business IT
MODULE GRADE Student
Id Module
Id Module Grade
101 DB 75
101 SA 65
101 PR 70
102 DB 55
102 PR 60
103 DB 50
103 WD 55
MODULE Module Id
(PK) Module name
DB Databases
SA Systems Analysis
PR Programming
WD Web Development
Fig 4.13 2NF tables
Download free eBooks at bookboon.com Click on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read moreClick on the ad to read more
The final step is to ensure that the tables are in 3NF. This is to ensure that all non-key attributes are dependent on the key and only the key. To illustrate this, let us look at the 2NF STUDENT table from Fig 4.13 above.
The Student name and Course Id are determined by the key Student Id, however the Course name can be determined by the non-key attribute Course Id, so the course name can be placed in a separate table called COURSE, along with the Course Id which will be the table primary key. The other 2NF tables are already in 3NF so they remain unchanged.
This results in the following normalised set of tables:
STUDENT Student
Id (PK) Student
Name Course Id (FK) 101 J Smith CMP 102 A Khan CMP 103 R Berger BIT
MODULE GRADE
Student
Id Module
Id (FK) Module Grade
101 DB 75
101 SA 65
101 PR 70
102 DB 55
102 PR 60
103 DB 50
103 WD 55
MODULE
Module Id (PK) Module name
DB Databases
SA Systems Analysis
PR Programming
WD Web Development
COURSE
Course Id (PK) Course name
CMP Computing
CMP Computing
BIT Business IT
Fig 4.14 3NF tables
Download free eBooks at bookboon.com
Note. In some situations, a decision may be made to denormalise the tables for performance reasons but this should be avoided if at all possible.
See Appendix B Normalisation template which will assist you with the normalisation process.
Remember attributes in a relation should depend on:
1NF – the key 2NF – the whole key 3NF – nothing but the key
The set of tables from Fig 4.14 can be represented by the following ERD:-
Fig 4.15 Student ERD