sophryu99/TIL

Database normalization

Opened this issue · 5 comments

Normalisation is a bottom-up technique for database design, normally based on an existing system (which may be paper-based).

Process of organizing data in a relational database used to reduce data redundancy and increase data integrity. This is important in creating an efficient and flexible database

Database is considered normalized when it meets the requirements of 3NF

  • First Normal Form: contain only atomic and single values
  • Second Normal Form: one primary key is allocated only to one row in a table
  • Third Normal Form: all attributes that are not a primary key have to be mutually independent.

Normalization Example

Screen Shot 2021-11-10 at 2 05 36 PM

Will conduct database normalization step-by-step

Step 1

Select the datasource and convert it to unnormalized table (UNF)

  • Create column headings for the table for each data item on the report (excluding calculated fields: total staff, average hourly rate)
  • Identify a key for the table: Project Code
  • Remove duplicate data
    • In this example, for the chosen key of Project Code, the values for Project Code, Project Title, Project Manager and Project Budget are duplicated if there are two or more employees working on the same project.
    • Project Code chosen for the key and duplicate data, associated with each project code, is removed.
    • Do not confuse duplicate data with repeating attributes which is described in the next step.

Resulting Table

Screen Shot 2021-11-10 at 2 59 57 PM

Step 2

Transform the above unnormalized data into first normal form (1NF)
Condition of 1NF: contain only atomic and single values: repeating attributes should be separated to a new table

  • Identify repeating attributes: Employee No. Employee Name, Department No., Department Name, Hourly Rate
  • Remove these repeating attributes to a new table together with a copy of the key from the UNF table
  • Assign a key to the new table (and underline it). The key from the original unnormalised table always becomes part of the key of the new table. A compound key is created. The value for this key must be unique for each entity occurrence.

Resulting Table

Screen Shot 2021-11-10 at 3 05 36 PM

Step 3: Transform 1NF data -> 2NF form

2NF condition: one primary key is allocated only to one row in a table

  • Partial key dependencies need to be removed

Q. Is this attribute dependent on one part of the key?

  • If yes, remove the attribute to a new table with a copy of the part of the key it is dependent upon. : Employee name, Department No. Department Name are dependent upon Employee no. only.
  • Hourly Rate is dependent upon both Project code and Employee No as it depends on different hourly rate and type of work
  • First table is already 2NF

Resulting Table
Screen Shot 2021-11-10 at 3 12 52 PM

Step 4: Transform 2NF -> 3NF data

3NF condition: all attributes that are not a primary key have to be mutually independent.

  • Remove to a new table any non-key attributes that are more dependent on other non-key attributes than the table key.

Q.If a non-key attribute is more dependent on another non-key attribute than the table key:

  • Move the dependent attribute, together with a copy of the non-key attribute upon which it is dependent, to a new table
  • Make the non-key attribute, the key in the new table.
  • Leave the non-key attribute in the original table and mark it as a foreign key
  • Department name is more dependent upon Department no. than Employee no. -> moved to a new table

Resulting Table
Screen Shot 2021-11-10 at 3 19 47 PM

Screen Shot 2021-11-10 at 3 20 04 PM