Normalization 1,2,3NF

Normalization of Database

Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form, removing duplicated data from the relation tables.
Normalization is used for mainly two purposes,
  • Eliminating reduntant(useless) data.
  • Ensuring data dependencies make sense i.e data is logically stored.

1st Normal Form Definition

A database is in first normal form if it satisfies the following conditions:
  • Contains only atomic values
  • There are no repeating groups
An atomic value is a value that cannot be divided. For example, in the table shown below, the values in the [Color] column in the first row can be divided into "red" and "green", hence [TABLE_PRODUCT] is not in 1NF.
A repeating group means that a table contains two or more columns that are closely related. For example, a table that records data on a book and its author(s) with the following columns: [Book ID], [Author 1], [Author 2], [Author 3] is not in 1NF because [Author 1], [Author 2], and [Author 3] are all repeating the same attribute.

1st Normal Form Example

How do we bring an unnormalized table into first normal form? Consider the following example:
Unnormalized Table Example
This table is not in first normal form because the [Color] column can contain multiple values. For example, the first row includes values "red" and "green."
To bring this table to first normal form, we split the table into two tables and now we have the resulting tables:
1st Normal Form Example
Now first normal form is satisfied, as the columns on each table all hold just one value

Second Normal Form (2NF)

  • In the 2NF, relational must be in 1NF.
  • In the second normal form, all non-key attributes are fully functional dependent on the primary key
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a school, a teacher can teach more than one subject.
TEACHER table
TEACHER_IDSUBJECTTEACHER_AGE
25Chemistry30
25Biology30
47English35
83Math38
83Computer38
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper subset of a candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_IDTEACHER_AGE
2530
4735
8338
TEACHER_SUBJECT table:
TEACHER_IDSUBJECT
25Chemistry
25Biology
47English
83Math
83Computer

Third Normal Form (3NF)

  • A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
  • 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
  • If there is no transitive dependency for non-prime attributes, then the relation must be in third normal form.
A relation is in third normal form if it holds atleast one of the following conditions for every non-trivial function dependency X → Y.
  1. X is a super key.
  2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
  3. Example:
    EMPLOYEE_DETAIL table:
    EMP_IDEMP_NAMEEMP_ZIPEMP_STATEEMP_CITY
    222Harry201010UPNoida
    333Stephan02228USBoston
    444Lan60007USChicago
    555Katharine06389UKNorwich
    666John462007MPBhopal
    Super key in the table above:
    1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on  
    Candidate key: {EMP_ID}
    Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.
    Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form.
    That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
    EMPLOYEE table:
    EMP_IDEMP_NAMEEMP_ZIP
    222Harry201010
    333Stephan02228
    444Lan60007
    555Katharine06389
    666John462007
    EMPLOYEE_ZIP table:
    EMP_ZIPEMP_STATEEMP_CITY
    201010UPNoida
    02228USBoston
    60007USChicago
    06389UKNorwich
    462007MPBhopal

Popular posts from this blog

18CS45 object oriented concept (OOC) notes, question paper

python application program 15CS664 notes question paper , important question

Operation Research Notes 15CS653