Normalization

September 28, 2007 12:22 by Young Park

This article is a summary of Normalization from the book 'Database Systems' by Thomas Connolly (Publisher: Addison Wesley)

Why Needed?
Normalization is needed in designing Relational database system to minimize data redundancy and update anomalies.

- Minimizes cost (Reduce file storage space)
- Updates can be performed with minimal number of operations

Functional Dependency

  • A relationship between attributes

Describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A -> B), if each value of A is associated with exactly one value of B. (A and B may each consist of  one or more attributes.)

  •  A -> B (A is determinant of B)

Type of functional dependency

  • Full functional dependency : Indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A,but not on any proper subset of A.
  • Transitive dependency : A condition where A, B, and C are attributes of a relation such that if A->B and B->C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).
  • Multi-Valued Dependency (MVD): Represents a dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B and a set of values for C. However, the set of values for B and C are independent of each other.
  • Join Dependency: Describes a type of dependency. For example, for a relation R with subsets of the attributes of R denoted as A,B, ... , Z, a relation R satisfies a join dependency if and only if every legal value of R is equal to the join of its projections on A, B, ..., Z.

The process of Normalization

Unnormalized Form (UNF): A table that contains one or more repeating groups.

First Normal Form (1NF): A relation in which the intersection of each row and column contains one and only one value.

Second Normal Form (2NF): A relation that is in First Normal Form and every non-primary-key attribute is fully functionally dependent on the primary key.

Third Normal Form (3NF): A relation that is in First and Second Normal Form and in which no non-primary-key attribute is transitively dependent on the primary key.

Fourth Normal Form (4NF): A relation that is in Boyce-Codd normal form and does not contain nontrivial multi-valued dependencies.

Fifth Normal Form (5NF): A relation that has no join dependency.

Armstrong's Axioms

  1. Reflexivity : If B is a subset of A, then A -> B
  2. Augmentation : If A -> B, then A,C -> B,C
  3. Transitivity : If A -> B and B -> C, then A -> C
  4. Self-determination : A -> A
  5. Decomposition : If A -> B,C, then A -> B and A -> C
  6. Union : If A -> B and A -> C, then A -> B,C
  7. Composition : If A -> B and C -> D then A,C -> B,D

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories: Database
Actions: E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

Brief Summary of Database Management System

September 10, 2007 11:38 by Young Park

This summary of the database system refers to the book called 'Database Systems - A practical approach to design, implementation, and management (Publisher: Addison Wesley / Author: Thomas Connolly)'

 

Database : a collection of related data
DBMS (DataBase Management System) : the software that manages and controls access to the database

Example of database uses:
Purchases from the supermarket, purchases using your credit card, booking a holiday at the travel agent, using the local library, and etc.

Before DBMS became popular for data store, file-based system was used.

File-based system: A collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data

Limitations of the File-based approach
1. Separation and isolation of data
2. Duplication of data
3. Data dependence
4. Incompatible file formats
5. Fixed queries/proliferation of application programs

Definition of Database

A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization.
( A Self-described collection of integrated records )

Description of data

System Catalog / data dictionary / metadata

Data abstraction of data

The approach taken with the database systems, where the definition of data is separated from the application programs, is similar to the approach taken in modern software development (Object oriented programming), where an internal definition of an object and a separate external definition are provided. This is known as Data Abstraction. The user of an object see only the external definition and are unaware of how the object is defined and how it functions.

Some terms

Entity: a distinct object in the organization that is to be represented in the database
Attribute: a property that describes some aspects of the object that we wish to record
Relationship: an association between entities

 

Definition of DBMS

A software system that enables users to define, create, maintain, and control access to the database

Database Application Programs

A computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS

Views

- Views provide a level of security (Views can be set up to exclude certain data that some users should not see)
- Views provide a mechanism to customize the appearance of the database
- A view can present a consistent, unchanging picture of the structure of the database, even if the underlying database is changed

ROLES IN THE DATABASE ENVIRONMENT

DA (Data Administrator) : Responsible for the management of the data resource including database planning, development and maintenance of standards, policies and procedures, and conceptual/logical database design

DBA (Database Administrator) : Responsible for the physical realization of the database, including physical database design and implementation, security and integrity control, maintenance of the operating system, and ensuring satisfactory performance of the applications for users

Logical Database Designer : Responsible for identifying the data, the relationships between the data a, and the constraints of the data that is to be stored in the database

Physical Database Designer : Decides how the logical database design is to be physically realized

Application Developer : Develops an application program that provides required functionality for the end-users must be implemented

End-users: Naive users / Sophisticated users


Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories: Database
Actions: E-mail | Permalink | Comments (2) | Comment RSSRSS comment feed


Sponsors