• Tidak ada hasil yang ditemukan

COP 5725 Fall 2012 Database Management Systems

N/A
N/A
Protected

Academic year: 2019

Membagikan "COP 5725 Fall 2012 Database Management Systems"

Copied!
51
0
0

Teks penuh

(1)

COP 5725 Fall 2012

Database Management

Systems

(2)

Schema Refinement

and Normal Forms

(3)

Review: Database Design Steps

• Requirements Analysis

• Conceptual Database Design

– E/R model

• Logical Database Design

– E.g., relational data model

– Output: logical/conceptual schema • Schema Refinement

(4)

Why Schema Refinement?

• After Conceptual and Logical Database Design  relational schemas + IC’s

• Problems caused by bad schema design – Redundant storage

– Update anomaly : one occurrence of a fact is changed, but not all occurrences.

(5)

Example of Bad Design

Drinkers(name, addr, beersLiked, manf, favBeer)

name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway ??? WickedAle Pete’s ???

Spock Enterprise Bud ??? Bud

Data is redundant, because each of the ???’s can be figured

(6)

This Bad Design Also Exhibits

Anomalies

name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway Voyager WickedAle Pete’s WickedAle Spock Enterprise Bud A.B. Bud

(7)

Decompositions

• Integrity constraints, in particular

functional dependencies

, can be used to identify schemas with such problems

and to suggest refinements. • Main refinement technique:

(8)

Problems Related to

Decompositions

• Do we need to decompose a relation?

Use FDs and normal forms

• What problems (if any) does the decomposition cause?

– 2 important properties:

• Lossless-join

(9)

Functional Dependencies

• X -> A is an assertion about a relation R

• Say “X determines A holds in R.”

– Convention: …, X, Y, Z represent sets of

attributes; A, B, C,… represent single attributes.

(10)

Example

name addr beersLiked manf favBeer

Janeway Voyager Bud A.B. WickedAle

Janeway Voyager WickedAle Pete’s WickedAle

Spock Enterprise Bud A.B. Bud

Because name -> addr Because name -> favBeer

Because beersLiked -> manf

Drinkers(name, addr, beersLiked, manf, favBeer)

(11)

FD’s With Multiple Attributes

• No need for FD’s with > 1 attribute on right.

(12)

Where Do FD’s Come From?

• Key constraints on entities and relationships

K

->

A holds for R (candidate keys)

– many-one, one-one relationships – domain knowledge of the

(13)

Inferring FD’s

• We are given a set of FD’s F : X1 ->

A

1,

X

2 ->

A

2,…, Xn ->

A

n , we want to know whether an FD Y ->

B must hold in

every relation instance that satisfies the given FD’s in F. If so, we say Y ->

B is

implied by F.

(14)
(15)
(16)
(17)

Closure Test

• An easier way to test is to compute the

closure of

Y

, denoted

Y

+.

• Basis:

Y

+ =

Y

.

• Induction: Look for an FD X ->

A, which

has left side

X

that is a subset of the

(18)

Y+

new Y+

(19)

Exercise, FD inference

• R(A,B,C,D,E): AB->C, BC->AD, D->E, CF->B, can AB->D be inferred?

Approach 1: (Using AA and additional rules)

AB->C => AB->BC (augmentation) BC->AD => BC->D (decomposition)

(20)
(21)

Projecting FD’s

• Finding All Implied FD’s on a Projected Schema

• Motivation: Decomposition

• Example:

ABCD

with FD’s AB ->

C,

C

->

D, and D

->

A.

– Decompose into ABC, AD. What FD’s hold in

(22)

Basic Idea

1. Start with given FD’s and find all

nontrivial FD’s that follow from the

given FD’s.

(23)

Simple, Exponential Algorithm

1. For each set of attributes

X

, compute

X

+. 2. Add X ->

A for all

A

in

X

+ -

X

.

3. However, drop XY ->

A whenever we

discover X ->

A.

 Because XY ->A follows from X ->A in any projection.

(24)

A Few Tricks

• No need to compute the closure of the empty set or of the set of all attributes. • If we find

X

+ = all attributes, so is the

(25)

Example

ABC

with FD’s A ->

B and B

->

C.

Project onto

AC

.

– A +=ABC ; yields A ->B, A ->C.

• We do not need to compute AB + or AC +. – B +=BC ; yields B ->C.

– C +=C ; yields nothing.

(26)

Example --- Continued

• Resulting FD’s: A ->

B, A

->

C,

and B ->

C.

• Projection onto

AC

: A ->

C.

(27)
(28)

Boyce-Codd Normal Form

• Relation R with FDs F is in BCNF if, for all X ->

A in F

+

X ->A is a trivial (X contains A) FD, or

X contains a key for R.

(29)

Example

Beers(name, manf, manfAddr)

FD’s: name->manf, manf->manfAddr • Only key is {name} .

(30)

Example (II)

Drinkers(name, addr, beersLiked, manf, favBeer)

FD’s: name->addr favBeer, beersLiked->manf

• Only key is {name, beersLiked}. • In each FD, the left side is

not

a

superkey.

(31)

Redundancy in BCNF

• BCNF ensures that no redundancy can be detected using FD information alone.

– Most desirable normal form (in terms of redundancy)

– 1. If XY->A

– 2. If X->A X Y A

x y1 a

(32)

Decomposition into BCNF

• Given: relation

R

with FD’s

F

.

• Look among the given FD’s for a BCNF violation X ->

B.

– If any FD following from F violates BCNF, then there will surely be an FD in F itself that violates BCNF.

(33)

Decompose

R

Using

X

->

B

• Replace

R

by relations with schemas: 1. R1 = X +.

2. R2 = R – (X + – X ).

Project given FD’s

F

onto the two new relations.

(34)

Decomposition Picture

R-X + X X +-X

R2

(35)

Example

Drinkers(name, addr, beersLiked, manf, favBeer)

F = name->addr, name -> favBeer, beersLiked->manf

Steps for BCNF decomposition: 1. Key: name, beersliked

2. find BCNF violation FD X->A: name->addr 3. compute X+: {name, addr, favbeer}

(36)

Example, Continued

• We are not done; we need to check Drinkers1 and Drinkers2 for BCNF.

For Drinkers1(name, addr, favBeer)

1. Projecting FD’s: name->addr, name->favbeer 2. Key: name

(37)

Example, Continued

For Drinkers2(name, beersLiked, manf), 1. Projecting FD’s: beersliked->manf

2. Key: name, beersliked

3. find BCNF violation FD X->A: beersliked->manf

4. compute X+: {beersliked, manf} 5. decompose to R3 and R4

R3: Drinkers3(beersLiked, manf) FDs? key?

(38)

Example, Concluded

• The resulting decomposition of Drinkers :

1. Drinkers1(name, addr, favBeer) 2. Drinkers3(beersLiked, manf)

3. Drinkers4(name, beersLiked)

• Notice: Drinkers1 tells us about drinkers,

Drinkers3 tells us about beers, and

Drinkers4 tells us the relationship between

(39)

Review: Problems Related to

Decompositions

• Do we need to decompose a relation?

Use FDs and normal forms

• What problems (if any) does the decomposition cause?

– 2 important properties:

• Lossless-join

• Dependency-preservation (decide efficiency in

checking the IC’s)

(40)

Problem with BCNF

• There is one structure of FD’s that

causes trouble when we decompose.

AB

->

C and C

->

B.

– Example: A = street address, B = city,

C = zip code.

• There are two keys, {

A

,

B

} and {

A

,

C

}.

(41)

We Cannot Enforce FD’s

• The problem is that if we use

AC

and

BC

as our database schema, we cannot enforce the FD AB ->

C

by checking

FD’s in these decomposed relations.

• Example with

A

= street,

B

= city, and

(42)

An Unenforceable FD

street zip 545 Tech Sq. 02138 545 Tech Sq. 02139

city zip Cambridge 02138 Cambridge 02139

Join tuples with equal zip codes.

street city zip 545 Tech Sq. Cambridge 02138 545 Tech Sq. Cambridge 02139

(43)

3NF

• 3rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation.

• An attribute is prime if it is a member of any key.

(44)

Example

• In our problem situation with FD’s AB ->

C and C

->

B, we have keys

AB

and

AC

.

• Thus

A

,

B

, and

C

are each prime.

(45)
(46)
(47)

Decomposition Properties

• There are two important properties of a decomposition:

1. Lossless-Join: it should be possible to project the original relations onto the

decomposed schema, and then reconstruct the original.

2. Dependency Preservation : it should be

possible to check in the projected relations

(48)

3NF vs. BCNF

• If R is in 3NF, some redundancy is possible.

• It is a compromise, used when BCNF not achievable (e.g., no ``good’’

decomp, or performance considerations).

(49)

3NF vs. BCNF (II)

• We can get (1) with a BCNF decomposition. • We can get both (1) and (2) with a 3NF

decomposition.

• But we can’t always get (1) and (2) with a BCNF decomposition.

(50)

Exercises

R(C,S,J,D,P,Q,V)

Key is C, JP->C, SD->P, J->S

Does JP->C violates BCNF? 3NF? What is a BCNF decomposition?

(51)

Summary

• If a relation is in BCNF, it is free of

redundancies that can be detected using FDs.

• If a relation is not in BCNF, we can try to decompose it into a collection of BCNF relations. All FDs are preserved?

• If a lossless-join, dependency preserving

Referensi

Dokumen terkait

Tirto, Siwalan dan Wonokerto 1 Paket Peningkatan Kualitas Permukiman Kumuh Kec Sragi 1 Paket Stimulan Perumahan Swadaya bagi RTM Desa Wonokerto Wetan, Kec. Wonokerto

Bahasa C adalah bahasa pemrograman yang dapat dikatakan berada antara bahasa tingkat rendah (bahasa yang berorientasi pada mesin) dan bahasa tingkat tinggi (bahasa yang

Data rata-rata dari hasil minat siswa antara kelas kontrol yaitu 8D dan kelas perlakuan yaitu 8E menunjukkan dari setiap indikator bahwa kelompok perlakuan

Adapun topik bahasan yang dipilih dalam Laporan Tugas Akhir ini adalah mengenai rangkaian mikrokontroler dan pemrograman bahasa C yang input kendalinya berasal dari water flow

ASUHAN KEPERAWATAN LANJUT USIA Pada Ny.P DENGAN DIAGNOSA MEDIS RHEUMATOID ARTHTRITIS DI UPT PELAYANAN.. SOSIAL LANJUT USIA PASURUAN BABAT

Rencana Terpadu dan Program Investasi Infrastruktur Jangka Menengah (RPI2-JM) Kabupaten Batu Bara merupakan dokumen rencana dan program pembangunan infrastruktur

Skripsi ini disusun untuk memenuhi salah satu syarat dalam menempuh ujian akhir Program Studi S.1 Keperawatan Fakultas Ilmu Kesehatan Universitas Muhammadiyah

Solidifikasi/Stabilisasi Limbah Slag Yang Mengandung Chrom (Cr) Dan Timbal (Pb) Dari Industri Baja Sebagai Campuran Dalam Pembuatan Concrete (Beton).. Dibuat untuk melengkapi