• Tidak ada hasil yang ditemukan

SURROGATE KEYS AND DATA INTEGRITY

Dalam dokumen Database Security: Problems and Solutions (Halaman 37-40)

the corresponding data inconsistencies have been resolved. For example, if Penny now updates her office address from “137 Main” to “417 Main,” that involves changing the data in exactly one location (in the Realtor table). The office address is not duplicated anywhere else, so there is now no possibility of a data inconsistency involving the new address and previous address. In a similar manner, if we were to now change the price of the “123 Big Lane” property, that change is applied to exactly one location (in the Property table) and there is now no possible data inconsistency involving the new price and previous price.

inconsistency—and database integrity concern—still remains if a change is made to one of our primary key values. The reason is because in our scenario, we still have a natural key chosen as a primary key for the Realtor table. A natural key is one whose values have meaning associated with the scenario. In Realtor, we chose Phone as the primary key, and Phone has a meaning associ- ated with the scenario (that is, Phone contains the actual phone number of a realtor).

To see how the potential of a data inconsistency still remains, consider that Bob changes his phone number to “555-3333.” That data value exists not only in the Realtor table, but potentially in multiple database locations as a foreign key, in our case with the Listing table. There are some ways in which we can eliminate this type of data duplication and its data inconsistency potential.

One way is to define and enforce referential integrity constraints with cascad- ing updates, where if a primary key value changes, the database system will automatically apply that change to every foreign key reference of that primary key. With a cascading update defined between Realtor and Listing in our real estate listing scenario, if Bob changes his phone number and we apply that change to the Phone column in the Realtor table, the database system will automatically apply that same change to the Phone column in Listing to rows that have the previous value in Phone. While cascading updates are effective, this approach does lead to a performance overhead consideration, in that if a primary key value appears as a foreign key value a large number of times, the cascade of the update may take a significant amount of time to update to all occurrences of that foreign key value.

A second way to resolve such data duplication and data inconsistencies and avoid cascading overhead is to introduce a surrogate key. A surrogate key is an added primary key that is not part of the original data or and does not contain values meaningful to the scenario, but is rather introduced to gen- erate a unique value in each row. We will introduce a surrogate key named RealtorID in the Realtor table, so the Realtor relation has the structure and data given in Figure 2.12. The data of the surrogate key has unique values in the RealtorID column such as those shown.

Note that RealtorID is now the primary key, rather than Phone. However, Phone can still exist as a candidate key as an alternative means to retrieve a unique row in Realtor.

We also have similar database integrity concerns with the use of PropAdr and PropCity as a natural primary key for Property and duplication of its val- ues as foreign keys (although changes to PropAdr and PropCity may be less

likely unless a misspelling or inaccurate value was initially provided and needs correcting). As with Realtor, we can introduce a surrogate key for Property to avoid such integrity concerns. But we will also see another advantage of surrogate keys: a surrogate key also provides the benefit of simplifying a com- posite primary key and reducing the primary key to just one column. With a surrogate key named PropID introduced in the Property table, we will have the following Property table structure and data given in Figure 2.13.

Property ( PropID, PropAdr, PropCity, NBeds, Area, Price) Property

PropID PropAdr PropCity NBeds Area Price

P001 17 Highland CityA 3 2000 220000

P002 1565 State Rd CityB 4 2900 290000

P003 997 George CityA 4 2200 240000

P004 123 Big Lane CityA 8 5000 750000

P005 5 Lighthouse CityB 4 2000 230000

P006 190 Brown CityC 2 1700 140000

FIGURE 2.13 Property table with surrogate key.

In addition to reducing database integrity concerns with data duplica- tion, reducing data inconsistencies that may arise when the primary key value changes, and reducing the size of a primary key, a surrogate key can also enforce referential integrity when a foreign key value is specified for an added or changed row. As an example, if one were to add a new row to Listing but mistype the phone number “555-1111” as “555-1112,” we have a refer- ential integrity constraint violation because “555-1112” is not a primary key in the current Realtor table. Likewise, because we now have a surrogate key in Property, we no longer have to specify a composite foreign key in Listing, which may otherwise result with mistyped values and referential integrity constraint violations. Because we now use surrogate keys for Realtor and

Realtor(RealtorID, RealtorName, OfficeAdr, OfficeCity, Phone) Realtor

RealtorID RealtorName OfficeAdr OfficeCity Phone

R001 Penny 137 Main CityA 555-1111

R002 Bob 455 Oak CityB 555-2222

FIGURE 2.12 Realtor table structure and data with surrogate key.

Property, such mistyping of foreign key values may become less likely. The new Listing table structure and its data are as shown in Figure 2.14.

Listing (RealtorID,PropID) Listing

RealtorID PropID

R001 P001

R001 P002

R001 P003

R001 P004

R002 P005

R002 P006

R002 P004

FIGURE 2.14 Referencing surrogate keys.

Now data duplication of values to Phone no longer exists, which helps enforce integrity. A change to a Phone value requires only one change, and that is in the Realtor table.

2.3 NORMALIZATION, ACCESS RESTRICTIONS, AND

Dalam dokumen Database Security: Problems and Solutions (Halaman 37-40)