• Tidak ada hasil yang ditemukan

Database Management System

N/A
N/A
Protected

Academic year: 2023

Membagikan "Database Management System"

Copied!
19
0
0

Teks penuh

(1)

Database Management System

Lecture 7 Index - Hashing

* Some materials adapted from R. Ramakrishnan, J. Gehrke and Shawn Bowers

(2)

Today’s Agenda

• Index

Hash

2 Database Management System

(3)

Hash Indexex (Hashing)

3 Database Management System

(4)

Basic Database Architecture

Database Management System 4

(5)

Hash Indexes (Hashing)

• Again, 3 alternatives for data entry k*

Data record with key value k

<k, RID of record having k>

<k, list of RIDs of records having k>

• Hash-based indexes are best for equality selections

Cannot support range searches

• Static and dynamic hashing techniques

Similar trade-offs for ISAM vs. B+ trees

Database Management System 5

(6)

Static Hashing

• Number of primary pages fixed

Allocated sequentially

Never de-allocated

Overflow pages if needed

• A hash function h : K → Int

h(k) mod M = bucket data entry for key k (M = # of buckets)

Database Management System 6

(7)

Static Hashing

Database Management System 7

key

h

0 1 2

… M-1

k1* k1* k1* k1* k1* k1*

overflow pages (buckets) primary bucket pages

(8)

Static Hashing

• Buckets contain data entries

• Hash function on search key field of records

Distributes values over range 0 ... M 1 (No guarantee about the distribution)

• Overflow chains can develop & degrade performance

Extendible and Linear hashing: dynamic techniques to fix this

Database Management System 8

(9)

Terminology: Dense Index

• One index entry for each data record

Database Management System 9

Anneli David

Fox Melady Optimus

Snoop Ugene Volde

Index

Search key is “name”

(10)

Terminology: Clustered Index

• Records are sorted based on search key

Database Management System 10

<Anneli, 25, 1990>

<David, 31, 1980>

<Fox, 1, 2000>

<Melady, 2, 2001>

<Optimus, 30, 2010>

<Snoop, 13, 1999>

<Ugene,10, 1997>

<Volde, 22, 2002>

Data file

(11)

Terminology: Dense Clustered Index

• One index entry for each data record

• Records are sorted based on search key

Database Management System 11

<Anneli, 25, 1990>

<David, 31, 1980>

<Fox, 1, 2000>

<Melady, 2, 2001>

<Optimus, 30, 2010>

<Snoop, 13, 1999>

<Ugene,10, 1997>

<Volde, 22, 2002>

Data file

Anneli David

Fox Melady Optimus

Snoop Ugene Volde

Index

Search key is “name”

(12)

Terminology: Sparse Clustered Index

• One index entry per page of data

• Records sorted on search key

Database Management System 12

Anneli Melady

Ugene

Index

Search key is “name”

<Anneli, 25, 1990>

<David, 31, 1980>

<Fox, 1, 2000>

<Melady, 2, 2001>

<Optimus, 30, 2010>

<Snoop, 13, 1999>

<Ugene,10, 1997>

<Volde, 22, 2002>

Data file

(13)

Terminology: Unclustered Index

• Records NOT sorted on search key

Database Management System 13

<Anneli, 25, 1990>

<David, 31, 1980>

<Fox, 1, 2000>

<Melady, 2, 2001>

<Optimus, 30, 2010>

<Snoop, 13, 1999>

<Ugene,10, 1997>

<Volde, 22, 2002>

Data file

if the search key is year This becomes

an unclustered index of year

(14)

Combinations

Can we define

• A sparse unclustered index?

• A dense unclustered index?

Database Management System 14

HW: Find the Answer and

give reasons for your answer

1 Week.

(15)

Terminology: Dense unclustered Index

• One entry per search key, records NOT sorted on search key

Database Management System 15

<Anneli, 25, 1990>

<David, 31, 1980>

<Fox, 1, 2000>

<Melady, 2, 2001>

<Optimus, 30, 2010>

<Snoop, 13, 1999>

<Ugene,10, 1997>

<Volde, 22, 2002>

Data file

1980 1990 1997 1999 2000 2001 2002 2010

Index

Search key is “Year”

(16)

I/O cost of using dense unclustered index

• For a range query:

Every data record costs I/O

You may have to re-read some pages

Cost to scan file in sort order equal to number of records in the file!

(this is really bad, which is why we’ll also talk about external sorting)

Database Management System 16

Internal pages

Leaf pages

records in file

(17)

Composite Search Keys

• Which indexes should be used to answer these queries? Why?

Age = 18

Age = 18 and Salary = 22

Age = 18 and Salary > 15

Age > 17 and Salary > 23

Database Management System 17

<Abby, 18, 24>

<Brown, 45, 80>

<Fox, 18, 22>

<Susan, 34, 75>

Clustered on Name

<18, 22>

<18, 24>

<34, 75>

<45, 80>

<Age, Sal>

<Name, Age, Sal>

<22, 18>

<24, 18>

<75, 34 >

< 80 , 45>

<Sal, Age>

<18>

<18>

<34>

<45>

<Age>

<22>

<24>

<75>

< 80>

<Sal>

(18)

Indexes (almost) final words

• A DBMS often creates clustered indexes on primary keys

i.e., the file is sorted on the declared keys

Since primary keys must be the values used in foreign keys ... these indexes used in joins (Employee.cid = Company.id)

• Only one clustered index is create per table ... Why?

• As a DB designer/administrator, you must determine whether you need additional (unclustered) indexes

Including composite indexes

You have to consider trade-offs between query & update

This is one reason why DBAs get paid so well!

Database Management System 18

(19)

For Next Week

• Read

Ch. 12: 12.3.3

Ch. 14: 14.4 (up to 14.4.2)

Database Management System 19

Referensi

Dokumen terkait

Pada model data berbasis record, database terdiri dari sejumlah record dalam bentuk yang tetap yang dapat dibedakan dari bentuknya2. Ada 3 macam jenis model data berbasis record,

Sustav za upravljanje bazama podataka (engl. DBMS - DataBase Management System) je programska podrška koja omogućava rad s bazama podataka odnosno definiranje baza

– May contain an exhaustive index that contains one entry for every record in the main file. – May contain a

Sedangkan versi terbaru dan yang paling terakhir untuk saat ini adalah SQL Server 2008 R2, dengan penambahan berbagai fitur – fitur yang dapat memudahkan user untuk

• A file is a collection of related records, such as a file of all records containing course codes and title fields • Table 6.1 shows an example of a database file... The Spreadsheet

SISTEM DATABASE adalah suatu sistem penyusunan dan pengelolaan record-record dengan menggunakan komputer, dengan tujuan untuk menyimpan atau merekam serta memelihara data

ANALISA DAN PEMBAHASAN Permasalah yang didapatkan dari hasil penelitian ini adalah bagaimana sebuah database yang terdiri dari record data dimana record data tersebut berupa gabungan