• Tidak ada hasil yang ditemukan

Struktur Data & Algoritme (Data Structures & Algorithms)

N/A
N/A
Protected

Academic year: 2021

Membagikan "Struktur Data & Algoritme (Data Structures & Algorithms)"

Copied!
13
0
0

Teks penuh

(1)

Struktur Data & Algoritme

(

Data Structures & Algorithms

)

Denny (

[email protected]

)

Suryana Setiawan (

[email protected]

)

Fakultas Ilmu Komputer Universitas Indonesia Semester Genap - 2004/2005

Version 2.0 - Internal Use Only

B Trees

SDA/BTREE/V2.0/2

Outline

„

Motivation

„

B-Tree

„

B+Tree

(2)

SDA/BTREE/V2.0/3

Motivation

„ Perhatikan kasus berikut ini:

„ Kamu harus membuat program basisdata untuk menyimpan data

di yellow pages daerah Jakarta, misalnya ada 500.000 data.

„ Setiap entry terdapat nama, alamat, nomor telepon, dll. Asumsi

setiap entry disimpan dalam sebuah record yang besarnya 512 byte.

„ Total file size = 500,000 * 512 byte = 256 MB.

• terlalu besar untuk disimpan dalam memory (primary storage)

• perlu disimpan di disk (secondary storage)

„ Jika kita menggunakan disk untuk penyimpanan, kita harus menggunakan struktur blok pada disk untuk menyimpan basis data tsb.

„ Secondary storage dibagi menjadi blok-blok yang ukurannya

sama. Umumnya 512 byte, 2 KB, 4 KB, 8 KB.

„ Block adalah satuan unit transfer antar disk dengan memory.

Walaupun program hanya membaca 10 byte dari disk, 1 block akan dibaca dari disk dan disimpan ke memory.

Motivation

„

Misalnya 1 disk block 8.192 byte (8 KB)

„

Maka jumlah blok yang diperlukan = 256 MB / 8 KB

per block = 31,250 blocks.

„

Setiap blok menyimpan = 8,192 / 512 = 16 records.

„

A disk access is really expensive due to mechanical

limitation.

„

Disk access is approx. 10,000 times slower than main

memory. One disk access is worth about 200,000

instructions.

„

The number of disk accesses will dominate the

running time.

„

Goal: a multiway search tree that will minimize disk

accesses.

(3)

SDA/BTREE/V2.0/5

B-Tree

„

B-tree adalah (a,b)-tree di mana b = 2a - 1. a dan b

biasanya merupakan angka-angka yang cukup besar,

misalnya a = 100 dan b = 199.

„

B-tree banyak digunakan untuk external data

structure.

„

Setiap node berukuran sesuai dengan ukuran block

pada disk, misalnya 1 block = 8 KB.

„

Tujuannya: meminimalkan jumlah block transfer.

SDA/BTREE/V2.0/6

B Trees

„ The problem with Binary Trees is balance, the tree can easily deteriorate to a linked list. Consequently, the reduced search times are lost, this problem is overcome in B-Trees.

„ B stands for Balanced,

where all the leaves are the same distance from the root. B-Trees guarantee a predictable efficiency.

„ There are several varieties of B-Trees, most applications use the B+Tree.

(4)

SDA/BTREE/V2.0/7

B Tree

„

B Tree of degree m has the following properties:

„

All non-leaf nodes (except the root which is not bound

by a lower limit) have between ⎡m/2⎤ and m non-empty

children.

„

A non-leaf node that has n branches will contain n-1

keys.

„

All leaves are at the same level, that is the same depth

from the root.

1250 0625 1000 1277 1282 < >= 1291 1425 2000

B+Tree

„

B+Tree adalah variant dari B-tree:

„

semua key value disimpan dalam leaf.

„

disertakan suatu pointer tambahan untuk

menghubungkan setiap leaf node tersebut sebagai

suatu linear linked-list.

•Struktur ini memungkinkan akses sikuensial data

dalam B-tree tanpa harus turun-naik pada struktur hirarkisnya.

„

node internal digunakan sebagai ‘indeks’. Beberapa

(5)

SDA/BTREE/V2.0/9

B+Tree

1250 0625 1000 1425 2000 1250 1300 1425 1600 2000 0350 0625 1000 Leaf Nodes < >= SDA/BTREE/V2.0/10

B+Tree Node Structure

P1 K1 P2 K2 P n-1 Kn-1 P n P1 K1 P2 K2 .. . . . P n-1 Kn-1 P n

. . . .

A high level node (internal node)

A leaf node (Every key value appears in a leaf node)

Pointer to subtree for keys>= K & < K Pointer to subtree for keys>= K 1 n-2 n-1 Pointer to subtree for keys>= K & < K1 2 Pointer to subtree for keys< K n-1 Pointer to record (block) with key K Pointer to record (block) with key K Pointer to leaf with smallest key greater than

K Pointer to

record (block)

(6)

SDA/BTREE/V2.0/11

Example of a B+Tree

1250 0625 1000 1425 2000 0350 0625 1300 1250 1300 1425 1600 2000 0350 0625 1000 1600 1425 2000 1000 1250 Leaf Nodes

Actual Data Records

>= <

Queries on B+Trees

„

Find all records with a search-key value of k.

1.

Start with the root node

1. Examine the node for the smallest search-key value

> k.

2. If such a value exists, assume it is Kj. Then follow Pi

to the child node

3. Otherwise k ≥ Km–1, where there are m pointers in

the node. Then follow Pm to the child node.

2.

If the node reached by following the pointer above is

not a leaf node, repeat the above procedure on the

node, and follow the corresponding pointer.

3.

Eventually reach a leaf node. If for some i, key K

i

= k

follow pointer P

i

to the desired record (or bucket).

(7)

SDA/BTREE/V2.0/13

Queries on B+Trees: Range Query

„

Find all records with a search-key value > k and < l

(range query).

„

Find all records with a search-key value of k.

„

while the next search-key value < l, follow the

corresponding pointer to the records.

•when the current search-key is the last search-key in

a node, follow the last pointer Pn to the next leaf

node.

SDA/BTREE/V2.0/14

Insertion on B+Trees

„

Find the leaf node in which the search-key value

would appear

„

If the search-key value is already there in the leaf

node (non-unique seach-key),

„

record is added to data file, and

„

if necessary search-key and the corresponding pointer

is inserted into the leaf node

(8)

SDA/BTREE/V2.0/15

Insertion on B+Trees

„ If the search-key value is not there, then add the record to the data file:

„ If there is room in the leaf node, insert (key-value, pointer) pair in

the leaf node (should be sorted)

„ Otherwise, split the node (along with the new (key-value, pointer)

entry) as shown in the next slides.

„ Splitting a node:

„ Take the new (search-key value, pointer) pairs (including the one

being inserted) in sorted order. Place the first ⎡n/2⎤ in the original node, and the rest in a new node.

„ When splitting a leaf, promote the middle/median key in the

parent of the node being split, but retain the copy in the leaf.

„ When splitting an internal node, promote the middle/median key

in the parent of the node being split, but DO NOT retain the copy in the leaf.

„ If the parent is full, split it and propagate the split further up.

67

Building a B+Tree

67, 123, 89, 18

, 34, 87, 99, 104, 36, 55, 78, 9

< data records leaf node root node 18 89 123

The split at

leaf nodes

67 89 123

promote but retain a copy < root node 18 67 89 89 123 >= split why promote 89, not 67?

(9)

SDA/BTREE/V2.0/17

67, 123, 89, 18

,

34, 87

, 99, 104, 36, 55, 78, 9

89 123 < root node 18 34 67 89 >= 67 87 < root node 18 34 67 89 >= 89 123 split SDA/BTREE/V2.0/18

67, 123, 89, 18

,

34, 87,

99, 104

, 36, 55, 78, 9

67 87 < root node 18 34 67 89 >= 89 99 123 67 87 < root node 18 34 67 89 104 89 99 104 123 split

(10)

SDA/BTREE/V2.0/19

67, 123, 89, 18

,

34, 87, 99, 104

,

36, 55,

78, 9

67 87 < 18 34 36 67 89 104 89 99 104 123 67 87 < 18 34 36 67 89 99 104 123 36 55 104 89 67 36 89 104

The split at

non-leaf nodes

promote and do not

retain a copy

double node split

split

split

The splitting of nodes proceeds upwards till a node that is not full is found. In the worst case the root node may be split increasing the height of the tree by 1.

Observations about B+Trees

„

The B+Tree contains a relatively small number of

levels (logarithmic in the number of data), thus

searches can be conducted efficiently.

„

In processing a query, a path is traversed in the tree

from the root to some leaf node.

„

If there are K search-key values in the file, the path is

no longer than ⎡log

⎡m/2⎤

(K)⎤, where b is index blocking

factor

(11)

SDA/BTREE/V2.0/21

Deletion on B+Trees

„

Remove (search-key value, pointer) from the leaf

node

„

If the node has too few entries due to the removal

(minimum requirement:

⎡m/2⎤ children), and the

entries in the node and a sibling

fit

into a single node,

then

„

Merge the two nodes into a single node

„

Delete the pair (K

i–1

, P

i

), where P

i

is the pointer to the

deleted node, from its parent, recursively using the

above procedure.

SDA/BTREE/V2.0/22

Deletion on B+Trees

„

Otherwise, if the node has too few entries due to the

removal, and the entries in the node and a sibling

does not fit

into a single node, then

„

Redistribute the pointers between the node and a

sibling such that both have more than the minimum

number of entries.

„

Update the corresponding search-key value in the

parent of the node.

„

The node deletions may cascade upwards till a node

which has

⎡n/2 ⎤ or more pointers is found.

(12)

SDA/BTREE/V2.0/23

Summary

„

B Tree is mostly used as an external data structure

for databases.

„

B Tree of degree m has the following properties:

„

All non-leaf nodes (except the root which is not bound

by a lower limit) have between ⎡m/2⎤ and m children.

„

A non-leaf node that has n branches will contain n-1

keys.

„

All leaves are at the same level, that is the same depth

from the root.

„

B+Tree is a variant from B Tree where all key values

are stored in leaves

Further Reading

„

applet simulasi B+Tree

„ http://www.cs.msstate.edu/~cs2314/global/BTreeAnimation/visual ization.html

(13)

SDA/BTREE/V2.0/25

What’s Next

Referensi

Dokumen terkait

In the scheme of the (unbalanced) priority search tree outlined above, each node p divides the plane into two parts along the line x = p.x. All nodes of the left subtree lie to

• jika search-key yang ditemukan adalah search-key yang terakhir dalam node, ikuti pointer yang terakhir (P n ) untuk menuju leaf node selanjutnya.  Bandingkan dengan proses

abstrak dimana implementasi pada tingkat lebih rendah dapat menggunakan struktur sequential (array) atau struktur berkait (linear

Fungsi deletenode( ) akan menghapus node pada posisi curr jika linked list tidak kosong dan memindahkan pointer curr ke posisi node pertama, atau curr bernilai NULL apabila linked

„ REVIEW: Suatu kelas yang meng-implements sebuah interface, berarti kelas tersebut mewarisi interface (definisi method-method) = interface inheritance, bukan

Balanced Binary Search Trees→AVL Trees AVL Tree: An AVL tree is a binary search tree that satisfies the following property: Height balance property: For every internal node of the

To search a sorted linked list requires that we move down the list one node at a time, visitingΘn nodes in the average case.. Imagine that we add a pointer to every other node that lets

procedure search for leafkvalue,Leaf /* read the leaf with a key value ofkvalue*/ begin Ptr←the pointer to the root node; Height←the height of the tree; while Height>1do /* pass down