Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases 11
Computer Science 202 Computer Science 202
Database Systems:
Database Systems:
Database Design Database Design
2 2 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Objectives Objectives
To learn what an information system is.
To learn what an information system is.
To learn what a Database Life Cycle To learn what a Database Life Cycle (DBLC) is.
(DBLC) is.
To learn what a Systems Development To learn what a Systems Development Life Cycle (SDLC) is.
Life Cycle (SDLC) is.
To demonstrate the relationship between To demonstrate the relationship between the DBLC and the SDLC.
the DBLC and the SDLC.
To learn how database design fits into the To learn how database design fits into the design of an information system.
design of an information system.
3 3 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Data
Data vs. Information Information
Data Data
n
n
Raw facts stored in a database Raw facts stored in a database
n
n
Little to no meaning in isolation Little to no meaning in isolation
n
n
Require organisation in order to be useful Require organisation in order to be useful Information
Information
n
n
Used for decision making Used for decision making
n
n
Result of a transformation of data Result of a transformation of data
n
n
Consists of data processed and organised in Consists of data processed and organised in a useful way
a useful way – – e.g. tables, graphs e.g. tables, graphs
4 4 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
The Information System The Information System
A system containing a repository of data, A system containing a repository of data, which is able to process the data and which is able to process the data and produce informational output
produce informational output
Manages data collection, storage and Manages data collection, storage and retrieval
retrieval
Also consists if the people and processes Also consists if the people and processes related to the management of data related to the management of data
5 5 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Data processing Data processing
6 6 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Systems Analysis Systems Analysis
Systems Analysis
Systems Analysis is the process that is the process that establishes the need for and the extent of a establishes the need for and the extent of a Information System
Information System
The process of creating an information system is The process of creating an information system is Systems Development
Systems Development
Database design and development are an Database design and development are an integral part of the systems development integral part of the systems development process
process
Systems development follows a lifecycle (SDLC)
Systems development follows a lifecycle (SDLC)
7 7 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Systems Development Life Cycle Systems Development Life Cycle
8 8 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC SDLC
Planning Planning Analysis Analysis
Detailed systems design Detailed systems design Implementation
Implementation Maintenance Maintenance
9 9 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC
SDLC - - Planning Planning
Planning of the project Planning of the project Definition of
Definition of
n n
Scope Scope
n n
Budget Budget
n
n
Resources Resources
10 10 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC
SDLC - - Analysis Analysis
Break down the project into manageable Break down the project into manageable components
components
Identify specifics regarding components Identify specifics regarding components Gather Requirements
Gather Requirements
11 11 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC
SDLC – –Detailed Design Detailed Design
In detail specification of In detail specification of
n
n
HCI components HCI components
n
n
Data structures Data structures
n n
API API
n
n
Internal Component functionality Internal Component functionality Testing Framework designed Testing Framework designed
12 12 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC
SDLC - - Implementation Implementation
Implementation of the System Implementation of the System
Can be on a component by component Can be on a component by component level (eXtreme Programming)
level (eXtreme Programming)
Implementation includes development of Implementation includes development of
n
n
Test harnesses Test harnesses
n
n
Unit Tests Unit Tests
n
n
System Tests System Tests
13 13 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
SDLC
SDLC – – Maintenance Maintenance
Corrective process Corrective process
Remedy errors found in testing Remedy errors found in testing Iterative
Iterative
Can also follow post deployment Can also follow post deployment
14 14 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DB Development Lifecycle DB Development Lifecycle
15 15 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC DBLC
Initial study Initial study Database Design Database Design
n
n
Conceptual Design Conceptual Design
Data analysis Data analysisERD construction and data normalisation ERD construction and data normalisation Data model verification
Data model verification ER Verification ER Verification
n
n
DBMS software selection DBMS software selection
n
n
Logical Design Logical Design
n
n
Physical Design Physical Design
Implementation and Loading Implementation and Loading Testing and evaluation Testing and evaluation Operation
Operation
Maintenance and Evaluation Maintenance and Evaluation
16 16 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – Initial Design Initial Design
Analyse company situation Analyse company situation
n
n
Current Operating environment Current Operating environment
n
n
Organisational Structure Organisational Structure
n
n
Information Flow Information Flow Define the problem Define the problem
Define scope and constraints of project Define scope and constraints of project Define design objectives
Define design objectives
17 17 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC Initial Design DBLC Initial Design
18 18 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN
Database Design Database Design
n
n
Conceptual Design Conceptual Design
n
n
DBMS software selection DBMS software selection
n
n
Logical Design Logical Design
n
n
Physical Design Physical Design
Critical part of whole process Critical part of whole process Focus on the Data requirements Focus on the Data requirements Final design should satisfy the Final design should satisfy the requirements
requirements
19 19 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN
20 20 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN Conceptual Conceptual
Develop a high
Develop a high- -level abstraction of the level abstraction of the data model
data model
Conceptual Design consists of four distinct Conceptual Design consists of four distinct parts
parts
Data analysis Data analysis
ERD construction and data normalisation ERD construction and data normalisation Data model verification
Data model verification ER Verification ER Verification
21 21 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Data Analysis and Requirements Data Analysis and Requirements
Focus on:
Focus on:
n
n
Information needs Information needs
n
n
Information users Information users
n
n
Information sources Information sources
n
n
Information constitution Information constitution Data sources
Data sources
n
n
Developing and gathering end Developing and gathering end- -user data views user data views
n
n
Direct observation of current system Direct observation of current system
n
n
Interfacing with systems design group Interfacing with systems design group Business rules
Business rules
22 22 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Entity Relationship Entity Relationship Modeling and Normalization Modeling and Normalization
23 23 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
E
E- - R Modeling is Iterative R Modeling is Iterative
Figure 6.8
24 24 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Concept Design: Tools and Concept Design: Tools and
Sources
Sources
25 25 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Data Model Verification Data Model Verification
E
E- - R model is verified against proposed system R model is verified against proposed system processes
processes
n
n
End user views and required transactions End user views and required transactions
n
n
Access paths, security, concurrency control Access paths, security, concurrency control
n
n
Business Business- -imposed data requirements and constraints imposed data requirements and constraints Reveals additional entity and attribute details Reveals additional entity and attribute details Define major components as modules Define major components as modules
n
n
Cohesively Cohesively
n
n
Coupling Coupling
26 26 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
E
E- - R Model Verification Process R Model Verification Process
27 27 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Iterative Process of Iterative Process of
Verification Verification
Figure 6.10
28 28 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Distributed Database Design Distributed Database Design
Design portions in different physical Design portions in different physical locations
locations
Development of data distribution and Development of data distribution and allocation strategies
allocation strategies
29 29 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN Software Selection Software Selection
Selection of the correct software is critical Selection of the correct software is critical For possible candidates, both advantages For possible candidates, both advantages and disadvantages need to be compared and disadvantages need to be compared Common influencing factors:
Common influencing factors:
n
n
Cost Cost – – purchase and ongoing support purchase and ongoing support
n
n
DBMS features and management tools DBMS features and management tools
n
n
Hardware/Platform requirements Hardware/Platform requirements
n
n
Underlying Model ( e.g.. OOD , RD) Underlying Model ( e.g.. OOD , RD)
30 30 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN Logical Design Logical Design
Process of converting the conceptual model into Process of converting the conceptual model into an appropriate form for the DBMS chosen an appropriate form for the DBMS chosen Maps conceptual objects to logical DB Maps conceptual objects to logical DB constructs ( tables and relations) constructs ( tables and relations) Common components:
Common components:
n n
Tables Tables
n n
Indexes Indexes
n n
Views Views
n
n
Access Control Access Control
31 31 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC – – DB DESIGN DB DESIGN Physical Design Physical Design
Physical layout of data on partitions Physical layout of data on partitions Highly technical
Highly technical
Can have an big impact on performance Can have an big impact on performance Most designers favour transparency of Most designers favour transparency of physical layout
physical layout
Complicated on distributed systems Complicated on distributed systems
32 32 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Physical Design Physical Design
33 33 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC - - Implementation and Implementation and Loading
Loading
Creation of special storage
Creation of special storage- -related constructs related constructs to house end
to house end- -user tables user tables Data loaded/imported into tables Data loaded/imported into tables Other issues to be considered Other issues to be considered
n
n
Performance Performance
n n
Security Security
n
n
Backup and recovery Backup and recovery
n n
Integrity Integrity
n
n
Company standards Company standards
n
n
Concurrency controls Concurrency controls
34 34 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC - - Testing and evaluation Testing and evaluation
Database is tested and fine
Database is tested and fine- -tuned against a tuned against a number of criteria, including
number of criteria, including
n
n
performance, integrity, concurrent access security performance, integrity, concurrent access security constraints
constraints
Done in parallel with application development Done in parallel with application development Corrective actions taken to correct problems:
Corrective actions taken to correct problems:
n
n
Fine Fine- -tuning based on reference manuals tuning based on reference manuals
n
n
Modification of physical design Modification of physical design
n
n
Modification of logical design Modification of logical design
n
n
Upgrade or change DBMS software or hardware Upgrade or change DBMS software or hardware
35 35 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC - - Operation Operation
Database considered operational Database considered operational
‘Go
‘Go- -live’ procedures live’ procedures
Possible running in parallel with old systems Possible running in parallel with old systems Starts process of system evaluation Starts process of system evaluation Unforeseen problems may surface Unforeseen problems may surface
Demand for change is constant and must be Demand for change is constant and must be managed
managed
Importance of adherence to the original Importance of adherence to the original specification.
specification.
36 36 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
DBLC
DBLC - - Maintenance and Maintenance and Evaluation
Evaluation
Preventative maintenance Preventative maintenance Corrective maintenance Corrective maintenance Adaptive maintenance Adaptive maintenance
Assignment of access permissions Assignment of access permissions Generation of database access statistics to Generation of database access statistics to monitor performance
monitor performance
Periodic security audits based on system Periodic security audits based on system - - generated statistics
generated statistics Periodic system usage
Periodic system usage- -summaries summaries
37 37 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Design Strategies Design Strategies
Top Top- -down down
n
n
Identify the data sets Identify the data sets
n
n
Define the data elements Define the data elements
Bottom Bottom- -up up
n
n
Identify data elements Identify data elements
n
n
Group elements into data sets Group elements into data sets
38 38 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Design Strategies Design Strategies
39 39 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Centralised vs. Decentralised Centralised vs. Decentralised
Design Design
Centralized design Centralized design
n
n
Typical of simple databases Typical of simple databases
n
n
Conducted by single person or small team Conducted by single person or small team Decentralized design
Decentralized design
n
n
Larger numbers of entities and complex Larger numbers of entities and complex relations
relations
n
n
Spread across multiple sites Spread across multiple sites
n
n
Developed by teams Developed by teams
40 40 Barry Irwin August 2003
Barry Irwin August 2003 Rhodes University CS202 : DatabasesRhodes University CS202 : Databases
Decentralised
Decentralised Design Design