Database Management
Foundations of Information Systems
MS PowerPoint 4.0 Presentation MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation MS PowerPoint 4.0 Presentation
Chapter Objectives
• Specify the elements of the data hierarchy.
• Understand what can be accomplished with files and
what the limitations of a file-based environment are.
• Understand the advantages of a database
environment and the role of a database management
system (DBMS).
• Specify the three levels at which data are defined in
databases.
• Compare the three data models.
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation MS PowerPoint 4.0 Presentation
Chapter Objectives, cont.
• Understand the role of the data dictionary.
• Explain the components of information
resource management.
• Describe the developmental trends in
database management, including distributed
databases, data warehouses, and
Hierarchy of Data
• Data are the Principal Resources of an
Organization
• Data Stored in Computer Systems form
a Hierarchy
– Extending from a Single Bit to a Database
– The Major Record Keeping entity of a firm
– Each higher rung of this hierarchy is
Hierarchy of Data
• Data are logically organized into:
• 1. Bits (Characters)
• 2. Fields
• 3. Records
• 4. Files
Hierarchy of Data
• Bit (Character)
– The smallest unit of data representation
– Value of a Bit may be 0 or 1
– 8 Bits make a Byte which can represent a
character or a special symbol in a
Hierarchy of Data
• Field
– A grouping of characters
– A data field represents an attribute,
Hierarchy of Data
• Record
– A collection of attributes that describe a
real world entity
Hierarchy of Data
• File
– A group of related records
– Classified by the application for which
they are used
• Example: Employee File
Hierarchy of Data
• Database
– An integrated collection of logically related records or
files
– Consolidates records previously stored in separate files
into a common pool of data records that provides data
for may applications
– The Data is managed by systems software called
Database Management Systems (DBMS)
Hierarchy of Data Organization in
Computer Storage
Hierarchy of Data Organization in
Computer Storage
Component of Data Organization Logical Components Physical (Storage) Components Database File Record Field (attribute) Byte Bit Example SUPPLIERS PARTS SHIPMENTS SUPPLIERS
NO. NAME STREETADDRESS CITY ST ZIP 13 Gasket Co. 50 Oak Tifflin OH 44883
3251 Reliable Supp. 11 Cedar Teaneck NJ 07666 13 Gasket Co. 50 Oak Tifflin OH 44883
Reliable Suppliers
01000001 (represents “A” in the ASCII-8 character code) 0
File Environment and Its
Limitations
• File Organization
– Data files are organized so as to facilitate
access to records and to ensure efficient
storage
• A Tradeoff between these two requirements
generally exists
File Environment and Its
Limitations
• Access to a record for reading it is the essential
operation on data
• 1. Sequential Access
– Records are accessed in the order they are stored
– The main access mode only in Batch Systems where files are used and updated at regular intervals
• 2. Direct Access
– Required by On-Line Processing
– A Record can be accessed without including the records between it and the beginning of the file
File Environment and Its
Limitations
• File Organization Methods
• 1. Sequential Organization
File Environment and Its
Limitations
• Sequential Organization
– Records are physically stored in a specified order according to a key field in each record
– Advantages
• Fast
• Efficient for dealing with large volumes data that need to be processed periodically (batch system)
– Disadvantages
• Requires that all new transactions be sorted into the proper sequence for sequential access processing
• Locating, Storing, Modifying, Deleting, or Adding Records in the file requires rearranging the file
File Environment and Its
Limitations
• Indexed Sequential Organization
– Records are physically stored in
sequential order on a magnetic disk or
other direct access storage device based
on the key field of each record
File Environment and Its
Limitations
• Direct Organization
– Provides the fastest direct access to records
– Records do not have to be arranged in any particular
sequence on storage media
– Computers must keep track of the storage location
of each record using a variety of direct organization
methods to retrieve data
– New transactions’ data does not have to be sorted
– Processing that requires immediate responses or
File Environment and Its
Limitations
• Limitations of a File Oriented
Environment
– Fair amount of data redundancy
– Example of Three Systems
• Supplier
• Shipment
• Inventory
Database Environment
• A Database is an organized collection of
interrelated data that serves a number of
applications in an enterprise.
– Stores both the values of various entities
and their relationships
– Managed by a Database Management
System DBMS
Database Environment
• DBMS
• Helps organize data for effective access
by a variety of users with different access
needs and for efficient storage
• Makes it possible to create, access,
maintain, and control databases
Database Environment
• Advantages of a Database Management
Approach
• Avoiding uncontrolled data redundancy
and preventing inconsistency
• Program-Data independence
• Flexible access to shared data
A Database Environment
A Database Environment
Reports
Reports DBMS Application
Program A
Application Program X End Users
Query
Update
Database
Levels of Data Definition in
Databases
• User view of a DBMS is the basis for
modeling steps
• Data models define the logical
relationships among the elements to
support the basic process
Levels of Data Definition in
Databases
• DBMS defines a database
• Schema
– Overall logical view of the relationships between data in a database
• Subschema
– Logical view of data relationships needed to support specific end user applications to access the database
• Physical
– How data is physically arranged, stored, and accessed on the magnetic disks and other secondary storage devices of a
Levels of Data Definition in
Databases
• DBMS provides the language, DDL
Data Definition Language to define the
databases on the three levels (Schema,
Subschema, and Physical), and the
DML Data Manipulation Language to
access records, change values of
Data Models or How to
Represent Relationships
Between Data
• Data Model is a method for organizing databases
on the logical, schema, and subschema levels.
• The main concern is how to represent
relationships among database records
• Relationships are based on logical data structures
or models
• DBMS are designed to provide end users with
quick, easy access to information in the
Data Models or How to Represent
Relationships
Between Data
• Data Model Structures
• Hierarchical
• Network
Data Models or How to Represent
Relationships
Between Data: Hierarchical
• Used by early mainframe DBMS
• Relationships between records form a hierarchy or tree-like structure.
• Records are dependent and arranged in multilevel structures of one root record and any number of subordinate levels.
• Relationships among the records are one-to-many as each data element is related only to one element above it.
• The data element or record at the highest level is the root element. Any data element can be accessed by moving
Data Models or How to Represent
Relationships
Between Data: Hierarchical
• Advantages
– Ease with which data can be stored and retrieved in structured, routine types of transactions
– Ease with which data can be extracted for reporting purposes – Routine types of transaction processing are fast and efficient.
• Disadvantages
– Hierarchical one-to-many relationships must be specified in advance and are not flexible.
A Hierarchical Database
A Hierarchical Database
CLASSCLASS
MIS 101 1 525 MWF 10-11 a.m.
PROFESSOR PROFESSOR PROFESSOR NUMBER PROFESSOR NUMBER
221 Turkel 02579 Young
Data Models or How to Represent
Relationships
Between Data: Network
• Can represent more complex logical
relationships and is used by mainframe
DBMS packages
Data Models or How to Represent
Relationships
Between Data: Network
• Advantages
– More flexible than the hierarchical model
– Ability to provide sophisticated logical relationships among the records
• Disadvantages
– Network many-to-many relationships must be specified in advance
– User is limited to retrieving data that can be access using the established links between records
PROFESSOR PROFESSOR 221 PROFESSOR NUMBER PROFESSOR NAME
MIS 101 1 525 MWF 10-11 a.m.
COURSE NUMBER SECTION ROOM NUMBER MEETING DAYS MEETING HOURS 01171 STUDENT NUMBER STUDENT NAME Turkel Johnson STUDENT STUDENT
Links to Records of Other Classes Taught by This Professor
Links to Records of Other Classes Taken by This Student
Links to Records of Other Students Taking This Class
A Network Database
Data Models or How to Represent
Relationships
Between Data: Relational
•
Most popular Database Structure
•
Used by most microcomputer DBMS packages as
well as many minicomputer and mainframe
systems
•
Data elements within the database are stored in the
form of simple tables which are related if they
contain common fields
Data Models or How to Represent
Relationships
Between Data: Relational
• Advantages
– Flexible in that ad hoc information request can be handled – Easy for programmers
– End users can use Relational Structures with little effort or training.
– Easier to maintain than the hierarchical and network models
• Disadvantages
Relational Databases
• A collection of tables that is relatively easy to use and understand offering flexibility for the data and ability to modify
• All records in a relational database must have a unique primary key
• Relational Systems support three principal operations on tables without any predefined access paths
• Select- from a specified row that satisfies a given condition • Project- selects the attribute values
Relational Databases
• The power of the relational model
derives from the join operation.
– Records are related through a join
operation rather than links so that a
predefined access path is not needed
– The join operation is time consuming
SQL- A Relational Query
Language
• International standard access language fro defining and
manipulating data in databases
• Data-Definition-and-Management Language for DBMS
and some nonrelational ones
• SQL is used an independent query language to define
objects in a database, enter, and access the data
• Embedded SQL is for programming in procedural
languages (host) such as C, COBOL, PL/1, PL/SQL to
access from an application program
SQL- A Relational Query
Language
• Principal Facilities of SQL
• Data Definition
Designing a Relational
Database
• Database design progresses from the design of the
logical of the schema and the subschema to the physical
level.
• Logical Design or Data Modeling is to design the
schema and the subschema.
• A relational database consists of tables (relations)
describing the attributes of a particular class of entities.
• Logical Design begins with identifying the entity classes
Designing a Relational
Database
• Entity-Relationship (E-R) Diagrams
are used to perform data modeling.
Designing a Relational
Database
• Normalization is the simplification of the
logical view of data in relational databases
meaning that all fields contain single data
elements, distinct records, and each table
describes only a single class of entities.
Designing a Relational
Database
• Physical Design is after the Logical Design of the
database.
• Fields are specified as to their length and the
nature of the data (numeric, characters).
• A principal objective is to minimize the number of
time consuming disk accesses that are necessary in
order to answer typical database queries.
The Data Dictionary
• A software module and database
containing descriptions and definitions
of the structure, data elements,
interrelationships, and other
The Data Dictionary
•
Schema, Subschemas, and Physical Schema
•
Which applications and users may retrieve and/or modify
the specific data
•
Cross reference information such as which programs use
what data and the users receive particular reports
•
Where individual data elements originate, and who is
responsible for maintaining the data
•
Standard naming conventions for database entities
•Integrity rules for the data
•
Where the data are stored in geographically distributed
The Data Dictionary
• Contains all the data definitions and
information necessary to identify data
ownership
Managing the Data Resource
of an Organization
• Database technology enables organizations to
control their data as a resource, but it does not
automatically produce organizational control of data
• Components of Information Resources Management:
Organizational and Technical Means
• Ensure that a firm systematically accumulates data
in its databases
Managing the Data Resource
of an Organization
• Principal Components of Information
Resource Management
• Organizational Processes
– Information planning and data modeling
• Enabling Technologies
– DBMS and a Data Dictionary
• Organizational Functions
Managing the Data Resource
of an Organization
• Database Administration Functional
Units to Manage the Data
• Data Administrator (DA)
Managing the Data Resource of an
Organization:
Data Administrator
• Person who has the central responsibility for an organization’s data
• Establish policies and specific procedures for collecting, validating, sharing, and inventorying data to be stored and
databases and for making information accessible to members of the organization and possible persons outside of it
• Data administration is a policy making function and the DA should have access to senior corporate management.
• Key person involved in the strategic planning of the data resource
Managing the Data Resource of an
Organization:
Database Administrator
• A specialist responsible for maintaining standards for the development, maintenance and security of an organization’s databases
• Creating the databases and carrying out the policies of the data administrator
• In large organizations, the DBA function is performed by a group of professionals; in a small firm, a programmer/ analyst may
perform the DBA function, while one of the managers acts as the DA.
Managing the Data Resource of
an Organization
• Joint Responsibilities of the DA and DBA
• Maintaining the Data Dictionary
• Standardizing names and other aspects of data
definition
• Providing Backup
• Provide security and privacy for the data stored
in the database
Developmental Trends in
Database Management
• Distributed Databases
• Data Warehousing
Developmental Trends in
Database Management
• Distributed Databases
– Spread across several physical locations
– Data are placed where they are used most often – Entire database is available to each authorized user – Local work groups (LAN)
– Departments at regional offices (WAN)
• Branch offices, manufacturing plants, and other work sites
Developmental Trends in
Database Management
• Data Warehouse Databases
– Stores data from current and previous years that has been extracted from operational and management databases
– Central source of data
– Standardized and integrated for use by managers and other end user professionals
– Objective of a corporate data warehouse
• To continually select data from the operational databases • Transform the data into a uniform format
• Open the warehouse to the ends through a friendly and consistent interface
– Data mining
Developmental Trends in
Database Management
• Systems supporting a data warehouse • Extract and Prepare Data
– The first subsystem extracts the data from the operational systems – Many are older legacy systems that get “scrubbed” by removing
errors and inconsistencies
• Store Data in the Warehouse
– The second support component is the DBMS that will manage the warehouse data.
• Provide Access and Analysis Capabilities
– The third subsystem is made up of the query tools to access the data – Includes OLAP (OnLine Analytical Processing) and other DSS tools
Developmental Trends in
Database Management
• Object-Oriented and Other Rich Databases
– With expanded capabilities of information technology, the content of databases is becoming richer
– Traditional databases include largely numerical data or short fragments of text organized into well structured records
– As processing and storage capabilities expand with growing telecommunications, the knowledge is supported with rich data
– Geographic Information Systems – Object-Oriented Databases
– Hypertext and Hypermedia Databases
Tele-communications communications Network Network
Site 1 Site 3
Site 2
Users have access to the entire database over the network
Database Fragment 3 Database Fragment 1 Database Fragment 2 . . . . . .
A System with a Distributed Database
Bit Byte Field Record File Primary Key Database
Access (to a record) Sequential Access Direct Access Sequential File Indexed-Sequential File Direct File Database Management System (DBMS) Program-Data Independence Schema Subschema Bit Byte Field Record File Primary Key Database