Data flow diagrams model system processes and are used frequently by systems analysts. They can be used to model the existing system and to model the pro- posed system. Reports as a form of communication have been partially replaced by diagrams in systems development for a number of reasons: Diagrams are less ambiguous, diagrams show relationships better, and diagrams summarize material.
Data flow diagrams can either show the physical details of implemen- tation or be logical diagrams without the physical details. They can be used (Figure 2.3) during these phases:
• In the problem definition phase during analysis (physical data flow diagrams);
• To view the current system logically (logical data flow diagrams);
• In the requirements definition phase (logical data flow diagrams);
• To describe alternative solutions (semiphysical data flow diagrams);
• In the design stage (physical data flow diagrams).
Logical data flow diagrams are implementation-independent models.
They remove biases that stem from the way the existing system is implemented or the way that one person thinks that the system should be implemented.
Implementation-independent models reduce the risk of missing functional requirements because of a preoccupation with technical details. They allow the analyst to communicate with the user in a nontechnical way when gath- ering requirements. Four symbols used in data flow diagrams are shown in Figure 2.4.
Hierarchy of Data Flow Diagrams
Data flow diagrams (DFDs) are developed in increasing levels of detail for a system.
Context Diagram
The context diagram has one process and delineates the system boundaries (Figure 2.5). The process describes the system that is being examined. It could be the company or a subsystem or department within the company; it all depends on the scope of the system being examined. The external entities or agents could even be departments or subsystems within the same organization if the system being examined is a subsystem. No data stores are shown. Every data flow is named.
Subsystem Diagram
The context diagram does not show much detail. If the process from the con- text diagram is broken down further, subsystems can be shown. The subsystem
Analysis phase
Requirements phase
Alternative solutions Systems
design
Physical logical (if required)
Logical
Semi-physical Physical
Figure 2.3 Physical and logical stages of systems development.
Data flow name
(b) Process name
(c) Entity name
(d) Data store name
(a)
Figure 2.4 Data flow diagrams. (a) The data flow symbol shows the flow of data; (b) the process symbol shows the processing of data; (c) the entity shows data coming into and going from the system; and (d) the data store is a symbol to show the storage of data.
Member
Stores and
distribution Suppliers
Fairdinkum Wine Club Application
response
Order Invalid order
Dispatch
details Purchase
order New
member
Application
Promotion information
Figure 2.5 Context diagram.
diagram is sometimes referred to as the systems diagram or the overview diagram. This diagram shows the data flows between subsystems and may even include shared data stores. You can see that this diagram shows more informa- tion than the context diagram (Figure 2.6).
Detailed Data Flow Diagrams
The subsystem diagram can be decomposed further into middle level diagrams and then primitive level diagrams. The middle level diagrams are not always necessary; it just depends on the size and detail of the systems under study.
Primitive level diagrams show processes that are detailed and show the process- ing of a self-contained task for the system (Figures 2.7 and 2.8).
Decomposition Diagram
It is not recommended that you start by drawing one large data flow diagram. It is better to start with a decomposition diagram (Figure 2.9). This diagram starts with a box that includes the name of the system. The next level shows the major
Purchase order
Dispatch note New
applicant Member
Member Supplier
Application
Application response
Order Invalid order
Promotions information
Member details
Purchase orders Member orders Memberships
Promotions
Order processing
Purchase orders
Stores Order details
Invoice Accounts
Application
Member details
Figure 2.6 Subsystems diagram.
subsystems. The final level shows the processes that are part of each subsystem unless there are middle level diagrams. The decomposition diagram acts as a guide for developing the set of DFDs.
Data Dictionary Entries
Processes need to be documented because many involve a number of steps.
These will not be obvious from the DFD. Many of these processes will form the software component of the new system and hence need to be specified in detail so that the programmers can accurately implement them. The processes that
Member
Member Check
order details
Check credit
Request pre- payment
Check stock
Release order pending
Process order
Create backorder
Orders awaiting payment
Stock
Member orders
Backorders Request prepayment
Order
Incorrect order
Order without credit
Backorder details Pending order details Checked order
Backorder notification
Figure 2.7 Order processing diagram.
New applicant
Accounts Application
Store application details
Applications
Process application
Assign credit level Rejection
Acceptance
Credit level Incomplete application
Figure 2.8 New members diagram.
Fairdinkum Wine Club
Order processing Memberships
Accounts Promotions Purchasing
system Stores
Store application details
Process application
Assign credit level
Check credit
Check stock
Process order
Request pre- payment
Release order pending Check
credit details
Create backorder
Figure 2.9 Decomposition diagram for part of the wine club ordering system.
are not implemented in the software are manual procedures that still need to be recorded.
While the processes could be explained by a process description in text format, that description might be ambiguous to another reader. Structured English and decision tables are two ways of unambiguously presenting the steps involved in any process.
Decision Tables
A decision table is a method of describing process logic (Figure 2.10 and Table 2.5). The stages in developing a decision table are as follows:
1. Identify the conditions.
2. Identify the rules and their values.
3. Specify the actions.
4. Complete the bottom right corner of the decision table by identify- ing the actions that result from the combination of conditions.
Decision tables can be simplified. When one condition has every option covered and all the other conditions are the same, resulting in the same action, then the rules can be collapsed into one.
Structured English
Process logic can also be defined using structured English. The decision tables are used for defining business policies and structured English can be used to define business procedures. The two are complementary techniques (Figure 2.11).
Structured English uses a restricted set of terms and file and attribute names to describe the high-level logic. There is no set vocabulary for structured English, but its style must be brief and to the point. It relies on the use of repetitions, conditions, and sequence statements.
Conditions
Actions
Condition values
Action values
Figure 2.10 The structure of a decision table.
Data Modeling
Data modeling is a method of organizing and documenting a system’s data.
The models produced are considered to be logical models because they are implementation independent. Modeling is done in the early stages of the data- base design phase. Although the data is always changing in a business, the types
Table 2.5 Decision Table
Conditions Rules
Credit rating A B C A B C A B C A B C
Over 25? Y Y Y N N N Y Y Y N N N
Employed over 2 years?
N N N N N N Y Y Y Y Y Y
Actions
No credit X
Credit level 1 X X X X X X X X
Credit level 2 X X X
For each Credit Application
CASE 1 (Credit Rating A) then
If Age < 25 and Employed < 2 years Then Assign Credit level 1
Else
Assign Credit level 2 End If
CASE 2 (Credit Rating B) then Assign Credit level 1 CASE 3 (Credit Rating C) then
If Age < 25 and Employed < 2 years Then Assign No Credit
Else
Assign Credit level 1 End If
End Case Figure 2.11 Example of structured English.
of data collected are fairly stable. Data is usually more stable than processes, and hence some methods put the emphasis on data modeling. One methodology that does this is James Martin’s Information Engineering [1].
Entity Relationship Diagrams
The entity relationship diagram (ERD) is a data modeling technique that shows the data and the relationships between the data within a business (Figure 2.12).
Member
Order Backorder Backorder
item
Order item Product
Purchase item
Purchase order
Supplier 1:1
0:M
1:1 1:M
1:1
1:M 0:M
1:1
0:M
1:1
0:M
1:1 0:M
1:M 1:1
1:1
0:M 1:1
Figure 2.12 Entity relationship diagram.
It is not a technique to show how data is implemented, created, modified, or deleted.
A data entity is anything, real or abstract, about which we want to store data. A rectangle is used to denote a data entity (Figure 2.13). Each entity has a list of attributes that describe it, with one or several of them acting as the key or unique identifier. The following is a list of example entities:
Applicant Borrower Contractor
Client Creditor Customer
Book Course Machine
Project Purchase Order Quote
Building Campus State
A data relationship is shown by a line between the entities. The entity relationship diagram of Figure 2.14 shows products being ordered from suppli- ers. It can be read in the following ways:
• A purchase order is filled by one and only one supplier.
• A supplier fills zero or more orders.
• A product is contained on zero or more purchase orders.
• A purchase order contains at least one product.
Many-to-Many Relationships
Many-to-many relationships in ERDs are considered undesirable (Figure 2.14).
This problem creates difficulties in the database design and implementation
Student
Student_Number Student_Fname Student_Lname Address City Postcode Telephone
Attributes
Key An entity
Figure 2.13 An entity and its attributes.
stages. The purchase_order/product relationship has a many-to-many link. In this case, there are pieces of data for each ordered item, such as quantity ordered and price at the time of order. This can be resolved by creating another entity, which is shown in Figure 2.15. A purchase order has one or many ordered products. Zero or many ordered products relate to a product type. One product type can be ordered as zero (may not be an ordered product at this moment) or many ordered products.
The following steps are used to develop an entity relationship diagram:
1. Identify entities from either interviews or forms and files.
2. Define identifiers (keys) for each entity.
3. Draw a rough draft of the ERD.
4. Identify data attributes.
5. Match attributes to entities.
6. Attempt to resolve many-to-many relationships.
Data Analysis
After the initial data modeling has been completed, each entity has to be ana- lyzed to assess whether all of the attributes describe the one entity. The process is called data analysis or normalization. Normalization rules are designed to prevent update anomalies and data inconsistencies. We will deal with three
Supplier
Product Purchase
order
1:M 0:M 1:1 0:M
Figure 2.14 An entity relationship diagram to show the ordering of products from suppliers.
Purchase order
Product Ordered
product
1:1 0:M 1:M 1:1
New entity
Figure 2.15 Resolving many-to-many relationships.
levels of normalization termed first, second, and third normal form. There are two further levels (fourth and fifth), but we will not be concerned with those because they deal with special cases of data that you are less likely to come across.
First Normal Form
First normal form excludes variable repeating fields and groups. If you look at the data in Table 2.6 you can see a repeating component. This format would create much redundancy in a database because all of the order details would have to be repeated for every part or, if the cells were left blank, they would create ambiguity. The solution to this problem is to split the entity into two separate entities.
The data in Table 2.7 is in first normal form. The objective of doing this first step is to reduce the many-to-many relationships to one-to-many relation- ships thereby alleviating update and redundancy problems. The creation of the new entity Order_Line reduces the many-to-many to a one-to-many relation- ship. The key combines the keys of the original table and the repeating group.
Second Normal Form
Second normal form is violated when a nonkey field is a fact about a subset of a key. It is only relevant when the key is composite, that is, when it consists of several fields.
The problem with the data in Table 2.8 is that Warehouse_Address is a fact about the warehouse alone and not the part. Warehouse_Address is repeated in every record that has a part in that warehouse. If the address of the warehouse changes, every record for that warehouse must be updated. Data
Table 2.6 Data in Unnormalized Form Order
Ord-No Date Company Part-No Qty
0001 6/3/90 J Smith P1 10
P2 30
P7 10
0002 6/4/90 XYZ P2 10
P7 20
may become inconsistent because of the redundancy. At some time there may be no parts stored in the warehouse and, therefore, no record in which to keep the warehouse address. The record should be decomposed into two. The data in Table 2.9 is in second normal form.
Table 2.7
Order Data in First Normal Form Order
Order-No Date Company
0001 6/3/90 J. Smith
0002 6/4/90 XYZ
0003 6/4/90 A. Capp
Order_Line
Order-Line Part-No Qty
0001 P1 10
0001 P2 30
0001 P7 10
0002 P2 10
0002 P7 20
Table 2.8 Part and Warehouse Data Part_Warehouse
Part Warehouse Quantity Warehouse_Address
P1 Alpha1 4,000 20 Singleton Road, Bunbury
P2 Beta2 250 10 Desert Road, Geraldton
P3 Alpha1 2,235 20 Singleton Road, Bunbury
P4 Alpha1 965 20 Singleton Road, Bunbury
Third Normal Form
Third normal form is violated when a nonkey field is a fact about another nonkey field. In Table 2.10, Department_Location is a fact about the depart- ment and not about Employee_No. The main problem with these data is the repetition of Department_Location. The redundancy of this data has the potential to lead to data inconsistency.
The problem is resolved by breaking up the data into two entities (Table 2.11). Department is included in the Employee table and acts as a for- eign key as it forms a link between the two tables. A foreign key is the
Table 2.9
Part and Warehouse Data in Second Normal Form Part_Warehouse
Part Warehouse Quantity
P1 Alpha1 4,000
P2 Beta2 250
P3 Alpha1 2,235
P4 Alpha1 965
Warehouse
Warehouse Warehouse_Address
Alpha1 20 Singleton Road, Bunbury
Beta2 10 Desert Road, Geraldton
Table 2.10 Employee Data Employee
Employee No Department Department_Location Date of Birth
1011 Accounting Joondalup 01/23/59
1012 Information Systems Churchlands 11/17/60
1017 Accounting Joondalup 01/10/75
name given to a key from one entity that is repeated in another entity to form a link.