Business processes operate on data. Explicitly representing data, data types, and data dependencies between activities of a business process puts a business process management system in a position to control the transfer of relevant data as generated and processed during processes enactment.
3.7.1 Modelling Data
Data modelling is at the core of database design. The Entity Relationship approach is used to classify and organize data in a given application domain.
M2: Metamodel (data meta model, e.g., Entity Relationship Model)
M1: Model (data model)
M0: Instance (data values)
describes
describes
Notation (data model notation,
e.g., ER Notation, UML Class Diagrams)
exp resse
s Instance-of
Instance-of
Fig. 3.23.MOF levels of modelling data
The Entity Relationship modelling approach belongs to the metamodel level, as depicted in Figure 3.23, because it provides the required concepts to ex- press data models. Data modelling will be illustrated by a sample application domain, namely by order management.
In a modelling effort, the most important entities are identified and clas- sified. Entities are identifiable things or concepts in the real world that are important to the modelling goal. In the sample scenario, orders, customers, and products are among the entities of the real world that need to be repre- sented in the data model.
Entities are classified as entity types if they have the same or similar properties. Therefore, orders are classified by an entity type called Orders.
Since each order has an order number, a date, a quantity, and an amount, all order entities can be represented by this entity type. Properties of entities are represented by attributes of the respective entity types.
The entities classified in an entity type need to have similar, but not iden- tical structure, because attributes can be optional. If the application domain allows, for instance, for an order to have or not to have a discount, then the amount attribute is optional. This means that two orders are classified in entity type order even if one has a discount attribute while another does not.
Entity types in the Entity Relationship metamodel need to be represented in a notation by a particular symbol. While there are variants of Entity Re- lationship notations, entity types are often represented by rectangles, marked with the name of the entity type. Figure 3.24 shows an entity typeOrders at the centre of the diagram. Other entity types in the sample application do- main are customers and products. The attributes are represented as ellipsoids attached to entity types.
Entities are associated with each other by relationships. For instance, a customer “Miller” requests an order with the order number 42. These types of links between entities are called relationships. Just as there are many customer
Orders Products
Customers requests ships
discnt
city pid pname
cid ordno
dollars
month city quantity price
qty
(1, 1) (0,1) (0,1)
(0, N) (0,1)
(1,1)
cname
(0,1)
(0, N) (1, 1)
(0, 1)
(0, 1) (0, 1)
(0, 1) (1, 1) (1, 1)
(1, 1)
(0, 1)
Fig. 3.24. Entity relationship diagram involving customers, orders, and products, O’Neil and O’Neil (2000)
entities and many order entities, there are many customer-order relationships.
To represent these relationships, a relationship type requests classifies them all. In Entity Relationship diagrams, relationship types are typically repre- sented by diamond symbols, connected to the respective entity types by edges.
The complex nature of data in a given application domain can be well represented by Entity Relationship Diagrams. These diagrams can be used to create relational database tables, using transformation rules. Once the respec- tive database tables have been created in a relational database, application data can be stored persistently. The data can be retrieved efficiently using declarative query languages, for instance Structured Query Language.
While this discussion focuses on data modelling in the context of database applications, the same data modelling method can be used to represent data structures in business process management. Based on these data structures, data dependencies between activities in business processes can be captured precisely.
Data modelling is also the basis for the integration of heterogeneous data.
In the enterprise application integration scenarios discussed above, one of the main issues was the integration of data from heterogeneous data sources. Once data models are available for these data sources, the data integration problem can be addressed. There are advanced data integration techniques that also take into account data at the instance level, but explicit data models in general are essential to addressing data integration.
Data integration can then be realized by a mapping between the data types. For instance, there might be applications on top of database systemsA andB, such that these systems have tablesCustomerAandCustomerB, respec- tively, that differ. For instance, whileCNameis the attribute of theCustomerA table, referring to the name of the customer, CustN might be the respective attribute in theCustomerB table. In order to integrate both tables, the at- tributes need to be mapped. In this case, CustomerA.CName is mapped to CustomerB.CustN.
In data integration projects, complex integration problems are likely to emerge. There might be attributes that cannot be mapped, but there might also be attributes that need to be mapped to different tables, often by our using transformation rules. The hardest set of problems, however, stem from semantic heterogeneity. There are assumptions on the data that are not ex- plicit in the data model or in the actual data stored in the database. These semantic differences can only be taken into account when investigating the meaning of the attributes in detail, often during interviews with the persons involved in the data modelling and database design of the systems to integrate.
Semantic specification of data can be used to solve these data integration problems. However, complete semantic specification of data requires consider- able resources, and the completeness of the semantic specification cannot be proven automatically. Therefore, further research is required to evaluate the possibilities of semantics-based data integration.
In graph-based approaches to business process modelling, data dependen- cies are represented by data flow between activities. Each process activity is assigned a set of input and a set of output parameters. Upon its start, an activity reads its input parameters, and upon its termination it writes data it generated to its output parameters. These parameters can be used by follow- up activities in the business process.
The transfer of data between activities is known as data flow. By provid- ing graphical constructs to represent data flow between activities, the data perspective can be visualized and used to validate and optimize business pro- cesses. These aspects are covered in more detail in Section 4.6, which intro- duces graph-based process languages.
3.7.2 Workflow Data Patterns
To organize data-related issues in business process management, workflow data patterns have been introduced. Workflow data patterns formulate char- acteristics on how to handle data in business processes. They are organized according to the dimensions data visibility, data interaction, data transfer, and data-based routing.
Data visibility is very similar to the concept of scope in programming languages because it characterizes the area in which a certain data object is available for access. The most important workflow data patterns regarding data visibility are as follows.
• Task data: The data object is local to a particular activity; it is not visible to other activities of the same process or other processes.
• Block data: The data object is visible to all activities of a given subprocess.
• Workflow data: The data object is visible to all activities of a given busi- ness process, but access is restricted by the business process management system, as defined in the business process model.
• Environment data: The data object is part of the business process execu- tion environment; it can be accessed by process activities during process enactment.
Data interaction patterns describe how data objects can be passed between activities and processes. Data objects can be communicated between activities of the same business process, between activities and subprocesses of the same business process, and also between activities of different business processes.
Data can also be communicated between the business process and the business process management system.
Data transfer is the next dimension to consider. Data transfer can be performed by passing values of data objects and by passing references to data objects. These data transfer patterns are very similar to call-by-value and call- by-reference, concepts used in programming languages to invoke procedures and functions.
In data-based routing, data can have different implications on process enactment. In the simplest case, the presence of a data object can enable a process activity. Data objects can also be used to evaluate conditions in business process models, for instance, to decide on the particular branch to take in a split node.
Workflow data patterns are an appropriate means to organize aspects of business processes related to the handling of data.