The proposed framework for implementing a scalable BI system is given in Figure 2-1. Each step of the framework is derived from the existing studies focused on implementing BI systems as discussed in the previous chapter.
The steps of the proposed framework are discussed in the next sections below.
Requirements Identification
Data Warehouse
Creation
BI Tool Selection
Report Templates
Creation
BI Portal
Creation Testing Deployment Support and
Maintenance
Figure 2-1 Proposed framework for creating a scalable BI system
A framework for implementing a scalable business intelligence system 28 Organisations must have a culture of working with information and information technology (IT) to build and implement BI systems [51]. As a result, it is vital to first establish a broad vision for the BI system and analyse the organisation's readiness [12], [51], [56]. This stage entails defining an organisation's informational demands as well as paying close attention to key IT decision-makers and specialists.
The first step to consider is the organisation's readiness for a BI system. Readiness for BI can be thought of as a risk analysis of the BI system's development, with the goal of increasing the potential return on investment [56], [57]. There are two reasons why a readiness assessment is required, which includes:
• Identifying areas that are not ready for BI processing to avoid wasting resources [56].
• Indicating the approaches required to address these issues to successfully implement the BI system [56].
Various approaches have been proposed to measure organisational readiness for BI system implementation. The primary elements used to determine organisational readiness are as follows:
a) Culture
• Refers to the organisational culture of learning, resistance to change as well as the habit of making data-driven decisions [56].
b) Individual
• Refers to the skills and knowledge of the people within the organisation [56].
c) Management
• Refers to the support of management and their willingness to provide resources for the successful implementation of the BI system [56].
d) Strategy
• Refers to a well-defined plan of the organisation to implement the BI system [56].
When developing a BI system, it is vital to have strong backing from an organisation's senior management, especially because a BI system is a large investment initiative [12]. Senior managers are also possible end-users of the BI system, therefore understanding their vision for the BI system's potential impact will help increase the potential value of the implemented system.
Once the readiness has been determined, the subsequent step is to collect and organise the business requirements, as well as to define the project scope of the BI system [12]. This includes developing a budget to meet the financial costs associated with implementing the BI system.
Finally, it is necessary to form a team that will oversee implementing the BI system [12]. The team should ideally include both technical specialists and business users. Throughout the implementation, it is critical that team members communicate on the project's progress as well as obstacles.
A framework for implementing a scalable business intelligence system 29 The most important component of a BI system is the data warehouse [6], [12], [42]. Thus, it is critical that it is properly designed and implemented. Numerous studies [12], [46], [48], [50], [53], agree that the most effective method for designing and implementing a data warehouse is Kimball’s method, which is based on dimensional modelling. Dimensional modelling provides both understandability and fast query performance [12], [46].
Dimensional modelling is a logical design technique for arranging data in a user-friendly manner. Kimball et. al [12] argue that simplifying the database is a crucial necessity because it ensures that users can readily comprehend it. A detailed overview of dimensional modelling is given in Section 1.1.2.
Figure 2-2 depicts the steps involved in creating a data warehouse.
Figure 2-2 Data warehouse life cycle [12]
BUSINESS REQUIREMENTS DEFINITION
Almost every decision made during the implementation of a BI system is influenced by business requirements. Talking to the business users is the first step in the entire approach to requirements definition. To guarantee that the BI system meets their demands, the goal is to learn about their jobs, objectives, and difficulties, as well as gain a deeper understanding of what they do and how they make decisions [12].
In-person meetings with business users are essential for fully comprehending their needs.
They assist in the development of positive working relationships, which has a significant impact on their eventual acceptance of the BI system. Kimball et al [12] suggests interviews as an interactive strategy for acquiring requirements.
The goal of the interviews is to learn more about the organisation's goals, success measures, major concerns, and analysis requirements, such as the type of analytics needed, the purpose of the analytics, and the amount of historical data required.
The interview's findings should subsequently be summarised and connected to the organisations fundamental business processes. A business process is the most basic activity that a company engages in, such as accepting orders, shipping, and invoicing. The findings are then summarised and organised into a high-level enterprise data warehouse bus matrix [12].
The bus matrix's rows represent each business process, and the columns correspond to the data's conformed dimensions. These are the typical words that are used to filter a business
Business Requirements
Definition
Dimensional Modelling
Physical Design
ETL Design &
Development
A framework for implementing a scalable business intelligence system 30 process. Filtering bills by customer is one example. The matrix cells are then marked to show which dimensions are associated with which business process.
Table 2-1 shows an example of a bus matrix.
Table 2-1 High level enterprise data warehouse bus matrix for restaurant
Date Customer Waiter Food
Menu Billing x x x x
Waiter Service Tips x x x
Order Complaints x x x
DIMENSIONAL MODELLING
Dimensional modelling is a dynamic and iterative process. Following a few preparatory stages, the design process begins with an initial graphical model built from the requirements definition's bus matrix. There are four major phases involved in dimensional modelling [12].
Namely:
• Preparation
• Develop a high-level model
• Develop a detail dimensional model
• Review the dimensional model
Dimensional models develop over time through several design meetings, with each pass resulting in a more complete and robust design that has been frequently evaluated against the organisations’ demands. When the model clearly fits the business requirements, the process is complete. The four major phases of dimensional modelling are discussed in the sub- sections to follow.
PREPARATION
Identifying the roles and participants needed, analysing the business requirements
documentation, setting up the modelling environment, establishing naming standards, and procuring necessary facilities and supplies are all part of the preparation phase. However, before initiating the modelling process, it is critical to review the business requirements, as these will be transformed into a scalable dimensional model [12].
Another significant topic to consider during this phase is the establishment of naming conventions [12]. This is significant because the labels you add to data must be informative and consistent in conveying a strong business orientation, especially when users browse the
A framework for implementing a scalable business intelligence system 31 data models through the BI system. A poor naming standard will jeopardise the usability provided by dimensional modelling [12].
The subsequent stage in the preparation phase is to identify the source data. This includes comprehending the source data's structure as well as its contents, relationships, and derivation rules. This also involves ensuring that the data exists and it is usable, or that its flaws can be managed, and that there is an understanding of what it will take to convert it into the dimensional model [12].
DEVELOP A HIGH-LEVEL MODEL
Designing dimensional data models is a four-step procedure [12]. The process does not take place in a single design session, but rather offers context for a succession of design
sessions. The bus matrix created during the requirements definition process specifies the business processes and their related dimensions, making it an important input to this four- step process.
The four-step process is depicted in Figure 2-3.
Figure 2-3 Four-step dimensional modelling [12]
The first step for developing a high-level model is to choose which business process should be modelled. According to the Kimball approach, the data warehouse should be built in an iterative manner, one business process at a time [12]. The fact table's granularity, or grain, is declared in the second stage. The granularity denotes the precise measurement that a fact table entry represents. For an example, an individual transaction against an insurance policy [12].
Building fact tables at a detailed level is essential for dimensional modelling since it provides for the most dimensional and scalable data [12]. The fact table's dimensions are determined in the third stage. All the dimensions in the bus matrix should be checked against the grain at this point to determine if they fit.
Identifying the facts or measures from the business process is the final step in the modelling process. It's crucial not to add facts that don't match the granularity of the fact table because this will complicate the BI system and produce problems [12].
Figure 2-4 shows an example of a high-level model diagram. Once the high-level model is created, a list of attributes and metrics for the dimension and fact tables must then be created.
Select a Business
Process
Declare the Grain
Identify the Dimensions
Identify the Facts
A framework for implementing a scalable business intelligence system 32
Figure 2-4 High-level model example
Once a dimensional model has been designed, the next step is to develop a high-level dimensional model diagram, which is a graphical depiction of the business process's dimension and fact tables.
Each component should be properly labelled, the grain of the fact table should be stated, and all measurements related to the fact table should be displayed.
DEVELOP A DETAIL DIMENSIONAL MODEL
Developing a detailed dimensional model entails adding missing data to the high-level model, resolving design concerns, and validating the model against the business requirements to ensure completeness. A critical component of this approach is determining the best data sources to feed the target design. The following criteria are often used to determine appropriate data sources [12]:
• Data Accessibility
• Data Availability
• Data Accuracy
REVIEW THE DIMENSIONAL MODEL
This step entails examining the model with various users who have varying degrees of business and technical knowledge, to obtain feedback from interested individuals throughout the organisation [12]. This evaluation procedure is likely to involve several parties. This includes:
• IT experts
• Data experts
• Business users
Sugar Level
Date
Patient
Nurse Ward
Diagnosis
A framework for implementing a scalable business intelligence system 33 PHYSICAL DESIGN
The elements of the logical database design were the emphasis of the preceding phase. This phase describes how to convert a logical design into a physical database. The physical data model and database implementation specifics are heavily dependent on project-specific elements such as the logical data model, data warehouse platform selection, and access tools.
Figure 2-5 gives an overview of the main steps involved in this phase.
Figure 2-5 Physical design process
DEVELOP STANDARDS
It is critical to establish standards for various BI system components [12], as it becomes considerably easier for business users and developers to manage the complicated BI system if standards are regularly used. This includes ensuring that the data warehouse's table and column names are almost equivalent to the logical model's names.
Other considerations include deciding whether to use nulls in the data warehouse dimension tables. Business users, according to Kimball et al [12], are confused by nulls, especially when creating queries that filter on a property that is occasionally null.
DEVELOP PHYSICAL DATA MODEL
Although some changes in the layout of tables and columns may be necessary to accommodate the hard restrictions of your selected data warehouse platform and access tools, the physical model should try to match the logical model as closely as possible [12]. The physical and logical models differ primarily in the detailed specification of physical database properties, such as data types.
To meet the special requirements of your access tool or increase query efficiency, you can diverge from the logical model [12]. Each dimension is frequently condensed into a single table in the logical data model, which is commonly arranged as a star schema. According to Kimball's method, this should also apply to the physical model [12].
It is also recommended that the physical data model be developed using a data modelling tool. This will make the flow of information through the ETL system and into the metadata more natural and more understandable for the user [12].
CREATE THE DATABASE
The physical database must be instantiated as the final stage. Physical implementation details differ significantly from project to project and platform to platform. Configuring the hardware, software, and infrastructure for a typical data warehouse takes time. The system must also be
Develop Standards Develop Physical
Data Model Create Database
A framework for implementing a scalable business intelligence system 34 optimised and managed, which takes a long time as well. Traditional data warehouses are typically harder to manage [16].
A cloud data warehouse is built to benefit from a higher number of users and applications [16].
The advantage of using a cloud data warehouse is that scalability is rapid and easy, allowing data to be scaled up or down as business needs change [16].
In the cloud, setting up a data warehouse is simple, whereas setting up a traditional data warehouse takes a long time. Cloud data warehouses are particularly beneficial for analytics because they employ parallel processing capabilities which boost query performance [16].
Traditional data warehouses are concerned managing data, whereas cloud data warehouses enable organisations to move their focus from management to analysis.
ETL DESIGN AND DEVELOPMENT
The ETL system is an important part of the BI system. The core activities of ETL are collecting data from several datasets and feeding it into the target table, but the transformation stage is where ETL brings value [13].
When it comes to putting the ETL system in place, there are two key steps. The first stage is to devise a high-level strategy for mapping the source data to the target tables. Finally, you must either select an ETL tool or create one from scratch. These steps are discussed in detail below [12], [13].
CREATE A HIGH-LEVEL PLAN
The high-level strategy outlines how to map source data to target tables. The purpose is to annotate the primary transformation difficulties that are required for each source, as well as the target tables into which they will be loaded [12]. The strategy can also describe how to clean and conform data, such as deleting duplicates and standardising attribute values.
Figure 2-6 below shows an example of a high-level strategy for an ETL system that will process customer orders.
Target
Orders Order date Order item
Transformation Does ordered item exist?
Source
Customer orders spreadsheet
Figure 2-6 Example of high-level data staging plan
A framework for implementing a scalable business intelligence system 35 The data source is a spreadsheet with customer orders. The transformation phase will have to check if the ordered items are valid orders. Once the transformation is performed, the data will either be rejected or loaded into the data warehouse. The orders, order date, and order item target tables will be updated.
CHOOSE OR CREATE AN ETL TOOL
An ETL process can be implemented in two ways. Either buy the tool from a vendor or create the ETL tool from the ground up [13]. Each choice has its own set of advantages. While buying a tool saves time, building one from the ground up can save money [13].
ETL tools are widely accessible from a variety of major manufacturers. Typically, these tools have a licensing fee. Some ETL tools are open source as well. These tools are often used for several purposes. Some are effective in extracting data from a variety of sources, including flat files, ODBC, and OLEB11, among others.
The advantages of using a tool rather than making your own include the following:
• Self-documentation from graphical tools.
• Advanced transformation logic.
• Improved system performance with lower level of expertise [13].
The advantage of building an ETL tool from the ground up is that it is extremely flexible and can readily solve all the unique business requirements, as opposed to using a vendor product with a limited set of features [13].
Building an ETL tool demands programming expertise, and as a result, maintainability and extensibility can be a drawback, especially if the previous developer has left, and the ETL system needs to be maintained by a new developer [13].
Choosing a BI tool can be a difficult undertaking. Companies today provide a diverse range of BI products, from simple reporting tools to sophisticated BI platforms [51]. As a result, a BI tool should be current enough to satisfy the organisation's needs in a few years. At this point, a thorough understanding of the BI market is essential [51].
Easy-to-use functionality is a hallmark of modern BI solutions, which enables the entire analytic workflow, from data preparation through to data insight discovery. In the BI sector, vendors range from established companies to start-ups backed by venture capital funds.
Larger companies are linked with more comprehensive products that include data management tools [58].
According to analysts at Gartner, there are several critical capability areas of BI platforms that users need to consider [58]. These include:
11 Difference Between OLEDB and ODBC. Source:
http://www.differencebetween.net/technology/web-applications/difference-between-oledb-and-odbc/
A framework for implementing a scalable business intelligence system 36
• Security
• Manageability
• Cloud
• Data source connectivity
• Data preparation
• Model complexity
• Catalogue
• Automated insights
• Data visualisation
• Natural language query
• Data story telling
• Embedded analytics
• Natural language generation
• Reporting
Figure 2-7 below shows different BI platform vendors as evaluated by Gartner for 2021 [58].
Figure 2-7 Gartner’s magic quadrant for BI platforms [58]
A framework for implementing a scalable business intelligence system 37 The four categories of the vendors and tools are defined as follows:
• Leaders grasp the product's capabilities, committed to customer success, and have a good price model [58].
• Challengers are well-positioned to succeed but are limited to specific use cases [58].
• Visionaries offer extensive capabilities in the areas they target, but there are gaps when it comes to broader functionality [58].
• Niche Players perform effectively in a specific market segment, such as financial BI [58].
Although this is not a tool for picking vendors and products, it is one of many resources available to help you choose the best vendor and product.
The cost of the tool, as well as the learning curve for the tool's end user, are two more factors to consider when choosing a BI tool. Because BI tools are such a large investment for businesses, the pricing of the tool must make financial sense for them. The tool's learning curve will have an impact on system usage and, eventually, acceptance. The end-user may be discouraged from using the BI system if the selected tool is too difficult for them.
Most BI systems produce standard reports as their primary output [12]. They usually have a defined format and are parameter driven. These reports provide a great deal of versatility in terms of parameter selection and content setting. They are the first place to look for information about an organisation's analytics [12]. As a result, report templates are utilised to streamline reports.
Users can build reports using an existing template as a starting point for the layout, data model, and queries of a new report. Thus, templates are critical for standardising the report preparation process [38]. Users may simply replicate enticing layouts and other report characteristics by creating and using report templates, which they can then share with others.
In general, there are three processes to creating report templates [12], as depicted in Figure 2-8.
Figure 2-8 Report template creation process
Identifying user interactions entails determining the data the user will require access to, as well as the parameters the user will be required to give and the list of interactions, they must be able to do. A user may, for example, require access to information about customer orders.
The user may be required to give dates, among other parameters, to filter out customer orders.
Define user
interactions Create queries Apply formatting
rules