Staff at the University of KwaZulu-Natal for their time to share their views on data quality. Globally, organizational management is increasingly faced with the need for data quality to make informed decisions.
LIST OF TABLES
LIST OF ABBREVIATIONS
CHAPTER ONE
INTRODUCTION TO THE RESEARCH
- Introduction
- Data and Information
- Data Quality Policy: Data Quality Initiative
- Data Quality Practice and Problems
- The Management Information (MI) / Institutional Intelligence (II) Function as a Catalyst for Change
- Statement of the Problem
- Objectives of the Study
- Research Questions
- Motivation for the Research
- Research Methods
- Beneficiaries of the Research and Outcomes
- Chapter Outline
- Literature Review
- Data Analysis
- Conclusions and Recommendations
More recent data crunch reports have highlighted the extent of data quality costs (Data Crunch Report (Australia) (2011), Data Crunch Report (UK) (2011). Therefore, this study sought to determine the perspectives of IS users on the sustainability of data quality activities .
CHAPTER TWO
LITERATURE REVIEW
Introduction
DATA GOVERNANCE
- Introduction and Definition
- Corporate and IT Governance
- Data Governance
- Decision Domains
- Data Governance - Data Quality
A discussion of data quality and its position within data governance requires an examination of data governance's position within business management. Organizations around the world are becoming more and more aware of the importance of data management (Brunelli, 2012).
DATA QUALITY - AWARENESS, CHARACTERISTICS, CAUSES AND IMPACT (Objective 1)
- Awareness of Data Quality
- Characteristics of Data Quality
- Other Data Quality Dimension Approaches
- Trade-offs within Dimensions
- Dependencies within Dimensions – Relationships of Data Quality Dimensions to Data Quality Improvement
- Causes of Data Quality
- Impact of Data Quality
Knowledge of the current state of data quality would make users aware of the need for ongoing maintenance. The data quality dimension that deserves mention is the CIA (Confidentiality, Integrity, and Availability) model.
ACCOUNTABILITY AND DATA QUALITY ROLE PLAYERS (Objective 2)
- Data Roles
- Data Stewards
- Data Custodians
- Connecting Data Roles to Data Quality Dimensions
- Data Quality Board
- Dimensions of Common Responsibility
- RACI Charting in Allocating Roles in Data Quality
For business data, an individual can be called the business owner of the data" http://www.information-management.com/glossary/d.html. Loshin (2001b, ibid.) acknowledges that the role of a data administrator also includes a business side.
DATA QUALITY PRACTICE / SPECIFIC ISSUES
- Performance
- The Comparative Approach, first implemented by Pipino, Lee and Wang (2002) involves the use of data quality surveys and quantifiable data quality
- An Alternative Comparative Approach uses aggregated results of data quality surveys to analyse and prioritise key areas of improvement. It includes
- Policy and Strategy – Way Forward
There are also other approaches to analyze roles and activity in terms of data quality, e.g. The COBIT framework (Barnier, 2012). He suggests that the data quality tolerance threshold should be related to the business impact of a data problem. Loshin proposes Service Level Agreements (SLAs) on data quality; his model is illustrated in Figure 2.12 below.
Very little subsequent research has been conducted on the relationship between data quality and KPIs.
DATA QUALITY COSTS
This study will focus only on the time ('hours'), namely the labor spent on the data quality activity. The objective is to obtain an estimate of cost savings through a data quality improvement exercise. This research study adopts a formula to determine labor costs in terms of hours spent on data quality activity.
A limitation of the study is that because a wide variety of factors driving data quality were examined, the scope of the study only allowed for the calculation of labor costs of data quality activities.
UNIVERSITY OF KWAZULU-NATAL - SITUATIONAL OVERVIEW
- Institutional Data Roles
- Roles and Functions ‘Augmented’ for the Purposes of this Research Study from the Institutional Data Quality Policy
Source: O' Riain, C and Helfert, M (2005), An Evaluation of Data Qualityrelated problem patterns in healthcare information systems, School of Computing, Dublin City University, Ireland. A list has been compiled of the three information user groups and their functions related to data quality, as outlined in the Institutional Data Quality Policy (Data Quality Principles and Guidelines, 2011). Advises DC DST on system problems, allowing data quality problems to be addressed at the source.
Pursuant to the Institutional Data Quality Policy, the Data Custodian is entrusted to coordinate the Data Quality initiative.
CONCLUSION
Preparing procedure manuals for recording, storing and maintaining data in the relevant transactional databases. For this research, the concept of Data Steward (as proposed by Karel, 2007) has been differentiated into Data Steward Technical ('DST') and Data Steward Business ('DSB') / also known as the Data Owner. This forms the basis for the differentiated approach that will unfold as the research data is analyzed and the perspectives of each of these roles are obtained.
CHAPTER THREE
RESEARCH METHODOLOGY
- Introduction
- Research Approach
- Research Design
- Research Process: Academic and Work
- Questionnaire
- Questionnaire Distribution - Survey Monkey
- Pilot Study
- Sampling and Population
- Summary of Research
- Data Collection Process, Tools and Analysis
- Ethical Considerations
- Conclusion
The questions in the section relating to the sustainability of data quality improvements are based on a four-point scale. Including all the stakeholders directly involved in the data quality process means that the numbers in each of the categories are representative. Three approaches are used in the literature to study data quality: (1) intuitive, (2) theoretical and (3) empirical (Wang and Strong, Ibid).
Differences in resources were assessed to illustrate different perspectives on the sustainability of data quality activities.
CHAPTER FOUR
DATA ANALYSIS
Introduction
BIOGRAPHICAL DETAILS
This is a significant number as it provides a good indication of management perceptions of the research issues, especially among Data Stewards (Business). This is a significant number because it gives a good indication of management perceptions of the research issues. Seventy-seven percent of the Data Stewards (Technical) who participated in the survey have more than 15 years of experience.
Additionally, 66.7% of data administrators have more than ten years of experience, followed by data administrators (business) (60%) and data administrators (technical) (89%).
AWARENESS AND COMMUNICATION
The low percentage of data custodians (technical) (37.5%) may reflect the fact that they were not exposed to data quality workshops. All data custodians (100%) reported experiencing data quality issues as part of their daily work. Seventy-three percent of data custodians (enterprises) stated that they have problems with data quality in their work.
This is ordered by Data Controllers (Business) as Data Owners and technical issues are handled by Data Controllers (Technical).
ISSUES RELATING TO ACCOUNTABILITY AND DATA QUALITY PRACTICE / MANAGEMENT
- ACCOUNTABILITY
The Data Stewards (Business) responded that they refer the problem (40%) but try to solve or work around it (53%). For example, 62.5% of Data Stewards (Technical) reported that they had referred the problem elsewhere, and 25% tried to work around the problem. They can only be 'part of the picture' in that they can assist the Data Custodians and Data Stewards (Business) with parts of the data quality system programming.
Data Stewards (Technical) reported a higher level of concern than the Data Custodians or Data Stewards (Business).
COST OF DATA QUALITY
The time that Data Stewards (Business) spend on data quality varies; 8% of these respondents reported spending more than 50% of their time on data quality. The tables below (Tables 4.24a to 4.24c) show the time spent on data quality by the data repositories, data stewards (business) and data stewards (technical). These costs may be conservative as data quality initiatives at the institution have not reached their peak.
Several areas remain to be integrated that will strain current resources for the data quality initiative.
SUSTAINABILITY OF DATA QUALITY IMPROVEMENTS
If these results are extrapolated to the total IS users in the survey, the labor costs amount to R 1 031 420 per year at the lower range and R 2 663 623 per year at the higher range. In addition, functional areas such as Human Resources and Finance are not incorporated in terms of errors in the current data quality audit system, and will contribute significantly to the labor costs calculated in this exercise. Furthermore, this exercise does not consider the cost of technical solutions and changing business processes to improve data quality.
Data quality initiatives / awareness will promote a data and information culture / information literacy (Question 24)
At this institution there are sufficient people to support the data quality initiative with the necessary skills and knowledge to guide
A mean of 2.21 suggests that IS users tend to be skeptical about the adequacy of the institutional skills and knowledge base to support data quality improvement. Funds for Data Administrators, Data Administrators (Business) and Data Administrators (Technical) are 3.0, 2.14 and 1.875 respectively. In this institution, training among data receivers/owners is appropriate to support the achievement of better data quality (question.
At this Institution training among data capturers / data owners is adequate to support the attainment of better data quality (Question
The mean of 2.66 indicates that a significant number of IS users believe that the level of training in data quality at the institution is inadequate and needs more attention. The averages for the Data Custodians, Data Stewards (Business) and Data Stewards (Technical) are 3.0, 2.61 and 2.625 respectively. The average of 2.69 suggests that the teams and working groups for data quality projects are not functioning as efficiently as expected.
The means for Data Custodians, Data Custodians (Business) and Data Supervisors (Technical) are 2.85 respectively.
In this organisation individuals have become comfortable with change and do not seek stability (Question 28)
In this organization, individuals are comfortable with change and do not seek stability (Question 28). The mean of 2.83 suggests that IS Users are not comfortable with change and implicitly express a "desire" for a data environment that is stable. The means for Data Custodians, Data Custodians (Business) and Data Supervisors (Technical) are 2.75 respectively.
Management views data quality as important (Question 29)
The funds for the data repositories, the data administrators (business) and the data administrators (technical) are respectively and 2.12.
There are enough people in the Institution to lead a data quality initiative (Question 30)
There are enough people in the Institution that care about data quality (Question 31)
Expectations about achieving data quality improvement are reasonable (Question 32)
Data quality activity / processes will over the longer term be successful and sustainable (Question 33)
This chapter provided an analysis and interpretation of the survey data using descriptive statistics regarding issues affecting data quality awareness, data quality practices, and accountability; an estimate of the cost of correcting bad data; and a statistical perspective on the sustainability of data quality improvements in relation to data managers and data administrators. The analysis of the data provided a comprehensive picture of the perspectives of the three groups of IS users. To elaborate on accountabilities and roles, a “RACI” matrix applied to the data quality environment at the institution may be provided in Chapter 5 for future research.
CONCLUSIONS AND RECOMMENDATIONS
- INTRODUCTION
- COMPARISON OF THE LITERATURE vs. FIELDWORK
- OBJECTIVE 2 : ACCOUNTABILITY
- OBJECTIVE 3 : PRACTICE – DATA QUALITY PRACTICES THAT MAY SUPPORT / INHIBIT DATA QUALITY IMPROVEMENT
- OBJECTIVE 4 - COST OF DATA QUALITY
- OBJECTIVE 5 : SUSTAINABILITY of DATA QUALITY IMPROVEMENT
- FURTHER RESEARCH
- RACI DIAGRAM – PROPOSED ACCOUNTABILITIES
- DEFINING
- MEASURING
- ANALYSING
- IMPROVING
- CONTROLLING / FEEDBACK
- CONCLUSION AND RECOMMENDATIONS
3 Accuracy of data quality (question 9) as a dimension of data quality appears to be more important than other dimensions. 5 The impact of data quality (question 22) appears to be financial, productivity and compliance risk. In addressing the issue of data quality accountability, data custodians and data custodians (technical) report that data custodians (business) should be.
This research has assessed perspectives towards the sustainability of data quality improvement at the university.
LIST OF REFERENCES
General details about you as a computer system(s) user / information worker
- Are you employed as a
- Which of the following computer system(s) do you use or are involved with most?
- As a systems user, in what capacity are you employed?
Data Quality Awareness, Experience and Practice (reflecting experience of the problem)
- As part of your systems and/or other job related training, do you feel that the notion of data quality and its importance to our business has been
- By what means has this knowledge or awareness been acquired?
- Are you aware of any Data Quality initiatives underway or having taken place?
- Do you experience data quality problems as part of your daily work?
- Current state of data quality Perfect
- On the system(s) with which you are involved, please indicate your responsibilities with respect to respect to changing data records
- Use of Reports - Should faster access to data / information help to discover DQ problems and to do something about it
- If yes, approximately what percentage of your week is involved in rework or resolving problems caused by bad data?
- Do you feel that the operational processes in your areas of work support /underpin work with respect to Data Quality, are robust and of
- When encountering a data quality problem, do you (choose most important option)
- Where, in which role, in your opinion, does the accountability for bad data lie?
- In terms of accountability for data and data quality success, do you believe that data quality responsibilities should be included in data owners’
- Data Problem Corrections – In order to minimise the time that data remains incorrect, do you believe that a ‘Data Correction Window Period’
- Elements that you see as Barriers to the adoption of Data Quality Initiatives (Choose the most important)
In the system(s) with which you are involved, please indicate your responsibilities regarding the change of data records responsibilities regarding the change of data. Do you feel that the operational processes in your areas supporting/supporting the work related to Data Quality are robust and supportive/supporting the work related to Data Quality are robust and of adequate quality. In terms of accountability for data success and data quality, do you believe that responsibilities for data quality should be included in data owners' belief that responsibilities for data quality should be included in data owners .
Elements you see as barriers to the adoption of Data Quality Initiatives (Choose the most important) Initiatives (Choose the most important).
Problem Areas, Impact and Causes related to Data Quality
- In which area do you feel the data problems lie? (Choose most important option)
- Impact of Poor Data – Please choose the most important impact element of poor data
- Causes of Poor Data – Please choose the most important contributor to poor data
An Organisational Assessment / Sustainability of Data Quality Improvement
- Data quality initiatives / awareness will promote a data and information culture / information literacy
- Awareness / Structures to communicate DQ problems
- Awareness of i t o the nature of DQ i e barriers to information use, causes, impact = Questions 20, (cause), 22(impact), 21 and 23 (cause)
The perspectives of control sites and commitment to accountability will be assessed in terms of linking data quality activity to performance management and improved turnover in terms of time spent on data quality improvement (section 2). To inform this question, time (person-hours) and costs will be determined that are devoted to improving data quality to determine a "baseline" from which data quality can be improved and data quality costs can be monitored (Question 14). Are there differences in perspectives toward data quality sustainability among the three groups of data quality stakeholders, and are the differences significant?.
Descriptive Statistics – Section D