Data quality problems in ETL: the state of the practice in large organisations
Files
(Published version)
Date
2016
Authors
Woodall, P.
Borek, A.
Oberhofer, M.
Gao, J.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
ICIQ 2016: 21st International Conference of Information Quality proceedings, 2016, iss.7, pp.1-11
Statement of Responsibility
Conference Name
ICIQ 2016: 21st International Conference of Information Quality (22 Jun 2016 - 23 Jun 2016 : Ciudad Real, Spain)
Abstract
This paper presents a review of the data quality problems that arise because of Extract, Transform and Load (ETL) technology in large organisations by observing the context in which the ETL is deployed. Using a case study methodology, information about the data quality problems and their context arising from deployments in six large organisations is reported. The findings indicate that ETL deployments most commonly introduce data accessibility problems which are caused by (1) the ETL failing part way and not delivering the data on time, (2) the information systems being locked during ETL execution, and (3) users not being able to find data in the target because of errors in the way the primary keys are transformed. Furthermore, accuracy, timeliness, believability, and representational consistency problems were also found to be caused by the ETL technology.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
Copyright 2016 The Authors