Data quality problems in ETL: the state of the practice in large organisations

Files

53141234200001831.pdf (314.45 KB)
  (Published version)

Date

2016

Authors

Woodall, P.
Borek, A.
Oberhofer, M.
Gao, J.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

ICIQ 2016: 21st International Conference of Information Quality proceedings, 2016, iss.7, pp.1-11

Statement of Responsibility

Conference Name

ICIQ 2016: 21st International Conference of Information Quality (22 Jun 2016 - 23 Jun 2016 : Ciudad Real, Spain)

Abstract

This paper presents a review of the data quality problems that arise because of Extract, Transform and Load (ETL) technology in large organisations by observing the context in which the ETL is deployed. Using a case study methodology, information about the data quality problems and their context arising from deployments in six large organisations is reported. The findings indicate that ETL deployments most commonly introduce data accessibility problems which are caused by (1) the ETL failing part way and not delivering the data on time, (2) the information systems being locked during ETL execution, and (3) users not being able to find data in the target because of errors in the way the primary keys are transformed. Furthermore, accuracy, timeliness, believability, and representational consistency problems were also found to be caused by the ETL technology.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright 2016 The Authors

License

Grant ID

Call number

Persistent link to this record