ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Data quality management and data cleaning

Uroš Podobnikar (2016) Data quality management and data cleaning. MSc thesis.

[img]
Preview
PDF
Download (4Mb)

    Abstract

    Today´s enterprises are often challenged by managing a large amount of data used in their business operation. Assurance and maintenance of adequate data quality level are important aspects of data quality management due to many reasons. On the one hand, the adequate data quality level represents a competitive advantage, and on the other hand, low data quality level leads to many unpleasant consequences. In the past, frameworks, methodologies, and tools to help ensuring adequate level of data quality were formed. Besides, the question of data quality is discussed in legislation and various standards. Despite that fact, some researches show poor state of data quality in enterprises. A purpose of the thesis is to research and present the area of data quality, and to show subsequent issues of low data quality. The thesis presents consequences as well as reasons of low data quality. It also shows reasons of data quality importance. In addition, it presents standards, legislation, and best practices that deal with the field of data quality. Data quality issues also arise in the field of the Internet of Things, which is an object of many researches lately, therefore, the thesis also presents main issues from that point of view. The main emphasis of the thesis is on the part of the field dealing with data quality and data cleaning. The thesis presents error types, various data cleaning frameworks, and combines their main activities in a consolidated view. Furthermore, the thesis presents an overview of the existing software solutions available on the market to support data cleaning tasks. The aforementioned is introduced in the theoretical part of the thesis. The second part of the thesis represents a practical part, where a proposal for data quality improvement is given using a prototype of a software solution to address a specific part of data quality management, which deals with data accuracy maintenance by sensing errors in data, and the possibility of error elimination (data cleaning). In addition, the thesis proposes installation of the solution in a concrete organisation´s information system by considering principles and rules the literature suggests. In the conclusion, there are essential approaches given to aid the improvement of data quality field in enterprises.

    Item Type: Thesis (MSc thesis)
    Keywords: data quality, data integrity, data quality management, DQM, data management, data cleaning, information security
    Number of Pages: 117
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Marjan Krisper51Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537004995)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3350
    Date Deposited: 14 Jun 2016 14:40
    Last Modified: 28 Jun 2016 14:04
    URI: http://eprints.fri.uni-lj.si/id/eprint/3350

    Actions (login required)

    View Item