Providing an optimal error detection model in the ETL process

Number of pages: 75 File Format: word File Code: 31018
Year: 2013 University Degree: Master's degree Category: Computer Engineering
  • Part of the Content
  • Contents & Resources
  • Summary of Providing an optimal error detection model in the ETL process

    Dissertation to receive a master's degree <>

    Trend: software

    Fault tolerance is one of the prominent and important features for information systems today. Among the various methods of improving fault tolerance, the software method is more complicated than other methods. Here, considering that our target system is business intelligence systems. have a significant role in decision-making and decision-making in the business environment, and as a strategic system, the importance of improving its error tolerance increases. In this research, we have presented a new software method to identify the occurrence of errors in intelligent business systems at the stage of transferring information from information sources to the destination system and building a data warehouse using business performance indicators. The system has been used to identify the error that occurred and also to select a healthy module. The advantages of this method are its high flexibility for use in different parts of the system, general identification of errors that occurred during the transfer process, the ability to develop it without additional cost, and also the ability to use this method in any system that needs to transfer information from one environment to another.

    This method is based on the diagnosis methodology based on comparison, and its goal is to quickly identify errors by comparing data Transferred and resource data is within the framework of performance indicators of the business environment.

    Considering that this method is a software solution, its cost is lower than similar hardware methods.

    Keywords:

    Intelligent business systems, pickup, transfer and loading, tolerance Error, redundancy, performance indicators, data warehouse

    Data is one of the most valuable assets of any organization. Business intelligence systems[1] provide the possibility to use data and tools to realize the real value of this data by converting data to information and then to knowledge. The mechanism of intelligent business systems is that the data available in different parts of the organization in various formats will be transferred to the data warehouse [3] through a process of recovery, transfer, cleaning and loading [3] and using analytical tools, appropriate reports will be provided to users. The ETL process includes almost 70% of the entire process of implementing the intelligent business system and it can be said that the correctness of the whole system is largely dependent on the correctness of the ETL part. Intelligent business systems use performance indicators [4] to evaluate the quantitative and qualitative status of different parts of the organization. In fact, intelligent business systems use these indicators to monitor and control the overall status of the organization.

    Past works that have been presented about intelligent business systems in the field of improving reliability [5] intelligent business and fault tolerance [6] are limited to the use of redundancy techniques without To have a reference to the error detection method. In this research, we will try to propose a new mechanism for detecting the occurrence of errors based on the use of performance indicators by looking at the standard and usual architectures of intelligent business systems, in order to increase reliability and tolerance against errors. Here, we present a general focused software method to control the accuracy of information transmission in the various stages of transmission from information systems to intelligent business systems, which is very effective when combined with redundancy techniques to increase the reliability of systems and has many advantages. Such as flexibility to apply changes and development, accurate identification of the location of errors, general usability can be used in any project that needs to transfer information from one environment to another.

    Problem design

    Decision-making and decision-making systems play a strategic role in the success of a business, so the accuracy of information and their availability are very important. Hardware redundancy solutions are used in the systems, which are easier to implement than software solutions and have less complexity, but one of the general weaknesses of such solutions is the uncertainty of their correctness, especially for this particular issue, i.e.Generally, to increase the reliability of such systems, hardware redundancy solutions are used, which are easier to implement than software solutions and have less complexity, but one of the general weaknesses of such solutions is the uncertainty of their correctness, especially for this particular discussion, i.e. transferring data from one environment to another. For example, to clarify the issue, we can refer to the transfer of a number of records from a source to a destination, which in hardware solutions is focused on the transfer operation and does not have any understanding of the nature of the data and their values, that is, if a value of a record changes during the transfer process, the system will not notice it, while in the proposed software method, it will notice any changes and errors by comparing the transfer values.

    Research objective

    Our goal In this research, we present a software solution that can be implemented in any business that needs to transfer information from one environment to another. In this method, we have used the mapping of source environment tables and destination tables, as well as based on the diagnosis methodology based on comparison. Given that information transfer is done in businesses where information systems have already been established, mappings and performance indicators can also be used.

    Scope of research

    In this research, we have limited error detection and increasing system reliability to redundancy solutions and divided it into two general categories, hardware and software, and from the aspect of the implementation environment of this research, it can be used in any environment that needs to transfer information from a preferably relational environment to a destination environment with a relational structure. It is also assumed that the selected business environment has The information systems are operational and the performance indicators of the organization are well defined. The structure of the thesis The structure of this thesis is as follows: in the second chapter, an introduction to business intelligence systems, basic definitions of the data warehouse and its architecture, the ETL process and its data flow, as well as various aspects of a business intelligence system such as management analytical reports[7] We will have mining [8] and dashboard reports [9], and in the third chapter, an introduction to reliable systems and reliability and reliability [10] of the system and characteristics and indicators of system reliability, which include accessibility, safety and. along with various software and hardware redundancy solutions focusing on its software aspect to increase the reliability of the system, and then we have a brief reference to the organization's performance indicators and system performance evaluation methods, and finally, in the fourth chapter, we will have the proposed method and in the fifth chapter, we will have the conclusion. Intelligent business systems

    Information systems [11] were responsible for the information support of organizations for a long time. The passage of time revealed weaknesses such as the impossibility of helping to make decisions in critical situations, the impossibility of providing conditions for predicting the future of business, the lack of multidimensional analytical reports and the inference of specific information and knowledge from system data for such systems. To compensate for these shortcomings, intelligent business systems [12] were proposed. One of the most important tasks that such systems use data Different parts of the organization and some related external data can do intelligent forecasting of the business environment, in-depth market forecasting and analysis, proper management of customer relations, providing analytical and comprehensive reports using the integrity of data collected from different parts of the organization and finally facilitating decision-making. The figure below shows the development and evolution of information systems. Smart business requires providing the right environment and conditions in the organization, among these conditions is the prevailing culture of decision-making based on data and the knowledge of information resulting from this system, teaching management methods based on analysis to decision-making and decision-making managers instead of the traditional view of just production. Making existing data of the organization such as purchase, sale, finance, and. Under a centralized and integrated system and using special tools[13], it provides the possibility of analyzing and analyzing organizational processes to improve decision-making. In fact, the main goal of intelligent business systems is to provide correct information for correct decision-making at the right time.

  • Contents & References of Providing an optimal error detection model in the ETL process

    List:

    Acknowledgments. C

    Abstract

    Table of contents. and

    list of tables. List of bugs. Chapter 1: Introduction. 1

    Introduction. 2

    Problem design. 3

    The purpose of the research. 3

    Scope of research. 3

    Thesis structure. 3

    Chapter Two: An overview of intelligent business systems. 5

    Introduction. 6

    Intelligent business systems 7

    Data warehouse 7

    Data flow architecture 8

    System architecture 10

    Data integrity 10

    Load frequency. 11

    Dimensional data source. 11

    Normalized data source 11

    Master Data Management (MDM) 11

    ETL [(5 and 6) 12

    Views and architecture of ETL. 14

    Analytical reports. 15

    Data mining. 15

    Reports service. 16

    Conclusion. 17

    Chapter three: secure systems. 18

    Introduction. 19

    Reliability. 20

    Reliability. 21

    Accessibility. 21

    Safety. 21

    Maintainability. 22

    Testability. 22

    Security. 22

    Flaws, errors and failures. 22

    Efficiency. 23

    Fault tolerance 23

    Redundancy 24

    Hardware redundancy. 25

    Static hardware redundancy 25

    Active hardware redundancy. 27

    Duplication with Comparison technique. 28

    Standby sparing technique. 29

    Pair-and-a-Spare technique. 30

    Watchdog Timers 30

    Combined hardware redundancy. 30

    Software fault tolerance. 31

    Tactics for single-version software fault tolerance. 32

    Defect detection. 32

    Limiting defects. 33

    defect recovery. 34

    Multi-version software fault tolerance tactics. 34

    Variety of design. 35

    Recovery blocks. 35

    Multi-version programming. 36

    Self-check programming. 37

    Distributed Recovery Blocks 38

    Consensus Recovery Blocks. 38

    Acceptance voting. 39

    Performance indicators. 40

    Common methods to evaluate the reliability of systems 41

    Simplification of series and parallel. 42

    Axial analysis. 43

    Creating paths and minimum parts 43

    Relation matrix. 44

    Node removal method to produce a minimum path. 45

    Production of minimum parts from minimum paths 45

    Inclusion-non-inclusion method. 46

    The method of summing the product of discrete products. 47

    Discrete controls: law of increase. 47

    Chapter 4: Proposed method. 49

    Introduction. 50

    A case study implemented in Khuzestan Steel Company. 58

    Chapter Five: Conclusion and Future Work 62

    Conclusion. 63

    Future works 64

    References. 65

    Source:

    No.

Providing an optimal error detection model in the ETL process