Document Details

SensibleGardenia

Uploaded by SensibleGardenia

University of Northern Philippines

Tags

data warehousing inmon approach data management

Full Transcript

Inmon’s Methodolo gy Inmon Approach Bill Inmon, often regarded as the "father of data warehousing," proposes a different perspective. Inmon's approach advocates for building an enterprise-wide data warehouse that serves as a single source of truth for an organization. This data warehouse is compose...

Inmon’s Methodolo gy Inmon Approach Bill Inmon, often regarded as the "father of data warehousing," proposes a different perspective. Inmon's approach advocates for building an enterprise-wide data warehouse that serves as a single source of truth for an organization. This data warehouse is composed of normalized data structures, creating a strong foundation for consistency and integrity. Inmon Approach The Inmon methodology excels in environments where data governance and a centralized approach are critical. It serves as a reliable foundation for complex reporting and strategic decision-making, providing a comprehensive view of the entire organization. Inmon Approach The Inmon approach focuses on integrating data from various sources into a comprehensive data model. It avoids data redundancy, ensuring that data anomalies are minimized. This normalized clearly understanding of data relationships and dependencies, promoting data integrity and accuracy. Bill Inmon Approach The approach introduced by Bill Inmon is known as the Inmon approach of Data Warehousing. According to him, a Data warehouse is a “subject-oriented, nonvolatile, integrated, time-variant collection of data in support of management’s decisions.” He was the first to write a book, to hold a conference, to write a column in a magazine, and to start classes on Data Warehousing. His approach can be understood as more corporate data model-oriented. It takes care of the customer, product, and vendor. Diagramatic representation of Inmon Approach In the diagram we can see the steps of Data warehousing according to the Inmon approach, 1. Data extraction from different sources, may it be financial data, academic data, business data, or anything else. 2. Next during the data transformation step all the data cleaning, imputation of null values, de-duplication, normalization processes are done. 3. Then the normalization data is loaded into the Data Warehouse. 4. From Data Warehouse the data is further divided into multiple data marts based on different department needs Functions of Data Warehouse Data cleaning Data integration Data mapping Data extraction Data transformation Data loading Data refreshing Bill Inmon’s Data Warehouse Architecture Corporate Applications. These applications are operational systems or transaction systems that develop to support business. Transaction systems collect data from business transactions such as sales, marketing, material management and stored those data in various forms, including relational data, hierarchical data, or even spreadsheets. In Inmon’s architecture, transaction systems are also called source systems that provide data to the data warehouse. ETL Processes. To bring data from the transaction system, a process called ETL is used. ETL stands for extract, transform and load. ETL process consolidates data, transforms it into a specific standard format, and loads it into a single repository called an enterprise data warehouse or EDW. ETL processes can run as a batch process periodically or a transaction- based for near real-time data. The ETL process is referred to as data integration or data services. Enterprise data warehouse is a central element in Inmon’s data warehouse architecture. As Inmon’s data warehouse definition, an enterprise data warehouse is an integrated repository of atomic data. Data in the enterprise data warehouse is captured at the very lowest level of detail. Data in the enterprise data warehouse is stored in a relational database and uses a third normal database design. Data marts are departmental views of information with subject-oriented data. Data marts take data from the enterprise data warehouses. Aggregations can take place when data brings from the enterprise data warehouses to data marts. Data marts use dimensional design. Therefore, the data in the data marts is ready for analysis. It is important to note that all the external applications or reporting tools or business intelligence tools query data marts instead of enterprise data warehouses. Data Warehouse Models 1. Data mart 2. Virtual data 3. Enterprise data warehouse Data Warehouse Models A data mart model is used for business-line specific reporting and analysis. In this data warehouse model, data is aggregated from a range of source systems relevant to a specific business area, such as sales or finance. Data Warehouse Models A virtual data warehouse is a set of separate databases, which can be queried together, so a user can effectively access all the data as if it was stored in one data warehouse. Data Warehouse Models An enterprise data warehouse model prescribes that the data warehouse contain aggregated data that spans the entire organization. This model sees the data warehouse as the heart of the enterprise’s information system, with integrated data from all business units. Characteristics of Data Warehouse Subject-Oriented: A data warehouse uses a theme, and delivers information about a specific subject instead of a company’s current operations. In other words, the data warehousing process is more equipped to handle a specific theme. Examples of themes or subjects include sales, distributions, marketing, etc. Characteristics of Data Warehouse Integrated: Integration is defined as establishing a connection between large amount of data from multiple databases or sources. However, it is also essential for the data to be stored in the data warehouse in a unified manner. The process of data warehousing integrates data from multiple sources, such as a mainframe, relational databases, flat files, etc. Furthermore, it helps maintain consistent codes, attribute measures, naming conventions, and, formats. Characteristics of Data Warehouse Time-variant: Time-variant in a Data Warehouse is more extensive as compared to other operating systems. Data stored in a data warehouse is recalled with a specific time period and provides information from a historical perspective. Characteristics of Data Warehouse Non-volatile: In the non-volatile data warehouse, data is permanent when new data is inserted, previous data is not replaced, omitted, or deleted. In this data warehouse, data is read-only and only refreshes at certain intervals. The two data operations performed in the data warehouse are data access and data loading. Data Warehouse vs. Database Data Warehouse Database Data warehouse serves as an A database is an amalgamation of information system that contains related data. historical and communicative data from one or several sources. A data warehouse is used for analyzing A database is used for recording data. data. Data warehouse is the subject-oriented A database is an application-oriented collection of data. collection of data. Data warehouse uses Online Analytical A database uses Online Transactional Processing(OLAP). Processing(OLTP). Data warehouse tables and joins are Database tables and joins are denormalized, hence simpler. normalized, therefore, more complicated. Data modeling techniques are used for ER modeling techniques are used for designing. designing. Normalization and Denormalization Normalization is the technique of dividing the data into multiple tables to reduce data redundancy and inconsistency and to achieve data integrity. On the other hand, Denormalization is the technique of combining the data into a single table to make data retrieval faster. The two Warehouse concepts: Bill Inmon vs. Ralph Kimball Inmon vs. Kimball Comparison Inmon Kimball Need Longer time scale Immediate Drive Enterprise Business area Budget Larger Budget Smaller budget Requirements More stable and Volatile growing Customer Corporate User base Sources changeable Stable Startup cost Higher Lower Projects Cheaper than start Same cost as start up up Bill Inmon recommends building a data warehouse that follows the top-down approach. In Inmon’s philosophy, it is starting with building a big centralized enterprise data warehouse where all available data from transaction systems are consolidated into a subject- oriented, integrated, time-variant, and non-volatile collection of data that supports decision making. then data marts are built for the analytic needs of departments. The Bill Inmon design approach uses the normalized form for building entity structure, avoiding data redundancy as much as possible. This results in clearly identifying business requirements and preventing any data update irregularities. Moreover, the advantage of this top-down approach in database design is that it is robust to business changes and contains a dimensional perspective of data across data mart. Another Inmon Architecture Advantages of Inmon Method Data warehouse acts as a unified source of truth for the entire business, where all data is integrated. This approach has very low data redundancy. So, there’s less possibility of data update irregularities, making the ETL-concept based data warehouse process more straightforward and less susceptible to failure. It simplifies business processes, as the logical model represents detailed business objects. This approach offers greater flexibility, as it’s easier to update the data warehouse in case there’s any change in the business requirements or source data. It can handle diverse enterprise-wide reporting requirements. Disadvantages of Inmon Method Complexity increases as multiple tables are added to the data model with time. Resources skilled in data warehouse data modeling are required, which can be expensive and challenging to find. The preliminary setup and delivery are time-consuming. Additional ETL process operation is required since data marts are created after the creation of the data warehouse. This approach requires experts to manage a data warehouse effectively. Contrast to Bill Inmon approach, Ralph Kimball recommends building the data warehouse that follows the bottom-up approach. In Kimball’s philosophy, it first starts with mission-critical data marts that serve the analytic needs of departments. Then it is integrating these data marts for data consistency through a so-called information bus. Kimball makes uses of the dimensional model to address the needs of departments in various areas within the enterprise. Initiated by Ralph Kimball, the Kimball data model follows a bottom-up approach to data warehouse architecture design in which data marts are first formed based on the business requirements. The primary data sources are then evaluated, and an Extract, Transform and Load (ETL) tool is used to fetch data from several sources and load it into a staging area of the relational database server. Once data is uploaded in the data warehouse staging area, the next phase includes loading data into a dimensional data warehouse model that’s denormalized by nature. Kimball Data Warehouse Architecture Advantages of Kimball Method Kimball dimensional modeling is fast to construct as no normalization is involved, which means swift execution of the initial phase of the data warehousing design process. An advantage of star schema is that most data operators can easily comprehend it because of its denormalized structure, which simplifies querying and analysis. Data warehouse system footprint is trivial because it focuses on individual business areas and processes rather than the whole enterprise. So, it takes less space in the database, simplifying system management. Advantages of Kimball Method It enables fast data retrieval from the data warehouse, as data is segregated into fact tables and dimensions. For example, the fact and dimension table for the insurance industry would include policy transactions and claims transactions. A smaller team of designers and planners is sufficient for data warehouse management because data source systems are stable, and the data warehouse is process-oriented. Also, query optimization is straightforward, predictable, and controllable. Conformed dimensional structure for data quality framework. The Kimball approach to data warehouse lifecycle is also referred to as the business dimensional lifestyle approach because it allows business intelligence tools to deeper across several star schemas and generates reliable insights. Disadvantages of Kimball Method Data isn’t entirely integrated before reporting; the idea of a ‘single source of truth is lost.’ Irregularities can occur when data is updated in Kimball DW architecture. This is because in denormalization technique, redundant data is added to database tables. In the Kimball DW architecture, performance issues may occur due to the addition of columns in the fact table, as these tables are quite in-depth. The addition of new columns can expand the fact table dimensions, affecting its performance. Also, the dimensional data warehouse model becomes difficult to alter with any change in the business needs. Disadvantages of Kimball Method As the Kimball model is business process- oriented, instead of focusing on the enterprise as a whole, it cannot handle all the BI reporting requirements. The process of incorporating large amounts of legacy data into the data warehouse is complex. Thank You.

Use Quizgecko on...
Browser
Browser