DATA WAREHOUSING


What is Data Warehousing?

A Data Warehousing (DW) is process for collecting and managing data from varied sources to provide meaningful business insights. A Data warehouse is typically used to connect and analyze business data from heterogeneous sources. The data warehouse is the core of the BI system which is built for data analysis and reporting.

It is a blend of technologies and components which aids the strategic use of data. It is electronic storage of a large amount of information by a business which is designed for query and analysis instead of transaction processing. It is a process of transforming data into information and making it available to users in a timely manner to make a difference.

Data warehouse system is also known by the following name :

  • Decision Support System (DSS)
  • Executive Information System
  • Management Information System
  • Business Intelligence Solution
  • Analytic Application

DAta Warehousing

How Datawarehouse works?

A Data Warehouse works as a central repository where information arrives from one or more data sources. Data flows into a data warehouse from the transactional system and other relational databases.

Data may be:

  1. Structured
  2. Semi-structured
  3. Unstructured data The data is processed, transformed, and ingested so that users can access the processed data in the Data Warehouse through Business Intelligence tools, SQL clients, and spreadsheets. A data warehouse merges information coming from different sources into one comprehensive database.

By merging all of this information in one place, an organization can analyze its customers more holistically. This helps to ensure that it has considered all the information available. Data warehousing makes data mining possible. Data mining is looking for patterns in the data that may lead to higher sales and profits.

Types of Data Warehouse

Three main types of Data Warehouses (DWH) are:

1.Enterprise Data Warehouse (EDW):

Enterprise Data Warehouse (EDW) is a centralized warehouse. It provides decision support service across the enterprise. It offers a unified approach for organizing and representing data. It also provide the ability to classify data according to the subject and give access according to those divisions.

2.Operational Data Store:

Operational Data Store, which is also called ODS, are nothing but data store required when neither Data warehouse nor OLTP systems support organizations reporting needs. In ODS, Data warehouse is refreshed in real time. Hence, it is widely preferred for routine activities like storing records of the Employees.

3.Data Mart:

A data mart is a subset of the data warehouse. It specially designed for a particular line of business, such as sales, finance, sales or finance. In an independent data mart, data can collect directly from sources.

Who needs Data warehouse?

DWH (Data warehouse) is needed for all types of users like:

Decision makers who rely on mass amount of data Users who use customized, complex processes to obtain information from multiple data sources. It is also used by the people who want simple technology to access the data It also essential for those people who want a systematic approach for making decisions. If the user wants fast performance on a huge amount of data which is a necessity for reports, grids or charts, then Data warehouse proves useful. Data warehouse is a first step If you want to discover ‘hidden patterns’ of data-flows and groupings.

Why We Need Data Warehouse? Advantages & Disadvantages

Advantages of Data Warehouse (DWH):

Data warehouse allows business users to quickly access critical data from some sources all in one place. Data warehouse provides consistent information on various cross-functional activities. It is also supporting ad-hoc reporting and query. Data Warehouse helps to integrate many sources of data to reduce stress on the production system. Data warehouse helps to reduce total turnaround time for analysis and reporting. Restructuring and Integration make it easier for the user to use for reporting and analysis. Data warehouse allows users to access critical data from the number of sources in a single place. Therefore, it saves user’s time of retrieving data from multiple sources. Data warehouse stores a large amount of historical data. This helps users to analyze different time periods and trends to make future predictions.

Disadvantages of Data Warehouse:

Not an ideal option for unstructured data. Creation and Implementation of Data Warehouse is surely time confusing affair. Data Warehouse can be outdated relatively quickly Difficult to make changes in data types and ranges, data source schema, indexes, and queries. The data warehouse may seem easy, but actually, it is too complex for the average users. Despite best efforts at project management, data warehousing project scope will always increase. Sometime warehouse users will develop different business rules. Organisations need to spend lots of their resources for training and Implementation purpose.

Data Warehouse Tools

  1. MarkLogic

  2. Oracle

  3. Amazon RedShift

Reference