Tuesday, January 21, 2020

Data Warehouse and Data Mart

Data warehouse vs Data martData warehouse and Data mart are used as a data repository and serve the same purpose. These can be differentiated through the quantity of data or information they stores. The vital difference between a data warehouse and a data mart is that a data warehouse is a database that stores information-oriented to satisfy decision-making requests whereas data mart is complete logical subsets of an entire data warehouse.

In simple words, a data mart is a data warehouse limited in scope and whose data can be obtained through summarizing and selecting the data from the data warehouse or with the help of distinct extract, transform and load processes from source data system.

Comparison Chart

Basis for comparisonData WarehouseData Mart
BasicData warehouse is application independent.Data mart are specific to decision support system application.
Type of system CentralisedDecentralised
Form of dataDetailedSummarized
Use of denormalisationThe data is slightly denormalised.The data is highly denormalised.
Data modelTop-downBottom-up
NatureFlexible, data-oriented and long life.Restrictive, project-oriented and short life.
Type of schema usedFact constellationStar and snowflake
Ease of buildingHard to buildSimple to build

Definition of Data Warehouse

The term data warehouse means a time-variant, subject-oriented, nonvolatile, and an integrated group of data that assist in decision-making process of the management. Alternatively, it a repository of information gathered from multiple sources, stored in a unified schema, at a sole site that allows integration of a variety of application systems. Once this data is collected it is stored for a long time, hence has a long life and permit access to historic information. Data warehouse architecture with staging area and data marts
Consequently, data warehouse provides the user with a single integrated interface to the data through which user can write decision-support queries easily. Data warehouse helps in turning the data into information. Designing a data warehouse includes top-down approach.
It gathers information about subjects that span the entire organisation, such as customers, sales, assets, items, and therefore its range is enterprise-wide. Generally, fact constellation schema is used in it, which covers a wide variety of subjects. A data warehouse is not a static structure and it’s evolving continuously.

Definition of Data Mart

A data mart can be called as a subset of a data warehouse or a sub-group of corporate-wide data corresponding to a certain set of users. Data warehouse involves several departmental and logical data marts which must be persistent in their data illustration to ensure the robustness of a data warehouse. A data mart is a set of tables that concentrate on a single task these are designed using a bottom-up approach.data mart developmentData mart extent is restricted to some specific chosen subject, thus its scope is department-wide. These are usually implemented on low-cost departmental servers. The implementation cycle of data marts is monitored in weeks instead of month and year.
As star and snowflake schema are driven towards single subject modelling that is why these are commonly used in the data mart. Although, the star schema is more popular than snowflake schema. Depending on the data source the data marts can be classified into two types: dependent and independent data marts.

Key Differences Between Data Warehouse and Data Mart

  1. Data warehouse is application independent whereas data mart is specific to decision support system application.
  2. The data is stored in a single, centralised repository in a data warehouse. As against, data mart stores data decentrally in the user area.
  3. Data warehouse contains a detailed form of data. In contrast, data mart contains summarized and selected data.
  4. The data in a data warehouse is slightly denormalised while in case of Data mart it is highly denormalised.
  5. The construction of data warehouse involves top-down approach. Conversely, while constructing a data mart the bottom-up approach is used.
  6. Data warehouse is flexible, information-oriented and longtime existing nature. On the contrary, a data mart is restrictive, project-oriented and has a shorter existence.
  7. Fact constellation schema is usually used for modelling a data warehouse whereas in data mart star schema is more popular.

Conclusion

Data warehouse provides enterprise view, single and centralised storage system, inherent architecture and application independency while Data mart is a subset of a data warehouse which provides department view, decentralised storage. As data warehouse is very large and integrated, it has a high risk of failure and difficulty in building it. On the other hand, the data mart is easy to build and associated failure risk is also less but data mart could experience fragmentation.

No comments:

Post a Comment