DataWarehousingGuide.com

Custom Search
 

Business Intelligence & Data warehousing Blog

Data mining, data warehousing, metadata management, BI tools, multi dimentional expressions, business intelligence trends, data modelling, open source BI, Microsoft BI solutions, Reporting services - SSRS, Analysis services - SSAS, Integration services - SSIS ...

Blog Categories    
Data Modeling MDX
Data Mining
Open Source BI BI Industry BI tools and solutions
Data warehousing    

 

Home arrow BLOG arrow Data warehouse vs Data mart
Data warehouse vs Data mart Print E-mail

Data mart and Data warehousing are two such jargons of BI, which creates two different definitions from any two BI experts. The long running debate between Ralph Kimball and Bill Inmon, the two Titans of Data Warehousing, only adds to the confusion.

Let me try to briefly explain what the two terms mean.

What is Data mart?

A Data Mart is a specific, subject oriented, repository of data designed to answer specific questions for a specific set of users. So an organization could have multiple data marts serving the needs of marketing, sales, operations, collections, etc. A data mart usually is organized as one dimensional model as a star-schema (OLAP cube) made of a fact table and multiple dimension tables.

What is data warehouse?

Data Warehouse (DW) is a single organizational repository of enterprise wide data across many or all subject areas. The Data Warehouse is the authoritative repository of all the fact and dimension data (that is also available in the data marts) at an atomic level.

The above definitions looks simple, so where is the confusion. Actually speaking, the confusion starts from here. There are two broad schools of thought lead by Kimball and Inmon that disagree on the details.

Kimball School of thought

Ralph Kimball began with the Data Mart as a dimensional model for departmental data and viewed the Data Warehouse as the enterprise wide collection of Data Marts. This is the bottom-up approach. You may begin with the Sales Data Mart, after sometime you put in place the Ops Data Mart, and so on an so forth. If you want you could have even more specific Data Marts serving specific questions like customer Churn. If you take care of consistency of metadata (making sure each departmental Data Mart calls an Apple an Apple) and connectivity, you have a Data Warehouse. So the Data Warehouse is really a virtual collection of Data Marts collected together on a Data Warehouse Bus, and in that sense the data flows from multiple Marts into the Warehouse.

Inmon School of thought

Inmon’s approach is the exact opposite and avoids the problem of metadata consistency by looking at the Enterprise Data Warehouse as a single repository that feeds subject oriented Data Marts. You still have your Sales, Marketing, Ops and Churn Data Marts containing atomic or aggregated information, but they are based on the Data Warehouse and are really subsets of the data contained therein. This is the top-down approach.

Kimball’s approach is easier to implement as you are dealing with smaller subject areas to begin with, but the end result often has meta data inconsistencies and can be a nightmare to integrate. Inmon’s approach, on the other hand does not defer the integration and consistency issues, but takes far longer to implement (which makes it easier for the project to fail).
As a common observation, most of the organizations that are just starting to do analytics usually do not have the patience or commitment required for Inmon’s approach. Any BI initiative is extremely iterative in nature. Unless you are confident that you would still have the CEO’s buy-in and a budget one year down the line, it might be better to begin with a Data Mart (to start delivering, and to manage expectations) keeping the meta data consistency requirements in mind, and then scale towards the Data Warehouse.

I find this arguments worth reading and enjoy listening to this debate from experts. Because it only adds to our knowledge. However, practically combination of the above approaches could help better to solve the problem.

Some interesting posts on this debate if you are further interested ...
Data Warehousing: Similarities and Differences of Inmon and Kimball
Data Warehouse Architecture: The Great Debate
Kimball Vs. Inmon

 


[+]
  • Narrow screen resolution
  • Wide screen resolution
  • Auto width resolution
  • Increase font size
  • Decrease font size
  • Default font size
  • default color
  • blue color
  • green color