Home bi & dw basics Data warehouse design & Architecture - General FAQ

Data warehouse design & Architecture - General FAQ

Q. What is Architecture?

"Often, architecture and infrastructure are used interchangeably. There’s a subtle difference. Infrastructure consists of applications, hardware, and software. Architecture, on the other hand, refers to the overlying principles and processes that lead the organization’s infrastructure deployment.

Infrastructure is a snapshot, while architecture is a continuously evolving set of ideas and philosophies. Architecture includes infrastructure, but the two aren’t synonymous."

Q. What is Data Architecture?

Data architecture is "the blueprint of the data within your company."  It "includes things like enterprisewide data models, the meta-data catalog, and notions of data ownership."  Data infrastructure, on the other hand, covers the physical data structures and data transport mechanisms.

Q. What is EAI Architecture?

Enterprise Application Integration (EAI) typically divides the world into "point-to-point" or "hub-and-spoke".

Organizations often have developed piecemail integrations incrementally from one application to another, that is, so-called "point-to-point" application integrations.  These incremental integrations span a variety of tools, and may be occasionally conflicting, since the integrations and data aren't viewed as a coherent, shared resource.

Forward-looking companies work toward a "hub-and-spoke" or enterprise approach.  Data sources and owners are defined, integrations are centrally controlled at the hub, and target systems are registered.  This probably sounds, and rightly so, very similar to data warehousing Extraction-Transformation-Load (ETL) architecture -- especially since a datamart often molds diverse data into a single "version of the truth."

Suppose the company acquires a new business.  The new business is just one more spoke on the company hub.  Data from the new business should flow (once the spoke is implemented) into the hub and be published as necessary through the enterprise.  For example, sales data from the new business should flow into the existing company sales datamart.

Q. What is ETL Architecture?

ETL architecture connects three areas:  source, staging, and target.  Sources are typically one or more OLTP systems. Staging is a common holding and work area.  Targets are typically a data warehouse or mart.

"The source to stage component is intended to focus the efforts of reading the source data (sourcing) and replicating the data to the staging area."

"The stage to warehouse component focuses the effort of standardizing and centralizing the data from the source systems into a single view of the organization’s information. This centralized target can be a data warehouse, data mart, operational data store, customer list store, reporting database or any other reporting/data environment."

The Best Practice is "to consolidate the business rules into the stage to warehouse ETL component".  This Best Practice can tie ETL architecture to EAI architecture.