Definition
of Data warehouse:
1. A data
warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
collection of data in support of management’s decision-making process.
2. A decision support database that is
maintained separately from the organization’s operational database.
3. Support information processing by
providing a solid platform of consolidated, historical data for analysis.
Why
Data warehouse is subject oriented?
·
Focusing on the modeling and analysis of data
for decision makers, not on daily operations or transaction processing
·
Provide a simple and concise view around
particular subject issues by excluding data that are not useful in the decision
support process.
Why
Data warehouse is integrated?
·
A data warehouse is constructed by integrating
multiple, heterogeneous data sources like relational databases, flat files, on-line transaction
records.
·
Data cleaning and data integration techniques
are applied to ensure
consistency in naming conventions, encoding structures, attribute measures,
etc. among different data sources.
·
When
data is moved to the warehouse, it is converted.
Why
Data warehouse is time variant?
·
The time horizon for the data warehouse is
significantly longer than that of operational systems:
1. Operational database: current value
data.
2. Data warehouse data: provide
information from a historical perspective (e.g., past 5-10 years)
·
Every key structure in the data warehouse Contains an element of time,
explicitly or implicitly But the key of operational data may or may not contain
“time element”
Why
Data warehouse is non-volatile?
·
A physically separate store of data transformed
from the operational environment.
·
Operational update of data does not occur in
the data warehouse environment:
1.
Does
not require transaction processing, recovery, and concurrency control
mechanisms
2. Requires
only two operations in data accessing:
No comments:
Post a Comment