OLAP and multidimensional data analysis
n A
data warehouse is based on a multidimensional data model which views data in
the form of a data cube
n A
data cube, such as sales, allows data to be modeled and viewed in multiple
dimensions
n Dimension
tables, such as item (item name, brand, type), or time(day, week, month,
quarter, year)
n Fact
table contains measures (such as dollars sold) and keys to each of the related
dimension tables
n In
data warehousing literature, an n-D base cube is called a base cuboid. The top
most 0-D cuboid, which holds the highest-level of summarization, is called the
apex cuboid. The lattice of cuboids
forms a data cube.
n Modeling
data warehouses: dimensions & measures
n Star
schema: A fact table in the middle connected to a set of
dimension tables
n Snowflake
schema: A refinement
of star schema where some dimensional hierarchy is normalized into a set of
smaller dimension tables, forming a shape similar to snowflake
n Fact
constellations:
Multiple fact tables share dimension tables, viewed as a collection of
stars, therefore called galaxy schema or fact constellation
n OLAP Operations:
n Roll
up (drill-up): summarize data
n by climbing up hierarchy or by dimension reduction
n Drill
down (roll down): reverse of roll-up
n from higher level summary to lower level summary or
detailed data, or introducing new dimensions
n Slice
and dice: project and select
n Pivot
(rotate):
n reorient the cube, visualization, 3D to series of
2D planes
n Other
operations
n drill across: involving (across) more than one fact
table
No comments:
Post a Comment