Translate

Tuesday, September 25, 2012

DATA WAREHOUSING AND MINIG ENGINEERING LECTURE NOTES--OLAP and multidimensional data analysis

OLAP and multidimensional data analysis

n  A data warehouse is based on a multidimensional data model which views data in the form of a data cube

n  A data cube, such as sales, allows data to be modeled and viewed in multiple dimensions

n  Dimension tables, such as item (item name, brand, type), or time(day, week, month, quarter, year)

n  Fact table contains measures (such as dollars sold) and keys to each of the related dimension tables

n  In data warehousing literature, an n-D base cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid.  The lattice of cuboids forms a data cube.

n  Modeling data warehouses: dimensions & measures

n  Star schema: A fact table in the middle connected to a set of dimension tables

n  Snowflake schema:  A refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflake

n  Fact constellations:  Multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation

n  OLAP Operations:

n  Roll up (drill-up): summarize data

n  by climbing up hierarchy or by dimension reduction

n  Drill down (roll down): reverse of roll-up

n  from higher level summary to lower level summary or detailed data, or introducing new dimensions

n  Slice and dice: project and select

n  Pivot (rotate):

n  reorient the cube, visualization, 3D to series of 2D planes

n  Other operations

n  drill across: involving (across) more than one fact table

No comments:

Post a Comment