Skip to main content

Dimension and Fact Types

TYPES OF DIMENSIONS

Conformed Dimension:

Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.
Eg: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

Junk Dimension:

A junk dimension is a collection of random transactional codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.

Eg: Assume that we have a gender dimension and marital status dimension. In the fact table we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension which has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.

Degenerated Dimension:

A degenerate dimension is a dimension which is derived from the fact table and doesn't have its own dimension table.

Eg: A transactional code in a fact table.

Role-playing dimension:

Dimensions which are often used for multiple purposes within the same database are called role-playing dimensions. For example, a date dimension can be used for “date of sale", as well as "date of delivery", or "date of hire".

TYPES OF FACTS

Additive:

Additive facts are facts that can be summed up through all of the dimensions in the fact table. A sales fact is a good example for additive fact.

Semi-Additive:

Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.
Eg: Daily balances fact can be summed up through the customers dimension but not through the time dimension.

Non-Additive:

Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. 
Eg: Facts which have percentages, ratios calculated.

Factless Fact Table:

In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called "Factless Fact tables".

Eg: A fact table which has only product key and date key is a factless fact. There are no measures in this table. But still you can get the number products sold over a period of time.

A fact tables that contain aggregated facts are often called summary tables

SLOWLY CHANGING DIMENSION

TYPE 1

In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept.
In our example, recall we originally have the following table:
Customer KeyNameState
1001ChristinaIllinois
After Christina moved from Illinois to California, the new information replaces the new record, and we have the following table:
Customer KeyNameState
1001ChristinaCalifornia
Advantages:
- This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information.
Disadvantages:
- All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before.
Usage:
About 50% of the time.
When to use Type 1:
Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes.

TYPE 2

In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
In our example, recall we originally have the following table:
Customer KeyNameState
1001ChristinaIllinois
After Christina moved from Illinois to California, we add the new information as a new row into the table:
Customer KeyNameState
1001ChristinaIllinois
1005ChristinaCalifornia
Advantages:
- This allows us to accurately keep all historical information.
Disadvantages:
- This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
- This necessarily complicates the ETL process.
Usage:
About 50% of the time.
When to use Type 2:
Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes.

TYPE 3

In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active.
In our example, recall we originally have the following table:
Customer KeyNameState
1001ChristinaIllinois
To accommodate Type 3 Slowly Changing Dimension, we will now have the following columns:
  • Customer Key
  • Name
  • Original State
  • Current State
  • Effective Date
After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003):
Customer KeyNameOriginal StateCurrent StateEffective Date
1001ChristinaIllinoisCalifornia15-JAN-2003
Advantages:
- This does not increase the size of the table, since new information is updated.
- This allows us to keep some part of history.
Disadvantages:
- Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost.
Usage:
Type 3 is rarely used in actual practice.
When to use Type 3:
Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes, and when such changes will only occur for a finite number of time.

 

Comments

Popular posts from this blog

DataZen Syllabus

INTRODUCTION TO DATAZEN PRODUCT ELEMENTS ARCHITECTURE DATAZEN ENTERPRISE SERVER INTRODUCTION SERVER ARCHITECTURE INSTALLATION SECURITY CONTROL PANEL WEB VIEWER SERVER ADMINISTRATION CREATING AND PUBLISHING DASHBOARDS CONNECTING TO DATASOURCES DESIGNER CONFIGURING NAVIGATOR CONFIGURING VISUALIZATION  PUBLISHING DASHBOARD WORKING WITH MAP  WORKING WITH DRILL THROUGH DASHBOARDS

PowerBI Interview Questions and Answers

Power BI Interview Questions – General Questions 1). What is self-service business intelligence? Ans: Self-Service Business Intelligence (SSBI) is an approach to data analytics that enables business users to filter, segment, and, analyse their data, without the in-depth technical knowledge in statistical analysis, business intelligence (BI). SSBI has made it easier for end users to access their data and create various visuals to get better business insights. Anybody who has basic understanding of the data can create reports to build intuitive and shareable dashboards. 2). What are the parts of Microsoft self-service business intelligence solution? Ans: Microsoft has two parts for Self-Service BI  Excel BI Toolkit – It allows users to create interactive report by importing data from different sources and model data according to report requirement.  Power BI – It is the online solution that enables you to share the interactive reports and queries that you have created using ...

MS BI Syllabus

Microsoft Business Intelligence Course Syllabus SSRS – SQL Server Reporting Services  Getting Started 1. Understanding Reporting (Authoring,Management,Delivery) 2. Installing Reporting (Native Mode, SharePoint Integration mode) 3. Building your first report  Authoring Reports 1. Developing Basic Reports (RDL,wizard,designer,datasource,dataset,formatting) 2. Working with expressions (expression to calculate value, Agg functions, exp for objects) 3. Organizing Data (Data Regions, Table, Matrix, Chart, List) 4. Advance Report (Parameter, drill down, drill through, links, 5. Report Model (Data Source, Data Source View, Model , Report Builder 3.0)  Managing Report ( Report Manager) 1. Managing Content (deploying report, folders, linked reports, datasources, value etc) 2. Managing Security (Item Level , Site navigation, localhost – sql) 3. Managing Server Config (Config Manager, Report Manager, Report Server DB)  Delivering Report 1. Accessing Report (Viewing...