Thursday, February 11, 2010

How to Build an Academic Year Time Dimension

Working as a Business Intelligence consultant is exposing me to new challenges every day. Finding the needle in the stack of hay or coming up with an innovative solution can be sometimes a very difficult task. As such, I have decided to start my own OBIEE blog, where I will write about my own found solutions or about useful articles that helped me do my daily job.

I will start with a common task, that of building a calendar dimension.
I have found the following interesting entry on John Minkjan's OBI blog, containing tips on what should the time table contain. The full thread can be found here.

Below I will provide the sql query that I have used for building an Academic Year time dimension. The requirement was:
- populate the calendar with data starting with a specific date (representing the date when the data warehouse started being populated).
- the week-start-day was considered to be a Sunday
- the academic start year is 1 April of each calendar year.
The bellow is specific to these requirements:

TABLE AY_CALENDAR
  -- day level
  time_id                           date,
  day_name                       varchar2(20),
  day_number_in_week    number,
  day_number_in_month   number,
  day_number_in_year      number,
  working_day_flag           varchar2(1),
  --week level
  week_number                 number,
  week_number_pay         varchar(3),
  week_start_date            date,
  week_end_date             date,
  -- month level
  month_name                 varchar(20),
  month_number             number,
  month_no_of_days      number,
  -- quarter level
  quarter_number          number,
  beg_of_quarter           date,
  end_of_quarter          date,
  -- year level
  year_number               number,
  year_number_of_days   number,
  year_start_date          date,
  year_end_date            date



INSERT INTO ay_calendar
  ( time_id,
    day_name,
    day_number_in_week,
    day_number_in_month,
    day_number_in_year,
    working_day_flag,
    week_number,
    week_start_date,
    week_end_date,
    month_number,
    month_no_of_days,
    --end_of_cal_month,
    month_name,
    --days_in_cal_quarter,
    beg_of_quarter,
    end_of_quarter,
    quarter_number,
    year_number,
    year_number_of_days,
    year_start_date,
    year_end_date )
SELECT
  TRUNC( sd ) time_id,
  TO_CHAR( sd, 'fmDay' ) day_name,
  TO_CHAR( sd + 1, 'D' ) day_number_in_week,
  TO_CHAR( sd, 'DD' ) day_number_in_month,
  TO_CHAR( sd - add_months(TRUNC( add_months(sd,-3), 'YEAR' ),3) + 1, '000' ) day_number_in_year,
  CASE WHEN TO_CHAR( sd, 'fmDay' ) IN ('Sunday', 'Saturday') THEN 'N' ELSE 'Y' END working_day_flag,                                
  trunc((sd - NEXT_DAY(add_months(TRUNC(add_months(sd,-3),'YEAR' ),3)-7,'SUNDAY'))/7) + 1 week_number,
  NEXT_DAY( sd - 7, 'SUNDAY' )   week_start_date,   
  NEXT_DAY( sd - 1, 'SATURDAY' ) week_end_date,   
  TO_CHAR( add_months(sd,-3), 'MM' ) month_number,
  TO_CHAR( LAST_DAY( sd ), 'DD' ) month_no_of_days,
  TO_CHAR( sd, 'FMMonth' ) month_name,
  TRUNC( sd, 'Q' ) beg_of_quarter,
  ( CASE
      WHEN TO_CHAR( sd, 'Q' ) = 1 THEN
        TO_DATE( '03/31/' || TO_CHAR( sd, 'YYYY' ), 'MM/DD/YYYY' )
      WHEN TO_CHAR( sd, 'Q' ) = 2 THEN
        TO_DATE( '06/30/' || TO_CHAR( sd, 'YYYY' ), 'MM/DD/YYYY' )
      WHEN TO_CHAR( sd, 'Q' ) = 3 THEN
        TO_DATE( '09/30/' || TO_CHAR( sd, 'YYYY' ), 'MM/DD/YYYY' )
      WHEN TO_CHAR( sd, 'Q' ) = 4 THEN
        TO_DATE( '12/31/' || TO_CHAR( sd, 'YYYY' ), 'MM/DD/YYYY' )
    END ) end_of_quarter,
  TO_CHAR( add_months(sd,-3), 'Q' ) quarter_number,
  TO_CHAR( add_months(sd,-3), 'YYYY' ) year_number,
  ( add_months(TO_DATE( '12/31/' || TO_CHAR( add_months(sd,-3), 'YYYY' ), 'MM/DD/YYYY' ),3)
    - add_months(TRUNC( add_months(sd,-3), 'YEAR' ),3) ) + 1 year_number_of_days,
  add_months(TRUNC( add_months(sd,-3), 'YEAR' ),3) year_start_date,
  add_months(TO_DATE( '12/31/' || TO_CHAR( add_months(sd,-3), 'YYYY' ), 'MM/DD/YYYY' ),3) year_end_date
FROM (
    SELECT
      TO_DATE( '05/24/2008', 'MM/DD/YYYY' ) + rownum sd
    FROM dual
      CONNECT BY level <= 3650

);

I hope this helps and gives some hints on how to tackle creating and populating the time dimension at database level.
All the best until next time!