Wednesday, April 25, 2012

OBIA Best Practices

Best Practices and Important Notes

DAC Notes
  1. To check the pmcmd command sent by DAC to Informatica to execute a task, click on the Status Description in the Task Detail window. We can run this from the Linux box (where the Informatica server resides)
  2. Use the Analyze Repository Tables (Tools > DAC Repository Management > Analyze Repository Tables) when it has been made a considerable change to the core metadata -> this helps to speed up the steps that follow
  3. Micro-ETL:
    - scheduled to run every couple of hours and it can be running even when users are using the OLTP system or running BI reports
    Steps:
      [] create a new Subject Area and add the workflows that we are interested in (i.e.: 1 fact or 1 dimension only) + Assemble
      [] create a new Execution Plan and add the Subject Area created before. Tick Micro-ETL. This is VERY IMPORTANT as the process is this: when Micro-ETL is ticked, the new data added in the OLPT comes into DW even when the users are still logged on and run reports, so no truncate of the staging or target tables happens. Also, the indexes are not dropped. The next time the Incremental is run (involving these tables), it brings in everything from the last incremental run (in a proper way), with dropping and recreating indexes, just to make sure that nothing was
    missed during the Micro-ETLs run and that the statistics are up to date.
    Note: It does not make sense to Analyze nor to drop/create indexes in the properties of the execution plan, all we want is to bring in those new records (so ideally we want the Micro-ETL to run in a couple of mins only - fastest as it can)
      [] Generate parameters and Save
      [] Build and Run 
  4. Deploy to PROD
  • The DAC client could be used to connect to many DAC servers.
  • In terms of migrating all the changes from DEV to PROD: export the custom container only (Logical, no other tick boxes checked) and import it in PROD. Note that when DAC server was created, an empty DAC repository has been created. so we only need to bring in the changes (ie custom container, custom subject areas, custom execution plans etc) and we setup the parameters and the connections to use the PROD connection details.
Export DAC customisations from DEV:     DAC > Tools > DAC Repository Management > Export  > tick only Logical -> Select container > Export.
Import DAC customisations into PROD:    DAC > Tools > DAC Repository Management > Import
 
Informatica Notes
  1. It is good practice to use versioning of the changes so check-in must be made under a specific revision.
  2. Before making any changes to the mappings, it is advisable to keep a copy of the original one. Export SDE > Save as XML and put it under a revision control system. This will ensure that if the SDEs get messed, we could always revert to the original by Importing it to the custom folder (address the conflicts with Replace)
  3. If we want to make add some conditions to the query only for the Full load, instead of updating the mapping (SDE), just change the Workflow query (Full) - as the mapping is used by both Workflows (incremental and full) and we only want the change in one of them so it does not make sense to create another mapping just to reflect our condition.
  4. Running Workflows directly from Informatica: when creating a Mapping as a copy of an existing one, right-click Mapping/Mapplet > Declare Parameters and Variables > always give an Initial Value for the params so that when running it in the Workflow Manager it will use the default values and it will not error.
In Informatica Workflow Manager: don't forget to replace the Parameters with the OLTP/DW actual values
- drag WF and on the task > right click > go to Properties
                      -- $Source Connection Value -> change to be whatever we want to point out to (Use Object and select object from the list)
                      -- $Target Connection Value -> change as above
               - NOTE there is always the option to revert to the original (Use Connection Variable)
               - Edit Task, go to tab Mapping > Connections > make sure there are no variables used but the actual values are passed. Save and run the workflow
 
DEV NOTE: to see large mappings just right click on the grey area in the Mapping Designer and select Arrange All Iconic
 

OBIA Customisations

The out-of-the-box OBIA does not always fulfill all the requirements that customers have, which is why in majority of the cases the DW, ETL or the BI reports need to be customized.
The customisations can be implemented at three levels:
1. Reports/dashboards (BI Presentation Server side) - i.e.: filters, new reports, new dashboards using the already existing objects in the Presentation Catalog.
2. BI Server side (creating additional KPIs, measures, dimensions/hierarchies, aggregations, etc)
3. ETL/DW side (which then involve carrying the customisations further and working out the logic in the BI server and BI Presentation Server side)
The most complex work is carried out on the ETL/DW side as this involves bringing more columns/tables in the DW.
 
The ETL process is tightly coupled with the DW.
 
There are three types of customisations of the ETL:
  • Type 1 customisations - when adding additional columns from source systems
  • Type 2 customisations - when add new facts/dimensions to the ETL (new tables in the DW) using prepackaged adapters
  • Type 3 customisations - imply the use of the Universal adapter to load data from sources that do not have pre-packaged adapters.
 
If the data that populates the DW needs to be changed (i.e. the rows to be retrieved) - this is being done by adding filters or modifying the queries of the already existing SDEs/SIL/PLPs.
 

Type 1 Customisation Steps


DW Steps  -- these imply adding new columns to the DW tables (W_..._DS, W_..._D, etc.)
  • Alter table to create new column in the staging (_DS) table
  • Alter table to create new column in the target (_DS) table
 

Informatica Steps  -- these imply extending the mappings

1. Create the custom folders and make o copy of the original provided mappings (as a rule of thumb, never alter the original mappings that come out-of-the-box)

  • Informatica Repository Manager:     
  1. Create custom folders CUSTOM_SDE,CUSTOM_SIL
  2. Copy the SDE workflows (both FULL and Incremental) that we want to alter to CUSTOM_SDE folder -- if conflicts arise, click Resolve conflicts: REPLACE
  3. Copy the SIL workflows (both FULL and Incremental) that we want to alter to CUSTOM_SIL folder   -- if conflicts arise, click resolve conflicts: REPLACE
2. Modify SDE mappings
  • Informatica Designer:
  1. change mapping name: (Menu) Mapping > Edit > prefix with XX_
[]  Target Designer
- drag and drop the target that we want to alter (the _DS table)
- go to Target (Menu) > Import from Database - search for the table having the newly added columns in the DW (the _DS table we have altered in the DW steps)
- resolve conflict with Apply to All tables, then click Replace
[] Mapping Designer
- checkout the mapping that we want to alter then drag and drop it to the working area
Note that now the column appears in the _DS table (as we have just imported the table). However, this column is not mapped to any OLTP column. It is the developer's job to map it from the source all the way through the mapplet/mapping. To do so, we open the mapplet (an object that encapsulates the logic of transformations) and look for the source table from which we want to map the column to the _DS column.
- drag the column from the source to the Source Qualifier (SQ) and drop it when the + sign is displayed. Then modify the name of the column to give it a proper significance (open the SQ, then go to Ports tab to change the name). In the Source Qualifier, we have to select the column to be brought from the table
- drag the column from SQ to X_CUSTOM expression
- drag the column from X_Custom across until reaching the OUT_Transformation
- Save, then if no conflicts occurred -> check in the mapping
NOTE: repeat the process of mapping the column throughout all the paths in the Mapping, following the X_CUSTOM placeholder's path (except of SA mapplets, as these are encapsulated objects which are used in other mappings) until reaching the final _DS object, so that we have a path of mapping this column all the way from the source table to the target (_DS) destination.
 
3. Modify SIL mappings
  • Informatica Designer
[]  Source Designer
- drag and drop the source table (_DS)
- go to Source (Menu) > Import from Database - search for the table having the newly added columns in the DW (_DS)
- resolve conflict with Apply to All tables + Replace
[]  Target Designer
- drag and drop the target that we want to alter (_D)
- go to Target (Menu) > Import from Database - search for the table having the newly added columns in the DW (_D)
- resolve conflict with Apply to All tables + Replace
[] Mapping Designer: note that now the column appears in both the _DS and _D tables, however, we have to map it from the source all the way through to the target. To do so, we open the mapplet and look for the table from which we want to map.
- drag and drop the column from the source to the SQ qualifier until we reach the + sign. NOTE: sometimes it is not required to modify the SQL query as this could be auto-generated. In this case, making sure that the port exists in the SQ is enough.
- drag and drop the column from SQ to the next object following the X_CUSTOM path only.
- drag the column across following the X_CUSTOM path until reaching the Target table
- Save (ctrl+s), then if no conflicts occurred -> Check In the mapping

IMPORTANT: for dimension tables, in the Unspecified mapping, always link the UPDATE Strategy with the Target Definition Column we have just added. I.e.: if the column is string, then link it to ETL_UNSPEC_ST. These Unspecified SIL are present in all Dimension SIL Full workflows to ensure the outer join (ie still display values in the fact even when there is no row in the dim but there are values in the fact table)
 
4. Modify SDE/SIL Workflows
  • Informatica Workflow Manager
- change workflows names: drag the workflow in WF designer > select Workflow menu > Edit > Rename to xx_SDE_/xx_SIL_.
- in the Properties tab, change the logfile name to be XX_.log.
- in Workflow Designer, on the session, right-click > Edit > General tab > Rename. Note that the task name is still the old one, this will be renamed when right-click on the session and specifying Open Task.
- in Properties tab, Parameter Filename needs to point out to the new folder where the workflow resides (CUSTOM_SDE.XX_SDE_.txt) and the new name of the Workflow. Note that these param files are generated from DAC and they need to be in synch with the Informatica mappings/workflows.
- to make sure that the task is executing the SQL query that would bring in the column (as we have altered it in the mapping) we need to Open Task > go to Mapping tab > select the SQ mapplet > SQL Query > check the query... if it does not contain the column (it shouldn't at this phase) then click REVERT and this will bring in the column added.
- repeat for all tasks, sessions, workflows (both incremental and Full).
- Repository > SAVE

Important NOTE: After a Type 1 Customisation we need to run a Full Load first, so that DW is aware of the new columns/tables and will not leave the table in an inconsistent state (ie if running an incremental load then only the new rows would populate the newly added column and the rest of the rows will have a NULL value in the added column)
 
 
DAC Steps
1. Register Table Columns in DAC
[] Design > Tables > query for the _DS and _D tables that we have altered to add the new column
- right-click > Import From Database > Import Database Columns > selected record only
- specify Database = DataWarehouse, then click Read Columns; then select the column from the list and click Import Columns  -- expect to see the newly added
column in the Columns tab of this table. NOTE: if importing all columns, note that the properties of the imported columns might alter the properties of the
already existing columns (this is not something that we want... so in order to avoid this, only import the new columns)

2. Create Logical/Physical folders in the DAC that point to our new folders in the Informatica Repository
(N) Tools > Seed Data > Task Physical Folders > NEW > enter the same names (case sensitive) as the folders we created in Informatica (CUSTOM_SDE, CUSTOM_SIL)
(N) Tools > Seed Data > Task Logical Folders > New > create 2 folders: Custom_Extract, Custom_Load
The Physical folders are the actual Informatica Folders that DAC need to be aware of. the Logical folders are folders that DAC will be working with.
 
3. Map Physical to Logical folders:
(N) Design View > Source System Folders tab > New > then map each of the physical folders to the corresponding folders: CUSTOM_SDE to Custom_Extract; CUSTOM_SIL to Custom_Load and SAVE

4. Let DAC know to issue commands to execute the custom WFs in Informatica. Also, change the folder from which DAC will be using the tasks:
NOTE: Make sure that you change the command for both Incremental and Full load.
(N) Design > Tasks > query for the tasks having exactly the same names as the Informatica WFs
5. SYNCHRONIZE (right click on the task > Synchronize tasks)
6. create a new subject area and add only our tasks (both SDE and SIL)
7. Assemble Subject Area
8. Create a new execution Plan and add our custom Subject area created at step 6. Generate Parameters and specify the datasources. Note that the custom folders are being brought forward here
9. Build the Execution plan.
10. Run the execution plan
 
OBIEE Steps
1. Open the RPD in Admin Tool
2. In the Physical Layer, search for the _D table that we have altered in the DW and whose additional columns we wanted to expose in BIA
3. Right-click on the table and New Object > Physical Column
4. Drag the column in the Logical layer in all the dimensions/facts involved.
5. Drag the column in the Presentation layer to expose it to the users accessing the Presentation Catalog.


 

Type 2 Customisation Steps

DW Steps
All tables created in the DW have to have these 4 mandatory columns:
  1. INTEGRATION_ID          -- unique identifier of a record as in source table
  2. DATASOURCE_NUM_ID  -- identifier for OLTP source
  3. ROW_WID                   -- surrogated key - unique ID for tables
  4. ETL_PROC_WID           -- ID of the ETL process (stored in S_ETL_RUN (OLTP) and W_ETL_RUN (DW))

Informatica Steps
same as Type 1 but with the exception that we need to create these tables from scratch and bring all the columns in.
also, we have to create all the workflows from scratch maybe!

DAC Steps
1. Register Table and Table Columns in DAC
[] Design > Tables > click New.
- right-click > Import From Database > Import Database Tables > selected record only
- right-click > Import From Database > Import Database Columns > selected record only
- right-click > Import From Database > Import Indices > selected record only
 
OBIEE Steps
1. Open the RPD in Admin Tool
2. In the Physical Layer, Import the tables in the Physical layer
3. Create the relations in the Physical Schema
4. Create the table (dim/fact) in the Business Model layer
5. Create the relations with the other objects in the BMM layer
6. Add the table (dimension/fact) to the Presentation layer to expose it to the users accessing the Presentation Catalog.
 
More on this: http://docs.oracle.com/cd/E10783_01/doc/bi.79/e10742/anyinstadmcustomizing.htm#i1027226