Wednesday, February 5, 2020

Neo4J Knowledge Graphs in OBIEE


Introduction:

The purpose of this article is to analyze and bring insights from statistical data from data warehouse and graphical data from Neo4J. Today’s most pressing data challenges center around connections, not just discrete data. At the same time, most of the business data gets stored to relational databases in form of tables. The challenges lies at the inception of the graph DB as to how we can narrow down our searches from one technology to other or integrate both we can reach proper business decisions.

For e.g. We have modeled our graphs to the best and it can answer all the things you can throw at it but how should we see that data and its associations with other business dimensions at a glance.


The problem statement

  • How to narrow down search from relational model to graphical model?
  • Drill down from graphical model to relational model
  • How to see data relationships without writing complex cyphers
I have divided the above problem statement in multiple steps listed below and we'll work upon each of it in detail

  1. Integrate OBIEE with Neo4J directly and do reporting on top of it
  2. Create Graph visualization in OBIEE so as to project our neo4j data.
  3. Create oracle reports/charts & neo4j visualizations and navigate among them in a single dashboard

Objective: To integrate OBIEE with Oracle DB & Neo4J and navigate to/fro from graph and Oracle reports in single dashboard.


Integrate OBIEE with Neo4J Database

To integrate OBIEE & Neo4J we'll leverage the neo4j-jdbc connector provided by Neo4J. You can download the driver from here. The latest driver at the time of writing is 3.4, if you are working on newer Neo4J version you have to update the driver as mentioned in "Building the driver yourself" section at this page.

Once downloaded, we have to follow the exact steps as mentioned on this blog page. You should end up with creating the connection pool with jndi/neo as the data source. Please note that since Neo4J is a graph database we'll not be able to model our rpd. We only have the option to create direct database request analysis in OBIEE.


The next step is to create direct database query analysis from obiee and enter connection pool information that you have just created. In the SQL statement, we can write our regular Neo4J cyphers and get the result. (Pretty cool right, I know :P )



(Please feel free to ask if you are stuck in building the driver or in any other steps above)

Creating Graph visualizations in OBIEE

Once we have integrated our Neo4J instance and can fetch the data, the next step is to create the graph visualization. For this we'll create our custom visualization using javascript with the help of narrative view of OBIEE as obviously OBIEE doesn't provide graph visualization out of the box. I have already written about creating graph visualization in obiee here, so please refer to it.

Please note, we can also create any other visualization or views that obiee provides with this data. Also, by now we'll have reports having data either from Neo4J or some other relational database using regular RPD stuff. The next step is to just navigate among these reports irrespective of underlying data source.

Navigating b/w Graph visualization, charts & reports

For navigating among reports we have various options as listed below:

Filtering on Neo4J data from OBIEE Dashboard Prompt

For filtering on neo4j data, we have to create dashboard prompt column with presentation variable. We can then use that presentation variable in our cypher query to filter out the data.Sample query below with presentation variable nPerson & title

MATCH (p:Person)-[r]-(m:Movie)  WHERE p.name in [@{nPerson.isAllColumnValues?NULL:nPerson}[\"@\"]{p.name}]  AND m.title in [@{title}[\"@\"]{m.title}] 
 return  p.name, type(r) as Relationship, m.title

Master Detail linking from Neo4J Graph to Table or Bar graph

Since our graph visualization is acting as master view, we have to send event to detail view as described below. Don't forget to add SAutils javascript file in your narrative header.

For e.g. we are sending movie name as 'The Matrix' on "mName" MD channel

        SAutils.publishMDEvent('mName', ['"Movie"."name"', "The Matrix"]);

On the detail report, just add graph or table listening the events on this channel

Navigating to OBIEE Dashboard page or individual report from Neo4j Graph

This is beautifully described on the oracle page Navigating Using Javascript and on gerardnico blog

Individual report
GoNav(event, '/users/neo/NeoRel/Application/detailsMovie', 'Movie', 'name', 'The Matrix')

Dashboard page
PortalPageNav(event, '/users/neo/NeoRel/_portal/Movie Demo','Movie View', 'Movie', 'name', 'The Matrix')

Dashboard
PortalNav(event, '/users/neo/NeoRel/_portal/Movie Demo', 'Movie', 'name', 'The Matrix')

Sample Screenshot


Read More »

Thursday, October 25, 2018

Oracle on wheels - My DevOps story


I had the privilege to learn and work on some of the DevOps tools like Jenkins & GoCD when I came across FlexDeploy which seems to be perfect based upon my area of work. It supports each component of DevOps loop.


DevOps continuous loop

Today's post will highlight some of the features & attributes on the same. FlexDeploy is a product born out of a startup Flexagon few years back and picked up lot of eyes especially in Oracle space. FlexDeploy offers several plugins associating itself with almost every Oracle product and alleviates pain areas in CI/CD horizon. You can easily build and deploy artifacts across environments based upon your technology of choice. For the interest of the blog, I'll stick to only Oracle BI domain.

I have explored specifically the ODI, OBIEE & Oracle DB Plugin. I'll write each in briefly but before let me explain how FlexDeploy talks with each environment viz. Development, Testing, Production etc.

FlexDeploy uses ssh connectivity with each environment - it puts the project specific binaries in that environment and executes them within, whether the task is to export the artifacts or to deploy them. Below diagram shows more in depth architecture:

FlexDeploy Architecture

ODI
ODI plugin within FlexDeploy exports the scenarios from one environment (generally build env i.e. Dev, Test) and imports them into your environment of choice (Test, Prod). It supports both ODI 11g & 12c and uses marker & regex based approach to filter out the scenarios that needs to be exported. Once exported in the artifact directory (of course, you can push them to VCS repositories as well viz. SVN, GIT) these can then be imported in the target systems just by using ssh connection.
Bonus point: I have extended this plugin to support generating of data lineage along with several bug fixes. You can request your copy from the comments section below.

OBIEE
OBIEE plugin is interesting in terms that it supports partial deployments. You can migrate all the webcatalog objects or you have a choice to select and migrate only few. It supports RPD migration as well which supports changing of connection pools programmatically based upon the target instance. After deployment, it may restart the OBIEE services for you. Again it is well suited for both OBIEE 11g & 12c.

Oracle DB
Oracle DB plugin is richer in terms of functionality and support. It creates the baseline of your database in each environment and compares that baseline for any discrepancies in metadata. You can synchronize these environments on the go or schedule them.

The above information highlights the tool capabilities succinctly and more information can be found from here.

If you have any questions pertaining to the tool, I'll be happy to answer.

Read More »

Friday, August 10, 2018

ODI 12c Data Lineage Tool

Hi Guys,

After Informatica to ODI Automation tool, I am happy to present ODI12c documentation tool that connects with your work repository and generates documentation on the selected mappings.
This is not it, I am releasing the tool for public download. So, if you want the copy, please get in touch.

Select ODI Instance to connect:














 Select mappings to generate data lineage:

Output File:



Read More »

Wednesday, June 6, 2018

Streaming Analytics through OBIEE

Hi Guys,

I am back with a new BI piece and this time its on a big data environment. This is not a tool but is amalgamation of multiple technologies to provide seamless BI experience in real time and to take business decisions quickly.

Many companies today is looking for alternatives to store huge data and gather insights quickly rather than waiting for regular BI jobs to get completed and then see the reports. Also, moving entirely to a completely different stack is also not possible as the business users got already accustomed and comfortable with the existing reporting tool. Also, what about the huge investments company made on the Oracle products? 😉

To answer these, I have created a demo that reflects how we can leverage the existing technology and combine it with big data products at the backend to provide seamless experience to the end users. Also, the demo will show how we can process the data in real time and visualize it live in obiee.


The entire data flow works as follow:

The data is fed from multiple sources systems/applications to kafka topics which passes it to the apache spark. Now apache spark processes the data and open two streams for data population.



First stream pushes the data to the data warehouse, in our case its kudu db maintained on hadoop cluster. Once the data is received in kudu, we can further transform/aggregates it or use predictive analytics, which will finally get visualized in obiee through rpd using apache impala as the odbc connector.



The second stream pushes back the data from spark to kafka in a different topic. Now this evenly timed data is sent to node server which is responsible of sending this data to each client window with the help of socketio. Finally, the visualization on the data happens with the help of highcharts.


Some OBIEE sample reports :-






more info @
https://www.linkedin.com/pulse/streaming-analytics-obiee-parikshit-agarwal/


Read More »

Friday, August 18, 2017

Informatica to ODI Conversion Tool

In this post I will brief about the Informatica to ODI conversion tool that I have been working from a long time. Below are some of the features and screenshots of the tool. Subsequent posts will provide more insights to this tool.

Also guys, I have released a introductory video of the tool. You can watch it here.




Features:
  • Convert Informatica workflows to ODI Interfaces seamlessly.
  • Generate documentation of informatica workflows.
  • ODI Accelerated development with excel.



Read More »

Wednesday, September 28, 2016

Generating custom module in ODI - A Cleaner Approach

Hi Guys,

There is lot of posts on internet on how to create your own custom fact groups and integrate them in load plans. But almost all of those requires you to intervene in ODI Metadata tables which sometimes leads to haphazard.
In this post, I'll give you the alternate approach that Oracle recommends and works well for all of your custom modules. Also, I'll mention all the bottlenecks that I had dealt during this phase.

Load plan generation in BIACM works through a utility called LPG aka load plan generator which ultimately crawls through the entire ODI code for all the checked Fact group and the associated dimension groups and creates a master load plan which is nothing but a amalgamation of many load plans that resides in dev components. System components tell you the ordering in which the selected dev components should pick up. It is not that convoluted as it sounds. We'll go step by step to understand what's happening and how can we use all these to integrate our custom module.

Below are the steps delineating the process of generating a load plan out of custom fact group
  1. Create Fact Group & Dimension Group (through BIACM)
    1. Goto Manage Business Intelligence Applications > Business Intelligence Applications Offerings tab
    2. Check the box against Oracle Custom Analytics > Custom Functional Area for Custom Analytics.
    3. Now under associated dimension and fact groups click on the + sign to add the fact group followed by the dimension group

  2. Now the next step is to create Dev Components (individual LP consisting of scenarios) in ODI
    1. SDE Dimension LP (_DS scenarios)
    2. SDE Fact LP(_FS scenarios)
    3. SIL Dimension LP (_D scenarios)
    4. SIL Fact LP(_F scenarios)
  3. Add dev components to respective system components (remember the ordering based upon the dependency)
  4. Add Fact group & dimension group name(from step 1) to flex-field value of each individual table at module level. It helps LPG in segregating tables based upon the Module
  5. Create foreign keys in fact table to refer the associated dimension.
Below are some of the issue that I have encountered.

Q: DSN not configured for LS. ODI repository not configured properly , please check ALL the logical schemas associated with the chosen PLV
Make sure in your logical schema, below flexfield values matches exactly with the logical schema of source system defined in BIACM 
  • PRODUCT_LINE_VERSION_KEY,
  • DATA_SERVER_TYPE,
  • LS_DATASOURCE_NUM_ID
  • LS_SCHEMA_TYPE
Q: Load plan generation failing with null 
The trick is to atleast add a single foreign key to a dimension that has its scenario attached in respective dev component.

Q: SDE Dimension flows are not coming in generated load plan
Add a foreign key in fact to that dimension which has its sde also.

reference: https://blogs.oracle.com/biapps/entry/load_plan_generator_an_inside
Read More »

Tuesday, August 9, 2016

ODI - Load huge source data in chunks using variable loops

Hi Guys,

In this post we'll walk through an issue I faced during one of the projects where source data (MySQL DB in our case) was so huge that ODI was not able to handle & was giving GC Overhead limit exception.
We took out the source query & filtered the data according to a specific year and ran again. This time since the source query was substantiated to a limit was giving out the appropriate results.
This gave us the idea to develop some kind of loop in ODI that will recursively run the interface/mapping iterating through the years present in source.
This article mainly focuses on developing this loop by using variables in ODI that will refresh and run our interface/mapping iteratively.

First of all, lets focus on different type of variables offered by ODI and how we can leverage them for our development.

Set Variable:  Set Variable step of type Assign sets the current value of a variable. We will use this to initialize our counter variable.

Refreshing Variable: We will extensively use this kind of variable.Refresh Variable step allows you to re-execute the command or query that computes the variable value. We will create counter using this type of variable which will increment itself every time this variable is referenced.

There are other variables offered as well such as evaluate, declare & increment variable but we are not going to discuss them here as they will not be required in our case. You can always google them though.

Now, we'll go through the above diagram in sequential order explaining each of the step in detail.

  • (Requirement) Each time the mapping runs, it should truncate the existing data first.
    • I have created procedure having truncate statement & tied that as the first execution step. Guys please note that there are multiple ways you can handle this but I found this approach a bit easy & quick.
  • X_COUNTER: The 2nd & 3rd object in our diagram is referencing the same variable but is performing different operation in both cases.
    • X_COUNTER when referenced as a set variable in 2nd step is used to initialize the counter to a value of -1
    • Now, in the 3rd step same variable is used as a refresh variable which will increase every time it is referenced. Let me explain the query behind this variable
select #X_COUNTER+1 from dual where #X_COUNTER+1 < (select count(DISTINCT ) from )

In above query, whatever the value of x_counter at that moment will increment by 1. Also please note that we will only increment this variable by the limit of distinct years present in our source table. That is, a single loop will run for a particular year. Likewise, we'll only iterate through all the years. (comment below for any doubts here) X_END_SCEN: This is an empty variable(you can attach mail step instead of empty variable which will notify you through mail once the load get finishes) which is referenced through X_COUNTER when the above query is failed to return any value i.e. X_COUNTER has been iterated through all the years and now the value of X_COUNTER will be greater than the count of distinct years.
  • X_CDW_ACTIVITY_YEAR: This again is are refresh variable which will output the specific year based upon the x_counter value. Below is the query:
select DISTINCT  from  order by 1 limit #X_COUNTER,1

The above query will return a single year based upon our X_COUNTER value since select query will list down all the distinct years in increment order and limit will extract the specific value based on the number(e.g. if x_counter=2 then limit 2,1 will fetch the 2nd value). Also, please note that our source is MySQL and the above query may not work for other db(its just one way of getting value based upon counter, you can use other approach as well)

  • Mapping: After refreshing all the variables, next step will be the main mapping, from where the final source query will generate. We have to just make sure to add the filter in our mapping for the particular column i.e. and assign the variable value to it i.e. X_CDW_ACTIVITY_YEAR in our case. Each time this mapping runs, it will fetch the latest value from the variable i.e will generate the query for that specific year.
Guys, please note we can have many approaches based upon the requirement. In fact, we can have nested loops where the source query is been filtered through multiple variables as given below



Hope you understood the concept. Please feel free to comment for any doubts or suggestions. Happy coding :)
Read More »