More often when people bumped the terms ‘Data Warehousing’, ‘Business Intelligence (BI)’ or ‘Data Mining’, they think of state-of-the-art tools and complex tech stuff comprehensible only to tech geeks. They squirm just by the thought of it and they let their thoughts jump into an escape hatch with a parting words ‘Let the tech guys worry about that.’
The fundamental difference
BI has many facets. Personally, having work with databases BI is data analysis in-depth. We normally analyze data by observing the behavior of numerical data in progression over time delimited with time itself and other components associated with the numeric object such as sales over time, number of units produce over time, distance covered over time, etc. To visualize the analysis, we plot the data on the graph traditionally on two-dimensional approach the way we see data in an Excel spreadsheet.
As an example, say Branch A of ABC Bank received a total deposit amounting to $12000 for the month of January 2012, Branch B received deposits amounting to $34000 in Jan 2012, Branch A had $9000 deposits on February 2012, Branch B earned $11000 deposits on Feb 2012 and so on. We list down the branches in the column (Y-axis) of a spreadsheet against the months on rows (X-axis) and the intersection of X and Y axis is the corresponding numeric value. The intersection of the two dimensions (X and Y axis or Rows and Columns) in the graph is the corresponding numeric value..
In BI perspective, the numeric value is called the Measure (or Fact Table) and these are the amount of deposits $12000, $34000, etc. Each of the numeric value is bounded by two dimensions namely November 2012 (called the Time Element dimension) and Branch A (called the Branch dimension). Thus, traditionally we analyze data on a two-dimensional approach the way we see numeric data in a spreadsheet – the amount of deposit bounded by the X-axis (Row) and the Y-axis (Column). Hence, say $12,000 amount deposited on January 2012 for Branch A.
What if two more dimensions are added – Account Type and Customer. So to complete the statement with the added dimensions it would be – A total deposit amount of $12000 was received by Branch A on the month of January 2012 using Savings account (Account Type dimension) and deposited by Client 1 (Customer dimension). With a total of four dimensions, we can no longer use the typical 2-dimensional spreadsheet obviously because a spreadsheet could not accommodate more than three dimensions. (Yes Sir, a typical spreadsheet can accommodate a three-dimensional graph where a third dimension is plotted on a page or worksheet). A set of numeric data with more than three dimensions can only be plotted on a graph using a cube structure.
Hence the difference, as oppose to analyzing data in a typical two-dimensional approach BI is an in-depth analysis of data with three or more dimensions constructed into a multi-dimensional structure called cube. Specifically, a part of BI where the analysis of data is based on a cube structure is called On-line Analytical Processing, otherwise known as OLAP. Consequently, in a multidimensional cube structure, a cube could be tweaked to drill down or drill up from a particular slice level. To drill-down, It means you may want to see the breakdown of deposits done on Branch A on a Weekly (or even on a daily) basis. To drill-up, you may want to see the deposit made for Branch A on annual basis
BI Applied, A Practical Approach
In a Bank where I once worked, monitoring the performance of branches was a serious business.. Though some departments like Trust and Treasury could independently generate income, branches are the basic income-generating units of the Bank. For a Bank with roughly 300 branches, a simple approach to profitability is pamper these 300-strong branches to operate productively. Hence more data instruments were needed for analytical efforts in identifying improvement opportunities and inherent weaknesses of branches
As I saw it, this was a potential area where an OLAP solution could effectively provide a mechanism that could be used to analyze the performance of branches. I started with a question – ‘Where is the highest (or lowest) concentration of deposits/withdrawals based on geographical and time parameters? Now what if two more variables are added to the equation namely Account Type (i.e. Cash deposits, Check deposits, Savings withdrawal, TD withdrawal, etc) and individual client (with properties on gender, income-level or profession). With these four parameters in an OLAP app, it might be able to spot a potential trend or relationship.
I finally came up with a simple Online Analytical Processing (OLAP) solution aptly named Branch Performance Matrix. A simple yet relevant approach measures the deposit or withdrawal capacity branches in terms four parameters just mentioned. It would give insight to a decision-maker for a case in point in any of the two cases below.
Branches covering South Western region of Mindanao have a high volume of deposits during the month of December. It defies the country-wide seasonal trend when this time of year withdrawals are dominate due to shopping trend for Christmas season. Upon scrutiny, South Western Mindanao is an agricultural geographical region and the clients of these branches are dominantly farmers. At this time of the year harvest has just been made and these farmers are saving their funds for the next planting season when they would need money to buy seedlings, fertilizers, etc.
On the month of June, a wave of withdrawals is expected to be made as it is the opening of school year when cash are needed to pay for tuition fees. However some branches in Metro Mania area, the amount withdrawn are significantly low. Upon scrutiny the clients of these branches, as shown by attributes of Client dimension, are well to dot who live in exclusive villages nearby. Because of their affluence, we could infer that they have adequate cash availability.
A lot of approaches can be made in designing an OLAP app to identify a potential trend in the frequency and volume of deposit or withdrawal made from branches. The Branch Performance Matrix is just one of the ways.