Data Warehouse Problems: Do You Even Need One?
For most data-driven organizations, seeing where your business is heading means understanding where you’ve been. In other words, these businesses need to perform analytics on historical performance data.
Online analytical processing, or OLAP, has been pretty much synonymous with this kind of historical analytics for at least two decades.
OLAP has proved to be a resilient and powerful framework for extracting actionable insights from historical data. But it has its limits, particularly when it comes to the data warehousing strategies that have traditionally supported OLAP.
In part one of this two-part report, we’ll discuss some of the limits of traditional OLAP data warehouses. Part two explains how emerging technologies are enabling new kinds of historical and real-time analytics.
Here’s what we’ll cover in part one:
What Is an OLAP Data Warehouse?
OLAP refers to a category of analytics tools that allow users to view data multi-dimensionally, i.e., as values organized in a hierarchy of attributes such as customer name, sales region, date of sale, time of sale etc.
These attributes are generally termed “dimensions,” and the values they categorize are known as “measures” or “facts.”
Generally, facts are derived from transactional or online transaction processing (OLTP) databases that support mission-critical operations.
For instance, a bank needs to store data somewhere while a customer is interacting with, say, a mobile app. The OLTP database is where the app reads data from and writes data to. Transactional data from the OLTP database is then loaded into a data warehouse for storage and analysis.
Traditional data warehouses enable OLAP by organizing arrays of facts in “data cubes,” the geometric dimensions of which correspond to the attributes of the facts that the business wants to track.
For instance, the following data cube shows dimensions for analytics on data from a department store, such as the specific branch within the store (cosmetics, men’s clothing etc.), salesperson name, brand, items sold etc. The numerical values represent the measures:
Once data is in the cube, users can then “roll up” measures, or aggregate them to reveal overall trends, slice out a data subset such as sales in branch #1, dice it by viewing it from various perspectives and drill down to more granular levels of detail in the cube.
For a fuller explanation of OLAP functionality and the ways in which OLAP differs from OLTP, see our buyer’s guide.
Essentially, OLTP databases are for operations, no matter whether you’re feeding data to a mobile app or updating a customer’s account balance, while OLAP data warehouses support historical analytics.
Peter Vescuso, chief marketing officer at VoltDB (an in-memory, operational database management system that we’ll discuss in part two of this report), explains that: “OLTP applications do include analytics, but it’s a different kind of analytics—it’s not the historical analytics or statistical number-crunching that you do with OLAP, but rather operating on the live streaming transactional data.”
“Moreover, the use cases are pretty different from analytics on historical data, and tend to be around trying to gain insights and make decisions in real-time to enable consumer-facing, interactive applications (e.g., smartphone users interacting with a system to get recommendations in real time).”
Pros and Cons of Traditional Data Warehouses
Generally, the IT team handles the creation of data cubes and the management of data warehouses, which is both the primary strength and the Achilles’ heel of this strategy.
The following table shows the correlations between the benefits and the drawbacks of traditional data warehouses:
Pros and Cons of OLAP Data Warehouses
Single source of truth. By aggregating and structuring transactional data from various sources used in the business’s operations, traditional data warehouses serve as a “single source of truth” for reporting and analysis.
IT governs the warehouse. The skills needed to create a centralized data warehouse are too advanced for normal business users, which means the warehouse is going to be primarily the responsibility of the IT team. This means that business analysts frequently rely heavily on IT assistance.
Well-defined data models. Because data cubes are defined by the dimensions that the business wants to track, data modeling (i.e., defining the logical interrelationships between data in order to properly schematize and analyze it) has to be done up front, before data is loaded into the cube. This means that OLAP tools are supported by a data model defined by the business as a whole, rather than the perspective of an individual analyst.
Cubes are hard to change once defined. Data modeling in traditional data warehouses means that dimensions and drill paths need to be defined before data is loaded into the cube. If the business decides it wants to track additional dimensions, such as regions within states as well as states, data must be re-organized and re-processed, which is time-consuming and technically challenging.
Well-suited to transactional data. OLAP emerged as a way of dealing with vast quantities of historical transactional data (sales figures, procurement expenses, employee performance figures etc.), and traditional data warehouses still excel in this area.
Poorly suited to big data. Transactional data such as sales records and customer records tends to be fairly well-structured even before the data is loaded into a data warehouse. After all, it’s generally being pulled from an OLTP database. On the other hand, data sets generated by the web, the industrial internet of things, social media etc. can be too big and unstructured for traditional data warehouses.
The bottom line: Traditional data warehouses are still effective for handling analytics on transactional data for businesses with considerable IT resources. Businesses without expertise in this area, as well as businesses that need to analyze other kinds of data, should explore some of the alternatives we consider in part two.
If you’ve already decided on a traditional OLAP strategy supported by a data warehouse, you can begin shortlisting vendors that offer data warehousing capabilities and end-to-end BI platforms by using our comprehensive listing. Our site features reviews, product descriptions and pricing details for leading vendors in this space.