Math Hurts: Risks of Predictive Analytics for Small Businesses

By: Daniel Harris on November 9, 2016

After the 2016 presidential election, the limits of predictive algorithms are in the spotlight like never before.

Pollsters and statisticians gathered a wealth of data and ran sophisticated simulations of the election’s outcome. Social scientists felt very confident in their ability to predict human behavior on a massive scale, and much of the country trusted their math.

Yet the results they predicted were way off. Following election night, the country woke up to a sense that it had been deceived by big data.

So how did mathematically informed predictions based on huge data sets go so wrong?

The truth is that we’re constantly being reminded of the limits of predictive algorithms, if we only care to look. By way of example, take the weather, which human beings have been seeking to predict since civilization began:

_Weather forecasts: a familiar (and frustrating) type of predictive model

(Source: GIPHY)_

How many times have you cancelled an outing on the basis of a map like this, only to see winds shift and the front change direction later in the week?

Human behavior, unsurprisingly, can be even harder to predict. This was demonstrated not only during the 2016 election, but also during the 2008-09 global recession, which was actually caused in part by—you guessed it—predictive models.

Thomas Davenport, the President’s Distinguished Professor of Information Technology and Management at Babson College, explains that mathematical models designed to predict when debtors would repay mortgage loans failed to factor in falling housing prices.

As prices fell, the models became increasingly worse at predicting debtor behavior, undermining the health of the global economy.

Despite these risks, a lot of Fortune 500s are successfully using predictive models to improve margins and streamline business processes. Perhaps you’re thinking that predictive analytics can also be a shortcut to success for smaller businesses.

**

In this article we’ll take a closer look at the risks of predictive analytics, including dangers identified by analysts at Gartner, a research and advisory firm that conducts in-depth research into how businesses should approach advanced technologies.**

Here’s what we’ll cover:


Is Predictive Analytics Simply Forecasting?

Risk #1: The Law of Unintended Consequences

Risk #2: The More Variables You Come Across, The More Problems You See

Risk #3: How DIY Do You Like to Get With Your IT Projects?

If You Still Want to Experiment

Is Predictive Analytics Simply Forecasting?

Most analytics look at historical data—i.e., data about your business’s performance in the past (customer retention, sales, supplier pricing etc.). And, the reason you’re looking at this data is to understand what your business should or shouldn’t do in the future.

In other words, analytics usually has a predictive aspect to it.

This cuts against the grain of standard accounts of business analytics, which break the discipline down into the following categories:

The Basic Types of Business Analytics

Four-Basic-Types-of-Business-Analytics

This diagram is based on definitions found in Extend Your Portfolio of Analytics Capabilities by Lisa Kart, Alexander Linden and W. Roy Schulte

While this taxonomy is certainly a useful way for beginners to understand analytics, it doesn’t get at why businesses perform them.

As the authors of the report note, “predictive analytics are often a natural extension of experience with descriptive and diagnostic analytics, driven by curiosity about the future and whether observed trends will continue.”

By way of example, take the trend lines in the following dashboard, which is typical for a business intelligence (BI) tool:

screenshot

product-logo

 Get Price

Compare Products

Sales dashboard in SAP Lumira

The reason why it’s important to plot trend lines (which are a form of descriptive analytics, since they’re based on historical data) is that there is an assumption that these trends will continue into the future. In other words, if we’re concerned with sales trending downward, it’s because we’re predicting that they will continue that trend.

But this kind of prediction doesn’t pose the risks that we discussed in the introduction. Businesses have been looking at trend lines for quite some time without any issue.

So what’s special about predictive analytics, and why does it pose so many risks?

The answer is that predictive analytics goes far beyond simply plotting trends and making informed guesses about whether these trends will continue. It wades into the swampier region of data mining‚ which is machine-automated recognition of patterns in massive data sets.

As Bill Hostmann explains in Seek Information Patterns With Data Mining and Predictive Analytics (content available to Gartner clients):

“Data mining can generally be categorized as descriptive data mining or predictive data mining, also referred to in this report as predictive analytics. Descriptive data mining tasks characterize the general properties/attributes and relationships within a set of information. Predictive analytics performs inference on the current information sets to create models to be used to make predictions on future information sets.”

The risks of predictive analytics are hidden in little words such as “data mining” and “models.” You’re not simply guessing that a line is going to continue to curve upwards instead of dropping off a cliff in Q4. Now, you’re making guesses about the line using sophisticated algorithmic models.

Let’s move on to examining some of the risks of using such models in business contexts.

Risk #1: The Law of Unintended Consequences

Ever heard of a “flash crash”? These are stock market crashes that involve dramatic losses over a very brief period of time—say 30 minutes.

In one of the most infamous flash crashes, which happened on May 6, 2010, the Dow Jones dropped nearly 1,000 points in a single day. At the time, no one even knew who to blame.

Recent research, such as the 2014 paper by Tommi A. Vuorenmaa of Triangle Intelligence and Liang Wang of the University of Cambridge, reveals how the event was driven by high-frequency traders (firms that use predictive algorithms to automate the buying and selling of stocks at very high speeds).

According to Vuorenmaa and Wang, when a major firm sold an abnormally large number of futures contracts over a short period of time, a “feedback loop” arose between high-frequency traders. Algorithms blindly passed the same “hot potato” shares back and forth between high-frequency trading firms until the whole stock market had been severely disrupted.

null

_Feedback loop created by predictive algorithms, leading to stock market crash

(Network view of the market during the simulated flash crash by Tommi A. Vuorenmaa and Liang Wang is licensed under CC BY 3.0)_

While these high-frequency trading firms hopefully understood how their algorithms worked in theory, real-world market conditions are far more complex than mathematical models.

Thus when the algorithms were let loose in the wild, they didn’t only destroy the firms that were trying to wield them, but temporarily undermined the entire stock market.

Predictive model usage by a small business probably isn’t going to cause this kind of domino effect. But large firms can hedge their bets by employing data scientists who run sophisticated simulations to anticipate at least some of the unintended consequences.

Does your small business have data scientists on payroll? If not, there’s always the chance that you might identify a market opportunity that turns out to be illusory (after you’ve invested in it).

Additionally, you may alienate customers by misunderstanding their behaviors, or interfere with supply chain operations and other crucial business processes.

Risk #2: The More Variables You Come Across, The More Problems You See

Remember when you were learning algebra? You started by solving fairly simple equations, and gradually added variables. The more variables you added to the equations, the harder they became to solve.

Mathematical models are essentially equations. The more variables you have in your model, the tougher it becomes to understand and use the model.

On the other hand, using a model that’s too simple has its own dangers, as we saw in the example of predictive models designed to forecast mortgage repayment that left out a crucial variable (fluctuations in housing prices).

This is why you really need a data scientist who can use statistical techniques to identify the model that best fits the business problem and further tweak the model when problems occur. Don’t have one? Then predictive modeling may not be for you.

Moreover, even if you come up with a working model (i.e. one that makes money after you put it into production), you still need an account of why it works.

Too often, business leaders are focused only on results. If the predictive model starts generating revenue after a few experiments, it’s counted as a success. The attitude is frequently “it just works,” which doesn’t account for the possibility that one day, some unforeseeable event will cause the model to fail spectacularly.

Risk #3: How DIY Do You Like to Get With Your IT Projects?

In the Hype Cycle for Data Science, 2016, by Peter Krensky, Alexander Linden and Jim Hare (content available to Gartner clients), Gartner notes that “Predictive analytics can be quite easy to use if delivered via a packaged application.”

Packaged predictive analytics modules found in larger software suites are the safest and easiest form of the technology for small businesses to work with.

Many of the rough edges of the models have been sanded away over years of experimentation, and the more complex elements of the models are shielded from user interaction.

However, Gartner goes on to note that “packaged applications do not exist for every analytics use case. Packaged applications may also often not provide enough agility or competitive differentiation. In these situations, organizations are advised to build solutions either through an external service provider (see Market Guide for Advanced Analytics Service Providers) or with typically highly skilled in-house staff using an advanced analytics platform (see Magic Quadrant for Advanced Analytics Platforms).”

In other words, if you’re looking to use predictive analytics for a specific purpose, instead of simply relying on packaged analytics modules in a larger software suite, your small business is out of luck. You don’t have the money to hire the in-house staff or pay a fancy consulting firm to build the analytics for you.

If You Still Want to Experiment

Predictive analytics usage is undoubtedly on the rise in the enterprise. We have three pieces of advice for the more adventurous small businesses out there:

1. Buy prepackaged. Predictive analytics are embedded in all types of software. The risks we’ve discussed throughout this report still apply to prepackaged analytics, but they’re not as great as with custom applications.

Highly focused predictive models that are commonly embedded in software—such as the forecasting algorithms widely used in scheduling call center agents—are generally the safest for small businesses to use.

2. Start small. In “User Experiences With Predictive Analytics Yield Useful Best Practices,” by Nigel Rayner (content available to Gartner clients), he notes that “all organizations that had implemented predictive analytics started with a limited scope project and expanded that as users gained confidence in the predictive models.”

Limiting the scope of the project won’t just help keep the costs associated with building the analytics down, but will also contain the business impact of the model if it backfires.

3. Use help boxes and wizards. Software vendors understand that predictive analytics are complex. Thus, many BI and data science platforms include dialogue boxes to provide further details on algorithmic models. These boxes are intended to help users better understand how to use models as well as the limits of specific models. Some platforms can even recommend appropriate models automatically via wizards that help build data mining workflows.

While such dialogue boxes can help reduce risks, bear in mind that the “novice users” they’re aimed at are seasoned analysts, rather than complete beginners.