Find the best Data Discovery Tools

Overview
ON THIS PAGE

Compare Products

Showing 1 - 20 of 172 products

Tableau

Tableau is an integrated business intelligence (BI) and analytics solution that helps to analyze key business data and generate meaningful insights. The solution helps businesses to collect data from multiple source points such as...Read more about Tableau

Microsoft Power BI

Microsoft Power BI is a web-based business analytics and data visualization platform that is suitable for businesses of all sizes. It monitors important organizational data and also from all apps used by organizations. Microsoft P...Read more about Microsoft Power BI

Qlik Sense

Qlik Sense is a business intelligence (BI) and visual analytics platform that supports a range of analytic use cases. Built on Qlik’s unique Associative Engine, it supports a full range of users and use-cases across the life-cycle...Read more about Qlik Sense

Looker

Looker, now part of Google Cloud, is a cloud-based business intelligence (BI) platform designed to explore and analyze data. The solution helps businesses to capture and analyze data from multiple sources and make data-driven deci...Read more about Looker

Talk with us for a free15-minute consultation
Software Advice is free because vendors pay us when they receive sales opportunities.
This allows us to provide comprehensive software lists and an advisor service at no cost to you.

Form Illustration Options

Meet Eric, a software expert who has helped 1,534 companies select the right product for their needs.

Tell us more about your business and an advisor will reach out with a list of software recommendations customized for your specific needs.


STEP 1 OF 4

How many are in your organization?

LiquidText

LiquidText is a note-taking solution that helps businesses collate ideas, create note relationships, handle search processes, and more from within a unified platform. It allows staff members share notes with other team members and...Read more about LiquidText

Wolfram Mathematica

Wolfram Mathematica is a technical computing solution that provides businesses of all sizes with tools for image processing, data visualization and theoretic experiments. The notebook interface enables users to organize documents ...Read more about Wolfram Mathematica

SAP Crystal Reports

SAP Crystal Reports is a business intelligence (BI) on-premise solution for Windows that is suitable for small to midsize businesses across multiple industries. It combines the reporting capabilities of SAP Crystal Reports with th...Read more about SAP Crystal Reports

Kommo

Kommo is a multifunctional CRM that excels at taking the conversation with your customers to the next level. With messengers, the connection is personal. All major messenger platforms are supported. You can create your own chatbot...Read more about Kommo

Safetica

Safetica provides DLP solutions to secure sensitive data and be compliant with regulations. Customers can choose from on-prem (Safetica ONE) and cloud-native (Safetica NXT) solutions. Safetica NXT (cloud-native) Safetica NXT is a...Read more about Safetica

Phocas Software

Phocas is a team of passionate professionals who are committed to helping people feel good about their data. Our software brings together organizations’ most useful data from an ERP and other business systems and presents it in a ...Read more about Phocas Software

SyncSpider

SyncSpider is an application-to-application integration tool designed to help eCommerce businesses grow revenue using multichannel sales automation. It helps manage stock in a centralized place, connect with eCommerce tools to syn...Read more about SyncSpider

SAP Analytics Cloud

SAP Analytics Cloud is a business intelligence and data visualization solution designed for businesses of all sizes. It offers business planning, predictive analytics, digital boardroom and reporting functionalities within a suite...Read more about SAP Analytics Cloud

Sigma Computing

Sigma is an award-winning modern business intelligence (BI) and analytics platform purpose-built for the cloud. With Sigma, anyone can use the spreadsheet functions and formulas they already know to explore live data at cloud scal...Read more about Sigma Computing

epocrates

Epocrates is the essential clinical reference and decision support tool for healthcare professionals. Users can get access to thousands of drugs and their side effects, dosing information, and patient-facing content. Epocrates al...Read more about epocrates

Elastic Stack

Built on a foundation of free and open, Elasticsearch, Logstash, Kibana, and Beats pave the way for diverse use cases that start with logging and span as far as your imagination takes you. Elastic features like machine learning, s...Read more about Elastic Stack

Spotfire

Spotfire provides executive dashboards, data analytics, data visualization and KPI push to mobile devices. It complements existing business intelligence and reporting tools, while midsize organizations can use dashboards and analy...Read more about Spotfire

SAS Visual Analytics

SAS Visual Analytics is our flagship offering for self-service data preparation, visual discovery, interactive reporting, and dashboards--as well as easy-to-use analytics--with governance. SAS Visual Analytics allows non-technical...Read more about SAS Visual Analytics

JMP

JMP is an on-premise data analytics solution that helps scientists, engineers and data explorers understand complex data relationships and visualize them via interactive dashboards. The data acquisition and cleanup functionalities...Read more about JMP

FindNiche

FindNiche is a niche analysis tool for AliExpress and Shopify. It can help us find all the information about our competitors, and it can also inspire product inspiration in your niche. Its power lies in the selection of big data. ...Read more about FindNiche

Pentaho

Pentaho is a business intelligence system designed to help companies make data-driven decisions, with a platform for data integration and analytics. The platform includes extract, transform, and load (ETL), big data analytics, vis...Read more about Pentaho

Buyers Guide

Last Updated: March 16, 2023

What is data discovery software?

Data discovery software is a tool that helps you to collect and combine data from multiple sources and identify patterns and trends in them. Data preparation, data modeling, visual analysis, and advanced statistical analysis are the key functions of data discovery software. Data discovery tools are primarily available as a part of business intelligence software solutions.


Data discovery is one of the fastest-growing and rapidly changing segments of the BI market. These tools differ dramatically from the traditional systems of record that enable IT to push reports and dashboards out to the rest of the organization.

In many cases, data discovery tools are purchased by organizations that have already deployed traditional BI systems, in order to solve issues with data access, data preparation and data exploration. Data discovery solutions have also been a godsend for small businesses that can’t afford complex data warehouses and lack the expertise to build them.

The market for data discovery software is complex and highly fragmented. There are a number of different “flavors” of data discovery, and a variety of use cases in which one flavor works better than another.

In this Buyer’s Guide, we’ll explain how data discovery software differs from traditional BI and describe the categories into which these tools break down.

Here’s what we’ll cover:

How Do Data Discovery Tools Differ From Traditional BI Systems?

Capabilities of Data Discovery Software

Types of Data Discovery Tools

How Do Data Discovery Tools Differ From Traditional BI Systems?

An easy way to understand this difference is to look at the history of BI solutions.

Traditional BI systems were an attempt to solve the difficulty of writing SQL queries in order to retrieve data such as sales information, customer information, shipping records etc. stored in multiple relational databases. Before BI, users had to be highly familiar with SQL to get the data they needed out of such databases.

Thus, traditional BI systems mapped a layer of familiar business terms (known as a semantic layer) onto the relational databases’ storage schemas, thereby allowing users to retrieve and combine data without knowing SQL at all.

Traditional BI Semantic Layer

Traditional-BI-Semantic-Layer

The semantic layer is a way of expressing a data model, or a schematic representation of the relationships between data in one or multiple datasets. In particular, the semantic layer schematizes the relationships between data residing in different data sources/databases. For instance, the dimension “customer” in the semantic layer may be defined as grouping together information from both the “sales orders” database as well as the “customer records” database.

BusinessObjects—later acquired by SAP—was the first BI vendor to use the semantic layer model, and remains one of the most popular semantic layer-based solutions. The semantic layer model is still suitable for large enterprises that need unified access to data stored in numerous operational databases.

The problem with this model is that the semantic layer needs to be standardized across the organization. In other words, various business units must agree on which databases and tables in these databases the dimension “customer” will pull from. Moreover, once the semantic layer has been standardized, it remains under IT control.

As you can see in the above diagram, traditional tools for ad hoc queries pass analysts’ queries through the semantic layer, which automatically translates them into SQL queries to retrieve data from SQL databases and other data sources that support SQL querying. Thus, traditional querying tools can only work with data sources that have already been integrated into the semantic layer.

Data sources outside the semantic layer (a spreadsheet sent in an email, a public data source on the web, 500,000 Tweets about a product recall etc.) can’t be easily integrated with the semantic layer unless IT develops new processes. And, of course, IT can’t develop a process for every new data source.

When the semantic layer is standardized across the organization, the paths that analysts follow to retrieve and combine data get frozen into place. For instance, if the organization defines “store” as a subcategory of “branch,” and “branch” as a subcategory of “sales region,” while neglecting to slot “customer” somewhere into this hierarchy, blended analysis of sales and customer data can become overly complex.

Business-terms-mapped-to-operational-data-in-SAP-BusinessObjects

Business terms mapped to operational data in SAP BusinessObjects

Data discovery tools remedy this situation by providing direct access to the operational databases shown in our chart, instead of going through a semantic layer. This allows users to combine spreadsheets and other data sources outside the semantic layer with operational data.

Any data preparation work that needs to be done to combine data sources (e.g., converting “customer_ID” to “customer”) is done on the fly, instead of forcing IT to standardize terminology across the organization.

Additionally, users can develop their own data models during analysis, instead of being bound to the data model encoded in the semantic layer. This allows greater flexibility for sophisticated queries that depend on blending data from multiple sources.

Capabilities of Data Discovery Software

There’s a wide range of data discovery platforms, meaning that listing specific features is pointless. Instead, let’s take a quick look at the broad capabilities that define these solutions:

(Graphical) front end for data manipulation

Allows for data access and manipulation via visualizations of data sources and patterns in data. Instead of writing a query, you can simply click on a wedge of a pie chart to drill down, or choose a heat-map visualization for your data.

In-memory processing

Processes data by storing it in RAM (random access memory) instead of writing it to disk. This gives them the processing power to blend massive data sets on a user’s laptop, instead of doing the blends in the database as traditional BI tools do. See our data blending report for more details.

Big data connections

Supports direct connections to data sources, instead of confining access to sources within the semantic layer. Support for flat files (.xlsx, .csv etc.) is nearly universal, as is support for SQL databases. Beyond that, the range of data sources a tool can connect to is generally a point of competitive differentiation.

Data cleaning/preparation

Offers features for cleaning and preparing data, since analysts can’t rely on pre-integration of data sources via a semantic layer. These features are for normalizing dimensions, removing trailing spaces, testing the accuracy of joins etc. on the fly.

Note: Several of these definitions of data discovery capabilities were adapted from Gartner research reports, specifically What Data Discovery Means for You by Joao Tapadinhas and Dan Sommer (available to Gartner clients).

Types of Data Discovery Tools

Data discovery has been an emerging market for at least a decade, but instead of solidifying around a core set of concepts and features, the market has continued to evolve.

Data discovery functionality has also been added to traditional systems that use semantic layers, though such systems will still be overkill for many small businesses.

There are essentially three categories of data discovery solutions currently on the market:

  • “Search engine”-like tools for textual searches of data

  • Visual interaction tools that provide a graphical front-end for data manipulation

  • “AI”-based tools that do the bulk of the pattern recognition for you

Visual data interaction tools are analytics tools that directly access data sources instead of going through a semantic layer. They allow users to process massive datasets on their laptops (via in-memory caching engines) and spot patterns using a visual interface.

Data-visualizations-in-data-discovery-tool-Tableau

Data visualizations in Tableau

The point of a visual data discovery tool isn’t simply to crunch numbers and then output pretty charts and graphs, which can easily be done with Excel and Powerpoint. Instead, these tools are for interactive manipulation of data via visualizations.

For example, you can click on a particular city in a heat-map to begin analyzing sales just within that city’s stores. You can then add another dimension to your map—say, aggregate payroll expenses per store—to blend sales and payroll data and spot new patterns.

As you click on visualization elements and drag and drop dimensions and measures into your visualizations, an engine within the data discovery tool translates your gestures into SQL queries. Changing the visualization automatically refreshes it with newly processed data from your databases.

These tools thus allow for highly interactive and sophisticated database querying without forcing users to learn SQL. Moreover, they allow users to access and blend data from multiple data sources that haven’t been integrated via a semantic layer.

Visual data interaction tools are thus known as “self-service” BI tools, since business analysts can get the data they need and analyze it in the ways they want without involving IT in the workflow.

Originally, visual data interaction tools were designed to supplement the capabilities of an existing BI system. As they’ve evolved, however, they’ve incorporated more and more of the capabilities that used to be found only in traditional systems. Many organizations—especially smaller ones—are now exclusively relying on this form of data discovery as their dominant analytics platform.

Visual data interaction tools make up the bulk of the data discovery market, and frequently data discovery is used as a synonym for business analytics via interactive visualizations.

“Search engine-like” tools are a niche category in data discovery. They’re specifically for performing keyword searches of large collections of files, and they feature an interface similar to that of web search engines such as Google and Bing. Search-based tools harness text mining technology to allow users to search keywords within files and documents:

Data-discovery-using-keyword-searches-and-word-clouds-in-WebFOCUS

Data discovery using keyword searches and word clouds in WebFOCUS

Search-based tools are clearly not the best choice for dealing with numerical values, which are, of course, absolutely central to business analysis. Instead, this form of data discovery is used by organizations with massive collections of unstructured textual data (surveys, documents, presentations, product literature etc.) sitting in numerous data siloes.

Without search-based data discovery, employees may never be able to track down the documents they need on their own. These tools thus enable better information-sharing, at the same time cutting down on the time that information “gatekeepers” have to spend tracking down documents for co-workers. Most small businesses won’t need them.

“AI”-based tools. Visual data interaction tools can be used to support pattern via machine learning (or “AI” in layman’s terms). Generally this requires integration with a variety of other tools and technologies ranging from the statistical programming language “R” to Apache Spark (a framework for programming machine-learning algorithms in cluster computing environments).

“AI-based” data discovery tools directly leverage machine learning to spot patterns for users, instead of enabling users to spot patterns themselves through visual analysis. These tools then output visualizations and can even express the patterns they find in narrative form for users (for example, they can output a sentence stating “Q4 revenue down 2.1 percent in Kentucky branch stores served by X, Y and Z distributors.”

Don’t assume that a HAL 9000 will replace your analysts anytime soon, however. Human beings still need to vet the patterns to make sure that they’re truly significant, and once a pattern has been spotted, users can continue to refine the analysis by asking new questions of the tool, similar to the workflow in a visual data interaction tool.

Examples of “AI”-based data discovery tools include IBM Watson and Salesforce BeyondCore. This is still an emerging market, and while promising, these solutions are too expensive and technologically immature for SMB users at present. Most SMBs will be better served exploring the wide range of visual data interaction tools on the market.

Note: Several of these definitions of categories in the data discovery market were adapted from Gartner research reports, specifically What Data Discovery Means for You by Joao Tapadinhas and Dan Sommer (available to Gartner clients).