AI is adept at many tasks, but reading social cues isn’t always one of them. It’s notoriously bad at understanding nuance, which results in misguided yet funny failed attempts.
That’s because it’s hard to teach machines what people really mean. Until someone builds formulas for sarcasm detection, training AI systems to detect emotions remains a mighty task. Luckily, advancements in text analysis are moving us in the right direction—a 2017 Gartner survey found that 79% of respondents already use or expect to use text analytics (full content available to Gartner clients).
What is text analysis?
Text analysis is the process of finding information from text sources, including emails and survey answers. These sources are unstructured data—that is, any data that’s not stored in a fixed format.
Text analysis involves reading unstructured data from a range of sources with the goal of finding business insights—processes your colleagues currently do by hand that you can automate for faster results. To achieve this, text analysis in businesses often takes one of five key forms:
- Summarization: Trying to find key content across either a range of sources or a single document
- Sentiment analysis: Assessing the tone, intent, and social context that’s relevant to a document
- Explicative: Finding the reason for said sentiment analysis in a given document
- Investigative: Reviewing the sources of a specific issue
- Classification: Confirming the subject(s) that a text source discusses
As mentioned above, text-based data is unstructured by nature. Unlike databases and log files—which are strongly modeled, and therefore structured—text data doesn’t have numerical value. That makes it much tougher to find the insights you need manually.
So, text analysis software that finds the data for you within unstructured sources is a huge value-add. Technology that can find key content that you need, then investigate and summarize it saves hours of manual labor.
And since text analysis captures sentiment, you can use it for a range of business needs, from modeling intent to expediting group decisions.
To start your search, here are four free and open source text analysis tools. To earn a spot on this list, each tool’s source code must be freely available for anyone to use, edit, copy, and/or share. You can learn more about how we chose which tools to include in our methodology below:
4 Free and Open Source Text Analysis Software
Best for: Businesses that want a text analysis API for Google Sheets.
Aylien text analysis is a cloud-based business intelligence (BI) tool that helps teams label documents, track issues, analyze data, and maintain models. It also allows users to extract meaning from content within public datasets. (Available on a monthly subscription.)
Aylien’s text analysis API integrates with tools including Google Sheets. Although Aylien discontinued it, you can still paste a pre-written script into Google Sheets and call the Text Analysis API without any code. Once you’ve pasted the script into the Script Editor on the Tools tab, you can save and enter said script into as many cells as you’d like.
Aylien’s text analysis extension offers language detection, hashtag suggestions, sentiment analysis, and more. It supports raw text and URL as inputs, and can remove ads to extract only the main text on a website.
Best for: Medium to large companies who want to analyze customer sentiment in English and French
Keatext analyzes large amounts of unstructured data collected from several sources. Users can share their data with Keatext team members, who upload it to the platform on your behalf. Worried about losing access to the data you hand over? Fear not – Keatext gives you access to the platform for 4 weeks, and you can download reports for future use.
Keatext is ideal for teams who want to analyze sentiment without setting up and maintaining a new developer environment. The tool prides itself on grouping customer feedback into one of four buckets: Praise, Problems, Suggestions, and Questions. It also alerts users to changes in sentiment, and sentiment towards any new actions you’ve made.
For those on the fence about open source software, Keatext offers strong customer support. Guides, infographics, tutorials, and API documentation are all available on its website. The site also has a help center which lets you search for articles on how to use the tool.
Best for: Experienced dev teams who want an on-premise text analysis tool.
KNIME offers on-premise data analytics tools for small to large business. It uses the Apache Spark framework to let users build machine learning (ML) models to automate regression, classification, cluster analysis, and more.
KNIME’s text processing tool offers natural language processing (NLP), text mining, and information retrieval. Documentation on the six steps involved is available, as are tutorials for using custom tag sets.
But unless you’re using KNIME’s analytics platform, its text processing option will have limited value. Its support and documentation don’t surpass those of competitors, and teams who don’t work in engineering will face an especially large learning curve.
Best for: Financial enterprises who want text analytics plug-ins for WordPress or Drupal sites.
Refinitiv offers five deployment options based on business needs. Its open calais package is free and handles up to 100KB each of HTML, XML, and raw text. It also handles up to 5k submissions per day.
If you host a WordPress or Drupal website, you can install Laiser Tag Plus WordPress plugin or the Drupal Open Calais plugin, respectively. Both plugins integrate Intelligent Tagging output into your site.
If you want to test the Intelligent Tagging tool, you can upload your own content to the Live Demo. Since Refinitiv’s specialty is financial topics and themes, it’ll assign those to the unstructured data that you load into the tool. It helps you tag the subjects (whether they be people, places, facts, or anything else) within your dataset. You can access Q&A, tutorials, documentation, and more after creating an account.
One final note:
To earn a spot on this list, each tool above needed to offer source code that’s freely available for anyone to view, edit, revise, etc. But many tools on this list, along with several more, offer even more features for paid customers.
We know that cost is a key concern when shopping for software. That said, investing in the right tools can save your business lots of money in the long run. To get started with text analysis, input some data into one of the free tools above.
Then, assess the results: Do you agree with the sentiment suggestions each tools makes? Did they help you discover any changes in sentiment that you might have missed?
If you’re impressed, it’s worth assessing BI software that has text analysis and additional features. Most software vendors offered tiered pricing plans that let you scale up as needed. And in an era where we create 2.5 quintillion bytes of data each day, having a strategy and software to manage data isn’t optional.
Gartner predicts (available to clients) that by 2020, 50% of businesses will lack the AI and data literacy skills needed to build business value. Using the right text analysis tool is a great way to stay in the fast lane.
We included text analysis products in our software catalog that both:
- Offered a free, stand-alone version of the software (not a trial version of the software that requires you to purchase the product after a limited amount of time), and
- Met our text analysis software market definition:
Text analysis is the process of finding information from text sources, including emails and survey answers.
We determined the most popular products to feature by choosing those highest ranked in Google search results during the week of September 16 – 20, 2019.