Predictive analytics involves using statistical tools to analyze data to determine the probability of future outcomes. It’s the branch of big data specifically focused on forecasting the most likely result, given a certain set of conditions.
For example, retailers use predictive analytics to determine which other products might interest a customer based on purchase history. Financial firms use predictive data analytics to capitalize on trends in the financial markets. And utilities use predictive algorithms to determine the likely affect of upcoming weather patterns on their customers — and to forecast their company’s level of readiness.
It’s important to note that, like any big data technology, predictive analytics does not function like a crystal ball. You cannot spin up a cloud analytics instance that will tell you with 100 percent certainty who is going to win this year’s NCAA March Madness basketball tournament. However, you could use that cloud computing service to determine a percentage of likelihood that a given team would win. Predictive research cannot “tell the future,” but it can make your guesses and hypotheses a lot better.
The process of applying predictive analytics to a given problem looks a lot like other big data analytics projects. First, companies must collect a large quantity of data related to the question at hand — the more data, the better. Then, they clean that data and get it into a format that allows them to use their predictive analytics tools. They often use a data integration strategy to combine inputs.
Next comes the process of developing and training a model. It might be a classification model, which sorts like things into groups, or a regression model, which assigns a number or a score to a set of variables. Organizations might use traditional statistical techniques and sophisticated data mining to generate these models, or they might rely on today’s more advanced machine learning techniques.
Once data scientists have developed a model that they think will perform well, they deploy it into production. They then monitor its performance and make incremental improvements to the model so that it will become more accurate over time.
Predictive Analytics Trends
Predictive analytics isn’t actually all that new. After all, you can think of credit scores as the result of predictive analytics that attempts to determine the likelihood that a person will default on a loan — and credit scores have been around for years. Vegas casinos have long used a form of predictive analytics to help them decide what kinds of odds to offer on various bets.
What is new is the quality of technology available to do predictive analytics, as well as the vast quantities of big data informing those analytics. Predictive analytics uses often uses a data warehouse and both structured and unstructured data.
On the infrastructure side, the last decade has seen the development of ever more powerful graphics processing units (GPUs). Although originally created to improve computer graphics capabilities, GPUs excel at the kinds of parallel processing activities necessary for predictive analytics and machine learning.
At the same time, storage prices have continued to drop, making it more affordable to store large quantities of data. And improvements in networking speed and the advent of cloud computing all made it possible — and much less expensive — for organizations of all sizes to run advanced analytics.
On the software side, companies have created more powerful machine learning, data mining and analytics tools that take advantage of these hardware advances. Today’s applications are capable of performing very complex analysis on very large data sets.
Currently, the leading big data trend in predictive analytics is the eagerness with which companies are adopting it. Surveys indicate that as many as 90 percent of businesses are interested in using predictive analytics. Business leaders believe that predictive analytics could help them glean valuable insights that could help then gain competitive advantage. However, actual production deployment of predictive analytics remains slow, meaning that a few companies are getting ahead of their industries, while the rest struggles to catch up.
Other key predictive analytics trends include the following:
- Greater use of artificial intelligence and machine learning: Enterprises are so interested in artificial intelligence and machine learning that vendors have rushed to slap these labels on their products — sometimes even when their products don’t have true AI or machine learning capabilities. Still, the number of advanced AI and machine learning products on the market is growing, as is the number of companies deploying them.
- Artificial neural networks (ANNs): One of the most popular AI technologies for predictive analytics are artificial neural networks. Designed to mimic the human brain, these tools rely on a layered architecture that yields incrementally improving results.
- Industry solutions: The first wave of predictive analytics brought a lot of general-purpose tools to the market. These tools were powerful but often required a data scientist to operate them well. The next generation of tools is more tailored for the needs of specific industries or specific use cases, and they are designed to be used by ordinary business professionals who don’t have specialized or advanced degrees.
- Cloud computing: The advanced hardware and software necessary for predictive analytics is generally much more affordable when purchased as a cloud computing service. And the top vendors like Amazon Web Services, Microsoft Azure, Google Cloud and IBM Cloud are locked in a race to offer increasingly advanced tools.
- Edge computing: While cloud dominates predictive analytics, some jobs are moving out to the edge of the network, particularly in Internet of Things (IoT) environments. In some cases, analyzing data where it is generated and transmitting only the most important insights back to the cloud can prove more efficient and effective.
- Real-time dashboards and visualizations: Older predictive analytics tools relied on historical data and batch processes, but the ability to incorporate streaming data is becoming increasingly popular.
- DataOps: Data management teams are starting to adopt the DevOps and microservices practices that have transformed other parts of IT. As data pipelines become more complex and more critical to business success, enterprises are looking for ways to improve collaboration and streamline processes.
- Improved governance: The combination of high-profile data breaches and the implementation of GDPR has enterprises more concerned than ever about privacy, security and compliance. More enterprises now have a chief data officer (CDO) who is responsible for overseeing big data efforts, including predictive analytics.
Related Technologies
Predictive analytics is closely related to several other big data technologies, including machine learning, AI, data mining and prescriptive analytics. Here are the differences among the terms:
Predictive analytics vs. machine learning
Machine learning is the branch of computer science that gives systems the ability to learn without being explicitly programmed. These techniques can be very helpful in building, training and improving predictive models. However, machine learning can also be used for tasks other than predictive analytics.
Predictive Analytics vs. Artificial Intelligence
Machine learning is a subset of artificial intelligence, which attempts to create systems that are good at the kinds of thinking and tasks that humans have traditionally been better at than machines. For example, voice recognition, image recognition and robots are all examples of artificial intelligence. As with machine learning, artificial intelligence can be used for creating predictive analytics models, but AI can also be used for many, many other things.
Predictive analytics vs. data mining
Data mining is the process of finding patterns and relationships in data. It is often part of the predictive analytics process. Organizations will use data mining to find patterns in historical data, and then predictive analytics goes the next step of using those patterns to forecast what is likely to come next. Companies can use data mining to help with predictive analytics, or they can use data mining alone.
Predictive analytics vs. prescriptive analytics
If you think of predictive analytics as taking data mining to the next step, you can think of prescriptive analytics as the next step beyond predictive analytics. Prescriptive analytics not only tell you the likelihood of future outcomes but also the likely results of actions you might take in reaction to those future events. Essentially, it tells you both what might happen and what you should do about it. Few prescriptive analytics solutions are on the market today, but many industry watchers believe they will become more common in the coming years.
Predictive Analytics Examples
Predictive analytics examples are numerous across industries as diverse as health care, entertainment, financial services, manufacturing, education, retail, transportation and many others are all using predictive analytics. Even local governments are getting into the act with predictive analytics tools that can forecast likely crime locations or identify children who are at high risk for abuse.
Predictive Analytics Techniques
Data scientists and business analysts use both traditional statistical techniques and more advance predictive algorithms for predictive analytics. For example, they might use decision trees, linear or logistic regression, neural networks, Bayesian analysis, gradient boosting, partial least squares and many others.
Predictive Analytics Tools
A very long list of startups and more established vendors offer predictive analytics tools. Some of the best known include SAS, IBM, Knime and Rapid Miner. The upcoming Datamation Predictive Analytics Tools page will have many more options and details about the various products.