This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
A major national retailer suffered periodic outages in its gift card processing application, preventing cashiers from processing customer cards for up to an hour or more. Each time an outage occurred, experts in applications, VMware, SQL Server, Windows Server and networking would spend hours poring through mountains of system logs and making phone calls to tech support lines, without any success in finding the root cause.
Then IT turned to a predictive analytics solution to churn through gigabytes of application-related system and network log data. The discovery: application/database connection errors correlated in time with a network spike on two of the application's three VMware hosts. With the information provided, IT was able to trace the root cause to a VLAN misconfigured to run both application traffic and VMware's vMotion function. Every time vMotion kicked off a VM move, it flooded the network and prevented the gift card application from accessing the database.
This is a prime example of the power of predictive analytics for solving seemingly intractable application performance issues. But the real beauty of predictive analytics for IT is that once it's running it can actually discover and provide the information to address application performance issues before they're even noticed. This is important for mission-critical applications, because it can prevent loss of revenue and customers.
Today's composite web-based applications have scores of dependencies, including web, application and database servers running on virtualization hypervisors, not to mention fraud prevention services, legacy applications, and all the hardware and network infrastructure running underneath.
In such a complex environment, tracing the root cause of a performance issue can be an overwhelming task. The siloed nature of IT expertise doesn't help, as there are few people with the experience and an overall picture of system relationships and event chains to discover anomalies that indicate trouble. Unfortunately, the KPIs and thresholds IT depends on are hit-or-miss predictors of performance issues as well, often generating too many false alerts and then missing the problems that do occur.
What IT teams really need to know is how all those system relationships and event chains underlying applications and transactions function normally and what changes indicate a budding performance problem. Predictive analytics solutions for IT are geared just for this purpose.
Predictive analytics use a variety of techniques, including machine learning, modeling, and data mining, to predict events based on current and historical information. The solutions are used for many purposes today, including predicting customer behavior and detecting fraud.
For application performance, predictive analytics use machine learning and big data analysis techniques to ingest and analyze mountains of system and infrastructure machine data in order to baseline typical application component relationships and event chains and then detect anomalies that predict a performance issue. Ideally, solutions can do their powerful analysis either on historical data or in real time. In the case of the latter, they can actually predict and help address performance issues before they have any business or productivity impact.
The beauty of the best predictive analytics solutions is they don't require the lengthy, complex installation and configuration process of most of today's application and network management tools. Nor do they rely on the user to make judgments about which KPIs and thresholds to monitor.
Instead, they use their complex machine learning algorithms to learn about your systems and all the underlying relationships and event chains themselves. Then further algorithms predict the probability of various events and alert you when those that are highly improbable happen.
Unlike KPIs, which may alert you a thousand times to some minor abnormality in one system, predictive analytics tools look at abnormalities across different systems that typical thresholds and alerts often miss. And because they learn about systems over time, they can continually self-adjust to legitimate changes in your IT environment. With typical management tools today, the user has to make those changes.
Aside from predicting and addressing performance issues, predictive analytics solutions are great at providing insight into system relationships and problems you never even knew existed. They've been called the "Donald Rumsfeld of application performance management" because they discover the "unknown unknown" behaviors in your application infrastructure.
A recent survey by TRAC research found 60% of IT organizations report a success rate of less than half when it comes to preventing performance issues that have an impact on end users. It also found that, on average, 46.2 hours are spent each month on "war room" scenarios.
Many of these organizations already have application performance management and numerous other management tools deployed. The addition of a predictive analytics solution can raise that success rate substantially and reduce or even eliminate those time-consuming, resource-intensive war room scenarios. And in doing so, they can improve customer satisfaction and allow IT to spend less time troubleshooting performance issues and more time on strategic initiatives that add value to the business and improve its competitive position.
Jaffe is a serial software entrepreneur who has been instrumental in the success of numerous software companies.
Read more about infrastructure management in Network World's Infrastructure Management section.