Predictive Maintenance: How Machine Learning Models Help Reduce Prevent/Repair Costs and Break Limitations of Traditional Maintenance Approaches

On the huge promise of predictive maintenance.

In industry, machinery downtime can lead to penalties, degrade a company’s core operations, and cause dire reputational damages.

It’s essential for both small enterprises and the Walmarts of the world to have a well-rounded, well-tested maintenance strategy in place to reduce the likelihood of sudden outages or breakdowns happening and mitigate all associated risks.

A proper approach to maintenance can help enhance, quickly, the overall reliability and performance of a company, while also substantially reducing the operational costs it incurs.

In 2018, Amazon had an outage that lasted for about 60 minutes and lost nearly $100m in sales as a result.

The recent proliferation of connected technologies and predictive machine learning algorithms has had a profound effect on how companies conduct their equipment management; these recent advancements have enabled firms to ditch, almost completely, the practices of traditional reactive (RM) and preventive (PM) modes of maintenance and start embracing data-driven predictive maintenance (PdM) instead.

In this article, we’ll explain accessibly the key differences between RM, PM, and PdM and discuss some of the AI models commonly used for condition-based machinery monitoring.

So, What’s Predictive Maintenance (PdM), Preventive Maintenance, and Reactive Maintenance?

So, What’s Predictive Maintenance (PdM)
Reactive Maintenance is when each piece of equipment at a plant is run to failure. The most appealing aspect of the method is that you get to achieve maximum utilization (receive maximum production output) and not spend a penny on maintenance until the machine fails to operate.

However, the costs of repairs after crashes tend to outweigh machines’ production output and, besides, a faulty, overheating piece of equipment can compromise the operability of other elements in the infrastructure, which might lead to additional damages.

The firms practicing RM must always either have extensive inventories of parts and components available or rely on a vendor in the vicinity who can deliver equipment promptly and thus help avoid critical operational disruptions. Neither of those options is very cost-effective.

Preventive maintenance resolves some of RM’s key issues. Using this approach means performing regular checkups on specific pieces of equipment and routinely replacing their components to decrease the probability of unexpected failures.

Those using PM must schedule and conduct maintenance activities even on fully operational devices, so, occasionally, they will spend time and resources on repairs that are completely unnecessary.

In short, the PM process consists of two steps:

  1. an investigation of equipment failure characteristics (time-series data is typically used for this)
  2. an attempt to maximize both the reliability and availability of machinery through the implementation of optimal maintenance policies.

The main difference between these two approaches is that the firms using RM experience operational disruptions due to outages and failures and thus they can’t predict how much they’ll have to spend to fix the unexpected damages, but those relying on PM can plan their downtime.

The latter method helps reduce the costs of corrective activities and replacements (or even avoid them altogether) but you will still incur costs associated with regular inspections, preventive replacements, etc.

Predictive maintenance (data-centered method). The goal of PdM is to predict, with as much precision as possible, when a piece of equipment is going to fail, help pick proper maintenance measures and achieve the optimal trade-off between the cost of repairs and maintenance frequency.

In this method, the data from a variety of sensors – vibration, heat, ultrasonic data, thermal images, etc. – is fed to a predictive AI model that identifies trends and patterns within it and helps determine the optimal time for repairing (or retiring) the machine or its components.

When implemented properly, PdM allows you to prevent machine crashes (and avoid subsequent repairs) without having to spend money on maintenance/replacement until such measures are necessary.

PdM aims to cut both planned and unplanned downtime, increase operational efficiency while reducing maintenance costs. The downside is, it requires purchasing equipment monitoring devices and setting up and managing a complex data gathering/analysis infrastructure.

Rolling Out a Predictive Maintenance Program, Our Approach

Again, PdM is when your equipment and assets are monitored through sensors that produce data (vibration, temperature, acoustic, etc.) that informs your maintenance managers about their exact condition. And then this data is plugged into a predictive machine learning model that helps you identify trends and patterns within it to calculate the optimal time to run maintenance activities on a piece of equipment, without incurring unnecessary costs.

When working with our clients on PdM implementation, the first thing we do is establish the baselines; it’s crucial to define the low-limits and high-limits of the optimal operating conditions of the machines before we move forward.

Next, we install the appropriate IoT devices (for example, we slap temperature sensors onto a boiler or a vibration meter onto a piece of mechanical equipment that has gears at a plant) collect a sufficient amount of sensor data, which might take some time, and connect them, through a gateway solution which we pick after a careful research, to a sophisticated analytics platform; the software will help us derive actionable insights from the data, which is the whole point of PdM.

A machine learning model, which we’ll also choose and deploy carefully, will allow us to spot baseline breaches, identify events that precede them (and when those are likely to occur), create a proper work order and schedule maintenance activities at an optimal time. This means you’ll get the most out of your equipment and only spend money on preventive repairs when necessary, there’s not going to be anymore unneeded, routine part replacements.

By doing all this, we’ll also create a feedback cycle that will help the AI algorithm to learn continuously through the data from sensors and maintenance activities performed on the equipment how to spot when machines are performing outside their normal operating conditions quicker and make more accurate predictions over time.

Predictive Maintenance – Machine Learning Methods

Predictive Maintenance – Machine Learning Methods<

Numerous methods, both from traditional ML and deep learning, have been applied to tackle PdM tasks. Here are some of them:


Support Vector Machines – a widely known supervised method that creates representations of data objects as points in space in a way that instances belonging to different categories are separated by a wide gap. Then, when new data points are introduced, the algorithm assigns them to one of two classes – positive or negative – based on the side of the gap they fall into.

SVM is easy to implement, excellent at generalization, and provides high classification accuracy, which makes it appealing for fault detection and remaining useful life (RUL) estimation.

In this paper, for example, researchers use an SVM model, along with other techniques, to predict the correct maintenance schedule for gears in gas turbines: they represent the time until the next maintenance activity as a response variable. This approach, according to the authors, produces sufficiently accurate predictions for recurring maintenance operations (under normal operating conditions) and enables companies to boost production efficiency while lowering downtime.

Support vector regression (SVR), characterized like SVMs by the use of kernels and VC control of margin, is also useful for industrial purposes: it can help firms figure out how much error is acceptable in the model and determine the hyperplane in higher dimensions or a line that fits our data. Perhaps not as popular as the standard SVM method, SVR too has been employed widely for fault prognosis. Khelif et al. propose an SVR-based remaining useful life estimation method that relies on data from sensors and doesn’t require preliminary estimations of degradation states, health indicators (which is typically necessary), or failure thresholds. In this scenario, SVR is used to model explicitly the relationships between sensors’ values and health indicators and it allows estimating RUL at any point in the degradation process.

K-NN. K-nearest neighbors, one of the most extensively used techniques in machine learning, assigns unseen instances of data to a category with K most similar data objects (a sort of majority vote is formed). The similarity can be defined by Euclidean, Manhattan, Chebyshev, or other distance measures, depending on what’s more suitable for a given dataset.

The method, too, has promise for fault diagnosis and RUL estimation. This work by Xiong et al. explains the effectiveness of an innovative information fusion method, based on KNN and dimensionless indicators with static discounting factor, for fault diagnosis in petrochemical rotating machinery. The researchers use evidence reasoning to process the accuracy and uncertainty of the information with the help of the K-NN algorithm and utilize static indicators and to turn the equipment’s input signals into the reliability of the structure framework. The approach, according to the paper, can drastically reduce the impact of unreliable factors in the fusion results and thus enable better decision-making in regard to maintenance planning.

As for K-NN based RUL prediction, this paper proposes a novel predictive model, which the authors call Volterra k-nearest neighbor optimally pruned extreme learning machine (OPELM) prediction model (VKOPP.) In the paper, they apply the method to insulated gate bipolar transistors RUL estimation and degradation trace identification. What they do exactly is utilize the Volterra series combined with minimum entropy rate to reconstruct phase space for the equipment’s aging samples and then use a combination of K-NN and least-squares estimation (LSE) to determine the output weights of OPELM and predict RUL.

The approach shows superior performance – a smaller error rate and higher prediction accuracy – compared to most traditional RUL estimation methods while being substantially more cost and time-efficient.

Deep Learning

Auto-Encoder. AE is a neural net that uses encoding and decoding phases when mapping inputs to outputs. At first, the input is mapped by the encoder to the model’s hidden layers, (where low-dimensional, non-linear representations of the original data points are created) and then the compressed version of the data is reconstructed back (as close to the original input as possible) by the decoder. AE training is essentially the process of tuning the model’s parameters to achieve minimal reconstruction error.

Typically combined with some other classification algorithms, AEs are utilized often for degradation process estimation as well as other PdM tasks; they provide accurate estimates of the distance between a machine’s state and its optimal, healthy condition.

Here described is an AE-based method for machinery fault detection: a deep learning model is trained to choose impulse responses automatically from long-term vibration signals and then the health index ( based on the Cosine distance measure) is utilized to determine the similarity between dynamic feature vectors.

The method has helped the researchers identify early the indicators of gradual degradation of equipment and, unlike many similar approaches, it performs well under time-varying conditions.

And here, a stacked sparse autoencoder method (combined with a logistic regression operation) is proposed as an effective method for RUL prediction in aircraft engines. The AE’s role is to extract useful features (in terms of performance degradation measurement) from multi-sensor data with complex correlations and fuse them through multilayer self-learning. The logistic regression is then employed to calculate the remaining useful life of the equipment and, as per the paper, the algorithm can reach up to 83% prediction accuracy.

Recurrent neural networks. RNNs, the classic ones and their enhanced versions (LSTMs, GRUs, etc.) have been designed specifically to process sequential data. They build connections between their hidden units and preserve information from previous inputs.

These networks’ unique architecture makes them suitable for fault diagnoses and RUL estimations; RNNs are superior to almost all other methods used for sequence learning problems.

In this paper, the researchers use a deep recurrent neural network for fault diagnosis in rotating machinery. The model is created by grouping together stacks of recurrent hidden layers and trained to extract features automatically from spectrum sequences (that have been fed to it as input.) Afterward, a softmax classification algorithm is applied for fault identification. The method shows better performance compared to other intelligent fault prediction techniques.

Similarly, this work proposes a GRU (gated recurrent unit) based fault diagnosis method that consists of three stages:

  1. Dividing raw data into sequence units (moving horizon is used as input for the model)
  2. Creating a GRU deep network and training it through batch normalization to have it derive dynamic features from sequence units while not being impacted by covariance displacement.
  3. Applying softmax regression for fault recognition (based on dynamic features).

Again, the results of the research show the method’s vast superiority over conventional approaches.

LSTMs, which we’ve covered in detail in this article, have also been applied successfully to both fault recognition and equipment RUL estimation. In this research, a bidirectional Long Short-Term Memory is shown to be capable of making full use of sensor data sequences (bidirectionally) and delivering better performance, in terms of forecast precision, than the conventional methods for intelligent RUL estimation. Unlike the traditional algorithms, the model can quickly expose hidden patterns in the complex data that involves multiple degradation models, working conditions, and fault patterns.


A company’s maintenance strategy reflects on its ability to achieve optimal performance, be flexible, competitive, cost-efficient, and deliver high quality of goods. PdM, which we’ve covered on a high-level in the article, is a novel paradigm that proposes only doing maintenance activities after an analytical model predicts degradations and/or equipment failures. By assisting companies with leveraging machine learning models, traditional ones as well as neural nets, we help them realize all the benefits PdM has to offer.

If you, too, are looking into employing AI techniques for equipment availability and reliability maximization, operational cost reduction, and multiobjective optimization – reach out to our expert right now for a free consultation.


Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *