Predictive Maintenance with MATLAB A Prognostics Case Study

Predictive Maintenance with MATLAB A Prognostics Case Study



hello everyone and welcome to today's session on predictive maintenance with MATLAB my name is Adam Fillion and I'm an application engineer at the math works in today's session we're going to take a look at how we can monitor the health of our equipment and forecast when to do maintenance so that we can avoid unexpected failures before we begin I have a few logistical details for you if you have any problems seeing or hearing today's webinar please enter those in the chat tab and the webinar host will help you sort through them if you have any technical questions on the content of today's presentation please enter those in the Q&A panel at any time at the end of the presentation we will go offline for a few moments compile those and then come back online and answer as many as we have time for I'd like to start today by showing you a quick example of what can happen if we don't get our maintenance right so what we have here are some electricity generating wind turbines and one of them is about to have a problem let's take a look at that again so what happened here is there's a braking system in the turbine to keep the blades from spinning too fast during a high wind situation the braking system had a failure and during a storm the blades spun so fast you can see it right here that they actually tore themselves apart from the rotational forces this shifted the center of mass and the result blew this turbine apart obviously this is a huge problem wind turbines cost millions of dollars to build and install but even more so these types of failures can be very dangerous debris was actually found half a mile away from the site of the explosion fortunately in this case nobody was hurt but there was a very real risk of damage to people and property and the main suspect behind the cause of this was that maintenance had not been done often enough so you might ask well why don't we just do more maintenance unfortunately is not that simple not only is maintenance itself very expensive but it can actually be dangerous for the maintenance crews these wind turbines can be very tall the ones you were looking at in that video were 70 meters off the ground so we have to find some way to get our crews up there sometimes we even have to drop them on top from a helicopter and if this is an offshore wind farm where they're installed out in the ocean then all of this gets even more complicated because now we have to balance on top of a boat that is moving and bopping in the water so how can we build some kind of system so that we can both avoid these potentially dangerous failures and also not have to do maintenance too often but we'll take a one quick look at something like that we could build in MATLAB so this is just a simple desktop app that we built from monitoring a jet engine we've got a number of different sensors reading data off of this engine while it's running and then behind the scenes we have a machine learning model that we've trained to understand how far away is this piece of equipment from having a failure in this case we're bucketing it into a couple of different groups there's this have a really long time until failure medium short or is this about to have a failure and really urgently in need of maintenance and so you can see right here it joins the urgent region so we can automate having alarm sent out to different people once that happens and then at the last data point here this is where this particular piece of equipment actually had a failure so we're not going to focus on building graphical front ends today we're really going to focus on how we can use MATLAB to build the analytics necessary to understand this type of sensor data to make some forecasts about the health of our equipment and when to do maintenance and then how can we take that and integrate it with the equipment itself and larger enterprise systems before we get into that I want to take one quick step back to talk about some of the different ways or styles of doing maintenance and why we really want to do predictive maintenance now one way of doing maintenance is by being what we call reactive basically we're doing maintenance once there's a problem and an everyday example of this is your car battery I don't know anyone who even thinks about their car battery until it has a problem at which point we usually just replace it now the main problem with being reactive is that you run the risk of having a lot of unexpected failures which can very expensive and potentially dangerous so what most people do instead is scheduled maintenance where we do maintenance at a regular rate and at everyday example is the advice that we change our cars oil every 5,000 miles however if you're doing maintenance on all your equipment at a set rate regardless of its actual condition then I can pretty much guarantee you that some of the maintenance is going to be unnecessary meaning that it's wasteful and even so you may not eliminate all failures so what many people are now trying to do instead is predictive maintenance where we forecast when problems will arise ahead of time there's many people trying to do this just one example is General Motors now has some car models where the car will actually monitor the battery fuel pump and starter motor and can even email the driver with recommendations on when to perform maintenance so this is an example of how we can not only do predictive maintenance for ourselves but it can even potentially be a service that we provide to others however predictive maintenance isn't without its own difficulties namely that it can be very difficult to make accurate forecasts for complex equipment so this is really what we're going to try to tackle today because if we can get this right the potential benefits are huge we can increase the uptime the availability and also the safety of our equipment we can minimize the maintenance costs and we can also optimize our supply chain because if I know what type of maintenance I need to do and when and where I can make sure that all the right parts are to be at the right place at the right time and what this really leads to is increase reliability of our equipment a lower cost of ownership for the operators and a better reputation for our organization and our products so these are the goals that we're really chasing today so who's really doing this what does success look like I always like to share at least one example of someone who's using MATLAB to do this in the real world and one organization is spec MA so Snecma is one of the world's leading manufacturers of commercial jet engines and these Malaba is a central part of their monitoring system they monitor many different parts of the engine to try to detect failures ahead of time and predict how long until May this is needed they do this with two main goals in mind one of them is to improve aircraft availability such as on-time departures and arrivals and the other is to reduce maintenance costs and they use MATLAB throughout this workflow they started on the desktop doing ad hoc analysis sort of like we're going to do today then they took their amount of applications and they compiled them so they can share them with their colleagues and then eventually took their mallet of application and integrate it into this larger enterprise system they gave a great presentation of our MATLAB virtual conference in 2015 so if you would like to see a talk done by them on how they integrate malim into their large real world system you can actually go on our website and watch that at any time what we're going to do today is we're going to focus on a similar example so we have some data provided by NASA's of public data and it's a sensor data coming from 100 different turbofan jet engines of the same model so these engines have different parts and we have various sensors throughout the engine measuring things like temperatures pressures rotational speeds and so on or if you don't know anything about jet engines don't worry you don't need to we're not doing anything today that is specific to them we're going to treat this sort of like a black box piece of equipment that just has some sensor data coming out of it so the workflow we'll go through we're going to look at quickly importing and visualizing the sensor data we're going to train some machine learning models to try to understand its condition and predict when we should be able to do maintenance and then we'll talk about how we can take what we've done and deploy it to run on live sensor data either in some kind of server or cloud infrastructure or on the actual piece of equipment itself and we'll look at this in the context of two different scenarios so the first scenario we're going to look at is when we don't have any data from failures and failure really just means whenever my equipment starts doing something I don't want it to do this could be something more catastrophic like in the Wood turbot example we saw earlier but for our data set today it really just means the aircraft equivalent of your cars check engine light turning on unexpectedly that's the kind of failures we're talking about today so when we don't have any data from failures typically the situation is we're performing scheduled maintenance and we're performing it often enough that we haven't had any failures happen so that's good but again if you're doing scheduled maintenance this often I can almost guarantee you that if you go talk to your maintenance crews they're going to tell you that many if not most of your engines could run for longer without having maintenance done up so the question is can we be smarter about how to schedule our maintenance without knowing what failures actually look like in data so this is going to be the first scenario we're going to look at today let's go ahead and pop over to MATLAB so I've got a couple of scripts here to help us walk through our examples today you notice here that I'm using what's called a live script this is something new in the 2016 a release of MATLAB it gives me a single place to organize not just code but richly formatted text images equations plots and outputs and so on and if you're interested in learning more I encourage you to check out the website we're going to start today by just reading in and visualizing the data from one engine I always encourage people to start small prototype and then build up from there so I've got engine data from one engine setting into CSV file my current directory so we're just going to read that into a table so if we look at our workspace you can see we've got a table of data here that we've read into MATLAB I can see which engine this is we have a timestamp in this case the timestamp is a flight so each data point is a flight you can think of all the different sensors as sort of like a summary statistic of what happened during that flight but we're not doing anything specific to that way of measuring time you can think of this as just any kind of generic time step and then we've got a number of different sensors throughout the engine we've got 21 in total so the first thing I always like to do when I get some new data is to make a plot of it to visualize it now we've got 21 sensors in this case that's a bit much to put up on the screen at one time so let's make a quick subplot of the first nine sensors you notice I can have this plot either pulled out to the right or actually inline it with my live script along with all the text and code y'all keep it over to the right and you can see that some of these sensors are actually constant throughout from zero which is our first time timestamp where we started recording the data all the way until the one hundred and twentieth 25th flight so that was the pace of the scheduled maintenance in this case we started using it and after 125 flights we took it offline to do maintenance some of these sensors are constant throughout and if they're constant never change well they're probably not too useful for understanding what's the condition of this piece of equipment so we probably want to throw those out some of the other sensors they are changing over time but they look a little noisy and in fact if you go read the paper that NASA published along with the dataset you know that these do have a little bit of random noise with them that we probably want to smooth out so we'll start by just picking out the variables from our table that we want to keep and we'll use a simple moving average trailing moving average to smooth out some of the data and so now let's take a look at the result so you see here on top this was kind of the original data so we threw out these constant sensors and then the other sensors we smoothed out a bit so obviously there's a lot of pre-processing work that we can do in MATLAB metal is very powerful for that but let's say that this is enough for what we're doing today the question now becomes how can we use these sensor signals to help us understand what's the current state of our equipment and how can we use them to help us make better decisions about maintenance without knowing what failures actually look like well one of the traditional ways of monitoring this type of equipment is using something called a control chart now in a control chart we have a center line this is ideally what we would like our data to be at we have upper and lower control limits which is the limits for what our sensor data should stay within and then if our sensor over time if it goes outside of that well we might mark that as a potential problem and if you have a small number of sensors and you have a good idea for how to set these upper and lower control limits then this might work just fine but for us today once we throw out the constant value sensors we have 14 different sensors that are left that means we would have to have 14 different control charts to set up and monitor we also don't have a good idea for how to set these upper and lower control limits so to help us with our problem today I'm going to briefly introduce one class of modeling called machine learning so machine learning has become very very popular for a certain class of problems namely problems that have too many variables to easily understand in this case we have 14 sensors right now I consider that to be too many and our system is too complex to know the governing equation so I can't sit down from first principles like Newton's second law and derive out an equation that describes how this thing is going to behave and you see examples of machine learning coming up everywhere in speech recognition and image recognition such as tagging faces and financial analysis and certainly lots of places in engineering machine learning can be broadly split up into two different subgroups what's called supervised learning and unsupervised learning supervised learning is all focused around predictive modeling around making forecasts so the idea here is we have a historical record of both inputs and outputs the inputs is going to be what's readily available to us for us that's the sensors we get off of our engine we also need a historical record of outputs of correct answers and this is important because if we have correct answers and then we can supervise our model while it learns so we can see for this type of what I need to give this type of output and for this different type of input I need to give this different type of output in our example the output would be data from failures and we don't actually have that right now we'll look at it a little later but for the scenario we're currently in if we don't have data from failures then we can't actually use supervised learning we have to use the other kind what's called unsupervised learning so an unsupervised learning we're actually not trying to forecast anything what we're doing is we're looking at the natural structure of the data to see if there's any sort of groupings or interpretations that we can make to inform us about our system and so this is what we're going to start with today now as many different types of unsupervised learning out there we're going to look at just one of the more common ones which is called principal components analysis so principal component analysis tries to summarize as much information as possible on a lower dimensional set so it's trying to reduce the number of dimensions that we need to work with let's consider a simpler three dimensional problem here where we have this sort of cigar-shaped cloud of data points the first thing principal component analysis does is it asks what straight line drawn through the origin will explain the maximum amount of variance or spread in the data so get something like this and this is called the first principal component by moving along this straight line I can explain more of the data than by moving along any other straight line but we haven't explained everything so the next step is to ask what straight line orthogonal to my first principal component will explain the maximum amount of variance that's left and so we might get something like this and so what this really gives me is a new set of coordinates a new set of axes in which I can measure and visualize my data so I think of this as rotating my axes such that they point in the directions in which the data varies the most and that will then let me summarize as much information as possible in a smaller number of dimensions so let's take a look at how we could do this in MATLAB now there's two things we're going to do first one is we're going to read in all the available data that we have so right now we were looking at just one particular engine but we actually have a hundred different engines organized in this folder here and I'd like to read all hundred of them in and we're going to use something called a data store so this gives me a really easy way to read in data from multiple files it even has tools to help me if the data is too big too often into memory at once but for today it will all fit so we'll use a simple read all command to read all the data from the data store into MATLAB and now that we've read all the data in and reapplied the smoothing to get rid of the noise now you can see the result and now you can see this is all 100 engines plotted on top of each other and now there actually isn't really any clear signal here there's really just kind of a range just kind of a bar moving through time and this is one of the things that the principal components analysis is going to help us to understand the other thing we need to do is we need to standardize our data so our sensors are being measured in different scales and in different units so some of these are measuring temperatures some are measuring pressures some are dimensionless ratios and principle component analysis involves measuring the distances between data points so in order for those distances to make sense we need to put these sensors on an equal footing somehow now there's many different ways to standardize or normalize data it really depends on the characteristics of your data and the type of analysis that you want to do in this particular example our signals are fairly close to normal in the normal distribution sense of the word so we're going to give all of them a mean of 0 and standard deviation of 1 but again there's many different ways to do this it really all depends alright so now that we've read in all the data and we've standardized it now we're ready to go and perform our principal component analysis so it's very simple function in MATLAB just PCA on our sensor data and we'll take a look at two different plots so this first plot here this is showing us how much the principle components are explaining so in orange this is the individual principal component and you see the first principal component actually explains over 75% of the total variance in the data so even though I have 14 sensors I have a 14 dimensional data set I can explain over 75% of its variance with just a single straight line drawn through the origin the second principal component can explain about another 11% and if I add the two of them together cumulative Li they can combine to explain almost 90% meaning that even though I have a 14 dimensional data set I explain almost 90% of its variance with a two dimensional point and this is what that data projecting Nativity two-dimensional plane looks like so this is all the data points from all 100 engines only now instead of plotting them in terms of the original raw sensors now we're plotting it in terms of the first and second principal component and so here we can see that it looks like we have maybe one big cloud of data points that are all pretty densely packed but we also have some data points that wander pretty far away from it so from this we might take our first initial gas we might form a hypothesis that we have a lot of engines running for a long time and we know that most of them are in normal conditions most at the time so this cloud of densely packed data points might represent normal behavior however we also know that some of the engines as they approach their scheduled maintenance date started to deteriorate a bit and when our maintenance crews got out there they noted that some of the engines were badly in need of maintenance so some of these points that wander away from the central group those might represent engines that were starting to deteriorate deteriorate and were in need of maintenance so let's say that's our initial guess what can we do to investigate this well the first thing we might do is just to make a kind of a simple plot looking at the first and last data points for each engine so what we're looking at here in blue down here this is the first data point for each engine so the engines are running we start recording data and in blue is the first data point that we get off of each of them and you notice that they all reside in that densely packed region the red data points is the last data point we took from each engine before its scheduled maintenance date some of them overlap and that's actually to be expected if some of our engines are perfectly fine in normal conditions at the point when scheduled maintenance is performed then I would expect them to overlap with the first data points from when we first started recording some of the other data points we're very far away from where they started by the end so this might give us our first impression that it looks like maybe as engines deteriorate and move away from normal conditions they might generally move in this direction in this principle components plot next we might want to take a look at just the problematic engines so many of the engines were fine at the time of scheduled maintenance but some of them were actually badly in need of maintenance and our main is crews actually identified four of them as being in particularly bad need so here we'll take a look at just what through the last 20 flights from each of those engines look like so those are the ones that we've highlighted here in red and so you can see that those are definitely the ones that are wandering the farthest away from this big cloud of data points and we can even look at how these different engines evolves over time and so what we're looking at here this is the path that a particular engine engine 39 took through the principal components plot while it was running so it started at this yellow point this was the first data point it wiggled around here in the normal region for a while and then started to move away from first slowly and then more quickly towards the end and then this red data point here this is the last data point prior to maintenance so using what we've learned here we might take our first initial guess as some criteria to determine when a piece of equipment is transitioning from normal conditions to something else and our first guess might look something like this so our big cloud of data points we might consider this to be normal conditions so this is the green area we've outlined here and as long as we're inside of here everything's fine nothing to worry about if we start to wander outside of this region into this yellow yellowish orange region whether we might say well it's no longer normal but it's not in really bad shape yet so maybe we want to issue some kind of warning to keep an eye on this piece of equipment or maybe prioritize it to have maintenance soon if it wanders into the red alarm region then this might mean that the engine has really started to deteriorate badly and we need to make sure that we do maintenance very quickly so I think of this as a two dimensional version of the control chart that we look at earlier but now rather than plotting our data points relative to their raw signal values now we're projecting them into this principle components plot but then how can we take this and implement it when we still don't actually know what failure looks like in the data well there's no one right answer there's lots of different ways to do it but I'd like to quickly sketch out just one idea for how we might implement this so you can think of the data that we got as coming from round one we have a series of engines we started running them we were recording data and after a hundred twenty-five flights we went ahead and we performed our scheduled maintenance we then took this data we brought it into MATLAB we analyzed that we developed our criteria now the sames engines after having maintenance performed are going to start running on round two as they run we can monitor them see are they in the normal normal warning or alarm Regents now it's up to us to decide based on this feedback when might we want to do maintenance so for example we might say well if it goes outside of the normal region into the warning for a couple of data points in a row then we might want to go ahead and prioritize doing maintenance on it otherwise as long as it's normal then it should be fine but if this is the very first time that we're doing this then we probably don't want to let them just run forever so we might still be doing scheduled maintenance but we'll push back the pace of it instead of 125 flights maybe 130 535 slides for round two we'll take all this sensor data we'll take the feedback from the maintenance crews and we'll use it to evolve our criteria to update our models for wet what is normal what is not normal and when does it transition from one to the other then as we get into subsequent rounds like round three some of our engines might not last too long other ones might keep running and so over time we can continue to push back the pace of this regularly scheduled maintenance until we get to a point where we're not doing any scheduled maintenance at all and we're only scheduling our maintenance based on this type of criteria and along the way we'll gather more data we'll gather more feedback from our maintenance crews and we'll continue to evolve this as our initial guess we made some rectangles here but odds are it's probably not a perfect rectangle that might be circular might be irregular and so on so if this was really the case if we really didn't have any data from failures then we might have to stop at this point and you know go implement something similar to what I just described but we've been a little bit misleading we do actually have data for each of these engines running it all the way until it hits failure conditions we were just ignoring what happened after flight 125 so that we could look at this type of scenario where we don't have any failure data so this is a little bit like cheating but since we do have data for running it all the way to failure let's see what would have happened if we had held these conditions constant and let that let all the engines run until they failed so what we're looking at in this plot here is all 100 engines run till failure they're synchronized so that they fail at time zero and these negative values down here these are flights prior to failure along the y-axis is the fraction of the engines that are currently in normal warning or alarm conditions so you can see that this starts over 350 flights prior to failure so at least some of these engines ran for over 350 flights prior to hitting failure conditions so we lost a lot of availability by doing maintenance after only 125 flights around 150 flights prior to failure some of the engines started to wander outside of the normal region and into the warning region around 70 flight seventy five flights prior to failure some of them started getting into the alarm region by around 25 flights prior to failure all of them had left the normal region none of them are in normal conditions anymore and by timing out the nine flights prior to failure all of them had entered the alarm region so every single engine would have generated an alarm for at least nine consecutive flights prior to failure and so there's a lot that we can learn from this type of plot I think a lot of people ideally would like to see these transitions from one state to the next all happen at once so maybe for example at 100 flights prior to failure all of the engines l at once would move from the normal to the warning conditions but that's not what we see and there's two potential causes for that number one is this is using our initial guess for what the conditions should be for normal and warning and so on and those certainly could be improved but another possibility is that this represents what's really going on some of the engines start to deteriorate fairly early on but do so at a slower rate and it takes them a long time to hit failure whereas other ones might be totally fine all the way up to the very end and then very quickly transition from normal conditions into failure and so looking at this type of analysis we can analyze which engines are which pull them aside and look at them to see if there's any conclusions we can draw from that finally we might would ask well how much uptime did we really gain well that depends on exactly how we're going to do our maintenance but for example if we did maintance upon the first warning condition we might game a little bit additional uptime this is versus doing it as a scheduled maintenance if we did it upon the first alarm signal we'll gain some more and then there's the maximum amount that we could gain by doing maintenance one sample before failure so this is like perfect foresight now if you've done this type of work before these numbers might look a little unrealistic it turns out that NASA curated this data so that it started fairly close to failure however the trend tends to hold true that there there's usually a more conservative implementation that might only gain us a little bit over our scheduled maintenance there's a more aggressive one that can get us more but it's a bit riskier and then there's the maximum amount that we could possibly get if we had perfect foresight so in this first scenario we were looking at what could we do if we don't have any data from failures but as I just mentioned we actually do have data for failures in in this data set so let's take a look at what we could do now that we have failure data so this is going to be our second scenario for today when we actually have data from failures now when this happens typically we're not being reactive we are performing scheduled maintenance but failures are still occurring and this may actually be by design we may find that in order to eliminate all failures we have to do scheduled maintenance at an unrealistic rate so we can search our records for when the failures occurred and gather the data that preceded those failure events and the question now is can we predict we how long we have until failures occur now that we have this data from failures so this puts us into the supervised learning region now that I have both inputs and outputs now I can make some forecasts supervised learning has two main sub categories regression and classification regression is where the output is continuous so this could be something like power or distance or time in our example a regression approach would try to predict exactly how long we have until failure classification is a little bit different in classification our output comes from a small discrete set so the output could be yes could be finer it could be yes or no will this piece of equipment fail in the next ten cycles or the output could come from maybe a couple different choices is this piece of equipment urgently in need of maintenance or does it have a short or medium or a long time until failure both regression and classification are valid approaches both are used in industry each one has its own set of pros and cons for and against it for the sake of time today we're just going to look at one of them we're going to look at classification I also will make one quick note on how this data was recorded so for each engine it was actually run for an unknown amount of time prior to when the data recording started so some unknown amount of time after initial use of prior maintenance but still in normal conditions we start recording this for one for a while until I hit the failure conditions engine two different unknown amount of time after initial use of prior maintenance we started recording all the way through engine 100 and so now that we have all this data running all the way until failure if I get a new engine the goal is now I want to be able to start recording and then stop at any moment and make a forecast so that I can smartly plan out my maintenance let's hop back over to MATLAB and take a look at our classification problem now earlier we looked at importing data into MATLAB visualizing it cleaning it up since we've already done that we're going to skip that in just load instant data now if we want to perform classification we need to set up the classes what are the buckets we're going to try to predict each data point belongs in and there's no one right way to determine what classes you should use for us today we're going to use four of them so if something is really close to failing we're going to say that's urgent and then to help us with longer-term planning we're also going to create buckets for engines that have a short medium or a long time until failure now an important point is where do you draw the boundaries between these different buckets when does it transition from one to the other and this is something we actually can't determine purely from the sensor data usually we have to go talk with some other groups like maybe our maintenance workers and ask them if I tell you you need to do maintenance in the next three flights is that easy to do does this is deserve an urgent rating and so on for us today we do just cup a couple arbitrary boundaries between these different buckets so for example if something is within 50 flights of failing we'll call it urgent if it's outside 50 flights but before 125 we'll call it short and so on and so now let's take a look at all of our data running all the way till failure and what these different buckets look like so here we're looking at again this is nine of the sensors coming off of this particular engine all 100 engines are plotted here they've all been synchronized so that they all fail at time zero and these negative values our flights prior to failure and you can see these different colors these are the different classes that we have then we're going to try to predict for each data point which one does it belong in and the challenging thing here is well it does look like there's maybe a clear trend in some of these as it goes and approaches failure it's tough to say when does it cross this boundary when does it go from one class to the other because it's kind of a thick bar as we move from left to right so this is where we're going to bring in machine learning again and I'm actually gonna bring up an app to help us so Balam has a number of apps that come with its various toolboxes and you can also build yourself for solving many different types of common problems like machine learning neural networks control system design signal processing image processing computer vision connecting with hardware and so on and I'm a big fan of the apps because they give us a really quick and easy way to set up our problems try things out and not all but most of them then have the ability to automatically generate some MATLAB code for us so that we get the programmatic way to replicate what we just did interactively in this case I'm going to open up the classification learner app this is for solving classification machine learning problems we're going to start by importing a variable from our workspace so you notice I've got I 14 sensors here I've picked them all out as predictors and then our last variable that we created this is our response so this is the urgent short medium or long we also need to determine how we're going to test our data so the idea here is we have some historical data we want to train this model on our desktop but then eventually we want to take this model we want to use it in the real world where it's going to get data it's going to get inputs that it's never seen before and where we're not sure what the correct answer is and we want some level of confidence in how accurate the predictions of our model are going to be on this data it's never seen before so what we do is we take our historical data and we split it up into two groups one group is called the Train data and this is the data that we use to teach our model the relationships between inputs and outputs the other data we call that test or validation data and we hold that aside we essentially pretend it was future conditions we did not know about once we've trained our model we take the test inputs we feed it through the model to get a test prediction and then we compare the testing prediction with the test data to get a sense for how was the model going to work on inputs it's never seen before now there's a couple different ways we might do this for us we're simply going to take random 25% of the data and we're going to hold that aside to test with and the other 75% will train with okay so now we've got our data into the app now if you take a quick look at how some of these different variables relate to one another we can do some manual select feature selection by turning variables on and off we can do principle component analysis and apply machine learning to the results we can also just go ahead and pick up some machine learning algorithms to try so most of the more common machine learning techniques are available here and if I want to try out something like a medium-size decision tree we can pick that out and train it and so tell me that overall it's about 65 percent accurate if I want to try something else like a linear discriminant we can pick that out and that doesn't do quite as well it's about 51 percent accurate overall and this is a really important part of a machine learning workflow one of the most challenging things about machine learning is there's no way to know ahead of time what's going to be the best model for my data until I just try a bunch of them out and see so this gives me a really easy way to pick through some pre-made options or I can even open up the Advanced tab to try and fine-tune them myself and since I've run this before I know that in this particular case a K nearest neighbor neighbors algorithm is what will get us the best result but again there's no way to know that until you just try a bunch of them out and see now as many different ways we might evaluate accuracy here the most common is using what's called a confusion matrix so a confusion matrix allows us to compare the predicted class on the horizontal with the true class on the vertical and see what's our accuracy where are we getting right answers and what different types of mistakes are we making and we can compare different techniques with this so here you can see when the correct answer is urgent I'm correctly predicting it was urgent about 75% of the time using this medium decision tree when the correct answer was urgent I incorrectly predicted it was short you can think it's like a false negative I incorrectly particularly short about 24 percent of the time when the correct answer was short I was incorrectly pretty keen was urgent sort of like a false positive about three percent of the time so we can use this confusion matrix to compare not just overall accuracy but the different types of mistakes that we're making between these different techniques so there's a lot that we can do to try to improve the accuracy of our machine learning models whether that's on the pre-processing noise removal and outlier removal side whether we're going to try to create additional features additional signals for us to work with or if we want to try out different machine learning models eventually we'll be happy with what we have and let's say this is good enough for us today in the end I want to either take this model and export it back to my workspace so I can use it or generate some MATLAB code so this will actually generate a new MATLAB function for me that will take new training data that I want to use and will retrain the model for me and you can see the command right here so this is a huge time-saver being able to set things up interactively in the app and then have it just generate all the MATLAB code that I need in this case I was able to just going to copy paste this over into my live script over here so retrain our K nearest neighbors algorithm and you can see the results down here now there's a little bit of randomization in the process of training a machine learning model so you notice the results will change a little bit from run to run what it should be generally the same and so you can see in this case when the correct answer is urgent we are predicting it's urgent about 92% of the time about 8% of the time we're making some kind of mistake now machine learning models by default will treat all of these different mistakes all of these different ways of making errors as all being equal that's rarely the case in the real world for example these urgent cases this is where our equipment is in a really bad place it is badly in need of maintenance and we're running a high risk of having an unexpected failure so this is potentially a much more expensive mistake to make than something like when it's actually long predicting that it has a medium time until failure so what we can do is specify what's called a cost matrix so this allows us to specify what's the actual cost of making these different types of mistakes so some of these mistakes like when it's urgent and I don't get that are more costly than the others I can increase their weight in this cost matrix and when I retrain my model it will now penalize those those types of mistakes even more so let's retrain this model all I have to do is give one additional argument telling it what the cost matrix is there we go and let me pop this out so we can get them side-by-side so here on the Left this is the original result and then on the right this is the result of using the cost matrix and so you can see we went from about 92% accurate to over 99% accurate in the urgent case because I told the algorithm when it was training that these types of mistakes are very expensive to make however this does not come without a cost you notice that in the case of the short algorithm or exceedingly short class originally we were getting it right about 85 86 percent of the time now we're only getting it right about 68 percent of the time so this isn't reducing the number of mistakes we make it's shifting the mistakes to areas where it's less costly to make but if the cost matrix we defined really represents what these different mistakes cost us then this result on the right over here this is going to really give us the best result because this is not optimizing for the minimum number of mistakes its optimizing for the minimum cost of our mistakes and that's what we're really interested in so we can keep iterating on this there's lots of other things we could do here to try to improve the accuracy of our models but let's say this is good enough for today the question now becomes well how can I take this outside of my desktop version of MATLAB and use it to actually make forecasts on my equipment as I get a new data well there's actually a couple different ways we might do that so Matt one can do a lot of things it can do a lot of math and algorithms it can make visualizations it can even do graphical interfaces and I can deploy almost everything about 99% of what you do with Matt 11s toolboxes through the MATLAB compiler the mallet compiler is an add-on to MATLAB that will generate a wrapper to put around your battle code and then compile that into a software library something like an executable or a C++ library dotnet assembly jar file Python library and so on and you can then take these libraries and you can use them wherever you would normally use them but they're running MATLAB code behind the scenes so you can take this give it to your colleagues run it wherever you like you can round the desk top run on the server cloud but since it's actually running MATLAB code under the hood it will need access to something called MATLAB or runtime so mat level runtime this is a freely distributable copy of the MATLAB engine that can run these matt'll compiler build components so once you build this both the component you build with matlock compiler and also the engine needed to run it are both licensing and royalty-free and you can run them wherever you like however since it's just running MATLAB code against the MATLAB engine this can really only work on hardware that could normally run MATLAB so a second option if I only care about the math about the really the core algorithms so I can use something called MATLAB coder Malik coder an actual C code translation process it actually translates your mountain code into standalone ANSI C code this supports a much smaller subset of malliband serious toolboxes but still supports a really wide array of a lot of the math and core algorithms and you can take this C code and then do whatever you normally do with it build it into libraries even run it on embedded hardware on the device itself that could not normally run MATLAB you can even connect them and do a combination of the two maybe take some of your algorithms like the pre-processing and the noise removal generate C code and push those down to run on the device itself before you upload any data and then once you upload the data to some central place like a server then you can have the more computationally heavy algorithms like the machine learning running through malloc compiler on the central server infrastructure so whether you want to take your mounted algorithms and you want to deploy them to something like a server or a cloud or if you wanted to deploy them to the equipment itself or if you want to do a combination of the two there's tools in MATLAB to help you do this so where do you go from here well in addition to having a lot of free information available on our website we also have very robust training and consulting offerings our training organization does both in person and online classroom training both on site and at public locations we also have self-paced online training in our training covers all of the core workflow that we talked about today importing visualizing processing data doing machine learning deploying MATLAB applications and also many other topics our consultants are very experienced in this area both in the core data analysis and modeling and also in integrating MATLAB into other systems a few key takeaways for today frequent maintenance and unexpected failures are a large cost in many industries and matt'll of it enables engineers and data scientists to quickly create test and implement predictive maintenance programs in fact we looked at two very different ways of doing it then we were quickly able to do today and predictive maintenance if we can get this right can save a lot of money for equipment operators can improve the reliability and safety of our equipment and it can even create new opportunities for services that equipment manufacturers can provide so that's about it for me today thank you all for tuning in if you haven't done so already please enter any questions you may have in the Q&A panel we're about to go offline for a few moments compile those and then we'll come back online and answer as many as we have time for thank you

Related Posts

9 thoughts on “Predictive Maintenance with MATLAB A Prognostics Case Study

Leave a Reply

Your email address will not be published. Required fields are marked *