Fault Detection

carloabarth · May 26, 2011

I have a problem in fault detection I am working on. It involves the detection of faults in a system (plant) that is made up of subsystems that are copies of each other. These subsystems can be interchanged, and the size of the plant is determined by how many subsystems are connected. The subsystems can vary in age and state of maintenance, and can even vary in their sensing and actuating capability (hence they are maybe not exactly 'copies' of each other).

My question to the wisdom of the group is: Does anyone know of any research or application work in the area of fault detection on systems made up of dynamically dependent subsystems? I'm looking for ideas here...

IRstuff · May 26, 2011

I'm not sure what you're really asking. A microprocessor consists of millions of subsystems(transistors) and there are plenty of people able to test them. "Built In Test (BIT)" is the common nomenclature for this.

Seems to me, the question should be "what failures are you trying to detect," and "what mechanism do you have for detecting them?"

TTFN

FAQ731-376
Chinese prisoner wins Nobel Peace Prize

carloabarth · May 26, 2011

The system I am interested in is a dynamically coupled mechanical system powered by actuators under closed-loop control with various sensors providing input. The sensor and actuator behaviors are all driven by the dynamic state of this system, and so in this sense the subsystems are coupled in a way that I don't believe is true of a microprocessor (or is it?). I don't know anything about BIT but I think it is looking at performance trends or limits to detect faults, as opposed to, for example, a fault detection technique like analytical redundancy which looks at the relationship of the dynamic behavior of the system to the individual (and often dissimilar) sensors and actuators.

One could implement fault detection on each subsystem, looking at the dynamic behavior of only that subsystem, but I think there is a benefit to implementing a system-wide fault detection scheme. The problem is that one must then keep track of the system topology and account for some variation in the sensing and actuating capability of each subsystem.

I'm wondering if the benefits of looking at the 'global' dynamics to make more accurate fault detection are worth solving the problem introduced by subsystem variability. I haven't been able to find any research that addresses this issue directly so I'm wondering I'm wading into unchartered territory, or maybe just missing an obvious related area of activity.

So I thought I'd ask here and see if anyone had any insight.

IRstuff · May 26, 2011

While there are obvious differences, there are also obvious similiarities. For your specific example, there are BIT tasks that the controller can look at, depending on what the expected failure modes are. Some examples:

> monitor actuator power -- increased power implies possible bearing or seal wear or malfunction
> offline Bode plots -- resonances and frequency performance can reveal much abour the physical state of the equipment.
> look at the sensor inputs to determine if "noise" is changed -- can reveal changes in friction, etc.

Few people implement all the BIT that they might do, since it usually adds a noticeable amount to the overall cost of the system; in particular, the processor would need to be upsized in throughput to manage the additional BIT tasks.

TTFN

FAQ731-376
Chinese prisoner wins Nobel Peace Prize

carloabarth · May 26, 2011

Thanks for those examples. I'll read up on BIT to learn more about it.

However, those examples are all looking at limits or trends in individual component behavior (increased power, changes in friction). What I'm especially interested in is fault detection of individual components based on system/subsystem dynamics.

For instance, a controller in an automobile might identify a fault in a left front wheel brake based on vehicle dynamic behavior. Or, a factory production controller might identify a fault in one production process based on the dynamic performance of the entire production line. But, extending the factory example, would there be a benefit to having such a fault detection scheme working on the larger supply chain, made up of several such factories? It seems on the face of it that there would be an advantage, just as there are advantages to looking at systems dynamics vs sensor performance only. But I'm having trouble backing that statement up with facts. Any thoughts?

MikeHalloran · May 26, 2011

They may be too old to find online, but I think NASA published some papers about how the Space Shuttle's seven identical computers interact and back each other up and check one another's behavior, and why there are so many of them.

Mike Halloran
Pembroke Pines, FL, USA

IRstuff · May 26, 2011

I think we're having trouble communicating. The performance anomalies ARE at the system level, because in most such systems, there is nothing to measure at the component level. So, seal friction is detected by the overall control system when it measures the power required by the SYSTEM to overcome the friction from the seal, or bearing. Likewise, the noise characteristics are measured at the SYSTEM level, and it's up to some smart software algorithm to determine where the noise is coming from, within the SYSTEM.

As for your factory example, I don't think it's necessarily the same paradigm. Unless the individual factories produce product that interact with each other directly, there would be nothing to be gained.

A counterexample of this would be the anecdotal tale of Mazda transmissions built for Ford cars. Ford supposedly wound up rejecting all of the Mazda transmissions because they were built too tight, i.e., they had close to zero variation on dimensions, as was intended by the factory culture of that time. Meanwhile, Ford's engines generally were at one extreme of the dimension or the other, and they would match engine to transmission based on the compensating for the error. With tranmissions that had no error, they couldn't do their mix&match trick.

But, if the products have no interaction, then there's really nothing to test or diagnose.

TTFN

FAQ731-376
Chinese prisoner wins Nobel Peace Prize

carloabarth · May 26, 2011

@Mike Halloran: Thank you for that citation. I found it, and it's very interesting. It's not exactly what I'm looking for, since the configuration of the space shuttle computers don't affect the dynamic behavior of the shuttle, ie the shuttle would perform the same if there were 2 or 5 computers.

@TTFN: Maybe the factory example is not the best analogy. But humor me a moment. What if the supply chain for a product was made up of sets of equivalent sources (2nd and 3rd sources, etc) and these sources could be interchanged, deleted, etc such that the final result (product quality, production volume, warranty claims, etc) was a dynamic response of the system of sources. Couldn't you then ask the question: Is it most effective to have individual fault detection schemes for each source? Or is it worth developing a fault detection scheme that worked on the entire supply chain, because the added information of the combined dynamic response gives more information to make fault detection more effective?

Your comment about the noise characteristic measured from the system is interesting: "it's up to some smart software algorithm to determine where the noise is coming from" >> I'm very interested in what that smart software algorithm looks like, and whether, in the case where there were five of these systems producing a combined noise response, it would be better to have a smart software algorithm that worked on the combined response, instead of just five individual software algorithms working in parallel.

Sorry for the verbose post.

IRstuff · May 26, 2011

Nowhere have I suggested localized, individual algorithms, because there would be no data for them to operate on. And since the data is the aggregate behavior of the ensemble, there would only be one algorithm.

However, again in the case of your supply chain, the analogy cannot be carried that far. Failures in the field are discrete events, and the only "algorithm" that's needed is to notice whether Supplier B is having more failures than anyone else.

TTFN

FAQ731-376
Chinese prisoner wins Nobel Peace Prize

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Fault Detection

carloabarth

Automotive

IRstuff

Aerospace

carloabarth

Automotive

IRstuff

Aerospace

carloabarth

Automotive

MikeHalloran

Mechanical

IRstuff

Aerospace

carloabarth

Automotive

IRstuff

Aerospace

Similar threads

Part and Inventory Search

Sponsor