Provision Spares with Confidence 1

shg4421 · Apr 7, 2018

I've asked this question at

https://math.stackexchange.com/q/2717852/

without response.

We sell systems comprising line-replaceable units (LRUs). We know how many operating hours have accumulated on each LRU type (because we know when they went into service, and we ask our customers what their operating tempo is), and we know how many failures have occurred by LRU type (because they send them to us for repair).

The current version of our product is relatively new, and so has acquired few operating hours and (blessedly) few failures.

I can calculate the MTBF confidence interval for each LRU (using NIST's formula for a Constant Repair Rate Model) at a given confidence level, and I know how to use inverse Poisson to calculate the max failures at some confidence if I knew the actual MTBF.

My question is, given operating hours and failures by LRU, and the number of LRU operating hours required for the remaining lifecycle, how do I calculate at some specified confidence the number of lifetime spares a customer should buy when they come into diminishing supply? I'd have thought this would be pretty common calculation.

So I'm looking for a function like this:

=NumSpares(HoursSoFar, Failures, HoursToGo, Confidence)

Two examples from among the six LRUs:

LRU type with 32,441 operating hours and no failures, with 90,247 hours to end of lifecycle.

LRU type with 251,775 operating hours and 1 failure, with 270,742 hours to end of lifecycle.

Thanks for reading.

MikeHalloran · Apr 8, 2018

Auto manufacturers clearly do that, and have been doing it for a long time.
Basically, all the spares that are predicted to be needed for a given make/model fleet are diverted from regular production and packaged for storage and eventual resale while the vehicle is in production, and are never produced again.
You might be able to social engineer your way to someone who can answer the question, unofficially of course, for a given manufacturer.

I don't know any more about it than that, except that they do get it wrong occasionally; some cars have to be discarded (or hacked) because the entire world lifetime supply of some unique part runs out earlier than predicted.

Mike Halloran
Pembroke Pines, FL, USA

IRstuff · Apr 8, 2018

I've never tried it, but the Crow-AMSAA model might be useful

http://reliawiki.org/index.php/Crow-AMSAA_(NHPP)

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

shg4421 · Apr 8, 2018

@MikeHalloran: I think most car parts (other than electronics) have wearout mechanisms rather than random failures. I'm not saying that's easier to sort out, but it's different.

@IRStuff: The Crow-AMSAA math is a little daunting, but at a glance, it appears to solve for the shape parameter of the Weibull distribution. That results (I reckon) in some kind of confidence interval, which I already have, even if it's not exactly the same.

Suppose I choose the bottom of the 95% confidence interval and call that the MTBF. Then to estimate spares to meet the lifecycle hours, I have to pick a confidence level -- again! If I pick 95%, what's the resulting probability that the customer won't run out of spares -- 95%^(1/2), or 95%, or 95%^2, or something else?

What if I chose the lower bound of the 90% confidence interval, and then calculated max failures of 95% confidence? Or vice versa? Is that just stupid on the face if it?

To pick a specific example, I have an LRU with 261,000 hours and one failure, and 282,000 hours to meet the required lifecycle. If you glance at those numbers, you might say, "I oughtta buy two spares, maybe three to be be safe." But the bottom of the 95% confidence interval is only about 47,000 hours, ("OK, so six spares"), but at 95% confidence, you need 10(!) spares.

It just seems like this has surely been done a zillion times, and there should be a nice, clean, commonly-accepted solution. But I must be using the wrong search terms, because I can't find anything close.

I should add that we have this same issue internally; we need to stock lower-level components to meet our 10-year supportability commitment for LRU repair, so we need to buy motherboards, power supplies, graphics cards, ..., plus some wear-out items, like fans, disc drives, .... The customer spares are at LRU level only, for the balance of their planned 15-year lifecycle. But I don't have to solve that problem by Friday when my proposal is due.

Thank you both for responding.

MikeHalloran · Apr 8, 2018

I think the car guys have the wearout mechanisms under control, and maybe even random failures. The specific instance that came to mind first was a special plastic elbow used in my '85 Camaro V6's smog system, which was injection molded of a material that degraded too quickly and failed. I ordered a new one and prepaid my Chevy dealer for it. A month later, the schmuck didn't have it, admitted he would never have it, and had not called me to say so, or even written a note on the postcard I had filled out for the purpose of being notified of the part's availability. It was a new part, specific to that engine and vehicle, and there were just no replacements left. I was able to fix the problem with some duct tape and plumbing fittings, but that would not have passed inspection in a Vogon state like New York.

I feel your pain. I used to work on a product that used embedded motherboards. All it needed was a basic 8088. But every six months, the bottom would drop out of the motherboard market, and we couldn't get the ones we had been using, so we had to upgrade the product to a 286, a 386, then a 486, rewriting our code along the way. ... and then they changed the envelope, and then the connector mix on the motherboard, etc. We had a talented electronics engineer doing nothing but testing new and proposed motherboards to make sure they would work with our hardware, or could be adapted. ... full time. I'd have gone nuts.

As for you, it sounds like you need to buy all 10 spares, now, while you can still get them.
The other alternative is industrial motherboards, which change less often, and cost 5x the price of commodity motherboards.

Mike Halloran
Pembroke Pines, FL, USA

ramseng · Apr 9, 2018

Jardine and Tsang, chapter 2.11 provides the model I think you are looking for:
Expected number of spares in interval=number of preventative replacements in interval+number of corrective replacements in interval=
EN(0,t)=T/t+H(t)(T/t)

Where H(t) is the number of failures expected in the interval (0,t) - derivations and how to calculate are provided earlier in it.

But I'm not sure that a model is really suitable. Ultimately, all models for reliability, including spares provisioning, require the input of a failure probability/failure distribution. You don't have really have anywhere near enough failures to be able to conduct the failure rate inference accurately - this is why your confidence interval is so large. This is the fundamental problem with most reliability data analysis, further analysis requires the construction of a failure distribution, which is very hard to do without any failures. Also be aware that by using the Poisson distribution, you have made an assumption about constancy of failure rate through life that may not be appropriate to your system.

A more useful way in practice I've found to think about spares considers the lead time. A useful rule of thumb is min=2xlead time demand, max=1 to 3 months demand. If the item is highly reliable and the lead time is short, perhaps stock zero is more appropriate, with insurance spares held, justified by the criticality/consequence of failure. Hastings can provide some more practical advice on inventory and provisioning.

Another note - above Crow-AMSAA was mentioned. This isn't the right tool for this job, this is used for assessing the improvements in a reliability growth program, or is used to model repairable systems, which have their own special set of assumptions that make normal statistical analysis inapplicable.

Hope this helped a bit.

shg4421 · Apr 9, 2018

#Mike: >> alternative is industrial motherboards

We buy enterprise everything, which is why we spend $4K for graphics cards instead of $500.

#ramseng >> Where H(t) is the number of failures expected in the interval (0,t)

Without having read the book, knowing the expected number of failures is equivalent to knowing the MTBF if failures are random, no?

>> You don't have really have anywhere near enough failures to be able to conduct the failure rate inference accurately - this is why your confidence interval is so large.

Grok.

>> A useful rule of thumb is min=2xlead time demand

These are spare quantities we recommend for purchase at end of life. Lead time after that is the same as Mike's Camaro part.

Thank you all again.

shg4421 · Apr 21, 2018

I think I've resolved this in a way that makes sense.

I made a Monte Carlo model in Excel. In each of 10,000 rows, I calculate a random (log-normal distributed) MTBF between 5,000 and 500,000 hours, and then a bunch of random (exponentially-distributed) failures for that MTBF. Each failure time is added to the previous, so failure times ascend left to right.

I count the rows that have the same observations as my LRU data (e.g., number of failures in number of operating hours), and for those rows, count how many additional failures occur to end of lifecycle. I like the model because if requires no more data than what I have (hours and failure by LRU), and it sells me how many spares I need at some confidence.

In the example I'm looking at now, one LRU has four failures in 207,000 hours; in one run, 579 lines of the table have that that same statistic. In the additional 281,000 hours to end of life, 13 to 15 spares covers 90% of the 579 lines, versus my earlier calculation of 17.

In a more compelling example, an LRU has one failure in 34,000 hours; 2023 lines of the table match. In the additional 90,000 hours to end of life, 8 spares covers 90%, versus my earlier calculation of 18(!).

I'm happy to share the workbook if anyone is interested.

Thank you again.

IRstuff · Apr 22, 2018

Sure, I'd be interested.

Thanks

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

shg4421 · Apr 22, 2018

Sure. Comments and suggestions welcome.

In the prior post, I said the random MTBFs have a log-normal distribution; that's wrong, it's a power law distribution (their logs are uniformly distributed).

IRstuff · Apr 23, 2018

I've not given it a thorough examination, but since RAND is a random number between 0 and 1, your usage guarantees that the calculated MTBF will always be less than the specified MTBF, since the equation does not allow for any calculated instance of time between failures to be greater than specified MTBF.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

shg4421 · Apr 23, 2018

I assume you're talking about the -LN(RAND()) * MTBF formula. It looks that way at a glance, but that's not the way it works:

As rand() approaches 1, failure time approaches 0

As rand() approaches 0, failure time increases without limit.

IRstuff · Apr 23, 2018

Sorry, read it too fast. But it seems to me that it's almost as bad. Is a time between failure of 3*MT

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

IRstuff · Apr 23, 2018

sorry, read too fast. seems non-intuitive to me that there would be a 10% probability that the time between failures is greater than 2.3*MTBF

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

shg4421 · Apr 23, 2018

Not intuitive, perhaps. The probability of having 0 events (failures) when the expected number is 2.3 is

=POISSON.DIST(0, 2.3, FALSE)

... which returns 10%.

More generally, it's easy enough to show that if mean = -ln(r), where r is between 0 and 1, then Poisson(0, mean) = r.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Provision Spares with Confidence 1

shg4421

Electrical

MikeHalloran

Mechanical

IRstuff

Aerospace

shg4421

Electrical

MikeHalloran

Mechanical

ramseng

Mechanical

shg4421

Electrical

shg4421

Electrical

IRstuff

Aerospace

shg4421

Electrical

IRstuff

Aerospace

shg4421

Electrical

IRstuff

Aerospace

IRstuff

Aerospace

shg4421

Electrical

Similar threads

Part and Inventory Search

Sponsor