Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations GregLocock on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Is given value, a part of distribution

Status
Not open for further replies.

smurali1

Automotive
Apr 21, 2003
40
Hello,

I have appx. 50 data points which were vibration measurements taken a month back. Now I have two data points measured recently. How do I statistically prove if the 2 recent values are also part of original distribution?

Thanks

SPM

Note: Being vibration measurement, the spec. is unilateral
 
Replies continue below

Recommended for you

What you're asking doesn't make sense. They're either within the statistical bounds of the original distribution or they're not. Whether that means they're "part" of the original distribution cannot be answered, since conditions may have changed, and the parts may have changed.

Only when you get another 50 points of data can you realistically even begin to do some comparisons.

TTFN

FAQ731-376
 
I think you are heading into the fraught realm of "outlier detection". If you are serious, look for work by:
» Frank.E.Grubbs, for example "Sample Criterion for Testing Outlying Observations", Annals of Math. Stat., vol 21, 1950, pp 27-58.
» W.J.Dixon. I have a photocopy of a paper by him titled "Analysis of Extreme Values", journal unknown, date unknown. He was at Univ of Oregon at the time. (This is not on "extreme values" in the sense that we now tend to use the term.)
» One of the above papers, Grubbs's I think, makes reference to work by a W.R.Thompson.

Most authors in this area caution that you should have a priori reason to suspect an observation to be an outlier before you subject it to outlier testing.
 
PS. The classic text by Sokal and Rohlf contains info on the Dixon test.
 
A random number is always part of any distribution... Depending on the distribution, the value may be very statistically improbable, but never identically zero.
 
You can test the hypothesis that your second distribution has the same mean and standard deviation as the first. Your confidence limits will be enormous of course.

A distribution with 2 samples does have a mean and standard deviation, after all. You just don't know what they are!

For instance if your 50 sample test had a mean of 5 and a standard deviation of 1, and your 2 extra sample had values of 2000 and 2001, without doing the maths you'd be pretty confident they were different.



Cheers

Greg Locock

SIG:please see FAQ731-376 for tips on how to make the best use of Eng-Tips.
 
You can construct a prediction interval (slightly wider than a confidence interval) with a given confidence level, say 99%, from the original 50 points. If your new values are within this interval, there is a 99% chance that your new values are from the original distribution. If they are not within your prediction interval, either the distribution has changed (such as a calibration change)or you managed to hit the 1% of new data points that are outside your prediction interval but are from the original distribution. You should gather more data though and do a proper t- or f-test.
 
I should've added to "If your new values are within this interval, there is a 99% chance that your new values are from the original distribution" by saying: assuming your process and measurements haven't changed. But then you can't really prove anything...
 
Bribyk's suggestion was pretty good. Thanks for that. Let me try this.

My colleague calculated mean+4 sigma for the first set of 50 values and assumed that as UCL. When you compare the two new values with UCL, they are higher than UCL. So, he claims these two new values are not from the same distribution meaning it is an outlier.

But I am not fully convinced with this way. I think it is not proper statistical way. Also, when the distribution is skewed, I think my colleagues method will fail. But I did not have alternate suggestion. Now that I have Briby's way let me try this.

 
Your 99% prediction interval is going to be about +/- 3 standard deviations. It doesn't look likely that your new values are from the original distribution. The prediction interval assumes a normal distribution so you'll want to check the skewness and kurtosis ( , is the best reference I've found) to see if your 50-point distribution "may" be normal.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor