Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations GregLocock on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Frequency resolution and amplitude in FFT 2

Status
Not open for further replies.

GMarsh

Mechanical
Sep 30, 2011
123
Hi,

I am looking at improving the frequency resolution in FFT through zero padding. Attached is an excel sheet showing an attempt to study the frequency resolution with padding of zeros to data. Any comments ?

Another basic (and obviously could be silly and stupid) question: What does the amplitude of FFT spectrum indicate ? I asked many people, read many books, etc. but this doubt remained. All books just escape saying it is amplitude, they won't give units. For example if I have acceleration (m/sec2) vs time (sec) data and I take FFT, the units of amplitude is (m/sec2). Right ? For example, a time domain acceleration shows maximum acceleration of the order of 50 m/s2. FFT shows amplitude of the order of 1. I am puzzled with this. What is the physical meaning of this? I tried summing up magnitude of all the frequencies which obviously goes beyond 50 m/s2. So what is relation of time domain amplitude to FFT amplitude ?

Attaching a picture of this data and its FFT in another worksheet of attached excel workbook.

Many thanks
Geoff
 
Replies continue below

Recommended for you

My two cents fwiw:
Attached is an excel sheet showing an attempt to study the frequency resolution with padding of zeros to data. Any comments ?
I would argue that your peak frequency is converging to the best estimate of the frequency possible from the set of time data that you started with. If the purpose is to get the most accurate estimate of frequency, using a longer actual time record is preferable to zero padding as Hacksaw metnioned.

Zero padding (pre-FFT) is roughly interchangeable to the variety of methods that can be done post-FFT to estimate the peak frequency by interpolation

There are magnitude correction that can be applied to undo any magnitude changes associated with zero padding. They have to coordinate with the choice of windowing used. The correction factor would be different if you put the zero's on one end than if you split them onto each end. I don't have any formula ready but it should be shown in most DSP textbooks. I think the generally preferred location to put the zero's would be split among both ends for a windowed signal.

Another basic (and obviously could be silly and stupid) question: What does the amplitude of FFT spectrum indicate ? I asked many people, read many books, etc. but this doubt remained. All books just escape saying it is amplitude, they won't give units. For example if I have acceleration (m/sec2) vs time (sec) data and I take FFT, the units of amplitude is (m/sec2). Right ? For example, a time domain acceleration shows maximum acceleration of the order of 50 m/s2. FFT shows amplitude of the order of 1. I am puzzled with this. What is the physical meaning of this? I tried summing up magnitude of all the frequencies which obviously goes beyond 50 m/s2. So what is relation of time domain amplitude to FFT amplitude ?
I view the FFT magnitude as the square root of the energy per bandwidth.
To interpret, we add up the energy over the frequency band of interest. (equivalence of energy in frequency and time domains suggested by Parseval's theorem). For example if you see several non-zero bins in a clump that you suspect are associated with a single sinusoidal peak, than the magnitude of that peak would be square root of sum of the squares of the individual FFT bin magnitudes. It transforms to a single sinusoid in the time domain with that magnitude (under the assumption that these are all associated with a single sinusoid).

The small noise floor between peaks tends to complicate this view, but less so when you remember that SRSS tends to be dominated by bins with large magnitudes and the contribution of bins with smaller magnitude is double-smaller (smaller squared).


=====================================
(2B)+(2B)' ?
 
Correction: any windowing is applied before zero padding. Where the zero's are added (beginning, end, both) shouldnt' make a big difference to the result and shoudln't affect the correction factor.

=====================================
(2B)+(2B)' ?
 
electricpete,

Thank you for detailed reply. Very useful.

Zero padding -

Yes. You are right. Padding applied after windowing doesn't affect which way we add zeros - left, right or both sides. Also the peak frequency remains same whether we pad before window or after. However the peak amplitude changes. I am attaching a graph here which shows variation of FFT peak amplitude with padding - applied before and after windowing. Though after windowing has more reduction in amplitude it seems to be following a smooth trend (to me looks like exponential!) which can be easily corrected as you said using a correction factor.

Amplitude of FFT -

First things first - now I understood that at least how the units remain same. If my time domain signal has m/s2 units, I can confidently put same units on FFT amplitude as it is equal to square root of sum of squares.

The concept of energy per bandwidth is more appealing. I will try to validate this SRSS concept on a known sinusoidal signal and will report the result here. But I also see this concept is very difficult or impossible to validate on a real time signal which can be represented as a summation of many sines and cosines.

Many thanks
Geoff
 
 http://files.engineering.com/getfile.aspx?folder=bfd5b2fe-a962-413a-bf51-8dcea279b872&file=PadB4AftrWindow.png
I'm sorry, you really are playing with fire.

consider the simple signal 1,1,1,1

now zero pad it to give twice the frequency resolution


1,1,1,1,0,0,0,0

If you are trying to tell me that the FFT of a DC signal is identical to that of a step function then something is very wrong. And a quick check says it ain't. Even if you use two leading zeroes and two after it still ain't.

So not very surprisingly we discover that there is no free lunch. Yes, you will increase your apparent frequency resolution but in doing so you will be altering the displayed spectrum in ways that you may not be able to predict.



Cheers

Greg Locock


New here? Try reading these, they might help FAQ731-376
 
As usual, mo' data is required to get mo' info'.

TTFN
faq731-376
7ofakss
 
These comments "no such thing as free lunch" and "more data to get more information" - you will have to explain the specific context to which you refer. At first glance they sound like disagreement with the statements I made.
I stand by all the statements made here:
thread384-325307
and here
thread384-208992
(particularly as summarized 9 Feb 08 19:02)
All my links disappeared when I left comcast. Attached is the spreadsheet that I posted 27 Feb 08 22:12

If there is a point you disagree with, quote it and lets discuss it.

Again the issue is not free lunch, but how much of your pb&j sandwich do you throw in the trash.
If you simply estimate your frequency simply by choosing the bin center of the highest magnitude FFT point, you are throwing away information.
The amount of information thrown away in arriving at an estimate (for a fixed time record), can be reduced by the three techniques mentioned in the post 9 Feb 08 19:02.

Regarding the step change example, two thoughts come to mind
1 – I’ll bet if you worked out the example fully, at the frequency points present in the original FFT, there is a match (positive at zero frequency and the zero’s of the sinc function at all other frequencies of the original domain.... the additional in-between frequencies would have non-zero content). It should match the original results at the original frequencies, it just adds new frequency points that don’t match what you’d expect.

2 – If you used a window, you wouldn’t have had the sharp change in the time domain or the extreme behavior and the result would be closer to what you expect. My 2008 spreadsheet has example of zero-padding applied with a window (although I did not adjust the magnitudes, my interest was only frequency determination). It is technically possible to smack your kneecap with a hammer or insert a screwdriver into your eye-socket, but one shouldn't conclude from this that a hammer and a screwdriver are bad tools ;-)


=====================================
(2B)+(2B)' ?
 
Shrug. Well you've got my answer already, zero padding adds artefacts to the spectrum. It can't add information, that is absolutely set by 1/T.



Cheers

Greg Locock


New here? Try reading these, they might help FAQ731-376
 
On thinking about it, your spreadsheet makes an assumption that is usually reasonable, but is not rigorous.

You assume that there is only one contributor in each 1/T band. Fourier does not.

If one were to restrict oneself to rotating machinery then this is usually fair enough. But then you'd be better off using synchronous sampling in the first place. An example where it would be horribly misleading is where a 3rd order firing engine is coupled to a hookes joint via a 1.47 overdrive gear ratio. BTDT got the T shirt.



Cheers

Greg Locock


New here? Try reading these, they might help FAQ731-376
 
... hence the need for speed sweeps if you want to deduce what's going on!

- Steve
 

On thinking about it, your spreadsheet makes an assumption that is usually reasonable, but is not rigorous.

You assume that there is only one contributor in each 1/T band. Fourier does not.
Yes, it is an assumption that I made and stated. I agree it may or may not hold depending on your situation and like any assumption that needs to be considered. This assumption was clearly stated at the very beginning of my summary posted 9 Feb 08 19:02, which is the specific summary post that I referenced by date/time on 10 Jul 12 22:50.

electricpete 9 Feb 08 19:02 said:
before we leave this topic, here is my attempt at a summary: (open to any comments or suggestions)

The ability to resolve the frequency from the output of FFT is seemingly almost limitless if we have only a single-frequency input. The time windowing does have the effect of shifting the energy to the left and right in frequency, but that shifting is symmetric and does not change the center of the energy from the sinusoid.

When we have multiple frequencies (which includes noise), the time-windowed output of those frequencies may overlap. In this case our accuracy in estimating peak frequencies suffer and we may be unable to distinguish closely spaced frequencies. The ability to resolve the frequency from the output of FFT is seemingly almost limitless if we have only a single-frequency input.

My comments apply to estimating the frequency of a single peak to the highest possible accuracy from existing data, they do not apply to the unsolveable (without more data) task of separating peaks which fall within a single bin of an FFT of the available time record (frequencies separated by less than 1/T where T is sample duration), and I have highilighted that my comments about resolution do not apply to the case where the multiple frequencies in the input overlap.


If one were to restrict oneself to rotating machinery then this is usually fair enough. .
For the rotating machinery I work with, my interest is not limited to synchronous frequencies. There are bearing defects, twice line frequeny vibrations of motor, whirls, and other phenomenon that make us interested in all frequencies. If I see a peak in my collected data at roughly 3 times running speed, I want to know whether it is 3.000 three times running speed or whether it is something like 3.014 times running speed such as a typical outer race frequency for an 8-ball deep groove bearing. Does that mean these tools are useless? Nope. They give us the best estimate of a given peak from the given data. They don’t do the impossible (separate the inseparable) but that does not make them useless. For starters, maybe there is just 3.014and no adjacent 3.000. Higher resolution helps see it. So, what about that case where there are 3.000 and 3.014 in the same bin or so cloe that they can’t be separated? The combined frequency when I try to label it will still differ from 3.000 (maybe 3.007...halfway between). If I label all my harmonic peaks on a log scale and see 1.000, 2.001, 3.007, 4.002, then I have a suspicion about that 3.007. The fact that I have higher resolution on the other peaks (1, 2, 4) makes it easier to see that outlier near 3. The point is: higher resolution always helps separates peaks and it’s always valuable to label your peaks with the highest resolution available for this reason. The higher, the better we can distinguish. I’d love to have 0.001hz resolution on every machine but then the data collection time would balloon. So I take what I get and make the best of it (using the valuable tools I have higlighted) and don’t throw any of it away.


zero padding adds artefacts to the spectrum. It can't add information, that is absolutely set by 1/T
I think your implying that zero padding and the other tools mentioned are useless and again I will strenuously disagree.

First I will mention that I consider all three tools (frequency interpolation, reconstruction, zero-padding) are of the same nature. There is one exactly precise estimate of the peak (subject to assumptions discussed), and all three will help us get toward that one precise number to various degrees depending on how much computational effort we want to put in.

These are well accepted tools, nothing I invented (ok, I came up with the particular quadratic interpolation myself but I’ll bet someone else has one it too... Entek/E-monitor must use some tool in the frequency interpolation category because they discard time waveform and phase and store only magnitudes and come up with the estimate from that).

To brush them off in this manner would be quite unwise imo. Artifacts are not created by zero-padding, they are created by windowing, which is present whether you zero-pad or not. Yes, you can come up with one example of pure dc where there is no windowing effect when performing FFT of the original signal and the windowing becomes important when you zero-pad, but it is not an important or representative sample. It is not important because those of us in vibration don’t generally use FFT to determine a dc component. It is not representative because any other signal will have a window effect.

If we stick with rectangular window for simplicity, we have the length of the time window will be the same regardless of whether we zero pad or not. So when we multiply our original time waveform by a broad rectangular window, we convolve the resulting frequency spectrum by the sync function (fourier transform of rectangular window). The width of the sinc function in frequency varies inversely with the width of the rectangular window in time. Since the width of the multiplying-rectangular window is the same in both cases, we convolve by the same sync function and get the same spreading. The only difference is that the zero-padded signal provides a higher resolution sampling in frequency of that resulting continuous function.

So the sync function by which the original signal gets multipled will be the same. The only difference is how finely in frequency we sample the output (the zero padded FFT provides finer frequency sampling of that result (the FFT points are frequency samples of the continuous DTFT... same DTFT either way, finder frequency samples with zero padding).

In summary, the zero pading does nothing to reduce the leakage We have broadening of the peak created by that leakage which is not solved or reduced by these techniques. (I have said as much in my previous linked thread). When combined with possible presence of interfering noise, these widening creates problems in estimating the frequency and we might describe as an uncertainty band. The techniques do nothing to reduce the width of that uncertainty band but they DO help us center the uncertainty band on the exact frequency where it should be centered.


=====================================
(2B)+(2B)' ?
 
Correction on a poor choice of words:
"The point is: higher resolution always helps separates peaks and it’s always valuable to label your peaks with the highest resolution available for this reason."
should've been:
"The point is: higher resolution always helps identify peaks and it’s always valuable to label your peaks with the highest resolution available for this reason."

=====================================
(2B)+(2B)' ?
 
No I'm not saying zero padding is useless, I am saying it is dangerous (in an analytical sense). It gives a sense of added resolution which may work or may not. If you don't believe me google "Zero padding" and at least the first two articles I looked at, from stamford and NI, emphasise that it is not a simple process. As a wise person once said to me "if you know that much about the signal, then go for it".

I agree, exploiting the shape of the shoulders of the bins is an effective way of helping to increase the apparent resolution for the frequency of a peak, but it relies on assumptions, again, not for the faint hearted in the general case.

I'm not too sure why we got side tracked into this curve fitting discussion, I thought the topic was zero padding.





Cheers

Greg Locock


New here? Try reading these, they might help FAQ731-376
 
I'm not too sure why we got side tracked into this curve fitting discussion, I thought the topic was zero padding.
I mentioned 3 techniques, they are all very closely related and I’d like to review how they tie together. I'll quote myself just to save retyping

Once we have fixed the number of time samples, there are three things discussed above that can increase the resolution in estimating a peak.

1 – Quadratic interpolation. The least computationally intensive and biggest bang for the computational buck. Also well suited to improve the output of one of the other methods.

2 – Zero-padding. More computationally intensive. Gives a higher resolution for all peaks on the whole spectrum. (Note).

3- Reconstruction of the DTFT. The most computationally intensive (in my implementation, anyway... I don't think this can be broken down into a simple convolution which could be processed by FFT multiplication). Gives unlimited ability to improve the resolution (Note).
Let's start with #3 (reconstruction) because it gives a very good framework to view the previous two.

Reconstruction of the DTFT provides a mapping from the discretely-spaced FFT results to a continuous DTFT function. That mapping again can be found for example equation 6.17 and 6.18 here:

It involves a sum of weighted, frequency-shifted continuous sinc functions. Each FFT point factors into one term. The complex value of the FFT is the weighting, the frequency of the FFT point figures into the frequency shift.

The peak of the continuous-frequency DTFT is the peak which is our best available estimate of the frequency of interest (assuming not two peaks in a bin in which case there is no technique that will separate them... I don't consider the inability to achieve the impossible as a disadvantage AND and as I showed above 1.000, 2.001, 3.007, 4.002... even where one of the peaks in the pattern is merged with another peak, we can usually recognize the pattern and the outliers better if we have the best available estimate of all the frequencies in our spectrum).

So the reconstruction provides this continous function and it’s peak the the most precise estimate we can possibly come up with. That is by my view the true best answer with available data, but it is computationally intensive. So we can try the other two approaches to try to get part way to the same answer. How does that work?

First start with zero padding. As we know, the FFT is simply samples of the DTFT at discrete intervals corresponding to the bin width. As we showed in previous discussion, the underlying DTFT is the same regardless of whether we zero pad or not. All we do with zero padding is get finer samples of that underlying DTFT function. With finer samples we will likely get closer to finding the true peak.

But zero padding is also a little bit compuationally intensive. And it has to be done before we do the FFT, not really suitable to be done when we’re poking around an FFT result. So that leads to the other approach: frequency interpolation (curve fitting as you called it).

In contrast to the other two techniques, I don’t know if any rigorous proof of the frequency interpolation techniques are available. I have my own intuitive fuzzy proof. It is that the sinc functions that we use to build the DTFT from are relatively smooth. Looking at zero crossings of the sinc to estimate it’s frequency content, the highest frequency posible occurs away from the center lobe where we see maximum rate of 180 degrees per bin-width. (By the way if we were looking at samples of a time function rather than samples of a frequency function, that would be the Nyquist frequency..... the idea that we can reconstruct the entire continuous DTFT from the discrete FFT points is analogous to the idea that we can reconstruct a time waveform from it’s samples if and only if the relation betwen sampling frequency and original waveform meets Nyquist limit). The fact that we build this final DTFT out of bandlimited sinusoids imposes a certain kind of smoothness on the resulting complex function and its’ real magnitude. It cannot vary too eratically between the known points, it is limited to varying at the rate of it’s components. This characteristic I believe is what helps the frequency interolation work.

The quadratic interpolation approach is to select the highest bin magnitude and one on each side. With three data points we can solve the A,B, C constants in a quadratic form: Y = A + B*f + C*f^2. Obviously we know dY/df = B + 2*C*F and so the maximum occurs where dY/df = 0 (f = -B/[2*C]). You can solve all the algebra ahead and you end up with a fairly simple result (I can provide the algebra/results if anyone wants). That solution is built into the vba of my spreadsheet along with some other stuff... you can graphically some results works on the first two tabs on left of the workbook... fill out the green input blocks with three frequencies spaced equally and three magnitudes (the center has to be the higest of the three).

Here is an empirical study of results of quadratic method (I’m not positive if it’s identical to the method I described):

I have done my own empircal study in the spreadsheet. I used a fixed sinusoid of varying frequencies from 74 to 79 sampled at interval of 0.0005sec as input to a 512-point FFT. Bin width is 1/(512*0.0005) ~ 3.9hz. The bin centers in the neighborhood are 74.2,
78.1, 82.0. Look at the first seven rows of the table in the summary tab which use nothing other than quadratic interpolation:

Tab / Freq / Estimate / Error / BinWidth / Fraction Of BinWidth
Trial74 74 74.05 0.05 3.90625 1.4%
Trial 75 75 74.83 0.17 3.90625 4.4%
Ttial 76 76 75.91 0.09 3.90625 2.4%
Trial 77 77 77.20 0.20 3.90625 5.2%
Trial 79.1 79.1 78.91 0.19 3.90625 5.0%
Trial76NW 76 75.35 0.65 3.90625 16.6%
Trial77NW 77 77.86 0.86 3.90625 21.9%

The accuracy of the first 5 trials in determining the frequency is 5% of the bin width. The accuracy of the last two is up to 22%, but we didn’t use a window (These last two don’t count in my book, we really should be using a window). It’s a pretty darned good improvement in precision (beyond just picking the highest bin) and it’s pretty cheap computationally. (If you want to go the other extreme and number crunch using the DTFT reconstruction, you can see the 77hz input sinusoid was estimated from FFT output to be 77.00001015).

The Entek E-monitor data base must use frequency interpolation (since they work from FFT magnitudes only, they cannot accomlish the other techniques). I don’t know exactly which technique they use, but I have reasons to suspect it’s pretty close to the quadratic method. For the user it is as simple as putting your cursor on the peak and pressing p.

I have been using that program to look at machinery spectra for probably an average of 10 hours per month for the last 12 years. Knowing the frequency is important to accomplish my job and I spend time studying the patterns to estimate how accurate they are (to how many decimal places are my harmonics exact multiples of the fundamental). I can say without hesitation that the Entek peak label feature is substantially better than just picking the highest bin center (exactly as we expect from my empirical study). I can also state my opinion that anyone assigned to do my job who didn’t use that tool based on vague unfounded fears or objections as expressed in this thread would be just plain misinformed. I have to say it that way to conviction that I feel about this subject. I definitely am not saying it that way to reflect on you, Greg. (I have learned a lot from you Greg and still am light-years away from knowing half the stuff you do about vibration).

How the three approaches tie together is that none of them accepts the false premise that the highest bin center is the best estimate and all of them move us in the direction of the same ideal best answer given available info.


=====================================
(2B)+(2B)' ?
 
adding zeros if you don't have enough data sounds a bit like a political discussion,

you can take the inverse transform of the result and quantify the mean square error relative to the original data easliy enough
 
adding zeros if you don't have enough data sounds a bit like a political discussion
The premise on which my discussion is based is that you have a fixed amount of data and you want to make the most of it. If you want to gather longer time record, that is better - there is no disagreement anywhere in sight. It's a good clarification. If that's your point then we all agree, but I'm not sure we all agree with the other aspect... The other aspect is that when we have a fixed amount of data, any of the three techniques mentioned will improve your estimate of frequency over simply picking the center of the highest bin. (excluding the unsolveable case of two peaks in a binwidth, in which case the technique may not help on that particular clump but it is still helpful to more clearly identify the rest of the peaks to recognize the pattern as in example 1.000, 2.001, 3.007, 4.002).
you can take the inverse transform of the result and quantify the mean square error relative to the original data easliy enough
I'm not sure what you mean. Inverse transform will give back the original data. Mean squared error had better be zero. Note that I have compared the effects of these techniques to known input in my spreadsheet which perhaps accomplishes what you were referring to... I'm not sure.


=====================================
(2B)+(2B)' ?
 
Inverse transform will give back the original data.
That wasn't 100% aaccurate. To get back the original waveform would be a two-step process:
1 - inverse FFT. 2 - divide by the window (to undo original window multiplication).
Perhaps you meant to do only step 1 and compare to original data. Then you would get difference associated with the window. It will tell you something about the window, but I'm not sure what it really tells us about the match. As far as validating the techniques I mentioned, comparing against known input as in my spreadsheet makes sense to me. I'm open to comment since perhaps I have misunderstood where you were heading with this comment.


=====================================
(2B)+(2B)' ?
 
Forgot to mention antialiasing filter in original process, so again my comments are not 100% accurate (#1 and #2 would not recover the original signal). Still don't see what is hoped to gain thru the mean squared error excercize.

=====================================
(2B)+(2B)' ?
 
It's a good discussion. I'm not trying to be nit-picky here, but I'd like to respond specifically to each points that was raised.
No I'm not saying zero padding is useless, I am saying it is dangerous (in an analytical sense).
It gives a sense of added resolution which may work or may not.
I'll agree that if you presented someone a spectrum and didn't tell them that it was derived from zero padded TWF, that it would be misleading. Since this thread is about zero padding (and related tools that accomplish exactly the same thing), I don't think there is a concern that anyone reading this thread would fall into the category of not knowing the specttrum that they zero padded was zero padded. It's worth discussing potential misuses of tools but imo it's not the primary basis for judging those tools.

If you don't believe me google "Zero padding" and at least the first two articles I looked at, from stamford and NI, emphasise that it is not a simple process.

Here's the first one that pops up for me, the only one in site from Stanford:
Please tell me specifically where in this article is it emphasized that zero padding is not a simple process.
As far as I can tell this doesn't say anything resembling anything you have said in this entire thread.. I would argue that if you read it carefully it will show why zero padding has nothing to do with a "free lunch" and everything to do with getting the most out of the lunch that you have in front of you.

Here's the second one.
Title "Zero Padding does not buy Spectral Resolution"
This article is intended to show the limitations of zero padding, that's fine. Read it and study it if you want. There's nothing to contradict anything I've said.

The basis for that title statement is apparently that zero padding does not improve the ability to separate closely-spaced peaks as discussed in the middle of the article. That is completely true. But separating closely spaced peaks is not the only thing we do with FFT's and not the only reason we need the best available accuracy on our peaks as I've explained (1.000, 2.001, 3.007, 4.002).

Maybe there is a terminology aspect to this discussion. Perhaps this author's definition of frequency resolution is ability to separate closely spaced peaks, and I would not fault him for that definition. I have that term in a different way and explained what I meant.

As a wise person once said to me "if you know that much about the signal, then go for it".
How much do you need to know about a machinery signal in order to decide you'd like to get the best estimate possible from your existing data? I don't study my data to decide whether to apply my peak label tool (which is equivalent to zero padding in result of estimating frequency closer to the ideal). For me the decision is a easy one (a.k.a. "no-brainer")


=====================================
(2B)+(2B)' ?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor