Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations waross on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Curve fitting pump curves 10

Status
Not open for further replies.

Patrick02

Mechanical
Jun 16, 2005
15
0
0
ZA
Dear all

Can anyone tell me which curve fit is the best fit to use when working with head versus flow rate data and head versus efficiency data on centrifugal slurry pumps with a low specific speed (100-700 or so)?

I am a Mechanical Engineer fresh out of university and am currently working for a slurry pump manufacturing company.

One of the projects I have been assigned is to help fix up pump selection software that the company currently employees in producing sales quotes. In its current form, the software tends to misquote on certain efficiency and %QBEP values when working with certain pump curves.

I believe that part of the problem lies with the way the software plots a curve through the pump input data it is given. Currently the software will accept 7 data points per speed setting (since these are slurry pumps, the impellers are not trimmed but rather belts are used to adjust the pump speeds). The data points inputted per speed setting consist of the following:

7 flow rate values
7 corresponding head values
7 corresponding efficiency values

Once these data points have been entered into the program it plots a 4th order polynomial through the data and generates curves of head versus flow rate and efficiency versus flow rate.

I have contested that plotting a 4th order polynomial through the data does not make the generated curves more accurate. I cannot remember from theory where I saw 4th order terms in any pumping related equations. I argue that a 2nd order curve is far more likely to represent the statistically most likely value of either head or efficiency that a customer may obtain for a given flow rate should they test the pump, even though the actual data points inputted into the program deviate from the curve more than they would for a 4th order curve.

Thus I recommended a 2nd order curve fit for both head versus flow rate data and efficiency versus flow rate data and increasing the number of data points per curve from 7 to 10 so as to get more data values throughout the range of flow rates for a particular curve.

I have been challenged in this regard by the software developer, who is a senior mechanical engineer, and who tells me a 4th order curve will describe the data perfectly. He holds the position that a 4th order curve will fit irregularities in the actual performance of the pump that my university theory cannot account for and has dismissed my arguments on the grounds of inexperience in the field.

I have searched far and wide in literature for some guidance, including curve fitting techniques, correlation coefficients, residual plots etc but have not found anything worthwhile yet. I would appreciate the advice of anyone who has had experience with this type of problem.

Kind Regards,

Patrick
 
Replies continue below

Recommended for you

The answer to which correlation fit is more accurate is that you could both be right, the physical curve will follow the laws of physics "Newtonian" so would tend to be a 2nd order polynomial since it would only have terms with flow squared in the physical world, however mathematically the more points you use for the fit the more accurately you fit a curve, hence the fourth order polynomial would more accurately fit the data - beware if you extrapolate past the first or last point, fourth porder will often curle away and give you less accuracy.

However I believe you are looking in the wrong area. A fact of life is that the efficiency of a working pump measured in the field will almost never match the one calculated at the bid phase for any of the following reasons and more:

1) Manufacturing tolerances
2) Installation and piping losses
3) Instrumentation calibration
4) Fluid physical properties (especially for a slurry)
5) location ambient conditions
6) Impeller or driver fouling or loss of efficiency

The goal of the bidding software would be to be as close as possible but state the tolerances to real world, since you dont know any of these factors before a pump is installed in the field generally some level of conservatism is needed based on experience.
 
I have done this with a little different approach. We all know that 2nd order should work, but there are anomolies that keep this from being close enough.
First, select a pump.
When I did this it was with small multistage centrifugals. We built 5 pumps, and tested 5 points on each one 10 times.
This will develop enough data to decribe each data point with a statistical distribution.
Now run your 2nd order fit and see what the mismatch is. I usually ended up using a 3rd order fit.
Don't do anything that the statistics will not support.
4th order fits the data nicely, but it doesn't describe the real system.

= = = = = = = = = = = = = = = = = = = =
Rust never sleeps
Neither should your protection
 
Whilst not any kind of expert in software modelling, I do however test real world pumps dynamically in my facility. This is done to compare refurbished pump performance against manufacurers curves. It is absolutely correct that the more measurement points taken, the more accurate the result, particularly where small changes in head produce large changes in flow (or vice-versa)I do not use any less than 10 measurement points, and sometimes up to 15. I have found with the modest(free!)curve fitting software that I use, that quadratic fits have compared best with manufacturers published curves. Surely any software used to determine pump performance at various speeds/trims can only follow the affinity laws, which while mathematically hold true, are not always entirely correct in the real world as monaco8774 says, particularly as impeller vane angles stay the same whatever speed/trim is chosen. After all is said, a curve is only averaging out measurement points taken.
Regards
John
 
monaco8774 said:
However I believe you are looking in the wrong area. A fact of life is that the efficiency of a working pump measured in the field will almost never match the one calculated at the bid phase for any of the following reasons and more:

I agree.

With respect to curve fitting, I tend to use a 4th order curve fit - as you say, it does a better job fitting the data that you have. As others have also said, the curve fitting is only as good as the data set. Maybe more data points would improve accuracy?

"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
I would suggest putting some of your data into Excel (or similar program) and letting Excel curve fit them. Excel will curve fit for a variety of polynomal equations, give you the coefficients and the correlation coefficient. Dead easy to change the polynominal equation and see how the curve changes to match the data.

Personally, I've had poor experience with 4th order polynomials on pump curves in that I see odd inflections at points not supported by the data. Typically, I use a 2nd order polynomial or sometimes a 3rd order and get more than sufficient accuracy for my needs. Now, I'm usually plotting data for 'common' centrifugal pumps, maybe that's the reason.

One caution with Excel, you can easily get round-off errors with the polynominal coefficients shown on the graph if you aren't careful. I always use engineering notation for the coefficients, typically go with 3 significant factors and then calculate the predicted value(s) against the input value(s) to make sure I don't have any odd discrepancies.
 


Thank you all for your suggestions. I am planning to test one of our newer pumps this week and get as much data as possible (running the pump up and down the selected test curves) and then use that to tell me what type of curve fit seems appropriate. Hopefully with sufficient test data I should be able to tell how the pump behaves at low flow rates and thus get a better idea of the curves to be used in general using free curve fitting software or a program such as MATLAB or Excel. I will also keep in mind the realities of the situation and try keep my analysis conservative. One of the irony’s of the slurry pumping business is that all out pump curves are generated in clear water, with adjustments made later on to accommodate for the slurry based on experience and a few correlation factors worked out empirically.

Anyway thank you again for your advice

Regards,
Patrick
 
Another trick I've used is to just curve fit part of the curve if the operating range you are interested in is a relatively narrow part of the entire pump curve. I used this on a cooling water pump (vertical) recently that I was not getting a good curve fit for trying to match the whole curve. But, over the range I was interested it, a linear fit did the job.

Another trick is to model the curve in sections and then use Excel's "IF" statement to pick the right equation. Lots of time this is overkill but it is an option.
 
A great way to do curve fitting using excel:

List your points (xi,yi).
Develop a functional form for your estimate of y, call it yhati = f(xi) with unknown parameters alpha, beta, gamma, delta etc.
For example yi = alpha + beta*xi + gamma * xi^2 + delta + xi^3 etc

Guess values for alpha and put them in a cell.

Type in your equation for yhati in terms of xi and the guessed paramters (guessed parameters as an absolute reference). Copy it all the way down the column.

Compute residual Ri = yhati - yi
Compute square of residuals Ri^2
Sum the square of the residuals over all i.
Your objective is to find the values of unknown constants alpha, beta, gamma, delta etc which will minimize the sum of squares of the residuals. Use the solver function to do this. The solver dialogue box has the general form:
Set target cell ____ to max/min/value by changing cells ____.
The target cell is your sum of squares of residuals...
You want to minimize (vs max or value) it...
By changing cells: alpha, beta, gamma, delta etc.

The advantage is you can pick whatever functional form you want. You don't have to choose between x^2 and x^4... you can have both as well as a cos function or anything else. It will pick coefficients to minimize the sum of squares of the residuals.

I think you will find a small investment in time in learning the solver tool (fairly simple) will pay off in many different types of problems, not just curve fitting.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Thanks for giving me the link to I am currently reviewing the software at the moment. In many ways it encompasses an automated version of the routine written out to me by electricpete. I agree with him in that the solver tool in excel is fairly powerful when utilised correctly. In general (outside the scope of my project) in terms of coefficients and powers for the equations used to describe data I quite like the fact that it has the ability to include non integer power values in the curve.

This issue with regards to curve fitting has been quite educational for me in terms of trying to figure out what is accurate in terms of given data, what is realistic in terms of the system and what is acceptable by my company in terms of standards and time spent on the project.

Thank you all again for you help,

Regards,
Patrick
 
At the risk of saying some unpopular things, I would venture the following comments (please take these in a friendly spirit if you disagree, but do state your objections):

Fitting polynomials to data is one of the oldest procedures in the book. You first need to decide just how accurate your data points are. If the measurements are to be regarded as "perfect", i.e., you blithely ignore the fact that all measurements are subject to statistical error, the next question is:

How best do you find a curve of some kind that has the right mathematical properties, with respect to (a) passing through all the points, (b) having the proper first and second derivatives, and (c) a reasonable ability to extrapolate, however mildly, outside the range of the data.

In all these respects, polynomials (especially of a high order) are generally a very poor choice.

In my opinion, the proper answer is to use cubic splines to fit the data. These have been around for decades, are extremely well researched, and are not difficult to understand or apply, although unfortunately many engineers have not had proper training in the use of this technique. There are several choices in the available software to address item (c) above. A cubic spline with the right "end conditions" specified will always do an excellent job and, in general, will out-perform a polynomial, no matter how high its order.

Another issue arises with respect to the handling of interpolation for compressor speeds different from the ones for which data have been provided. Here, one is much better off using the "fan laws" to fit the original data in transformed coordinates (e.g., Hp/N^2 versus Qs/N for polytropic head, and PolEff versus Qs/N for polytropic efficiency) to ensure that a rational interpolation is performed for any N. Otherwise, you'd be condemned to use another arbitrary order polynomial to interpolate between the curves. Of course, the fan laws can break down if the impeller tip velocity is too close to the sonic velocity, but this is not an issue generally with compressors used in the process industries (hydrogen compressors may be the main exception).

One of the nicest things about splines is that the vectors of spline interpolation coefficients can be saved, after the first time through, for future use. This reduces the computational effort in subsequent interpolations. The coefficients are revised only if the original data set changes.

A great reference for cubic splines (and a huge variety of other numerical problems that are relevant to engineers) is:

W. H. Press, et al: "Numerical Recipes in C", 2nd Edition (Cambridge Univ. Press, 1992). There is a lot of published software in this book, beautifully explained, and also available at a nominal cost from the publisher. Fortran 77, Fortran 90, and Pascal versions of the book and software are also available.

That said, it is sheer folly for any manufacturer to insist that his measurements are absolutely perfect. Nothing in the universe is known perfectly, no matter how "accurate" your measurement techniques. In a real-world system that is subject to dynamic disturbances, it is especially galling that anyone would adopt such a dogmatic position. If a simple plot of the data shows abnormal bumps in the curvature, you can be pretty sure that the measurements are suspect. Here, I would resort to use of the fan laws at least to eliminate the discrepancy.
 
Those are good points to broaden the discussion.

Each tool has it's place, and which tool is the best certainly depends on the data, the amount of noise/random error, and your knowledge of any underlying functional form.

Here is the book mentioned above available for free:


=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
I definitely agree with you UmeshMathur. There is a lot of valuable information for me contained within your post. I am trying to persuade my company to take into greater account error when testing pumps, developing pump databases and publishing pump curves (something that they currently do not do). I am hoping to develop a new set of standards with regards to the handling and processing of test data so that we can improve the utility of our performance curves and standardize them (in many ways they are not at the moment). I have a meeting with the software developer next week and I plan to look through the relevant sections of “Numerical recipes” before discussing the technicalities of the program and the way it handles data so that him and I may agree on methods to process test data in the program used for pump selections. I cannot agree more with the fact that no data points are ever known perfectly and this has been the basis of my argument to re-examine the way in which curves are applied through the data. Since I am relatively new to the field, it has been really interesting learning in depth, the methods and problems associated with curve fitting. For someone like me, who is inherently interested in the correlation between experimental and numerical values, this is a great project to develop my skills.

I have noticed though that in general throughout out local slurry pump industry (South Africa and Africa) these issues have taken a back seat and have not been fully investigated. Even if we specify efficiency values with a relatively high confidence, in general our customers do not fully check these efficiency values (when the pump is fitted into the actual pipe network that it is to operate in with the slurry) or require us to pay penalties for discrepancies. There is also little information with regards to the change of efficiency values over time. I would have thought more information would have been supplied locally with regards to these factors and that independent certification tests would have been done on our pumps by our customers to second check our results. Currently we do have customer certification tests but they are conducted in our test facility (a clear water test facility) using our equipment and our data processing techniques. It may be that for the moment there is simply not enough of a requirement from the mining and process industries we sell our products to, to second guess our data.
 
As far as the cubic spline, what I understood from above is that it would be a good solution if we have high confidence the data is exact but not as good in the presence of noise. For example imagine one bad data point outlieing from the smooth curve. The spline will force the curve directly through that point. The polynomial fit would generally draw a smooth curve that passes close to most of the points.


One thing to note about the excel/solver method I mentioned above. Sometimes it works great on the first try. Sometimes excel has problems selecting which terms to use first when you put a lot of terms in. So you can do it in stages. First fit yhat1 to the forme you think works best, maybe a*x^2 or a*x^n. Then accept the solution for a and n and plot first set of residuals (y-yhat1). Now look at that residual curve as your new objective function and guess the funcitonal form (maybe b*exp(-c*x). User solver again to determine b and c to best match the first set of residuals (treating previoulsy-determined a and b as a constant). Now you have a second set of residuals to keep repeating the process. Each step you are working with a smaller and smaller set of residuals so the longer you spend, the closer you will get to matching your data points.

One last thing to consider, if you have not just one set of test data, but multiple sets of test data, you can make far better guesses to eliminate random/non-repeatable variability (of course you can't get rid of bias errors that show up the same each time).

If you have any data sets to post, I would be glad to do a solution with excel and post a link to the file back here.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Patrick,

You may want to pay closer attention to how accurately the actual pump shaft speed is known during each test state. I have found that seemingly minor variations in the shaft speed can have significant implications if they are not properly included in the mathematical adjustment of the data. If the pump is driven by an induction motor, the variation in the actual shaft speed due to slip at different torque loadings can be troublesome. If the pump is driven by an induction motor through v-belts or flat belts, then belt slip can introduce further speed variation from the nominal speed.

As others have stated above, I have found Excel to be very handy in resolving pump and fan problems. Surprisingly often, I have found that adjusting for the slip of the driving induction motor was highly beneficial in the evaluation of apparent performance problems.

 
Patrick02:

My comments earlier mentioned compressors, but apply equally to centrifugal pumps.

SLURRY PUMPS
A major source of contention with slurries, of course, is the characterization of slurry properties. With suspended solids, one must worry about accurate knowledge of the solids bulk density, percentage solids in the pumped fluid, adequate solids dispersion into the bulk, non-Newtonian behavior, and variability in solids content over time. Such measurements are very hard to replicate in a test lab and all guarantees must reflect the lack of certainty in slurry characterization.

I would recommend the following as an excellent book for discussion of fluid flow problems for slurries: R. P. King, "Introduction to Practical Fluid Flow" Chapters 3, 4, 5, and 6, (Butterworth-Heinemann, 2002). King shows the very many intricacies in this kind of system that can waylay the casual user.

Back-calculation of pump or compressor efficiency in the field requires measurement of flows, temperatures, and pressures (all subject to previously mentioned measurement errors) and also user-specified values for fluid density and other transport properties. I would say that field measurement of slurry flows is itself a huge area of uncertainty. For non-Newtonian fluids, several other parameter inputs are required that are themselves back-calculated from field data that are difficult to replicate in the laboratory. The uncertainty in these measured and unmeasured inputs used for such calculations is often the source for a lot of wrangling between vendors and owners. Consequently, I regard those slurry pump vendors who provide guarantees with monetary penalties attached as intrepid warriors.

CURVE FITTING
With respect to electricpete’s latest post, I agree that splines may not be appropriate if there is significant evidence of scatter in the underlying data. However, most compressor vendor curves are smoothened before being provided to clients, so such discrepancies are not in evidence.

With pump curves that show much scattering in the data, the procedure I would recommend is first to reduce the data using the fan laws. Then, select a simple, non-polynomial functional form that has the proper behavior with respect to curvature and boundary conditions. Unfortunately, a simple parabola rarely suffices. The final fitting of the function’s coefficients against the reduced data is performed a lot more easily than is possible in Excel using standardized statistical software, a lot of it available on the web free or for a nominal cost. See, for example, for a very nice package that has a great non-linear solver, nice graphics, and an easy input language.

Whichever functional form you choose, it is essential to plot the predicted and actual values (Hp/N^2 or polytropic efficiency versus Qs/N). This would show evidence of a systematic error in predictions owing to an erroneous choice of function. Patrick02 may be pleased to hear that 4th order polynomials are assuredly poor performers in fitting such data in a huge proportion of cases. This is because they exhibit up to two points of inflection which distorts the curve fit over much of the range of the data.

CHALLENGE PROBLEM
To make things interesting, and just to show you how hard the task of finding a proper fitting function can be, I extracted the following set of data for 17 points taken from a real compressor vendor’s curves:

Qs/N: 0.8,0.833333,0.866667,0.9,0.933333,0.966667,1,1.033333,1.066667,1.1,1.133333,1.166667,1.2,1.233333,1.266667,1.3,1.333333
1000*Hp/N^2:
0.297,0.296,0.294,0.293,0.292,0.29,0.288,0.286,0.283,0.28,0.276,0.272,0.267, 0.259,0.249,0.234,0.199
Pol. Eff.:
0.746,0.758,0.768,0.778,0.785,0.794,0.8,0.806,0.81,0.812,0.812,0.809,0.802, 0.794,0.777,0.75,0.676

The tasks are: (a) Fit a function to the Hp/N^2 v/s Qs/N data, and (b) Fit a function to the Pol. Eff. v/s Qs/N data.

This should be an interesting challenge problem for the “polynomial fitters” to try out. The results of such efforts will, I assure you, be extremely revealing and should serve to reinforce my general point about use of polynomials for such work.

Needless to say, using a cubic spline to fit such data is a cinch and requires no guesswork at all about the type of function or polynomial order to use. (As a rule, I specify that the slope at the end-points be equal to that of the neighboring points, so spline extrapolation is reasonable).
 
For your data with a very fine mesh and very smooth curve, spline is no doubt better. Since you have extracted the data from a curve, the best-fit has already been done for you in preparation of the curve. I would hope no noise would remain in the vendor's curve.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Actually there is a little bit of noise in the data. Perhaps based on the errors from reading off a curve.

I did a fit of 1000*hp/N^2 (Y) vs Qs/N (X) here


The first estimate was Yhat1 which was a power law fit.

Residuals were R1.

I fit R1 using R1hat which was sum of two decaying exponentials

Result is the final estimate Yhat2 = Yhat1 + R1hat as follows:

Yhat2 =+$D$2+$D$3*(L9-$D$1)^$D$4 +$G$2*EXP(($G$1-L9)*$G$3)+$G$4*EXP(($G$1-L9)*$G$5)

where
column L is the X values
Constants as follows:
D1 0.499637258 G1 1.251207481
D2 0.297529817 G2 -0.039569578
D3 -0.176437834 G3 -17.73002606
D4 4.028911694 G4 0.053137589
G5 -13.12118014

All of the calcs are done on the tabl leveled XYonly.

I had some difficulty fitting the right-most point until I increased it’s weighting. (Objective function is the sum of all the residuals squared plus another 3 times that right-most point’s residual squared).

Results agree to within 1% for all the given data points. (see cell H27 of tab XYonly)
You can see the overview of the results in the chart tab labeled “Y_andYhat2”

I think what you have is a smoother curve than if you used a spline for the same purpose. Although I think the spline would probably be more accurate. Also if there are known boundary conditions such as the curve crosses the axes at a right angle, it was indicated above that these could be incorporated into the spline method. I don’t know of any easy way to incorporate that into the excel fit method I described. So I would think if you can incorporate those boundary conditions into the spline method, the spline method would be better for purposes of extrapolating beyond the data.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Actually if either/both the shutoff head or the 0-dp runout flow were known, these could easily be incoporated into the least square fit as another point.... using a similar weighting approach as mentioned above - heavier weighting of the points that are far away from the center cluster.

I don’t know enough about pumps to know if there are any other relevant boundary conditions. (do the curves cross the axes at right anlges?)

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Status
Not open for further replies.
Back
Top