Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations waross on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Multivariable regression in Excell 5

Status
Not open for further replies.

BBzDan

Mechanical
Apr 18, 2008
7
CA
Does anyone have any suggestions how to do a multivariable polynomial (2nd and 3rd order) and power regressions in Excel and calculate the coefficients and some statistical characteristics of the fitted equation? Let's assume for simplicity only 2 independent variables.

The second question is if there is anyway to plot the 3D scattered points and the surface representing the regression equation of those points in Excel.

I found a solution for 2-varaible linear regression, but not for polynomial or power.

Thank you,
Dan
 
Replies continue below

Recommended for you

Excel's LINEST() function includes multivariate regression almost as easily as it covers univariate regression. It also gives you various "statistical characteristics of the fitted equation". Note that you need to use it as an "array function" to get the full panoply of its results.

Polynomial regression can be treated as a special case of multivariate regression. Create data columns containing x^2, x^3, etc. Then treat these extra columns as if they are further independent variables.

(I have never quite rationalised in my own mind the extent to which it is statistically valid to treat x and x^n as independent, but this is the method used, and it seems to work.)
 
Denial, thank you, I'll try it. I used LINEST() for linear multivariable, and single variable linear, polynomial, etc., but it hasn't crossed my mind to build extra columns with x1^2, x1^3, and respectively x2^2, x2^3 to use it for multivariable polynomial regression.

Dan
 
I some cases you don't even need to built extra columns
=LINEST(Y1:Y10,X1:X10^{1,2,3},TRUE,TRUE)
will fit a cubic polynomial
 
Cummings54, thx for your suggestion. However, it applies to single variable polynomial regression, not multivariables. What I need is multivariable polynomial regression (and multivariable power regression), of the type y = f(x1,x2...xn). To simplify, let's say omly 2 indepemdent variables, 3rd order in x1 and 3rd order in x2.

Moe complicated may be a polynom of say 2nd order in x1 and 3rd order in x2.

Here, x1, x2..xn are the independent variables and y is the dependent variable.

Dan
 
With a little variation you can include multivariable with their powers by building an array. Suppose your X1's are in B1:B10 and x2's are in C1:C10. Then
=LINEST(Y1:Y10,B1:B10^{1,2,3,0,0,0}*C1:C10^{0,0,0,1,2,3},TRUE,TRUE)
will fit first, second and third order in each independent variable.

Of course at some point you may decide the LINEST formula is too awkward and opt to have the powers in the worksheet.

When you have all variables in a table in the worksheet you can also use the matrix functions to perform the least squares regression as follows:
=MMULT(MINVERSE(MMULT(TRANSPOSE(A1:C5),A1:C5)),MMULT(TRANSPOSE(A1:C5),Y1:Y5))

which assumes you have the independent variables in a1:c5
 
Cummings,
Thanks for help, it started to work. I tried the LINEST() function as you described it and got the coefficients for x1, x1^2, x1^3, x2, x2^2, x2^3, and the free one, but without any terms for products between the 2 independent variables (e.g.: x1*x2, x1^2*x2, x1^3*x2, etc.

I checkd the values, and it's not bad.

Attached is what I've got.
Dan
 
 http://files.engineering.com/getfile.aspx?folder=3ff4b5a6-d00b-4955-8006-40e6599161c3&file=Poly.xls
To include the products x1*x2, x1^2*x2, x1^3*x2 as well, use another variation as follows:
=LINEST(Y1:Y10,B1:B10^{1,2,3,0,0,0,1,2,3}*C1:C10^{0,0,0,1,2,3,1,1,1},TRUE,TRUE)
 
Cummings,

Thank you very much for your suggestion, it works. I also figured out how to get all the coefficents of all the other terms, following the pattern in your formula.

Now, I have to take care of the power regression, of the type y = a*(x1^b)*(x2^c). Any suggestions for this one?

Dan
 
If you include the power terms you will no longer have a linear system. Techniques for non-linear regression need to be applied. The exception, as jghrist notes, is when all terms are power terms and you can take the logarithm and "linearize" the equation. For non-linear fitting it may be best to use a commercial product. I've had good success with Aspire Software's Table Curve 2D
 
This is a very interesting way to multivariable and multipower regression. I am following cummings54 logic with the powers, I think. For instance
B1:B10^{1,2,3,0,0,0,1,2,3}*C1:C10^{0,0,0,1,2,3,1,1,1}

Call "x" the B column, and "z" the C column. This above form will give you the coefficients for an equation of form:
Y=a1*(x^1)*(z^0)+a2*(x^2)*(z^0)+a3*(x^3)*(z^0)+
a4*(x^0)*(z^1)+a5*(x^0)*(z^2)+a6*(x^0)*(z^3)+
a7*(x^1)*(z^1)+a8*(x^2)*(z^1)+a9*(x^3)*(z^1)+a10
where I am looking for a1, a2,...,a10. To get Excel to give me this coefficients a1, etc. I select cells 1 row by 10 columns (1 for each coefficient I want to output), type in

=LINEST(Y1:Y10,B1:B10^{1,2,3,0,0,0,1,2,3}*C1:C10^{0,0,0,1,2,3,1,1,1},TRUE,TRUE)

in the fx (function) box at the top, then hit CTRL-SHIFT-ENTER to get Excel to do the curve fitting for those coefficients. What I don't understand is the logic Excel uses for the output; in my 1 row by 10 column block of cells, where do the a1, a2, etc. go? Looks like 'a10', the constant term, always goes at the end. But it seems like it reverses the sequence of (remember, column B is x, C is z):
x^{1,2,3,0,0,0,1,2,3}*z^{0,0,0,1,2,3,1,1,1} so that the powers are reversed and the x and z are reversed. I will guess:
column coefficient
1 a9
2 a8
3 a7
4 a6
5 a5
6 a4
7 a3
8 a2
9 a1
10 a10 (the constant)

this doesn't seem to be the correct output sequence. I tried, for instance, just using only one term (say a2*x^2, plus the constant a10) to compute a sequence of 'Y', then curve fitting with =LINEST(Y,x^{1,2,3,0,0,0,1,2,3}*z^{0,0,0,1,2,3,1,1,1},TRUE,TRUE)
and the non zero coefficient kept jumping around in that 1x10 block I've selected in the worksheet.
 
Gents, thx for your suggestions.

Cummings, unfortunatelly, for what I need, I cannot use a commercial package, thus, I'm stuck with Excel even for the non-linear regressions.

Thus, regarding power, [y=a*(X1)^b1*(X2)^b2], I used
LINEST(LN(A1:An),LN(B1:Cn),TRUE,TRUE), where y is in column A, X1 is in column B and X2 is in column C. The result gives ln(a), b1 and b2, thus, I had to take exp(ln(a)) to get the free term a. It seemss to work.

Regardding polynomial, I've got OK the 2nd order with all the 9 terms (y = A + B1*X1 + B2*X1^2 + C1*X2 + C2*X2^2 + D1*X1*X2 + D2*X1^2*X2 + E1*X1*X2^2 + E2*X1^2*X2^2), using:

LINEST(A1:An),B1:Bn^{1,2,0,0,1,2,1,2}*C1:Cn^{0,0,1,2,1,1,2,2},TRUE,TRUE),

but I have difficulties to get the 3rd order with all the 16 termss

(Y = a0 + a1*x1 + a2*x1^2 + a3^x1^3 + b1*x2 + b2*x2^2 + b3*x2^3 + c1*x2*x1 + c2*x2*x1^2 + c3*x2*x1^3 + d2*x2^2*x1 + d3*x2^3*x1 + e2*x1^2*x2^2 + e3*x1^3*x2^2 + f2*x1^2*x2^3 + f3*x1^3*x2^3

using something similar:
LINEST(A1:An,B1:Bn^{1,2,3,0,0,0,1,2,3,1,1,2,3,2,3}*C1:Cn^{0,0,0,1,2,3,1,1,1,2,3,2,2,3,3},TRUE,TRUE).

The results do not seem to be correct, and don't know where I mixed up.

I hope all this discussions help others, too, as I spent a lot of time on the net trying to find info about this, and couldn't find exactly what I needed, except for your suggestions.

Thx,
Dan
 
There are limits to what LINEST can do. Looking at the add trendline feature avalable in Excel charting you can see that only polynomials of order 6 can be fitted. This may be a hint that when regressions having more than 7 coefficients in the approximating equation, there may be computational issues. With 17 variables some users have reported #REF errors which can mean machine underflow and overflow errors. If LINEST starts to give errors or the fit doesn't seem right use the matrix method I have already given. MS claims the MINVERSE can handle a 50x50 matrix ie. which could accomidate 50 variables.
 
I just copy pasted this out of a spreadsheet I made years ago with a chiller energy function. Not sure where I got these, but they look relevant and pretty. Apologies if this is a repeat of the good work above.

These equations assume that your sheet has two named ranges: x and y.

Linear Trendline
Equation: y = m * x + b
m: =SLOPE(y,x)
b: =INTERCEPT(y,x)

Logarithmic Trendline
Equation: y = (c * LN(x)) - b
c: =INDEX(LINEST(y,LN(x)),1)
b: =INDEX(LINEST(y,LN(x)),1,2)

Power Trendline
Equation: y=c*x^b
c: =EXP(INDEX(LINEST(LN(y),LN(x),,),1,2))
b: =INDEX(LINEST(LN(y),LN(x),,),1)

Exponential Trendline
Equation: y = c *e ^(b * x)
c: =EXP(INDEX(LINEST(LN(y),x),1,2))
b: =INDEX(LINEST(LN(y),x),1)

2nd Order Polynomial Trendline
Equation: y = (c2 * x^2) + (c1 * x ^1) + b
c2: =INDEX(LINEST(y,x^{1,2}),1)
C1: =INDEX(LINEST(y,x^{1,2}),1,2)
b = =INDEX(LINEST(y,x^{1,2}),1,3)

3rd Order Polynomial Trendline
Equation: y = (c3 * x^3) + (c2 * x^2) + (c1 * x^1) + b
c3: =INDEX(LINEST(y,x^{1,2,3}),1)
c2: =INDEX(LINEST(y,x^{1,2,3}),1,2)
C1: =INDEX(LINEST(y,x^{1,2,3}),1,3)
b: =INDEX(LINEST(y,x^{1,2,3}),1,4)

Higher Order Polynomial Trendline
Notice the pattern in the two preceding sets of formulas
 
Gents, thx again for reply.

Cummings, the 2nd order polynomial, which has more than 7 coefficients, works, thus, computational power may not be the issue when talking about 16 coefficeints, for a 3rd order, 2 independent varaibles polynomial regression.

Kiwi and Clyde, thx for info, but what I need is multi-variable (let's say 2), not single variable regressions (e.g. I need: y = f(x1, x2). The info you provided is the base for the respective functions, but only for a single independent varable (y = f(x)).

The only one I still have problems for my "little exercis" is the 3rd order polynomial for 2 independent varables, with all the terms included in the equation (see a previous message).

Dan

 
Thanks to you Clyde for providing my reference, and also to the OP for pointing out my lazy post was not particularly useful.

I did originally solve my problem as a 2nd order poly over two variables using a solver built into the 'ideas' process modelling environment, but I found that this was difficult to implement in excel (I had to solve it over and over for different chillers) and I found that the error was acceptable in approximating one of the dimensions as linear and adjusting after the first run.

Worth trying as you move in to the cubic realm.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top