Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations pierreick on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

The Fallacy of Extrapolating with Computer Models 7

Status
Not open for further replies.
I have vocally and repeatedly proclaimed that computer models cannot prove anything. As I was working on updating my 5-day Engineering course I came across a perfect example of what I'm talking about.

This model started with the best dataset ever assembled. Really. The best ever. The underlying data was all "coincident" to this model. By that I mean that the people collecting the data had a strong motive to make it as accurate and complete as it could possibly be. Those folks got paid and they paid partners and mineral owners based on the data (so they had no significant incentive to illegally adulterate it), oh yeah, the data is required by law to be complete and accurate and has been since the 1940's. Further the data was collected each month on upwards of 400,000 discrete entities operated by nearly 100,000 business entities, all with an explicit license to operate that is not trivial to acquire. In other words the data entering this model has had financial and legal incentives to be accurate and complete. Of course, the dataset is monthly U.S. gas production by well.

If you start with this high quality data and bring in:
[ul]
[li]Historical wellhead price and consumer price data sets along with an independent forecast of those prices into the future[/li]
[li]A detailed data set containing historical new-well permits that can be compared to the price data over time[/li]
[li]A detailed data set containing new facility permits that can be compared to new well permits and the price data[/li]
[li]A detailed list of issued permits for facilities (that take up to 10 years to build after the permit is issued) with their projected completion dates and projected capacities (see the big uptick in the attachment in the Alaska data in 2019 representing the pipeline coming on line)[/li]
[li]Independent forecasts of inflation[/li]
[li]Historical and (independently) projected steel pipe worldwide manufacturing tonnage and prices[/li]
[li]A team of very talented, very experienced Engineers, Economists, Statisticians, and Computer Modelers[/li]
[li]A project deadline that the team felt was very liberal[/li]
[li]No limits on budget for manpower, computing equipment, or software[/li]
[/ul]

It really doesn't get any better than this. They published the attached forecast in the 2007 Energy Outlook there were some glitches in the first version so they updated it and published the attached in the 2008 Energy Outlook. This chart was reprinted hundreds of times over the next few years. I haven't seen it much since 2011. I pulled in the data yesterdat and added the actual production between 2006 and 2011 (last data available that breaks out the various gas types).

So at year 5 you find:
[ul]
[li]Unconventional gas under predicted 92% (using the Unconventional Gas forecast as the denominator)[/li]
[li]Onshore conventional over predicted 41%[/li]
[li]Offshore over predicted 60%[/li]
[li]Alaska over predicted 55%[/li]
[li]Total gas under predicted 19%[/li]
[li]If you remove the Unconventional component, total would be over predicted 55%[/li]
[/ul]

This is at year 5 of a 24 year forecast. In early 2008, Gas prices were over $10/MSCF, the drilling in the Marcellus, Hanesville, and Fayetteville shales had accelerated and rig counts were approaching all-time highs. All of this data was readily available to the modelers, but they didn't quite believe it and tweaked the model back to a slight increase followed by flatish with offshore taking up the slack.

I don't mean to ridicule these guys, they did a workmanlike job. I made a similar blunder in 1990 when I failed to include a group of wells (that I had already built pipe to) in a forecast of the value of a company that was on the market. A competitor did include those wells and offered $15 million more than we did--the group of wells I excluded produced that much profit the first 6 months and 23 years later they are still on production.

My point is that with a superb set of clean data, unlimited time and budget, a team with all of the requisite skills and no incentives for reaching a particular conclusion couldn't predict something as "simple" as gas production within 55%, how can anyone put any credence in the climate models that have questionable data, intense time pressure, intense budget pressure, and intense pressure to reach a specific conclusion? Hell, they could even be "right", but I won't be willing to accept that until we can look back at a body of predictions that have the same shape as the actual (raw) data for that period. So far we are not even close.

David Simpson, PE
MuleShoe Engineering

"Belief" is the acceptance of an hypotheses in the absence of data.
"Prejudice" is having an opinion not supported by the preponderance of the data.
"Knowledge" is only found through the accumulation and analysis of data.
The plural of anecdote is not "data"
 
One thing I struggle with often is getting others to realize that there are assumptions, limitations and uncertainty in every model. We need to always be ready for what we did not know.
 
Anyone with any scientific training knows that a predictive model cannot 'prove' anything, it's a tool to predict something that could happen.

That said, could you do me a favour, and start 324,567,890,000 more threads on here that state you don't agree with the current popular climate predictions and surrounding politics? I don't think we have quite enough of your viewpoint posted on here yet.
 
The most insidious thing is scientists who intentionally calibrate their models against a specific variable with a known correlation, but then use those models to claim causality.



Hydrology, Drainage Analysis, Flood Studies, and Complex Stormwater Litigation for Atlanta and the South East -
 
TenPenny,
That sounds like sarcasm. I can't help the direction that the threads I start go. The "consensus science" discussion could have gone to a discussion of historical consensus topics, it isn't my fault it fell into the AGW discussion. This one is the same. I saw an excellent example of a best in class model failing to predict a pretty simple system (compared to the climate of the globe). Thought I'd share. I would guess that there is some interest in these discussions since most of them get over 100 posts in a week. If you don't want to participate, there is a really effective technique that I use all the time--don't open the damn thread.

David Simpson, PE
MuleShoe Engineering

"Belief" is the acceptance of an hypotheses in the absence of data.
"Prejudice" is having an opinion not supported by the preponderance of the data.
"Knowledge" is only found through the accumulation and analysis of data.
The plural of anecdote is not "data"
 
Having a bad day, Tenpenny ? [curse]

It is better to have enough ideas for some of them to be wrong, than to be always right by having no ideas at all.
 
It really depends on what models you are talking about. For example models have been used successfully for years to control many different physical processes. If you have flown lately on commercial airlines, then you most likely experienced model predictor control. Also know as fly be wire.

In the case of the OP example, the model was trying to predict human behavior. I do not know of anyone, or anything that can predict human behavior.
 
djs,
I'm a modeler. I understand the power of models better than most. I also understand the limitations. The big limitation is extrapolating any human or natural system forward more than a step or two. Weekly projections of any system going out decades are just random number generators after the first month.

David Simpson, PE
MuleShoe Engineering

"Belief" is the acceptance of an hypotheses in the absence of data.
"Prejudice" is having an opinion not supported by the preponderance of the data.
"Knowledge" is only found through the accumulation and analysis of data.
The plural of anecdote is not "data"
 
"" The big limitation is extrapolating any human or natural system forward more than a step or two""

I don't think so as you might guess.

Thermal conductivity models of homogenous stationary media and inherently stable. The PDE solution to a heat flow problem will be stable forever if it converges in the first place.

I think you miss the difference between stability, marginal stability, and instability as types of dynamic systems.

For instance dissipative systems are stable and the solutions to such converge to a point and stay there.

See this nice wiki article.

 
Google
"extrapolating with computer models"

867,000 hits.

apparently most are unaware of this fallacy.
 
One reason interpolation works for predictions is that you can be wrong and still get reasonably good results. If you approximate a logarithmic or polynomial function with a set of trig functions, the interpolated results can be tolerably good. However, the divergence as one gets outside of the known data can be, well, logarithmic.
 
zdas04 has 1 model that doesn't predict very well.
zdas04 extrapolates to conclude that no models will predict well.

One could just as easily point to somebody like Nate Silver who has had great success in political forecasting with his models.

 
Of course, when it comes to certain situations, you have NO choice but to use computers and software to 'model/simulate' systems which CANNOT be tested using prototypes or even first production models.

A recent case in point:


Note that for the people where I work, this topic would be more than a simple intellectual exercise:


John R. Baker, P.E.
Product 'Evangelist'
Product Engineering Software
Siemens PLM Software Inc.
Industry Sector
Cypress, CA
Siemens PLM:
UG/NX Museum:

To an Engineer, the glass is twice as big as it needs to be.
 
Your Smithsonian link reminded me of another link to that site. This one shows how well the Limits to Growth predicted 30 years out.
 
John - loved the Smithsonian video. However, in EACH of the phases of that, do you have any idea how much testing and prototyping was done? I think the engineering teams spent over 1 year modeling, prototyping and testing the parachute. The science behind the aerobraking maneuver has been tested thousands of times with hundreds of thousands of computer model simulations validated by testing. (Disclosure - I was involved in the boundary-layer stability numerical simulations of Martian and Earth re-entry hypersonic flow for blunt and semi-blunt configurations. The validation against experimental work was beyond extensive).

Mars wasn't done with only models, but models that were validated by extensive prototype testing. AND, nothing was extrapolated - the prototype testing was done to create an envelope of possible scenarios so that, if anything, interpolation and not extrapolation was done.

That's the thing about interpolation vs. extrapolation - we usually test our systems beyond their expected performance, to safely predict what they'll do under normal operation. Case-in-point: pressure-containing system (pressure vessels and piping). We test those to 1.25-1.5 (and way more, if they normally operate at high temperatures) the design pressure, to validate the integrity of the design and fabrication. Nobody says, well the last 2000 vessels we made haven't blown up, so let's not test this one - the model says that it's OK.

These modeling problems can be broken down to boundary-value problems and initial-value problems. Boundary-value problems have (obviously) known boundary conditions and are essentially interpolations - and nobody asks questions about what happens beyond the boundaries in these closed systems. As 2dye4 indicates, they are typically stable or semi-stable and are well-behaved. Compare that to weather. Weather is an initial-value problem; the more you know about the initial conditions (magnitude, derivative, and integral of all initial values), the better your prediction is. However, in these types of problems, random boundaries may pop up - say a huge increase in a variable such as unconventional gas production, that is essentially unpredictable a priori. Same with climate - solar variation (TSI, GCR, etc), vulcanism, ENSO, other not-well-understood cyclic variations, etc may pop up and make a huge difference. Heck, a 30-minute timing difference in tropical cloud formation has more effect (on a W/m² basis) than a doubling of CO2. Furthermore, the resolution needed for those initial conditions is vastly greater than what is currently available.
 
Yea gotta wonder when they'll be including the next 10 years of data...

John R. Baker, P.E.
Product 'Evangelist'
Product Engineering Software
Siemens PLM Software Inc.
Industry Sector
Cypress, CA
Siemens PLM:
UG/NX Museum:

To an Engineer, the glass is twice as big as it needs to be.
 
2dye4 - very interesting wiki article about Attractors in chaotic systems. One of the fascinating aspects about the so-called strange attractors is that they are exceedingly stable even in the presence of large perturbations - perhaps there may be a leap from one stable attractor to another, but they usually don't "blow up".

Here's a thought exercise for you. I don't know if this is right or wrong or otherwise. What if the earth's climate, the chaotic system that it is, has a stable or pseudo-stable attractor that keeps our overall climate reasonably stable in the presence of huge perturbations? What if something like tropic thunderstorm daily timing is sufficient to add or shed heat to stabilize the system, either in the perturbation of volcanic eruption emissions or increased CO2?

Considering that we've never experienced a run-away before, even with ice-ages (and possibly a "snowball earth") and interglacials of almost complete loss of inter-annual ice, is this a possibility?
 
But in the end, it all depended on a lot of computer cycles to pull it all together into a single scenario simulating exactly what was going to happen during those last seven minutes of the delivery phase of the mission. And they apparently go it right [thumbsup2]

But I agree with your comments about "interpolation vs. extrapolation" and the need to determine how accurate your designed-for safety factors are. After all, in much of engineering it's often true that you never really know HOW MUCH until you know what's TOO MUCH. Which reminds me of several books I've read by Prof. Henry Petroski, primarily about ther failure of structures, usually bridges since that's his area of expertise, but he touches on many other famous, or should we say, infamous examples of engineering failures. I think one of the most profound comments that I remember him making was that we learn an order of magnitude more when something fails than when something does not. Another good thesis covering a famous failure was the comments made by Dr. Richard Feynman as he described the events, which included a 'Mr Wizard' like experiment before the Congressional committee looking into the Challenger disaster in 1986.

John R. Baker, P.E.
Product 'Evangelist'
Product Engineering Software
Siemens PLM Software Inc.
Industry Sector
Cypress, CA
Siemens PLM:
UG/NX Museum:

To an Engineer, the glass is twice as big as it needs to be.
 
John, was it models and simulations that put Curiosity on the ground on Mars, or was it real-time on-board computing with sufficient flexibility to deal with the "unknown"?

And check out JPL's Mars travel log summary - There's a whole lotta "Failure"s in there. Although the trend is improving, I don't know if I would extrapolate that trend... [smile]
 
[ul]
[li]Historical wellhead price and consumer price data sets along with an independent forecast of those prices into the future[/li] So one of the inputs to the model was the output of another model....
[li]A detailed list of issued permits for facilities (that take up to 10 years to build after the permit is issued) with their projected completion dates and projected capacities (see the big uptick in the attachment in the Alaska data in 2019 representing the pipeline coming on line)[/li] Permits do not equal construction. Construction started does not equal construction finished. Projected capacities are again the output of another model.
[li]Independent forecasts of inflation[/li]The output of yet another model
[/ul]

And they didn't even attempt to model factors such as:
The development of technology to make "unconventional" sources more accessible.
The rule of law by an arbitrary and capricious government.

So the question to ask is not "How could such a great model be wrong", but "how could anyone really expect that this would have any chance of being accurate" - which I guess is your point in the first place

I get "Can't you model it?" questions frequently from clients. My most frequent answer is "The equations to characterize the system are trivial. However the outcome is entirely dependent on initial conditions and external inputs; and we will be guessing about most of those."
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor