Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations GregLocock on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Boeing 737 Max8 Aircraft Crashes and Investigations [Part 4] 28

Status
Not open for further replies.

Sparweb

Aerospace
May 21, 2003
5,131
This is the continuation from:

thread815-445840
thread815-450258
thread815-452000

This topic is broken into multiple threads due to the long length to be scrolled, and many images to load, creating long load times for some users and devices. If you are NEW to this discussion, please read the above threads prior to posting, to avoid rehashing old discussions.

Thank you everyone for your interest! I have learned a lot from the discussion, too.

My personal point of view, since this falls close to (but not exactly within) my discipline, is the same as that expressed by many other aviation authorities: that there were flaws in an on-board system that should have been caught. We can describe the process that "should have happened" in great detail, but the reason the flaws were allowed to persist is unknown. They are probably too complex to reveal by pure reasoning from our position outside of the agencies involved. Rather, an investigation of the process that led to the error inside these agencies will bring new facts to light, and that process is under way, which will make its results public in due time. It may even reveal flaws in the design process that "should have" produced a reliable system. Every failure is an opportunity to learn - which is the mandate of the agencies that examine these accidents.

Some key references:

Ethiopian CAA preliminary report

Indonesian National Transportation Safety Committee preliminary report

The Boeing 737 Technical Site


No one believes the theory except the one who developed it. Everyone believes the experiment except the one who ran it.
STF
 
Replies continue below

Recommended for you

There is a path from the switches to the stab motor that is not interrupted by any processor. The blue path informs a separate module that the trim switches are engaged, but without more detail in this not-an-electrical-schematic chart it's not clear how priority is set.

Obviously they cannot be directly connected or the trim limits and cutout switches could not function.
 
The AP didn't simply stay connected. It did connect and then disconnect again. I believe it gave a warning too.

What should also be questioned is why the pilot attempted to engage the AP with the stick shaker going off.
 
Why the pilots tried it?

More than likely nothing else was working and they knew they were going to die so were trying anything. Funny enough if they had managed to get it to stay in, it would have killed the MCAS input instead of them. Manual trim not moving plane going down, turn the electric trim back on, it doesn't work instantly either because the processor is locked up. I have never been in an about to die situation in an aircraft so really have no clue what I would do. And I really don't know how you would simulate it for training purposes either. I suspect reading an Ipad for 50 mins followed by a 2 page AD isn't going to do anything to help the situation though.

I am pretty sure they have just discovered its always been over loaded in none normal situations. Its just they haven't tested it before. As they never bothered testing with an AoA primary instrument sensor failure. I think we can be pretty certain they haven't for a secondary effect.

The real issue will be if it turns out that its overloaded in the 800 as well.

It will be some 1980's single thread processor possibly even of a 386 equivalent power. There are still thousands of antique laptops with seriel ports round maintence hangers with an abundance of operating systems that should be dead in any safety critical enviroment that are used to link to these things. And its not just Boeing. All the OEMs are guilty and windows 95 is alive and kicking in aviation.

Normally if you tried to put the AP in with the stick shaker going it wouldn't go in and you would get INHIBIT or something like that coming up in the position that the normal AP comes up which tells you that the Auto pilot is in. No aural warning charge tone.

I susppose if this processor is overloaded it may accept the signal and show it as engaged and then eventually processes it and says nope your not having that with the stick shaker and then kills it. Same with it keeping it in for a period with the stick shaker going. It just doesn't get round to processing that it should kill it. How they deal with the prioritys and errors in those modules I have zero clue. If its signal in and action and start the next item its pretty easy to see it getting swamped and not doing what it should for seconds between inputs as it completes everything else in the stack which is always full.

So it might explain the AP staying in with the stick shaker going its just processing all the other stuff before it gets to the input sticker shaker active: output AP disconnect. Then starts going with the other stuff again. Pilot activates AP and eventaully it gets round to processing Autopilot active and sends a kill signal again. By this point the activation loop in the AP has timed out with no negative don't engage signal and its back in and sends a signal. But this gets processed after the kill signal has been sent. Now everything is confused and all timed responces are being missed so nothing knows whats going on including the pilots.

To note this was on the equipment simulator at boeing they discovered this not the flight sim the pilots use for training.

I really hope they publically document all that they find wrong with the MAX and then history can't repeat itself.

I don't for one second believe this is the final issue that they will find.




 
The processor problem appears to be on the new software. There is no reason for Boeing or the FAA to characterize the old software performance as that is certainly never to be flown again.

The trim switch worked to produce stab trim change proportional to the duration every time it was used on all three flights.

Summarized from the Ethiopian Preliminary report:

The Ethiopian pilots never fully returned a large excursion in trim and then, after exceeding Vmo, took a couple of stabs at the switch that were too short to move the stab enough before allowing MCAS to run unopposed.

The ET302 pilots attempted to engage autopilot immediately after getting the stick shaker.

The stick shaker activated at 05:38:44 and remained on until G reversal just before impact.
At 05:38:58 the PIC (pilot in command) called to engage the autopilot and got an AP warning
At 05:39:00 the PIC called for it a second time and one second later got another AP warning.
The data doesn't show if a switch was pressed to engage the autopilot so the one second delay may be how long the copilot took to respond.
Just after this the plane responded to a manual electric trim change.
At 05:39:22 they engaged the AP
At 05:39:55, the autopilot disengaged
At 05:40:00, at the five second design interval MCAS makes the first trim down
At 05:43:11, about 32 seconds before the end of the recording, at approximately 13,400 ft, two momentary manual electric trim inputs are recorded in the ANU direction. The stabilizer moved in the ANU direction from 2.1 units to 2.3 units.
At 0.1 unit per click they needed only to click that switch 20 times to get back in trim. Instead they left trim engaged and did not oppose the final MCAS trim.

All in all, there is no evidence that the CPU had any delays in processing commands on the Ethiopian accident flight.
 
Where does the FDR get it's data from?

Is it the cockpit switch position.

Is it pre the processor or post or feed to the screw Jack?

Not that it really matters what they did to be honest. It's not going to change the re certification process. Or for that matter stop changes to how future aircraft are certified.
 
LOL, WTF does that rant have to do about the crash? The pilots were attempting to engage the AP before the first MCAS activation and long before they put the plane into serious trouble.
 
In general, it's not uncommon to include a mandatory requirement that the CPU or processor shall not be loaded more than 50%.

(Actually, this requirement is too frequently mangled to read "shall provide 50% growth", which can be (mis)interpreted in various ways.)

Anyway, if they found an explicit CPU functional performance issue during system test (which would imply at least 100% loading), then that perhaps opens up another whole can of worms.

 
They will be doing system tests on old software and new software.

Until the new software gets released to the FAA I doubt Government test pilots will get to look at it. It will only be company TP's until its a finished product.


The both accidents are still under investigation and realistically the Ethiopian or Philippines AAIB won't have the skill sets to do most of the technical hardware investigation so in full compliance of international treaties have requested the FAA as the certifying authority to help. And I suspect the FAA is more than willing to help to get to the bottom of this and try and regain some trust from the other authorities. So there will be parallel testing going of original accident configuration and the fix.

I suspect the proc was at 50% when it was fitted to the classic normal ops. It then increased significantly when they added STS on the NG. Increased again with the addition of MCAS normal operations. Stick in a failed AOA and none normal ops and its utterly overloaded. I don't have a clue how good the thermal dissipation properties of said box are, it could very easily be throttling back due over temping as well.

"(Actually, this requirement is too frequently mangled to read "shall provide 50% growth", which can be (mis)interpreted in various ways.)" :)

It was at 50% when first certified in 1990's then the growth has been STS being added and then MCAS. So grandfather rights claimed so as not to have to get a new proc and it certified.... And when they go and check it in the 800 they find its running at 70% normal ops with just STS running extra.
 
A question:
In the event that Boeing had developed a higher, extendable, landing gear, would that have triggered a need for simulator training?

A correction. I mistakenly understood that Southwest Airlines had ordered 100 Max aircraft.
On checking I see that Southwest Airlines have 200 Max aircraft on order.
If new landing gear required simulator training, it would cost $200,000,000 in rebates to Southwest alone.


Bill
--------------------
"Why not the best?"
Jimmy Carter
 
Don't forget 80k x 200 because they paid for the AoA indicator which is now fitted as standard.

Anyway some more info on the latest processor overload situation.

Apparently it can get overloaded with out MCAS triggering. And the plan is to spread the load over more "boxes" which I suspect will open a barrel of worms never mind a can of them.


"In the event that Boeing had developed a higher, extendable, landing gear, would that have triggered a need for simulator training?"

Depends if the procedures had changed or there was anything special about it. But I would be suprised on its own that it would. If there was a different setup for the hydralics to run it which meant different QRH procedures then yes.

I think you can be pretty certain that the FAA will say that ground school and sim training will be required.
 
waross,

I can see you like that solution, but the magnitude is large.

From what I can gather the 737 would need to be higher in the order of 600 to 700mm.

That's an awful lot of extension.

Picture is the difference for A320 to 737

737_height_yr498a.jpg


Remember - More details = better answers
Also: If you get a response it's polite to respond to it.
 
There's a key sentence - "The deliberately broken microprocessor had become overwhelmed ..."

It appears the FAA is going to chase every single-point failure consequence and probably look to make sure pilots can ignore all the warnings. It's a good thing for really lousy reasons.

You know that ET pilots got neither ground school or sim training about the AD, right? So why would they need more training?
 
I read that "deliberately broken" to mean they intentionally fed a lot of inputs into it to see if it could keep up with the commands; not that the processor was physically broken or impaired?

"test pilots flew a scenario causing a fault in a microprocessor..."
That sounds like too many commands, not intentional hardware failure to see the effects
 
No, they inserted a fault; two paragraphs above that sentence, it says

"...intentionally broke part of the 737 Max's flight control computer."

That's actually pretty sound regression testing. The "had become overwhelmed" is the truly scary part, since it's more likely that the processor fault tolerance mechanism had failed to do its job.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert! faq731-376 forum1529 Entire Forum list
 
Too bad they don't really clarify. How can "too many commands" be fed into it? It should only happen if they create a defect somewhere else as the normal process is polling the various inputs preventing "too many commands" from ever happening. The FCC should be a state machine and not able to ever see too many commands; the worst is if there is a state input that changes faster than the system can respond, like if someone applied milli-second duration switch inputs such that the the average status was ambiguous for long periods of time.

Rumor reporting is really not helping and the FAA should be 100% transparent about all the tests they are doing.

I think I'll just discount other sources as entirely unreliable and set pitch and power.
 
"You know that ET pilots got neither ground school or sim training about the AD, right? So why would they need more training? "

They definately won't need any more training because they are dead. License revoked.

European MAX drivers just got the AD emailed to them by their flight safety officer. And they had done the same ipad training.

Once the aircrat is eventually released for service again if the FAA sticks with the rest of the worlds CAA's there will be differences training encompassing more than just a ipad course. It will involve sim work.

Now Sully has spoken his mind they will have to do it anyway even if it just one session for the American market, even if the FAA doesn't think its required. EASA is pretty much certain to require it what ever happens.

Another article with gives a few more difference between the max and NG



just read all the way through its not 100% arcuate or fair to be honest. But there is some good stuff about the addition flight control features that are going on which haven't been mentioned yet in the threads.
 
The article is unclear, and that's intentional on someone's part. The relevant symptom, and the only other "fact" is

[URL unfurl="true" said:
https://theaircurrent.com/aviation-safety/faa-and-boeing-initially-disagreed-on-severity-of-catastrophic-737-max-software-glitch/[/URL]]The test pilot initiated the runaway stabilizer trim checklist, according to the people, but found the electric trim switches on the pilot's yoke unresponsive as the stabilizer continued to force the jet's nose down even further.

All the other words in the surrounding paragraphs are suppositions and speculations; until Boeing and FAA go through the datalogs in detail, we only have two facts, a fault was inserted to induce the stabilizer to pitch the plane down, and the switches to stop that were unresponsive. There are dozens of possibilities.

Note that this was likely a new fault insertion, since they likely choose one that was different that any they might have tested in the past. Fault insertion testing is a extremely slow process and it's likely no more than a few dozen faults were ever tested in the past by the FAA.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert! faq731-376 forum1529 Entire Forum list
 
"...someone applied milli-second duration switch inputs..."

If that was an actual issue, then the system designer needs to back to school to learn about debouncing inputs.

It's standard and essential design practice to debounce switch inputs, because switch contacts do bounce, irrespective of how someone pushes it.

In other words, that's probably not a good example towards your point.


 
I thought it was pretty clear from the article that they performed a fault insertion test, something that would result in the flight computer forcing the nose down; the expectation was that the test pilot would be able to disable the effect of the fault and recover the plane.

It's certainly possible that the inserted fault had an unintended consequence, or that the simulator no longer matches the configuration on which a similar fault was previously tested and averted.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert! faq731-376 forum1529 Entire Forum list
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor