Monday, 20 May 2019

Boeing 737 Max 8 Fiasco Broadens

Being an ex-coder (yes, I learned to code and made a living out of it..) and also having worked for American companies I can sympathise somewhat with the issue of the Boeing 737 Max 8 and the issue of the MCAS software. I can also see how the issues surrounding the 737 Max 8 coding are pertinent to software systems in other vehicles, including cars.

First off, I pray that no one person messed up coding the MCAS system software and it was all written to specification. God help the person or persons on the MCAS team if they didn't code to spec.

I'm not a flyer, I'm a sailor. But in sailing, everything is duplicated or triplicated in order to add safety. Even when the threat is only getting wet (albeit miles from land) you still make sure that you have at least two means of propulsion. The auxiliary engine may be smaller and get you there slower, but it gets you there, rather than being stuck. You always take bungs with you just in case a hull fitting breaks or leaks. There's always a backup. When the risk is dropping out of the sky at hundreds of miles an hour and there's a greater risk of death, then the redundancy should go up.

Having been in many technical spec meetings, I can well imagine a scenario where the MCAS system input and output were specified without full knowledge of how it would affect the aircraft as a whole. For instance it's now known that if MCAS increases stab trim to maximum, it makes it virtually impossible to adjust the trim manually if you follow the "runaway stabiliser trim" procedure in the cockpit, due to air pressure pushing on the stabiliser. That makes it virtually impossible to return the stabiliser trim to a safe position once MCAS had done it's thing.

I've had very heated meetings where I've had to assertively point out the implications on software changes both in delivery times, cost and including changes to other parts of the software system affected by this single, apparently insignificant software change (nothing, including time, is free).
It didn't win me friends in management at the time, only when the system actually worked as promised, within timescales. But the testament to my skills is that a number of the systems I wrote worked from day one without issues and are still running virtually unchanged decades later. Pretty rare in software circles. But that's where understanding the system as a whole and outlining the issues correctly and coming up with a deliverable plan helps. After all you can code a basic system to run initially. The bells and whistles can always be added later as long as you write the code expecting them.

Now with Boeing, comes the revelation that their 737 Max 8 simulator cannot replicate the cockpit conditions when the MCAS is activated. It seems that the manual stab trim wheels in the simulator nowhere near replicate the forces required to manually move the stabiliser. It seems yet again to be an issue with software.

Thankfully being a simulator issue this time it's not fatal.

But it does underline the issue that software engineers need to appreciate the systems they are working on AS A WHOLE. Especially when (but not exclusively) interfacing with and commanding physical systems. You cannot commit a software system in isolation from the rest of the system, it cannot just be plonked on top of existing systems either, without reference to pre-existing systems, both software, hardware and the existing man-machine interface.

The same goes for autonomous vehicles being developed right now. We've already seen issues with the control systems that are supposed to avoid crashes. It becomes a very salient point, given that we are putting our lives in the hands of these systems both in the air and on the ground.

No comments:

Post a Comment