Which would be quite odd. Because "seeing" a static object and then determining its distance should be about the easiest part in the whole AI. In fact AI should barely come into it. 3D positioning of objects and the vehicle should be the easy part of self driving cars. That aspect isn't traditionally refered to as AI.
Maybe you should apply to work for Tesla? They've been at it for many years and still can't get it right.
Both Tesla and Uber seem to be working on a object identification first principle, not the lets avoid any object that appears in the path of the car principle. Knowing how to react seems much simpler if you start with a list of objects tied to the type of reaction needed for each object. However, it breaks down when you fail to detect the object correctly.
The Uber accident report clearly showed that the AI was determining what the "object" was on each scan and then deciding what to do about it each scan. There seemed to be a lack of both history tracking of objects as the classification changed or any kind of "avoid any object that will be in the cars path" logic. This sounds very simple with the Lidar system by just tracking the fact that there was an object and it was on a collision course, but they obviously struggled with this.
It must be difficult to use multiple cameras and process the images into a 3-D map of the environment or else Tesla would be doing it. They'd have a constantly updating 3-D map that could be used to keep the car from crashing into anything. They seem to first detect what the object is and then decide what to do about it. In the case of the truck accident, the processing determined the truck trailer was an overhead road sign and overhead road signs were programmed to be of no concern. The historic tracking of the rest of the truck is something that a human would do, but it is something that has to be programmed into a computer. I'm sure they have tracking for some objects in the logic, but obviously only the cases they have thought about. For example, tracking the expected path of a pedestrian who leaves the sidewalk in the direction of the road to ensure the car doesn't hit them. Once again though, it can fail when the object detection fails.
Now, if you think about it you can understand why they aren't acting on every detection of an "object" possibly in the cars path. A false detection of an object of concern leads to unnecessary avoidance and they are trying to avoid turning or braking unless it is actually necessary. Falsely detecting an overhead sign as a truck in the cars path leads to the car coming to a sudden stop on a high speed road where is it highly susceptible to being rear-ended.
Is it possible to know the total system experience? All sensory inputs, all decisions /branches, outputd / commands exactly as occurred in the incident? Can we "kmow" what the car "knew"? Pardon the imprecise terminology
I think they can determine what the processing decisions were, but they can't determine why the car got the detection of the plane wrong. My understanding is that with the image learning and classification systems you can't find out what things the processing keyed-on as important when it learns that a group of images equals a certain type of object when model building. You also can't find out what parts of the image it used to determine what the object was when doing detecting.
You can't log the image processing logic, it's working too fast. I've been playing with a $40 Google Coral processor for security cameras and it processes images about 10x faster then a new CPU can. These cars are using processing units that have hundreds of parallel processors and each processor is much more capable than this Google one. The amount of data crunching going on is far beyond any logging capability. Then, you have the issue that when you record the camera images and feed them into an image processor again later you just might get another result.