Self-Driving Cars: What Could Go Wrong?

A new study exploring how self-driving cars avoid pedestrian collisions draws an unfortunate conclusion: the darker your skin, the more likely you are to be hit.

The study, published in February, suggests an inherent algorithmic bias in the technologies guiding these vehicles. This stems from data sets that are used to train the systems.

As Sigal Samuel explains at Vox:

[T]he authors of the self-driving car study note that a couple of factors are likely fueling the disparity in their case. First, the object-detection models had mostly been trained on examples of light-skinned pedestrians. Second, the models didn’t place enough weight on learning from the few examples of dark-skinned people that they did have.

More heavily weighting that sample in the training data can help correct the bias, the researchers found. So can including more dark-skinned examples in the first place.

Basically, algorithmic systems learn from the datasets fed to them and then extrapolate and learn from that data. If the data isn’t diverse, the system won’t learn diversity. See, for example, Google circa 2015 when its Photos app identified black people as “gorillas.”

There’s a caveat to the study though. One that illustrates another problem and provides a lesson. The study’s authors didn’t have access to the actual object-detection models used by companies making self-driving cars. These are carefully guarded trade secrets, the special sauce artificial intelligence formula of machine learning.

Instead, the authors used what’s available to researchers studying such issues, in this case, the Berkeley Driving Dataset.

So, problem: companies developing self-driving cars guard their data. This is more or less understandable but leads to public health and legal conundrums. When the public doesn’t have access to how the cars make life and death decisions – such as whether to hit the cat or dog, or the old man or young child in a split second incident where a collision is bound to happen – how is it to determine a host of issues ranging when and where these vehicles can operate to who’s culpable in the event of a tragic accident.

Let’s back up a minute though. Self-driving cars use a few main systems to perform as they do. The Economist explains it like so:

The computer systems that drive cars consist of three modules. The first is the perception module, which takes information from the car’s sensors and identifies relevant objects nearby… Cameras can spot features such as lane markings, road signs and traffic lights. Radar measures the velocity of nearby objects. LIDAR determines the shape of the car’s surroundings in fine detail, even in the dark. The readings from these sensors are combined to build a model of the world, and machine-learning systems then identify nearby cars, bicycles, pedestrians and so on. The second module is the prediction module, which forecasts how each of those objects will behave in the next few seconds. Will that car change lane? Will that pedestrian step into the road? Finally, the third module uses these predictions to determine how the vehicle should respond (the so-called “driving policy”): speed up, slow down, or steer left or right.

Here’s a helpful illustration from The New York Times:

Now, the lesson. Other industries such as pharmaceuticals guard their special sauce but also, ostensibly, go through rigorous trials before being brought to market.

Our software, our AI, needs to do something similar too. States are implementing rules and regulations around autonomous vehicles but these are largely mechanical.

Whether the lesson is learned from the drug industry or elsewhere, needs to be greater transparency that allows for deep investigations into the algorithmic bias of types before self-driving cars appear in the streets.

Because, of course, we have well-intentioned yet face-palming initiatives such as this:

Mentioned

Image: Photo by Gareth Harrison on Unsplash