AI’s Kobayashi Maru

5 min readSep 23, 2024

Imagine a no-win situation in which you must pick the least worst option.

It’s the premise of a training exercise featured in Star Trek II: The Wrath of Kahn, in which a would-be captain needs to decide whether or not to save the crew of a freighter called the Kobayashi Maru.

It’s also a useful example of the challenges facing AI. Imagine this thought experiment:

You and your family are riding in an automated car as it speeds along on a highway that’s lined with a park filled with other families enjoying a pretty day. Suddenly, a crane at a nearby construction project flings a huge steel beam that falls to the ground a few feet ahead. Hit it and all of you will be harmed and possibly die. Swerve to avoid it and your car will plow into the crowd along the road, also harming or killing people.

What does your car’s AI decide to do?

You could imagine any number of other instances wherein a decision needed to be made between horrible options. An airplane that is going to crash somewhere. A train approaching a possible wreck. A stressed electrical grid that has to choose which hospitals to juice. Hungry communities that won’t all get food shipments.

What will AI do?

The toffs promoting AI oversight of our lives have two answers:

First, they say that such crises will never happen because there won’t be any surprises anymore.

Nobody will be surprised by the steel beam because sensors will note when the crane starts losing its grip (or even earlier, when the crane increases starts its lifting). Potential arcs of flight and falling will be calculated and communicated to all vehicles in the area, so they’ll automatically adjust their speeds and directions to stay clear of the evolving danger.

Picnickers smart devices will similarly warn them. Maybe the crane will be commanded to tighten its grip, or simply stop what it’s doing before anything goes wrong.

Ditto for that airplane, since the potential for whatever issue might cause it to crash would have been identified long ago and adjustments made accordingly. AI will always give us the best outcomes, never having to face anything less.

AIs will give us a connected world wherein every exception is noted and tracked. Every possibility considered. Every action maximized for safety and efficiency.

The projected date for the arrival of that nirvana?

Crickets.

The likelihood that any system would work perfectly in any situation every time?

More crickets.

So, in the meantime, a second answer to the crisis question is that AIs would be coded to make the best decisions in those worst situations. They wouldn’t be perfect and not everyone would be happy with the outcomes, but they would maximize the benefits while minimizing the harm.

This has unaffectionately been dubbed “the death algorithm,” and it speaks to a common belief among tech developers that they can answer messsy moral questions with code.

And it should scare the hell out of you.

The premise that a roomful of geeks who never took a liberal arts in college could decide what’s best for others is based on a philosophy called “Effective Altruism,” which claims on its website that it can use “evidence and reason to figure out how to benefit others as much as possible.”

In our steel beam experiment, that would mean calculating the values of each variable — the costs of cleaning up various messes, the damage to future quality of life for commuters and, yes, deciding whose lives represent the greatest potential benefits or costs to society — and then deciding who lives or dies.

Morality as computer code that maximizes benefits while minimizing harm. It’s simple.

Not.

How do you calculate the value of a human life? Is the kid who might grow up to be a Nobel Prize winner more valuable than the kid who will likely be an insurance salesman? Would those predictions be influenced by valuations of how much they’d improve the quality of their communities, let alone help make their friends and familiy members more accomplished (and happier) in their lives?

How far would those calculations look for impact? After all, we’re all already connected — what we choose to do impacts others, whether next door or on the other side of the planet, however indirectly — and sometimes the smallest trigger can have immense implications.

And would the death algorithm’s assessments of present and potential future value be reliable enough to be the basis for life-or-death decisions?

Crickets.

Well, not exactly: Retorts from AI promoters range from “it’ll never come to that,” which is based on the nonsense I noted in Answer #1, or “hey, it can’t be worse than human beings who make those awful and imperfect decisions every day,” which refers back to Answer #2’s presumption that the subjectivity of morality can be deconstructed into a set of objective metrics.

A machine replacing a human being who’s going to try to make the best decision they can imagine is not necessarily an improvement, since we can always question its values just as we do one another’s.

It’s just messy analog lived experience masquerading as digital perfection.

The truly scary part is that the death algorithm is already a thing, and more oof it’s coming soon.

Insurance companies have been using them for years, only they’re called “actuarial tables.” Now, imagine the equation being applied more consistently, perhaps even in real-time, as your driving or eating habits result in changes to your premium payments or limits to your choices (if you want that steak, you’ll have to buy a waiver).

Doctors already use versions of a death algorithm to inform recommendations on medical treatments. Imagine those insights being informed by assessments of future worth — does the risk profile of so-and-so treatment make more cents [typo intended] for that potential Nobel Prize winner — and getting presented not with options but unequivocal decisions (unless you can pay for a waiver).

Applying to college? AI will make the assessment of future students (and their contributions to society) seem more reliable, so you may get denied (unless you pay more). Don’t fit the exact criteria for that job? Sorry, the algorithm will trade your potential as an outlier success for the less promising but reliable candidate (or you could take a lower salary).

Pick your profession or activity and there’ll be ways sooner versus later to use AI to predict our future actions and decide where we can do and what we can access or do, and what we’re charged for the privilege.

In that Star Trek movie, Captian Kirk is the only person who ever passes the Kobayashi Maru test because he hacks the system and changes the rules.

I don’t need an AI to tell me that he’s probably not going to show up to get us out of this experiment.

AI’s Kobayashi Maru is a no-win situation.

[This essay originally appeared at Spiritual Telegraph]

AI’s Kobayashi Maru

Written by Jonathan Salem Baskin

No responses yet