• Homepage
  • Blog
  • Robots and Moral Dilemmas: A Short Guide to Machine Ethics

Robots and Moral Dilemmas: A Short Guide to Machine Ethics

8 min read
Robots and Moral Dilemmas: A Short Guide to Machine Ethics

In 1942, a few years before the birth of Artificial Intelligence, Isaac Asimov formulated the Three Laws of Robotics, a set of rules to govern a world in which it is possible to build advanced robots.

Asimov's Laws are quite simple:  

  1. a robot may not injure a human being or, through inaction, allow a human being to come to harm; 
  2. a robot must obey the orders given it by human beings except where such orders would conflict with the First Law;
  3. a robot must protect its own existence as long as such protection does not conflict with the First or Second Laws". 

Sounds reasonable, right? However, it is not difficult to create a scenario in which these laws are in conflict with each other (Asimov himself creates several). What then? Is it possible to write a code that would not blue screen when such a scenario occurs and still be able to decide what to do? Moreover, would the result be ethically right?  

Machines, tools and personal responsibility 

Before we answer this question, it has to be emphasized that we are not going to consider a "strong AI" scenario. Autonomous vehicles and other machines will be seen simply as advanced tools that are not capable of making any decision of their own. (You will find some remarks about why this is important at the end of this post).  

The most important element of a moral decision is personal responsibility: a human agent can be held accountable for his actions. Personal responsibility therefore implies that a human agent is free in his decisions, which means that a choice between given alternatives is not predetermined. If it was (and there are many philosophical schools saying so), personal responsibility would be simply a nonsensical concept. In that case there would be no point in distinguishing human action from any natural event. 

This being said, when we are considering any result of using a tool from an ethical point of view, we always have to trace it back to some human agent behind it and judge its action as simply an extension of human decisions. Let's consider the following examples:  

1) John Smith bought a butter-passing robot to have it in his kitchen. One day he asked the robot to pass the butter and suddenly the robot went out of the house and killed John's neighbour.  

2) John Smith bought a butter-passing robot and one day decided to use it to move the furniture. The robot lifted a bookshelf but failed miserably and killed John's neighbour as a result.   

3) John Smith bought a butter-passing robot and asked it to take the butter to his neighbour’s driveway. The butter melted and the neighbour slipped on it and died.  

In the first scenario, the tool was used properly, and the damage was accidental (the user didn't intend to cause it). The party responsible for this event is designer/producer, as John didn't know that the robot was capable of such an action. In the second scenario the tool was used improperly, and its user also didn't intend to cause the damage. John would be responsible for this event as a person who used the tool without following the instructions. Finally, in the third scenario the robot was used improperly, and the damage was intended by the user, so the user should be held accountable for the result.  

Scenarios of good or bad robots using

When it comes to machine ethics, we will be considering the left side of the diagram, which contains two cases of possible damage: one intended by the user of the tool and one not intended by anyone. The right side of the diagram applies only to war machines and requires a separate analysis. 

Pulling the lever 

You may have heard about the "Moral Machine experiment". The context for it, as the authors declare, is "the rapid development of artificial intelligence", which raised questions about the possibility of making moral decisions by machines. However, the most interesting part about this experiment was discovering the axiological preferences of the people taking part in it, for The Moral Machine is a very complex approach to the well-known "Trolley Problem" experiment.  

The Trolley Problem, invented by Phillipa Foot, depicts the conflict between individual rights and social or economic utility – or the deontological and utilitarian (consequentialist) approaches to morality. Imagine a trolley disconnected from the train which is about to hit five people on the tracks. The trolley can't be stopped; however, you can save those people if you pull the steering lever and redirect it to the side track, but in that case the trolley will kill one person.  

The classic utilitarian answer to this question is that pulling the lever would be morally right, since the value of five lives (in terms of total sum of happiness and suffering) trump the value of one. But what if this one person is a gifted heart surgeon, and those five people are a gang of robbers, who just recently killed a couple of pensioners? John S. Mill would say that we should make decisions based on utilitarian premises only if we are able to calculate the actual consequences. It means either there are other premises a utilitarian should consider (and therefore, utilitarianism is at least inconsistent), or – more likely – the only possible objective criterion in classic utilitarian decision-making is the quantity. Otherwise, every utilitarian agent should have perfect knowledge about everything.  

An autonomous vehicle, equipped with advanced pattern recognition devices, would be closer to the position of such an ideal utilitarian observer than any human will ever be. And this is what makes The Moral Machine experiment truly interesting: we were able to calculate utility based not only based on the number of saved lives, but also to decide which life is "worth more" (or which death would cause more damage to society) than others, regarding lifestyle, social status, gender, age and some other factors. That’s what we did: you can read a summary of the experiment. This is also where things get controversial.  

The social preferences demonstrated during the experiment are quite clear, even if there are slight differences across countries, cultures and regions. If you're an overweight homeless male, people (statistically) would sacrifice your life without hesitation for the "greater good"; the highest chances of saving your life you would have by being an athletic pregnant female doctor. It is barely surprising – as "inhuman" it may appear, these biases are deeply rooted in our biology, psychology and (as a derivative) in our culture. Females are in general more valuable for the preservation of the species than males; the same thing goes for pairs "young/old", "healthy/obese", and obviously "human/animal". 

However, the fact this measured social preference is rational from an "evolutionary" point of view does not make it "objective" at any level. We can say that as humans we tend to make such choices – however we are not determined to do so. If that was the case, ethics would not be needed anymore. These criteria can easily change, as they are not objective or universal – and moreover, it is not possible to find any utilitarian premises that can be described as objective. (This objection also applies to Eliezer Yudkowsky's "Coherent Extrapolated Volition"). Ethics cannot be aggregated and simply calculated by an algorithm.  

Imagine that someone close to you is killed in such an accident and you are given this explanation: this person died because they were in a bad shape and old age, so the car was programmed to spare the young and healthy ones. Doesn't bring much closure, does it? How would you feel towards the younger and healthier people who survived the accident?  

The natural flow of events 

In deontology (from Greek δέον, which means "obligation, duty") the question of whether or not a certain action was morally right cannot be based on its consequences. A famous extreme example of this comes from Immanuel Kant: if someone being chased by a murderer hides in your house and the murderer asks you if this person is hiding there, it is ethically wrong to lie about it, since lying is itself immoral. So if a Kantian was about to decide if we can pull the lever, he would say that we can't: we didn't cause the trolley to go in this direction, so the death of five people would not be our fault. Pulling the lever would be our decision and our responsibility, and since we know this decision will kill a man, it is morally wrong.  

It might be that Kantians are not so great when it comes to saving lives, and if your friend is a Kantian it may be a good idea not to seek asylum in his house if you are pursued by a murderer. However, Kantian ethics is much closer to universality than any consequentialist concept of morality. 

There is a big hole in our everyday perception of morality, which is the need of rationalization. We desperately need to feel good about ourselves and we need to make sure that decisions we made were right, but at the same time we usually don't like it if they require any sacrifice. "I should have done X and it was too hard for me to do, but look – in the end everything turned out well". Or: "I should do Y, but is it really worth it? If you consider everything, it will be better if I won't do it at all".  

Deontology resolves this by denying any attempt to rationalize decisions – before or after making them – and by focusing on the duty. Contrary to the results of convenient rationalization of actions, the duty is truly rational – it does not serve pleasure, fame, recognition or any kind of reward. If someone wanted to summarize deontology in one sentence, it would be a paraphrase of John F. Kennedy's speech: "It is doing things, not because they are easy, but because they are hard". This is the essence of "good will" – we rationally choose a purpose which can be seen as universal good and it requires our personal sacrifice.   

How does it apply to machine ethics? A deontological approach is consistent with our premise that machines are just tools not capable of making decisions. The question of whether or not the result of using a machine is right depends on the relation between the result and the design. Contrary to the utilitarian approach, it does not create an illusion of the autonomous decision of a machine and puts personal responsibility in the center.  

When trying to apply this to the Moral Machine experiment and harm reduction, we find ourselves with very limited options: 

  1. A person who decides to drive an autonomous car is the one responsible for putting pedestrians in danger. It doesn't matter that they didn't know that the brake system can fail – the unfortunate flow of events started with them. In that case, the autonomous car should choose to spare the lives of pedestrians before passengers.
  2. The only exception is when a pedestrian is jaywalking, which also distorts the natural flow of events. In this case we could allow the autonomous vehicle to kill the pedestrian and spare the passenger's life.
Graphic presenting autonomous cars

This logic of action is also much easier to compute, as it does not require complex utility calculation. It requires a proper logical structure that establishes relations between general rules and avoids contradictions. However, it is not immune to dilemmas, it is a much more reasonable and acceptable option than any utilitarian approach.  

There is no "black box" in ethics 

Arthur C. Clarke's famous "third law" says that "any sufficiently advanced technology is indistinguishable from magic". This is why when talking about AI and its capabilities we're often misled by absurd scenarios of artificial intelligence taking over the world. These scenarios are based on the assumption that AI will become so advanced to have its own goals and its own values. 

This case is related to the question about the possibility of AI having its own semantics: why would it ever need it if it functions perfectly fine without it?  

But that's not the whole story. Consider Nick Bostrom's "paperclip maximizer" thought experiment. It is quite possible for the hypothetical super intelligent paperclip machine to recognize that "the human is made of atoms useful for paperclip production", however if it wasn't programmed to retrieve these atoms from humans (and not from specific objects, like metal bricks), it would not be able to do it. If I was given a task to produce a lot of paperclips and suddenly ran out of metal bricks, I would be able to recognize that I can use humans to finish my job. However, that would require understanding how humans are different from metal bricks and how to change the production process to make this happen.  

Chart presenting input and output of process of machine understanding

If a machine starts killing humans to make paper clips from them, it was written in the code in Input 2. On the other hand, if AI becomes so advanced to have the ability to learn how to turn everything into paperclips, why on God's green Earth would it choose to produce paper clips? If AI was programmed to do X (producing paperclips) and not programmed to do Y (turning people into paperclips), but it was perfectly capable of learning Y by itself, it is a conclusion based on absolutely nothing that it will follow X while doing Y. In other words, If AI is determined to do X, it is not free to do Y; if it's free to do Y, it is not determined to do X. And if it's not determined to do X – then discussing how it should be programmed is pointless.  

Looking at AI as capable of disconnecting from human goals and values is very dangerous. This most probably doesn't happen, so any decision and action of a machine has to be seen as an extension of human act. However, if it actually happens, everything we are talking about when discussing machine ethics won't matter anyway.