In the following essay I will reflect on how to mete punishment on “criminal” autonomous agents.

Punishing the unpunishable

Punishment can have multiple goals. One is to aware that an error has been made. In this case, the role is educational: the subject learn something previously unknown. On the other side, there exist occasions in which we already have consciousness of a bad behaviour. In these situations, the goal is to inhbit the desire of doing something wrong. Let’s think for example at the fine system: in a lot of situations, the speed of the car is kept under the law limits not because of a higher risk of crashes, but just to avoid paying the fine. In both of these cases, the aim is the one of avoiding the error to happen again.

In the first option, raising to consciousness of a wrong act, is a sort of correction and learning, and so this could be considered as fixing. In the second option, the consciousness already exists and the behaviour changes not as a consequence of learning but because there is fear of the consequences of the actions, that is, fear of the punishment. Just like a baby doesn’t touch the fire not because somebody told him that it is hot, but to avoid suffering, and eventually he has already tried touching that and had an instant punishment.

In addition, there exist another possibility. When a murder take place, the family of the victim don’t want the killer to be incarcerated to avoid it to happen again; they may want it, but this is just a collateral reason. The main one is to make the murderer suffer in a sort of empathy. In this case, punishment performs as revenge, and has nothing to do with education.

Having said that, punishment should be conceived as something that causes distress. In the case of human condition, it is a trivial formulation, since the mental and physical part of us are vulnerable. Applied to humans, deprivation of belongings, necessities or even life have a huge effect on the mental and physical state of the race. On the other hand, implementing a resembling mechanism on machines has not the same effect. If we remove some parts of the mechanical or abstract body of the agent it will never know it, or it will just stop working, not realizing that it did. If we eventually delete the whole software or shut down the machine permanently or temporarily, it will never know, because it has no cognition of time or even of being turned on.

It is simply impossible to make suffer something that does not suffer, but we can program the agent to act as it was. That is, we can create a punishment mechanism but we won’t be able to actually punish it. The idea would be to create a variable “punishment” that increases every time the machine makes a mistake. The autonomous agent should learn to avoid the situations that causes the variable to increase. It will not suffer, but would act in a way to avoid punishment, which can be thought as a way of being afraid of that.

We then need to create the conditions to allow the machine to learn. It cannot understand to be in fault, but can act in a way to seem to do it. So, avoiding the same mistake. It has therefore to remember the conditions that caused the punishment, handling a list of situations to prevent, and this would be impossible simply deleting the algorithm: if we erase the program, we cannot be sure that that error will not happen again.

Comparing the situation with the human one, we can say that an error is like an illness. A person can either have contracted the illness, and be healed generating antibodies, or can have taken a big number of vaccinations that avoid the disease to be caught. The key point is that, once that person faced the sickness, he or she must not suffer from it again. If he or she does, we can say that that person is not an optimal subject. But we can likewise say that a person that barely gets ill is however a robust subject. In this latter case though, we cannot be sure that that person will never get sick of a precise illness one or more times in life.

We can hence have two situations: a machine that has done a lot of errors, or a machine that has done only a few. If we consider the comparison between different softwares, we may be tempted to prefer a machine that has lower punishment value. But indeed, the higher number of errors an agent has, the bigger the chances to avoid those errors in the future, since its list of situations to avoid is bigger and more precise. It is nevertheless fundamental that the error must not take place more than one time. If it does, it means that the software is malicious and that must be changed or deleted. In the event of reprogramming or deleting the software, the machine is no more at stake, and as a result, a person needs to be punished because the agent was programmed in an erroneous way since the beginning.

A way of testing different softwares is to give them the same input various times. Conforming to the output, the result could be an erroneous situation. In this case, we need to add this to the error list, and increment the punishment variable. If the error was already listed, measures need to be adopted, for instance, reprogram or delete the software. In this manner, a ranking of softwares will be put in place.

In this scenario we can eventually point out a software that normally does not fail when others do. This is the analogous of a person that seldom gets ill. In this case we can say that the software is a good one, but we cannot be sure that a certain mistake will never take place.

The essay was written for the course “Social and Ethical Issues in Information Technology” in my Digital Humanities Master at University of Pisa.