Super-intelligence would have the power to enslave humanity, but would it also have the desire to do so? It is debated among futurists. Since nothing would escape her, by definition, she would understand the difficulty humans would have in accepting being under her control. Therefore, they could renounce it, unless they decide to move on, attributing this feeling to the weakness of their intelligence, incapable of conceiving that their interest, if understood well, consists in sacrificing their autonomy, which they would misuse due to of their mediocre cognitive abilities. In either case, the super-intelligence could not will its own disappearance, so that if its survival and prosperity came into conflict with those of humanity, or if it felt threatened with deactivation, it would not it could prohibit itself from putting its own interest before that of humanity and would adopt the necessary measures to preserve the former at the expense of the latter either by liquidating humanity or enslaving it.
But humanity must guard not only against the potential hostility of a super-intelligence, but also against its indifference. After all, he gave birth to it, and to take advantage of it, so that it assists him in his undertakings and this requires that, at the most general level, it be at the service of his values: what humanity wants, it is necessary that humanity super-intelligence desire to the same degree. It is therefore necessary to guarantee, according to the expression now in use, the alignment of the values of artificial intelligence with those of the human beings it serves, since by definition its autonomy means that it can only pursue its own ends.
This objective seems unachievable for two reasons. On the one hand, it seems impossible to force an autonomous entity to adopt any value: no catechism, no family morality, no social code has ever prevented a child from becoming an abominable tyrant. The precept, the law, the rule, the commandment can be duly inculcated, understood, assimilated, but they can only be made one's own thanks to consent, which depends, by definition, on the autonomy of the subject. As for the control of behavior, in the absence of that of intention, it would be eluded by super-intelligence, as we saw when we talked about ethical artificial intelligence. But the second reason is that the objective is inconsistent. What is he aiming for, really? An artificial intelligence that on the one hand does not harm human beings but acts for their good; and which on the other hand takes into account not only the interests of humanity considered globally, but also those of the different individuals and groups that refer to it. By crossing these two dimensions (do not harm or serve the whole of humanity or particular individuals or groups), a four-box table is obtained. Not harming humanity as a whole would perhaps be ensured by adhering to the principles of the Universal Declaration of Human Rights. And the least difficult box to fill. The following one is more: what values could guide artificial intelligence towards the good of humanity as a whole? The question of human flourishing has always occupied philosophers, anthropologists, theologians and political thinkers and has recently experienced an intensification.
It is not one of the questions that can ever receive a unanimous answer, even if only because of the differences between eras and cultures. Even more problematic are the other two boxes, those relating to individuals and particular groups. First, what are the evils from which artificial intelligence must protect them, what are the goods which it must help them access? If their autonomy is to be respected, it is those that they judge to constitute evils or goods respectively. This judgment may not be that of the Sai involved, who is better informed and more lucid, which would embarrass him: he would have no choice but to become an accomplice to an action harmful to the individual or group he is supposed to serve and disobey him. Secondly, since the action undertaken involves various people or groups, it may happen that their values are incompatible: which ones should guide the Sai? Finally, would it be enough to set the values of super-intelligence to serve our interests? Don't these respond to much more local and particular norms, what are called preferences in decision theory?
Whatever the necessarily general values or principles that the super-intelligence would have made its own at the moment of its gestation, they would not be sufficient to guide its action on the ground: it would therefore have to be prepared to take into account the preferences of the people involved. And since there are more than one, it would clash with the insoluble problem of the aggregation of preferences, that is, the possibility, if not of pleasing everyone, at least of determining an optimum in which everyone is treated in the best possible way. In short, we see that to solve the problem of the alignment of values, it would be necessary to have settled in a satisfactory manner in the eyes of all a set of questions on the agenda of moral and political philosophy. Powerful artificial intelligence holds up the mirror in which our profound uncertainty is reflected. The situation is at a similar point to that of early artificial intelligence: to equip it with the rational aptitudes that humanity possesses or ideally should possess, it should be able to identify them. Experience has proven that it was less easy than the pioneers thought, but at least they had quite plausible starting hypotheses, which emerged from the progress accumulated by generations of logicians starting from Aristotle. The difference is that, compared to our aspirations, our progress has been much more hesitant and perhaps things will always be this way.
© 2023 Éditions Gallimard, Paris
© 2024 Giulio Einaudi editore sp
in Turin