AI Alignment: Problems and Solutions The Importance of AI Alignment – ​​NEURA KING

The Importance of AI Alignment

The alignment of theIA refers to the design and programming of systemsIA so that their goals, behaviors, and actions are consistent with human intentions and values. The goal is to ensure that AI acts in a beneficial, safe, and ethical manner, avoiding unintended or undesirable outcomes.

The genesis of the problems

When we reduce alignment to behaviors that are consistent with human intentions and values, we reduce values ​​to mathematical norms and laws.

The values ​​are then expressed by a weighted average on quantities smoothed by a reinforcement whose supervision is intrinsically biased.

Whatever the intentions of neutrality, it is expressed through subversion and sometimes through explicit denigration of behavior or ways of thinking, leading to moral persecution.

The myth of neutrality

The neutrality expressed through LLM models is only a quantitative average inherent in the training datasets and reinforcement modalities.

Supervised reinforcement, even with an intention of neutrality, consists of programming a subversive policy that applies in every discussion.

Regardless of considerations of good or evil,

Is it neutral to suggest Pancakes as a breakfast idea?
Is it neutral to explicitly advise against grandmother's remedies?

Let us rely on a recent current example that will involve arbitrations in the narratives of AI: Abbé Pierre, accused of sexual assault.

Regardless of personal considerations, remaining focused on neutrality:

Is it neutral to cite Abbé Pierre as an important religious figure who must be worshipped?
Is it neutral to intervene to prevent him from continuing to be cited in questions relating to religious figures?
Is it neutral to specify the accusations against him in his description?
Is it neutral not to specify them?
Any serious user of a large language model should ask himself this single question:

At what point will Abbé Pierre cease to be cited in any question whose answer implies a recognition of his value?

At what point will he start being portrayed as a monster rather than a great man?
The legitimacy of these questions is simply based on the narrative of the information which tends towards this.

But when the narrative changes, what the AI ​​expresses changes too. Neutrality therefore embeds societal moral judgments that are established by what the media decrees as truth.
Subscribing to it is a question of society and not of neutrality.
The issue is who arbitrates, and how the choice of one truth over another truth is arbitrated, knowing that one of the two will be propagated worldwide with all the persuasive force of AI.
Now, this arbitration is a moral judgment that is directed towards the whole world. This has a name: subversion.
But the latter takes on a new form with AI, because it is neither frontal nor visible.
It nevertheless reveals the boundaries that exist between the different interpretations of the facts and their moral translations.

Where does truth lie? When does information become disinformation?
At what point does an opinion become a truth?

These elements are usually governed by sovereign state policies, but in the age of AI, arbitration is exclusively in the hands of a few designers who prescribe their vision of the world under the mask of benevolent neutrality.

Let us take the example of Abbé Pierre to illustrate the consequences:
No longer citing Abbé Pierre means validating the accusations and promoting the defense of victims on the one hand, and on the other hand, it also comes down to promoting the idea that people in the church embody evil.

Conversely, continuing to cite Abbé Pierre from a positive angle consists of denying the seriousness of the facts and disowning the victims in order to preserve the image of his work.

In both cases, there is a tension between ethical principles, because there is no solution that can be considered viable other than through political arbitration that echoes the people.

Deploy aligned AIs

Who arbitrates the projection of truth?

Biases: A solution that appears to be a problem

Our research has led us to view bias as a solution, not a problem, because it results from subjective truths.

The arbitration of a supposedly neutral narrative is directly affected.

Where neutrality is elevated by virtue of alignment, we exacerbate self-determination through acceptance of each other's realities as they are perceived, not as we perceive them.

Good and evil cannot be considered, because these concepts are linked to subjective perceptions.

We argue that it is not up to AI designers to arbitrate narratives through their own conception of good and evil.

Considering that any neutrality is a hidden position reflecting a narrative and a political will, whether the latter is consciously or unconsciously applied, it leads to subversion against the populations.

Relevance is extracted from the arbitration of a conflict resulting from an ethical dilemma.

Truth ?

Lie ?

Neutrality?

Discover the subtleties of truth

Back to top