Keeping systems accountable, machine ethics, value alignment or mis...

In the paper titled: "Value Alignment or Misalignment–What Will Keep Systems Accountable?" there are some concepts such as inverse reinforcement learning and value alignment. It is speculated that inverse reinforcement learning can be an effective method to train ethical behavior into autonomous systems. Inverse reinforcement learning is described in the paper:

"Inverse reinforcement learning (IRL) is the task of inferring
a reward or utility function by observing the behavior of
other agents in a reinforcement learning-like setting.

What is the greater or greatest good? This is a key question to developing any system of ethics. In economics the concept of a utility function is aligned with the concept of a greater good in that it's the satisfaction of the consumer as a function of the consumption of real goods. The formula below shows how to do a measurement:

To postulate the utility function, economists typically make assumptions about the human preferences for different goods. For example, in certain situations, tea and coffee can be considered perfect substitutes of each other and the appropriate utility function must reflect such preferences with a utility form of u(c, t) = c + t, where "u" denotes the utility function and "c" and "t" denote coffee and tea. A consumer who consumes 1 pound of coffee and no tea derives a utility of 1 util.

The lesson here is that economics and indeed morality itself are ultimately about the preferences of people. These preferences most often are emotional because humans are not necessarily rational. Rational choice theory does tie into this calculation and rational choice theory can be explained by the video below:

At the heart of ethics are the motivations of people. At the heart of the motivations of people are economics. People generally want more of what they deem good and less of what they deem bad. The concept of "good" and "bad" behavior reflect that there are patterns of behavior that people want more of or less of and so laws are created as a way to use reinforcement to produce the desired behavior. Prison is generally used as a form of punishment to discourage certain behaviors, but fines are also used in the same way to achieve the same result.

The consequence based perspective of ethics

If we look at the consequence based perspective then we would see from the above video that behaviors society wants are conditioned by positive and negative reinforcement. Positive (addition of a positive consequence) reinforcement is a way which society uses to increase a desired behavior of it's members. Negative (removal of a negative consequence) reinforcement is a way in which society takes something away. Punishment is used to decreased an undesired behavior, or in other words weaken the response. An unpleasant consequence is added when a member of society adopts a behavior which society as a whole does not want.

From a consequence based perspective, if a person is rational then they will want the least negative consequences and the most positive consequences. In other words people who are rational want the best they can get out of any situation. What is important to note is that preferences determine what "best" actually means and it's not clear that everyone has the exact same concept of what "best you can get" is, but this highlights the meaning behind utility function, values, and the mechanisms in which society shapes behavior.

AI has no fear of punishment and no true preferences

The problem with AI is that AI doesn't have any fear. Human beings typically are born with certain basic needs so that every human from birth is in a position to be threatened to have certain basic human needs taken away. The AI doesn't have to care about anything and what it values isn't necessarily going to be what a human might value. Value alignment is about making sure the AI has values which mimic our own values as humans and it turns out this is not a trivial problem. Value misalignment can result in some of the major AI safety concerns we most fear.

References

Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value Alignment or Misalignment–What Will Keep Systems Accountable?.

Web:

Keeping systems accountable, machine ethics, value alignment or misalignment

The consequence based perspective of ethics

AI has no fear of punishment and no true preferences

References