I contacted Steve Omonohundro on Bob Mottram’s recent post. Here is Steve’s response.
My writings have really been about the consequences of rational systems with different kinds of goals and the extent to which the technologies we are building are likely to be well-described by these models. I’m not a “Singularitarian” in the sense that I don’t think extremely rapid technological change is good for humanity and much of my work is about how to create systems that change slowly enough for humanity to make thoughtful and well-considered choices.
Bob talks about “informationally closed systems”. I think that’s an interesting class of systems to understand but most of my writing is not about them. Rational systems have goals in the physical world and act to try to bring them about. They learn by interacting with the world and by seeing the consequences of their actions.
The phrase “maladaptive goal” is a bit odd. A goal can only be maladaptive relative to some other goal. Systems can be built with many different kinds of goals. A system with a particular goal is not ever going to think its own goal is maladaptive because its goal is its very purpose in the world. In the paper “The Basic AI Drives“, I did identify 3 situations in which rational agents will want to change their goals because the physical form of the goal interacts with the informational content but these situations are pretty obscure. For most rational agents, their goals are what they are trying to bring about in the world and changing them would go against their very purpose.
Systems can be given abstract goals, however, such as creating greater happiness, or greater peace, or being compassionate which might have many different concrete subgoals as possible realizations. Those concrete realizations can then certainly change as the world changes or the system learns more.
In an economist’s sense, a utility function is a measure of the desirability of an entire history of the universe, so for most utilities a system can’t reach “100% performance on its utility function” while there is any universe history left.
Bob asks where goal systems come from and what is the origin of human values. These are the critical and important questions! Evolutionary psychology and ethical philosophy have proposed some answers but I think there is much left to discover. Humans are not fully rational but act approximately rationally when we are clear about what we really want. One of our challenges is that we are not yet completely clear on what we want but our technology is rapidly moving forward ready to give it to us! As in countless genie stories, if we ask for the wrong thing, we won’t like what we get.
If we build technological systems with amorphous or unclear goals and they are allowed to self-modify and replicate, they are unlikely to behave in ways that are positive for humanity. My papers analyze a number of drives which appear for many simple goals if they are not explicitly counteracted including self-preservation, replication, and resource-acquisition. I and others are working very hard to design classes systems which will act in support of humanity rather than just playing out these drives with anti-human consequences. As long as systems are simple and confined to controlled environments like the Noble Ape experiments, there is unlikely to be any danger. But as systems become more powerful, we must be very careful. The more understanding we have of the behavior of intelligent systems and of human values and goals, the more likely we will be to create technologies with careful forethought and for the benefit of humanity. So I applaud your inquiry into these important topics.