Mar 262012

I contacted Steve Omonohundro on Bob Mottram’s recent post. Here is Steve’s response.

My writings have really been about the consequences of rational systems with different kinds of goals and the extent to which the technologies we are building are likely to be well-described by these models. I’m not a “Singularitarian” in the sense that I don’t think extremely rapid technological change is good for humanity and much of my work is about how to create systems that change slowly enough for humanity to make thoughtful and well-considered choices.

Bob talks about “informationally closed systems”. I think that’s an interesting class of systems to understand but most of my writing is not about them. Rational systems have goals in the physical world and act to try to bring them about. They learn by interacting with the world and by seeing the consequences of their actions.

The phrase “maladaptive goal” is a bit odd. A goal can only be maladaptive relative to some other goal. Systems can be built with many different kinds of goals. A system with a particular goal is not ever going to think its own goal is maladaptive because its goal is its very purpose in the world. In the paper “The Basic AI Drives“, I did identify 3 situations in which rational agents will want to change their goals because the physical form of the goal interacts with the informational content but these situations are pretty obscure. For most rational agents, their goals are what they are trying to bring about in the world and changing them would go against their very purpose.

Systems can be given abstract goals, however, such as creating greater happiness, or greater peace, or being compassionate which might have many different concrete subgoals as possible realizations. Those concrete realizations can then certainly change as the world changes or the system learns more.

In an economist’s sense, a utility function is a measure of the desirability of an entire history of the universe, so for most utilities a system can’t reach “100% performance on its utility function” while there is any universe history left.

Bob asks where goal systems come from and what is the origin of human values. These are the critical and important questions! Evolutionary psychology and ethical philosophy have proposed some answers but I think there is much left to discover. Humans are not fully rational but act approximately rationally when we are clear about what we really want. One of our challenges is that we are not yet completely clear on what we want but our technology is rapidly moving forward ready to give it to us! As in countless genie stories, if we ask for the wrong thing, we won’t like what we get.

If we build technological systems with amorphous or unclear goals and they are allowed to self-modify and replicate, they are unlikely to behave in ways that are positive for humanity. My papers analyze a number of drives which appear for many simple goals if they are not explicitly counteracted including self-preservation, replication, and resource-acquisition. I and others are working very hard to design classes systems which will act in support of humanity rather than just playing out these drives with anti-human consequences. As long as systems are simple and confined to controlled environments like the Noble Ape experiments, there is unlikely to be any danger. But as systems become more powerful, we must be very careful. The more understanding we have of the behavior of intelligent systems and of human values and goals, the more likely we will be to create technologies with careful forethought and for the benefit of humanity. So I applaud your inquiry into these important topics.



Bookmark this on Hatena Bookmark
Hatena Bookmark - Response from Steve Omohundro
Share on Facebook
Post to Google Buzz
Bookmark this on Yahoo Bookmark
Bookmark this on Livedoor Clip
Share on FriendFeed

  2 Responses to “Response from Steve Omohundro”

  1. By “maladaptive goal” I just mean any goal which was created in one context, but since the context has changed yet the system is reluctant to alter it’s goal (“For most rational agents, their goals are what they are trying to bring about in the world and changing them would go against their very purpose.”) that goal no longer makes as much sense and so is likely to lead to less adaptive behavior. I think goals can be maladaptive not just relative to others but relative to the environment within which they exist.

    Abstract goals are slightly more interesting, because these example abstractions are informational metasystems which make sense only in a multi-mind (cultural) system. Things like “peace” and “compassion” are obviously collective, but what about “happiness”? If happiness means maximizing dopamene release then this can lead to some fairly maladaptive behavior unless you consider the structure of the environment. If the environment changes then a goal to increase “reward” in the classical neurobiological sense could become a road to destruction for the agent persuing it. The usual examples given are of rodents depressing levers to obtain a neurologically induced reward, but this kind of earnest pursuit of a goal which in the wider context is maladaptive for the organism does seem to go on quite regularly in human cultures.

    There’s also a curious circularity in amorphous goals like “peace” or “happiness”. On the one hand their amorphous nature – being unable to pin them down very concretely – seems to be a requirement in order to avoid the possibility of maladaptions, but on the other hand you say that agents with such goals are “unlikely to behave in ways that are positive for humanity”. I think this is a sign that the framing of the issue in terms of goals is likely not an ideal one.

    I think there’s a way out of this philosophical impasse, which applies especially to human-like intelligence where communication is both ubiquitous and generative in structure, and that’s to think about the problem in terms of informational ecologies. Here you can think of agents in the system as both generating, and being generated by, the ambient informational environment which they inhabit. Where that loop exists it’s also likely that you’ll see the typical features of an ecology, such as competition, metastability, invasive species, symbioses and parasitism.

    • Bob, thanks for the response.

      A goal can certainly be maladaptive relative to *survival* in an environment, and survival is important for many goals but not all.

      Yeah, the danger is that a goal like “happiness” might be encoded as “increase this hormone”. That’s fine if the only way to do that is to do the things in the world that happiness is intended to measure. But there is tremendous temptation to circumvent the system and directly increase the hormone if that is possible. The huge battle around illegal drugs is testament to the individual desire to do that and the societal pushback against it.

      The fact that, as you say, it “does seem to go on quite regularly in human cultures” is a quite interesting observation. I’ve argued that it’s a “vulnerability” in the human goal structure which can be exploited by others and used to extract resources. Drug dealers, candy manufacturers, porn producers, casinos, etc. all profit from this kind of vulnerability and are incentivized to optimize their products toward exploiting it ever more fully. In the long run, this can be viewed as part of the process by which we get rid of vulnerabilities. But in the short term, it has the effect that our businesses tend produce and promote exactly those products which expose our irrationalities and vulnerabilities. So we end up with large portions of the population operating in states far from where we are at our best.

      The issue of the amorphous nature of goals like “peace” and “happiness” is also quite interesting. If we don’t define them precisely, then we don’t have clarity about what we want in the future. But if we do try to define them precisely and get it wrong then it may veer off in some undesired direction. I think we want to move conservatively, creating systems that are limited in their power and use those in helping us clarify exactly what mean by that kind of broad goal. Do lots of simulations and try to resolve the edge cases that help clarify our intent. The philosophers do this a lot. For example, there are now a bunch of “trolley problems” that try to force us to clarify our ethical theories and they can be quite challenging to introspect on and come to coherent models of our own ethical intentions:

      Very interesting suggestion about “informational ecologies”. I agree that we want to create social structures that allow for a diversity of intelligent agents and promote the incorporation of multiple viewpoints. Exactly which structures are most effective at that is a very interesting question.