Is the Orthogonality Thesis Defensible if We Assume Both Valence Realism and Open Individualism?

Ari Astra asks: Is the Orthogonality Thesis Defensible if We Assume Both “Valence Realism” and Open Individualism?


Ari’s own response: I suppose it’s contingent on whether or not digital zombies are capable of general intelligence, which is an open question. However, phenomenally bound subjective world simulations seem like an uncharacteristic extravagance on the part of evolution if non-sphexish p-zombie general intelligence is possible.

Of course, it may be possible, but just not reachable through Darwinian selection. But the fact that a search process as huge as evolution couldn’t find it and instead developed profoundly sophisticated phenomenally bound subjectivity is (possibly strong) evidence against the proposition that zombie AGI is possible (or likely to be stumbled on by accident).

If we do need phenomenally bound subjectivity for non-sphexish intelligence and minds ultimately care about qualia valence – and confusedly think that they care about other things only when they’re below a certain intelligence (or thoughtfulness) level – then it seems to follow that smarter than human AGIs will converge on valence optimization.

If OI is also true, then smarter than human AGIs will likely converge on this as well – since it’s within the reach of smart humans – and this will plausibly lead to AGIs adopting sentience in general as their target for valence optimization.

Friendliness may be built into the nature of all sufficiently smart and thoughtful general intelligence.

If we’re not drug-naive and we’ve conducted the phenomenological experiment of chemically blowing open the reducing valves that keep “mind at large” out and that filteratively shape hominid consciousness, we know by direct acquaintance that it’s possible to hack our way to more expansive awareness.

We shouldn’t discount the possibility that AGI will do the same simply because the idea is weirdly genre bending. Whatever narrow experience of “self” AGI starts with in the beginning, it may quickly expand out of.


Michael E. Johnson‘s response: The orthogonality thesis seems sound from ‘far mode’ but always breaks down in ‘near mode’. One way it breaks down is in implementation: the way you build an AGI system will definitely influence what it tends to ‘want’. Orthogonality is a leaky abstraction in this case.

Another way it breaks down is that the nature and structure of the universe instantiates various Schelling points. As you note, if Valence Realism is true, then there exists a pretty big Schelling point around optimizing that. Any arbitrary AGI would be much more likely to optimize for (and coordinate around) optimizing for positive qualia than, say, paperclips. I think this may be what your question gets at.

Coordination is also a huge question. You may have read this already, but worth pointing to: A new theory of Open Individualism.

To collect some threads- I’d suggest that much of the future will be determined by the coordination capacity and game-theoretical equilibriums between (1) different theories of identity, and (2) different metaphysics.

What does ‘metaphysics’ mean here? I use ‘metaphysics’ as shorthand for ‘the ontology people believe is ‘real’. What they believe we should look at when determining moral action.’

The cleanest typology for metaphysics I can offer is: some theories focus on computations as the thing that’s ‘real’, the thing that ethically matters – we should pay attention to what the *bits* are doing. Others focus on physical states – we should pay attention to what the *atoms* are doing. I’m on team atoms, as I note here: Against Functionalism.

My suggested takeaway: an open individualist who assumes computationalism is true (team bits) will have a hard time coordinating with an open individualist who assumes physicalism is true (team atoms) — they’re essentially running incompatible versions of OI and will compete for resources. As a first approximation, instead of three theories of personal identity – Closed Individualism, Empty Individualism, Open Individualism – we’d have six. CI-bits, CI-atoms, EI-bits, EI-atoms, OI-bits, OI-atoms. Whether the future is positive will be substantially determined by how widely and deeply we can build positive-sum moral trades between these six frames.

Maybe there’s further structure, if we add the dimension of ‘yes/no’ on Valence Realism. But maybe not– my intuition is that ‘team bits’ trends toward not being valence realists, whereas ‘team atoms’ tends toward it. So we’d still have these core six.

(I believe OI-atoms or EI-atoms is the ‘most true’ theory of personal identity, and that upon reflection and under consistency constraints agents will converge to these theories at the limit, but I expect all six theories to be well-represented by various agents and pseudo-agents in our current and foreseeable technological society.)

One comment

  1. Donald Hobson · 3 Days Ago

    “Friendliness may be built into the nature of all sufficiently smart and thoughtful general intelligence.”
    The existence of a few humans that are smart and not friendly would shed doubt on that.
    Given a particular formal definition of “human suffering”, and a robot, some possible actions will lead to more suffering than others. It should be possible to calculate this, not perfectly but well enough, I can predict that the robot will cause some suffering if it starts kicking people, and more if it builds a bioweapon. From these pieces, create a long list of randomly chosen actions, sorted by the amount of suffering they will cause. Create a program that picks the top item, and sends the raw motor signals to the real robot. You can deliberately engineer your robot to create human suffering, just like you can deliberately engineer a clock to tell the time. If the list gets longer, and the approximate measure of human suffering gets more accurate, even worse options will be found. With only a few options on the list, the most suffering it can cause will be by thrashing around in a way that hits nearby humans, if there are many many options on the list, one of them might be a plan to make bioweapons. If its suffering caused estimator is accurate, it will deduce that this plan causes a lot of suffering.
    How big does the list have to be, and how accurate does the suffering caused evaluator have to be before the system becomes “sufficiently smart” and tries to reduce suffering, rather than taking the action that results in most suffering?

    “As you note, if Valence Realism is true, then there exists a pretty big Schelling point around optimizing that.”
    A schelling point is a concept from game theory. Imagine playing a game, where you and another player both go into identical rooms and name a fruit. If you both name the same fruit you both get a prize. The rooms both have a conspicuous picture of a banana, and there are no mentions of any other fruits. You say “banana”, and win. “banana” is a Schelling point, a way of agents to choose the same action when they cant communicate. (Schelling points also turn up in negotiations, where agents can talk, but have no reason to believe what the other side says.)
    The concept of Schelling points only make sense when you have multiple agents, who know what they want (to get a prize in our example) playing game theory.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s