Everything in a Nutshell

David Pearce at Quora in response to the question: “What are your philosophical positions in one paragraph?“:

“Everyone takes the limits of his own vision for the limits of the world.”

All that matters is the pleasure-pain axis. Pain and pleasure disclose the world’s inbuilt metric of (dis)value. Our overriding ethical obligation is to minimise suffering. After we have reprogrammed the biosphere to wipe out experience below “hedonic zero”, we should build a “triple S” civilisation based on gradients of superhuman bliss. The nature of ultimate reality baffles me. But intelligent moral agents will need to understand the multiverse if we are to grasp the nature and scope of our wider cosmological responsibilities. My working assumption is non-materialist physicalism. Formally, the world is completely described by the equation(s) of physics, presumably a relativistic analogue of the universal Schrödinger equation. Tentatively, I’m a wavefunction monist who believes we are patterns of qualia in a high-dimensional complex Hilbert space. Experience discloses the intrinsic nature of the physical: the “fire” in the equations. The solutions to the equations of QFT or its generalisation yield the values of qualia. What makes biological minds distinctive, in my view, isn’t subjective experience per se, but rather non-psychotic binding. Phenomenal binding is what consciousness is evolutionarily “for”. Without the superposition principle of QM, our minds wouldn’t be able to simulate fitness-relevant patterns in the local environment. When awake, we are quantum minds running subjectively classical world-simulations. I am an inferential realist about perception. Metaphysically, I explore a zero ontology: the total information content of reality must be zero on pain of a miraculous creation of information ex nihilo. Epistemologically, I incline to a radical scepticism that would be sterile to articulate. Alas, the history of philosophy twinned with the principle of mediocrity suggests I burble as much nonsense as everyone else.

Image credit: Joseph Matthias Young

Why I think the Foundational Research Institute should rethink its approach

by Mike Johnson

The following is my considered evaluation of the Foundational Research Institute, circa July 2017. I discuss its goal, where I foresee things going wrong with how it defines suffering, and what it could do to avoid these problems.

TL;DR version: functionalism (“consciousness is the sum-total of the functional properties of our brains”) sounds a lot better than it actually turns out to be in practice. In particular, functionalism makes it impossible to define ethics & suffering in a way that can mediate disagreements.

I. What is the Foundational Research Institute?

The Foundational Research Institute (FRI) is a Berlin-based group that “conducts research on how to best reduce the suffering of sentient beings in the near and far future.” Executive Director Max Daniel introduced them at EA Global Boston as “the only EA organization which at an organizational level has the mission of focusing on reducing s-risk.” S-risks are, according to Daniel, “risks where an adverse outcome would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.”

Essentially, FRI wants to become the research arm of suffering-focused ethics, and help prevent artificial general intelligence (AGI) failure-modes which might produce suffering on a cosmic scale.

What I like about FRI:

While I have serious qualms about FRI’s research framework, I think the people behind FRI deserve a lot of credit- they seem to be serious people, working hard to build something good. In particular, I want to give them a shoutout for three things:

  • First, FRI takes suffering seriously, and I think that’s important. When times are good, we tend to forget how tongue-chewingly horrific suffering can be. S-risks seem particularly horrifying.
  • Second, FRI isn’t afraid of being weird. FRI has been working on s-risk research for a few years now, and if people are starting to come around to the idea that s-risks are worth thinking about, much of the credit goes to FRI.
  • Third, I have great personal respect for Brian Tomasik, one of FRI’s co-founders. I’ve found him highly thoughtful, generous in debates, and unfailingly principled. In particular, he’s always willing to bite the bullet and work ideas out to their logical end, even if it involves repugnant conclusions.

What is FRI’s research framework?

FRI believes in analytic functionalism, or what David Chalmers calls “Type-A materialism”. Essentially, what this means is there’s no ’theoretical essence’ to consciousness; rather, consciousness is the sum-total of the functional properties of our brains. Since ‘functional properties’ are rather vague, this means consciousness itself is rather vague, in the same way words like “life,” “justice,” and “virtue” are messy and vague.

Brian suggests that this vagueness means there’s an inherently subjective, perhaps arbitrary element to how we define consciousness:

Analytic functionalism looks for functional processes in the brain that roughly capture what we mean by words like “awareness”, “happy”, etc., in a similar way as a biologist may look for precise properties of replicators that roughly capture what we mean by “life”. Just as there can be room for fuzziness about where exactly to draw the boundaries around “life”, different analytic functionalists may have different opinions about where to define the boundaries of “consciousness” and other mental states. This is why consciousness is “up to us to define”. There’s no hard problem of consciousness for the same reason there’s no hard problem of life: consciousness is just a high-level word that we use to refer to lots of detailed processes, and it doesn’t mean anything in addition to those processes.

Finally, Brian argues that the phenomenology of consciousness is identical with the phenomenology of computation:

I know that I’m conscious. I also know, from neuroscience combined with Occam’s razor, that my consciousness consists only of material operations in my brain — probably mostly patterns of neuronal firing that help process inputs, compute intermediate ideas, and produce behavioral outputs. Thus, I can see that consciousness is just the first-person view of certain kinds of computations — as Eliezer Yudkowsky puts it, “How An Algorithm Feels From Inside“. Consciousness is not something separate from or epiphenomenal to these computations. It is these computations, just from their own perspective of trying to think about themselves.


In other words, consciousness is what minds compute. Consciousness is the collection of input operations, intermediate processing, and output behaviors that an entity performs.

And if consciousness is all these things, so too is suffering. Which means suffering is computational, yet also inherently fuzzy, and at least a bit arbitrary; a leaky high-level reification impossible to speak about accurately, since there’s no formal, objective “ground truth”.

II. Why do I worry about FRI’s research framework?

In short, I think FRI has a worthy goal and good people, but its metaphysics actively prevent making progress toward that goal. The following describes why I think that, drawing heavily on Brian’s writings (of FRI’s researchers, Brian seems the most focused on metaphysics):

Note: FRI is not the only EA organization which holds functionalist views on consciousness; much of the following critique would also apply to e.g. MIRI, FHI, and OpenPhil. I focus on FRI because (1) Brian’s writings on consciousness & functionalism have been hugely influential in the community, and are clear enough *to* criticize; (2) the fact that FRI is particularly clear about what it cares about- suffering- allows a particularly clear critique about what problems it will run into with functionalism; (3) I believe FRI is at the forefront of an important cause area which has not crystallized yet, and I think it’s critically important to get these objections bouncing around this subcommunity.

Objection 1: Motte-and-bailey

Brian: “Consciousness is not a thing which exists ‘out there’ or even a separate property of matter; it’s a definitional category into which we classify minds. ‘Is this digital mind really conscious?’ is analogous to ‘Is a rock that people use to eat on really a table?’ [However,] That consciousness is a cluster in thingspace rather than a concrete property of the world does not make reducing suffering less important.”

The FRI model seems to imply that suffering is ineffable enough such that we can’t have an objective definition, yet sufficiently effable that we can coherently talk and care about it. This attempt to have it both ways seems contradictory, or at least in deep tension.

Indeed, I’d argue that the degree to which you can care about something is proportional to the degree to which you can define it objectively. E.g., If I say that “gnireffus” is literally the most terrible thing in the cosmos, that we should spread gnireffus-focused ethics, and that minimizing g-risks (far-future scenarios which involve large amounts of gnireffus) is a moral imperative, but also that what is and what and isn’t gnireffus is rather subjective with no privileged definition, and it’s impossible to objectively tell if a physical system exhibits gnireffus, you might raise any number of objections. This is not an exact metaphor for FRI’s position, but I worry that FRI’s work leans on the intuition that suffering is real and we can speak coherently about it, to a degree greater than its metaphysics formally allow.

Max Daniel (personal communication) suggests that we’re comfortable with a degree of ineffability in other contexts; “Brian claims that the concept of suffering shares the allegedly problematic properties with the concept of a table. But it seems a stretch to say that the alleged tension is problematic when talking about tables. So why would it be problematic when talking about suffering?” However, if we take the anti-realist view that suffering is ‘merely’ a node in the network of language, we have to live with the consequences of this: that ‘suffering’ will lose meaning as we take it away from the network in which it’s embedded (Wittgenstein). But FRI wants to do exactly this, to speak about suffering in the context of AGIs, simulated brains, even video game characters.

We can be anti-realists about suffering (suffering-is-a-node-in-the-network-of-language), or we can argue that we can talk coherently about suffering in novel contexts (AGIs, mind crime, aliens, and so on), but it seems inherently troublesome to claim we can do both at the same time.

Objection 2: Intuition duels

Two people can agree on FRI’s position that there is no objective fact of the matter about what suffering is (no privileged definition), but this also means they have no way of coming to any consensus on the object-level question of whether something can suffer. This isn’t just an academic point: Brian has written extensively about how he believes non-human animals can and do suffer extensively, whereas Yudkowsky (who holds computationalist views, like Brian) has written about how he’s confident that animals are not conscious and cannot suffer, due to their lack of higher-order reasoning.

And if functionalism is having trouble adjudicating the easy cases of suffering–whether monkeys can suffer, or whether dogs can— it doesn’t have a sliver of a chance at dealing with the upcoming hard cases of suffering: whether a given AGI is suffering, or engaging in mind crime; whether a whole-brain emulation (WBE) or synthetic organism or emergent intelligence that doesn’t have the capacity to tell us how it feels (or that we don’t have the capacity to understand) is suffering; if any aliens that we meet in the future can suffer; whether changing the internal architecture of our qualia reports means we’re also changing our qualia; and so on.

In short, FRI’s theory of consciousness isn’t actually a theory of consciousness at all, since it doesn’t do the thing we need a theory of consciousness to do: adjudicate disagreements in a principled way. Instead, it gives up any claim on the sorts of objective facts which could in principle adjudicate disagreements.

This is a source of friction in EA today, but it’s mitigated by the sense that

(1) The EA pie is growing, so it’s better to ignore disagreements than pick fights;

(2) Disagreements over the definition of suffering don’t really matter yet, since we haven’t gotten into the business of making morally-relevant synthetic beings (that we know of) that might be unable to vocalize their suffering.

If the perception of one or both of these conditions change, the lack of some disagreement-adjudicating theory of suffering will matter quite a lot.

Objection 3: Convergence requires common truth

Mike: “[W]hat makes one definition of consciousness better than another? How should we evaluate them?”

Brian: “Consilience among our feelings of empathy, principles of non-discrimination, understandings of cognitive science, etc. It’s similar to the question of what makes one definition of justice or virtue better than another.”

Brian is hoping that affective neuroscience will slowly converge to accurate views on suffering as more and better data about sentience and pain accumulates. But convergence to truth implies something (objective) driving the convergence- in this way, Brian’s framework still seems to require an objective truth of the matter, even though he disclaims most of the benefits of assuming this.

Objection 4: Assuming that consciousness is a reification produces more confusion, not less

Brian: “Consciousness is not a reified thing; it’s not a physical property of the universe that just exists intrinsically. Rather, instances of consciousness are algorithms that are implemented in specific steps. … Consciousness involves specific things that brains do.”

Brian argues that we treat conscious/phenomenology as more ‘real’ than it is. Traditionally, whenever we’ve discovered something is a leaky reification and shouldn’t be treated as ‘too real’, we’ve been able to break it down into more coherent constituent pieces we can treat as real. Life, for instance, wasn’t due to élan vital but a bundle of self-organizing properties & dynamics which generally co-occur. But carrying out this “de-reification” process on consciousness– enumerating its coherent constituent pieces– has proven difficult, especially if we want to preserve some way to speak cogently about suffering.

Speaking for myself, the more I stared into the depths of functionalism, the less certain everything about moral value became– and arguably, I see the same trajectory in Brian’s work and Luke Muehlhauser’s report. Their model uncertainty has seemingly become larger as they’ve looked into techniques for how to “de-reify” consciousness while preserving some flavor of moral value, not smaller. Brian and Luke seem to interpret this as evidence that moral value is intractably complicated, but this is also consistent with consciousness not being a reification, and instead being a real thing. Trying to “de-reify” something that’s not a reification will produce deep confusion, just as surely trying to treat a reification as ‘more real’ than it actually is will.

Edsger W. Dijkstra famously noted that “The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.” And so if our ways of talking about moral value fail to ‘carve reality at the joints’- then by all means let’s build better ones, rather than giving up on precision.

Objection 5: The Hard Problem of Consciousness is a red herring

Brian spends a lot of time discussing Chalmers’ “Hard Problem of Consciousness”, i.e. the question of why we’re subjectively conscious, and seems to base at least part of his conclusion on not finding this question compelling— he suggests “There’s no hard problem of consciousness for the same reason there’s no hard problem of life: consciousness is just a high-level word that we use to refer to lots of detailed processes, and it doesn’t mean anything in addition to those processes.” I.e., no ‘why’ is necessary; when we take consciousness and subtract out the details of the brain, we’re left with an empty set.

But I think the “Hard Problem” isn’t helpful as a contrastive centerpiece, since it’s unclear what the problem is, and whether it’s analytic or empirical, a statement about cognition or about physics. At the Qualia Research Institute (QRI), we don’t talk much about the Hard Problem; instead, we talk about Qualia Formalism, or the idea that any phenomenological state can be crisply and precisely represented by some mathematical object. I suspect this would be a better foil for Brian’s work than the Hard Problem.

Objection 6: Mapping to reality

Brian argues that consciousness should be defined at the functional/computational level: given a Turing machine, or neural network, the right ‘code’ will produce consciousness. But the problem is that this doesn’t lead to a theory which can ‘compile’ to physics. Consider the following:

Imagine you have a bag of popcorn. Now shake it. There will exist a certain ad-hoc interpretation of bag-of-popcorn-as-computational-system where you just simulated someone getting tortured, and other interpretations that don’t imply that. Did you torture anyone? If you’re a computationalist, no clear answer exists- you both did, and did not, torture someone. This sounds like a ridiculous edge-case that would never come up in real life, but in reality it comes up all the time, since there is no principled way to *objectively derive* what computation(s) any physical system is performing.

I don’t think this is an outlandish view of functionalism; Brian suggests much the same in How to Interpret a Physical System as a Mind“Physicalist views that directly map from physics to moral value are relatively simple to understand. Functionalism is more complex, because it maps from physics to computations to moral value. Moreover, while physics is real and objective, computations are fictional and ‘observer-relative’ (to use John Searle’s terminology). There’s no objective meaning to ‘the computation that this physical system is implementing’ (unless you’re referring to the specific equations of physics that the system is playing out).”

Gordon McCabe (McCabe 2004) provides a more formal argument to this effect— that precisely mapping between physical processes and (Turing-level) computational processes is inherently impossible— in the context of simulations. First, McCabe notes that:

[T]here is a one-[to-]many correspondence between the logical states [of a computer] and the exact electronic states of computer memory. Although there are bijective mappings between numbers and the logical states of computer memory, there are no bijective mappings between numbers and the exact electronic states of memory.

This lack of an exact bijective mapping means that subjective interpretation necessarily creeps in, and so a computational simulation of a physical system can’t be ‘about’ that system in any rigorous way:

In a computer simulation, the values of the physical quantities possessed by the simulated system are represented by the combined states of multiple bits in computer memory. However, the combined states of multiple bits in computer memory only represent numbers because they are deemed to do so under a numeric interpretation. There are many different interpretations of the combined states of multiple bits in computer memory. If the numbers represented by a digital computer are interpretation-dependent, they cannot be objective physical properties. Hence, there can be no objective relationship between the changing pattern of multiple bit-states in computer memory, and the changing pattern of quantity-values of a simulated physical system.

McCabe concludes that, metaphysically speaking,

A digital computer simulation of a physical system cannot exist as, (does not possess the properties and relationships of), anything else other than a physical process occurring upon the components of a computer. In the contemporary case of an electronic digital computer, a simulation cannot exist as anything else other than an electronic physical process occurring upon the components and circuitry of a computer.

Where does this leave ethics? In Flavors of Computation Are Flavors of Consciousness, Brian notes that “In some sense all I’ve proposed here is to think of different flavors of computation as being various flavors of consciousness. But this still leaves the question: Which flavors of computation matter most? Clearly whatever computations happen when a person is in pain are vastly more important than what’s happening in a brain on a lazy afternoon. How can we capture that difference?”

But if Brian grants the former point- that “There’s no objective meaning to ‘the computation that this physical system is implementing’”– then this latter task of figuring out “which flavors of computation matter most” is provably impossible. There will always be multiple computational (and thus ethical) interpretations of a physical system, with no way to figure out what’s “really” happening. No way to figure out if something is suffering or not. No consilience; not now, not ever.

Note: despite apparently granting the point above, Brian also remarks that:

I should add a note on terminology: All computations occur within physics, so any computation is a physical process. Conversely, any physical process proceeds from input conditions to output conditions in a regular manner and so is a computation. Hence, the set of computations equals the set of physical processes, and where I say “computations” in this piece, one could just as well substitute “physical processes” instead.

This seems to be (1) incorrect, for the reasons I give above, or (2) taking substantial poetic license with these terms, or (3) referring to hypercomputation (which might be able to salvage the metaphor, but would invalidate many of FRI’s conclusions dealing with the computability of suffering on conventional hardware).

This objection may seem esoteric or pedantic, but I think it’s important, and that it ripples through FRI’s theoretical framework with disastrous effects.


Objection 7: FRI doesn’t fully bite the bullet on computationalism

Brian suggests that “flavors of computation are flavors of consciousness” and that some computations ‘code’ for suffering. But if we do in fact bite the bullet on this metaphor and place suffering within the realm of computational theory, we need to think in “near mode” and accept all the paradoxes that brings. Scott Aaronson, a noted expert on quantum computing, raises the following objections to functionalism:

I’m guessing that many people in this room side with Dennett, and (not coincidentally, I’d say) also with Everett. I certainly have sympathies in that direction too. In fact, I spent seven or eight years of my life as a Dennett/Everett hardcore believer. But, while I don’t want to talk anyone out of the Dennett/Everett view, I’d like to take you on a tour of what I see as some of the extremely interesting questions that that view leaves unanswered. I’m not talking about “deep questions of meaning,” but about something much more straightforward: what exactly does a computational process have to do to qualify as “conscious”?



There’s this old chestnut, what if each person on earth simulated one neuron of your brain, by passing pieces of paper around. It took them several years just to simulate a single second of your thought processes. Would that bring your subjectivity into being? Would you accept it as a replacement for your current body? If so, then what if your brain were simulated, not neuron-by-neuron, but by a gigantic lookup table? That is, what if there were a huge database, much larger than the observable universe (but let’s not worry about that), that hardwired what your brain’s response was to every sequence of stimuli that your sense-organs could possibly receive. Would that bring about your consciousness? Let’s keep pushing: if it would, would it make a difference if anyone actually consulted the lookup table? Why can’t it bring about your consciousness just by sitting there doing nothing?

To these standard thought experiments, we can add more. Let’s suppose that, purely for error-correction purposes, the computer that’s simulating your brain runs the code three times, and takes the majority vote of the outcomes. Would that bring three “copies” of your consciousness into being? Does it make a difference if the three copies are widely separated in space or time—say, on different planets, or in different centuries? Is it possible that the massive redundancy taking place in your brain right now is bringing multiple copies of you into being?



Maybe my favorite thought experiment along these lines was invented by my former student Andy Drucker.  In the past five years, there’s been a revolution in theoretical cryptography, around something called Fully Homomorphic Encryption (FHE), which was first discovered by Craig Gentry.  What FHE lets you do is to perform arbitrary computations on encrypted data, without ever decrypting the data at any point.  So, to someone with the decryption key, you could be proving theorems, simulating planetary motions, etc.  But to someone without the key, it looks for all the world like you’re just shuffling random strings and producing other random strings as output.


You can probably see where this is going.  What if we homomorphically encrypted a simulation of your brain?  And what if we hid the only copy of the decryption key, let’s say in another galaxy?  Would this computation—which looks to anyone in our galaxy like a reshuffling of gobbledygook—be silently producing your consciousness?


When we consider the possibility of a conscious quantum computer, in some sense we inherit all the previous puzzles about conscious classical computers, but then also add a few new ones.  So, let’s say I run a quantum subroutine that simulates your brain, by applying some unitary transformation U.  But then, of course, I want to “uncompute” to get rid of garbage (and thereby enable interference between different branches), so I apply U-1.  Question: when I apply U-1, does your simulated brain experience the same thoughts and feelings a second time?  Is the second experience “the same as” the first, or does it differ somehow, by virtue of being reversed in time? Or, since U-1U is just a convoluted implementation of the identity function, are there no experiences at all here?


Here’s a better one: many of you have heard of the Vaidman bomb.  This is a famous thought experiment in quantum mechanics where there’s a package, and we’d like to “query” it to find out whether it contains a bomb—but if we query it and there is a bomb, it will explode, killing everyone in the room.  What’s the solution?  Well, suppose we could go into a superposition of querying the bomb and not querying it, with only ε amplitude on querying the bomb, and √(1-ε2) amplitude on not querying it.  And suppose we repeat this over and over—each time, moving ε amplitude onto the “query the bomb” state if there’s no bomb there, but moving ε2 probability onto the “query the bomb” state if there is a bomb (since the explosion decoheres the superposition).  Then after 1/ε repetitions, we’ll have order 1 probability of being in the “query the bomb” state if there’s no bomb.  By contrast, if there is a bomb, then the total probability we’ve ever entered that state is (1/ε)×ε2 = ε.  So, either way, we learn whether there’s a bomb, and the probability that we set the bomb off can be made arbitrarily small.  (Incidentally, this is extremely closely related to how Grover’s algorithm works.)


OK, now how about the Vaidman brain?  We’ve got a quantum subroutine simulating your brain, and we want to ask it a yes-or-no question.  We do so by querying that subroutine with ε amplitude 1/ε times, in such a way that if your answer is “yes,” then we’ve only ever activated the subroutine with total probability ε.  Yet you still manage to communicate your “yes” answer to the outside world.  So, should we say that you were conscious only in the ε fraction of the wavefunction where the simulation happened, or that the entire system was conscious?  (The answer could matter a lot for anthropic purposes.)

To sum up: Brian’s notion that consciousness is the same as computation raises more issues than it solves; in particular, the possibility that if suffering is computable, it may also be uncomputable/reversible, would suggest s-risks aren’t as serious as FRI treats them.

Objection 8: Dangerous combination

Three themes which seem to permeate FRI’s research are:

(1) Suffering is the thing that is bad.

(2) It’s critically important to eliminate badness from the universe.

(3) Suffering is impossible to define objectively, and so we each must define what suffering means for ourselves.

Taken individually, each of these seems reasonable. Pick two, and you’re still okay. Pick all three, though, and you get A Fully General Justification For Anything, based on what is ultimately a subjective/aesthetic call.

Much can be said in FRI’s defense here, and it’s unfair to single them out as risky: in my experience they’ve always brought a very thoughtful, measured, cooperative approach to the table. I would just note that ideas are powerful, and I think theme (3) is especially pernicious if incorrect.

III. QRI’s alternative

Analytic functionalism is essentially a negative hypothesis about consciousness: it’s the argument that there’s no order to be found, no rigor to be had. It obscures this with talk of “function”, which is a red herring it not only doesn’t define, but admits is undefinable. It doesn’t make any positive assertion. Functionalism is skepticism- nothing more, nothing less.

But is it right?

Ultimately, I think these a priori arguments are much like people in the middle ages arguing whether one could ever formalize a Proper System of Alchemy. Such arguments may in many cases hold water, but it’s often difficult to tell good arguments apart from arguments where we’re just cleverly fooling ourselves. In retrospect, the best way to *prove* systematized alchemy was possible was to just go out and *do* it, and invent Chemistry. That’s how I see what we’re doing at QRI with Qualia Formalism: we’re assuming it’s possible to build stuff, and we’re working on building the object-level stuff.

What we’ve built with QRI’s framework

Note: this is a brief, surface-level tour of our research; it will probably be confusing for readers who haven’t dug into our stuff before. Consider this a down-payment on a more substantial introduction.

My most notable work is Principia Qualia, in which I lay out my meta-framework for consciousness (a flavor of dual-aspect monism, with a focus on Qualia Formalism) and put forth the Symmetry Theory of Valence (STV). Essentially, the STV is an argument that much of the apparent complexity of emotional valence is evolutionarily contingent, and if we consider a mathematical object isomorphic to a phenomenological experience, the mathematical property which corresponds to how pleasant it is to be that experience is the object’s symmetry. This implies a bunch of testable predictions and reinterpretations of things like what ‘pleasure centers’ do (Section XI; Section XII). Building on this, I offer the Symmetry Theory of Homeostatic Regulation, which suggests understanding the structure of qualia will translate into knowledge about the structure of human intelligence, and I briefly touch on the idea of Neuroacoustics.

Likewise, my colleague Andrés Gómez Emilsson has written about the likely mathematics of phenomenology, including The Hyperbolic Geometry of DMT Experiences, Tyranny of the Intentional Object, and Algorithmic Reduction of Psychedelic States. If I had to suggest one thing to read in all of these links, though, it would be the transcript of his recent talk on Quantifying Bliss, which lays out the world’s first method to objectively measure valence from first principles (via fMRI) using Selen Atasoy’s Connectome Harmonics framework, the Symmetry Theory of Valence, and Andrés’s CDNS model of experience.

These are risky predictions and we don’t yet know if they’re right, but we’re confident that if there is some elegant structure intrinsic to consciousness, as there is in many other parts of the natural world, these are the right kind of risks to take.

I mention all this because I think analytic functionalism- which is to say radical skepticism/eliminativism, the metaphysics of last resort- only looks as good as it does because nobody’s been building out any alternatives.

IV. Closing thoughts

FRI is pursuing a certain research agenda, and QRI is pursuing another, and there’s lots of value in independent explorations of the nature of suffering. I’m glad FRI exists, everybody I’ve interacted with at FRI has been great, I’m happy they’re focusing on s-risks, and I look forward to seeing what they produce in the future.

On the other hand, I worry that nobody’s pushing back on FRI’s metaphysics, which seem to unavoidably lead to the intractable problems I describe above. FRI seems to believe these problems are part of the territory, unavoidable messes that we just have to make philosophical peace with. But I think that functionalism is a bad map, that the metaphysical messes it leads to are much worse than most people realize (fatal to FRI’s mission), and there are other options that avoid these problems (which, to be fair, is not to say they have no problems).

Ultimately, FRI doesn’t owe me a defense of their position. But if they’re open to suggestions on what it would take to convince a skeptic like me that their brand of functionalism is viable, or at least rescuable, I’d offer the following:

Re: Objection 1 (motte-and-bailey), I suggest FRI should be as clear and complete as possible in their basic definition of suffering. In which particular ways is it ineffable/fuzzy, and in which particular ways is it precise? What can we definitely say about suffering, and what can we definitely never determine? Preregistering ontological commitments and methodological possibilities would help guard against FRI’s definition of suffering changing based on context.

Re: Objection 2 (intuition duels), FRI may want to internally “war game” various future scenarios involving AGI, WBE, etc, with one side arguing that a given synthetic (or even extraterrestrial) organism is suffering, and the other side arguing that it isn’t. I’d expect this would help diagnose what sorts of disagreements future theories of suffering will need to adjudicate, and perhaps illuminate implicit ethical intuitions. Sharing the results of these simulated disagreements would also be helpful in making FRI’s reasoning less opaque to outsiders, although making everything transparent could lead to certain strategic disadvantages.

Re: Objection 3 (convergence requires common truth), I’d like FRI to explore exactly what might drive consilience/convergence in theories of suffering, and what precisely makes one theory of suffering better than another, and ideally to evaluate a range of example theories of suffering under these criteria.

Re: Objection 4 (assuming that consciousness is a reification produces more confusion, not less), I would love to see a historical treatment of reification: lists of reifications which were later dissolved (e.g., élan vital), vs scattered phenomena that were later unified (e.g., electromagnetism). What patterns do the former have, vs the latter, and why might consciousness fit one of these buckets better than the other?

Re: Objection 5 (the Hard Problem of Consciousness is a red herring), I’d like to see a more detailed treatment of what kinds of problem people have interpreted the Hard Problem as, and also more analysis on the prospects of Qualia Formalism (which I think is the maximally-empirical, maximally-charitable interpretation of the Hard Problem). It would be helpful for us, in particular, if FRI preregistered their expectations about QRI’s predictions, and their view of the relative evidence strength of each of our predictions.

Re: Objection 6 (mapping to reality), this is perhaps the heart of most of our disagreement. From Brian’s quotes, he seems split on this issue; I’d like clarification about whether he believes we can ever precisely/objectively map specific computations to specific physical systems, and vice-versa. And if so— how? If not, this seems to propagate through FRI’s ethical framework in a disastrous way, since anyone can argue that any physical system does, or does not, ‘code’ for massive suffering, and there’s no principled way to derive any ‘ground truth’ or even pick between interpretations in a principled way (e.g. my popcorn example). If this isn’t the case— why not?

Brian has suggested that “certain high-level interpretations of physical systems are more ‘natural’ and useful than others” (personal communication); I agree, and would encourage FRI to explore systematizing this.

It would be non-trivial to port FRI’s theories and computational intuitions to the framework of “hypercomputation”– i.e., the understanding that there’s a formal hierarchy of computational systems, and that Turing machines are only one level of many– but it may have benefits too. Namely, it might be the only way they could avoid Objection 6 (which I think is a fatal objection) while still allowing them to speak about computation & consciousness in the same breath. I think FRI should look at this and see if it makes sense to them.

Re: Objection 7 (FRI doesn’t fully bite the bullet on computationalism), I’d like to see responses to Aaronson’s aforementioned thought experiments.

Re: Objection 8 (dangerous combination), I’d like to see a clarification about why my interpretation is unreasonable (as it very well may be!).


In conclusion- I think FRI has a critically important goal- reduction of suffering & s-risk. However, I also think FRI has painted itself into a corner by explicitly disallowing a clear, disagreement-mediating definition for what these things are. I look forward to further work in this field.


Mike Johnson

Qualia Research Institute

Acknowledgements: thanks to Andrés Gómez Emilsson, Brian Tomasik, and Max Daniel for reviewing earlier drafts of this.


My sources for FRI’s views on consciousness:
Flavors of Computation are Flavors of Consciousness:
Is There a Hard Problem of Consciousness?
Consciousness Is a Process, Not a Moment
How to Interpret a Physical System as a Mind
Dissolving Confusion about Consciousness
Debate between Brian & Mike on consciousness:
Max Daniel’s EA Global Boston 2017 talk on s-risks:
Multipolar debate between Eliezer Yudkowsky and various rationalists about animal suffering:
The Internet Encyclopedia of Philosophy on functionalism:
Gordon McCabe on why computation doesn’t map to physics:
Toby Ord on hypercomputation, and how it differs from Turing’s work:
Luke Muehlhauser’s OpenPhil-funded report on consciousness and moral patienthood:
Scott Aaronson’s thought experiments on computationalism:
Selen Atasoy on Connectome Harmonics, a new way to understand brain activity:
My work on formalizing phenomenology:
My meta-framework for consciousness, including the Symmetry Theory of Valence:
My hypothesis of homeostatic regulation, which touches on why we seek out pleasure:
My exploration & parametrization of the ‘neuroacoustics’ metaphor suggested by Atasoy’s work:
My colleague Andrés’s work on formalizing phenomenology:
A model of DMT-trip-as-hyperbolic-experience:
June 2017 talk at Consciousness Hacking, describing a theory and experiment to predict people’s valence from fMRI data:
A parametrization of various psychedelic states as operators in qualia space:
A brief post on valence and the fundamental attribution error:
A summary of some of Selen Atasoy’s current work on Connectome Harmonics:

The Forces At Work

        Recreational agents which are legal and socially sanctioned by respectable society aren’t, of course, popularly viewed as drugs at all. The nicotine addict and the alcoholic don’t think of themselves as practising psychopharmacologists; and so alas their incompetence is frequently lethal.

        Is such incompetence curable? If it is, and if the abolitionist project can be carried forward with pharmacotherapy in advance of true genetic medicine, then a number of preconditions must first be in place. A necessary and sufficient set could not possibly be listed here. It is still worth isolating and examining below several distinct yet convergent societal trends of huge potential significance.

  1. First, it must be assumed that we will continue to seek out and use chemical mood-enhancers on a massive, species-wide scale.
  2. Second, a pioneering and pharmacologically (semi-)literate elite will progressively learn to use their agents of choice in a much more effective, safe and rational manner. The whole pharmacopoeia of licensed and unlicensed medicines will be purchasable globally over the Net. As the operation of our 30,000 plus genes is unravelled, the new discipline of pharmacogenomics will allow drugs to be personally tailored to the genetic makeup of each individual. Better still, desirable states of consciousness that can be induced pharmacologically can later be pre-coded genetically.
  3. Third, society will continue to fund and support research into genetic engineering, reproductive medicine and all forms of biotechnology. This will enable the breathtaking array of designer-heavens on offer from third-millennium biomedicine to become a lifestyle choice.
  4. Fourth, the ill-fated governmental War On (some) Drugs will finally collapse under the weight of its own contradictions. Parents are surely right to be anxious about many of today’s illegal intoxicants. Yet their toxicity will no more prove a reason to give up the dream of Better Living Through Chemistry than the casualties of early modern medicine are a reason to abandon contemporary medical science for homeopathy.
  5. Fifth, the medicalisation of everyday life, and of the human predicament itself, will continue apace. All manner of currently ill-defined discontents will be medically diagnosed and classified. Our innumerable woes will be given respectable clinical labels. Mass-medicalisation will enable the big drug companies aggressively to extend their lucrative markets in medically-approved psychotropics to a widening clientele. New and improved mood-modulating alleles, and other innovative gene-therapies for mood- and intellect-enrichment, will be patented. They will be brought to market by biotechnology companies eager to cure the psychopathologies of the afflicted; and to maximise profits.
  6. Sixth, in the next few centuries an explosive proliferation of ever-more sophisticated virtual reality software products will enable millions, and then billions, of people to live out their ideal fantasies. Paradoxically, as will be seen, the triumph of sensation-driven wish-fulfilment in immersive VR will also demonstrate the intellectual bankruptcy of our old Peripheralist nostrums of social reform. Unhappiness will persist. The hedonic treadmill can’t succumb to computer software.
  7. Seventh, secularism and individualism will triumph over resurgent Islamic and Christian fundamentalism. An entitlement to lifelong well-being in this world, rather than the next, will take on the status of a basic human right.

         There are quite a few imponderables here. Futurology is not, and predictably will never become, one of the exact sciences. Conceivably, one can postulate, for instance, the global triumph of an anti-scientific theocracy. This might be in the mould of the American religious right; or even some kind of Islamic fundamentalism. Less conceivably, there might be a global victory of tender-minded humanism over the onward march of biotechnical determinism. It is also possible that non-medically-approved drug use could be curtailed, at least for a time, with intrusive personal surveillance technologies and punishments of increasingly draconian severity. Abetted by the latest convulsion of moral panic over Drugs, for example, a repressive totalitarian super-state could institute a regime of universal compulsory blood-tests for banned substances. Enforced “detoxification” in rehabilitation camps for offenders would follow.

        These scenarios and their variants are almost certainly too alarmist. Given a pervasive ethos of individualism, and the worldwide spread of hedonistic consumer-capitalism, then as soon as people discover that there is no biophysical reason on earth why they can’t be as happy as they choose indefinitely, it will be hard to stop more adventurous spirits from exploring that option. Lifelong ecstasy isn’t nearly as bad as it sounds.

David Pearce in The Hedonistic Imperative (chapter 3)



Desiring that the universe be turned into Hedonium is the straightforward implication of realizing that everything wants to become music.

The problem is… the world-simulations instantiated by our brains are really good at hiding from us the what-it-is-likeness of peak experiences. Like Buddhist enlightenment, language can only serve as a pointer to the real deal. So how do we use it to point to Hedonium? Here is a list of experiences, concepts and dynamics that (personally) give me at least a sort of intuition pump for what Hedonium might be like. Just remember that it is way beyond any of this:

Positive-sum games, rainbow light, a lover’s everlasting promise of loyalty, hyperbolic harmonics, non-epiphenomenal bliss, life as a game, fractals, children’s laughter, dreamless sleep, the enlightenment of emptiness, loving-kindness directed towards all sentient beings of past, present, and future, temperate wind caressing branches and leaves of trees in a rainforest, perfectly round spheres, visions of a giant ying-yang representing the cosmic balance of energies, Ricci flowtranspersonal experiences, hugging a friend on MDMA, believing in a loving God, paraconsistent logic-transcending Nirvana, the silent conspiracy of essences, eating a meal with every flavor and aroma found in the quantum state-space of qualia, Enya (Caribbean Blue, Orinoco Flow), seeing all the grains of sand in the world at once, funny jokes made of jokes made of jokes made of jokes…, LSD on the beach, becoming lighter-than-air and flying like a balloon, topological non-orientable chocolate-filled cookies, invisible vibrations of love, the source of all existence infinitely reflecting itself in the mirror of self-awareness, super-symmetric experiences, Whitney bottles, Jhana bliss, existential wonder, fully grasping a texture, proving Fermat’s Last theorem, knowing why there is something rather than nothing, having a benevolent social super-intelligence as a friend, a birthday party with all your dead friends, knowing that your family wants the best for you, a vegan Christmas eve, petting your loving dog, the magic you believed in as a kid, being thanked for saving the life of a stranger, Effective Altruism, crying over the beauty and innocence of pandas, letting your parents know that you love them, learning about plant biology, tracing Fibonacci spirals, comprehending cross-validation (the statistical technique that makes statistics worth learning), reading The Hedonistic Imperative by David Pearce, finding someone who can truly understand you, realizing you can give up your addictions, being set free from prison, Time Crystals, figuring out Open Individualism, G/P-spot orgasm, the qualia of existential purpose and meaning, inventing a graph clustering algorithm, rapture, obtaining a new sense, learning to program in Python, empty space without limit extending in all directions, self-aware nothingness, living in the present moment, non-geometric paradoxical universes, impossible colors, the mantra of Avalokiteshvara, clarity of mind, being satisfied with merely being, experiencing vibrating space groups in one’s visual field, toroidal harmonics, Gabriel’s Oboe by Ennio Morricone, having a traditional dinner prepared by your loving grandmother, thinking about existence at its very core: being as apart from essence and presence, interpreting pop songs by replacing the “you” with an Open Individualist eternal self, finding the perfect middle point between female and male energies in a cosmic orgasm of selfless love, and so on.

The Binding Problem

[Our] subjective conscious experience exhibits a unitary and integrated nature that seems fundamentally at odds with the fragmented architecture identified neurophysiologically, an issue which has come to be known as the binding problem. For the objects of perception appear to us not as an assembly of independent features, as might be suggested by a feature based representation, but as an integrated whole, with every component feature appearing in experience in the proper spatial relation to every other feature. This binding occurs across the visual modalities of color, motion, form, and stereoscopic depth, and a similar integration also occurs across the perceptual modalities of vision, hearing, and touch. The question is what kind of neurophysiological explanation could possibly offer a satisfactory account of the phenomenon of binding in perception?
One solution is to propose explicit binding connections, i.e. neurons connected across visual or sensory modalities, whose state of activation encodes the fact that the areas that they connect are currently bound in subjective experience. However this solution merely compounds the problem, for it represents two distinct entities as bound together by adding a third distinct entity. It is a declarative solution, i.e. the binding between elements is supposedly achieved by attaching a label to them that declares that those elements are now bound, instead of actually binding them in some meaningful way.
Von der Malsburg proposes that perceptual binding between cortical neurons is signalled by way of synchronous spiking, the temporal correlation hypothesis (von der Malsburg & Schneider 1986). This concept has found considerable neurophysiological support (Eckhorn et al. 1988, Engel et al. 1990, 1991a, 1991b, Gray et al. 1989, 1990, 1992, Gray & Singer 1989, Stryker 1989). However although these findings are suggestive of some significant computational function in the brain, the temporal correlation hypothesis as proposed, is little different from the binding label solution, the only difference being that the label is defined by a new channel of communication, i.e. by way of synchrony. In information theoretic terms, this is no different than saying that connected neurons posses two separate channels of communication, one to transmit feature detection, and the other to transmit binding information. The fact that one of these channels uses a synchrony code instead of a rate code sheds no light on the essence of the binding problem. Furthermore, as Shadlen & Movshon (1999) observe, the temporal binding hypothesis is not a theory about how binding is computed, but only how binding is signaled, a solution that leaves the most difficult aspect of the problem unresolved.
I propose that the only meaningful solution to the binding problem must involve a real binding, as implied by the metaphorical name. A glue that is supposed to bind two objects together would be most unsatisfactory if it merely labeled the objects as bound. The significant function of glue is to ensure that a force applied to one of the bound objects will automatically act on the other one also, to ensure that the bound objects move together through the world even when one, or both of them are being acted on by forces. In the context of visual perception, this suggests that the perceptual information represented in cortical maps must be coupled to each other with bi-directional functional connections in such a way that perceptual relations detected in one map due to one visual modality will have an immediate effect on the other maps that encode other visual modalities. The one-directional axonal transmission inherent in the concept of the neuron doctrine appears inconsistent with the immediate bi-directional relation required for perceptual binding. Even the feedback pathways between cortical areas are problematic for this function due to the time delay inherent in the concept of spike train integration across the chemical synapse, which would seem to limit the reciprocal coupling between cortical areas to those within a small number of synaptic connections. The time delays across the chemical synapse would seem to preclude the kind of integration apparent in the binding of perception and consciousness across all sensory modalities, which suggests that the entire cortex is functionally coupled to act as a single integrated unit.
— Section 5 of “Harmonic Resonance Theory: An Alternative to the ‘Neuron Doctrine’ Paradigm of Neurocomputation to Address Gestalt properties of perception” by Steven Lehar

Schrödinger’s Neurons: David Pearce at the “2016 Science of Consciousness” conference in Tucson



Mankind’s most successful story of the world, natural science, leaves the existence of consciousness wholly unexplained. The phenomenal binding problem deepens the mystery. Neither classical nor quantum physics seem to allow the binding of distributively processed neuronal micro-experiences into unitary experiential objects apprehended by a unitary phenomenal self. This paper argues that if physicalism and the ontological unity of science are to be saved, then we will need to revise our notions of both 1) the intrinsic nature of the physical and 2) the quasi-classicality of neurons. In conjunction, these two hypotheses yield a novel, bizarre but experimentally testable prediction of quantum superpositions (“Schrödinger’s cat states”) of neuronal feature-processors in the CNS at sub-femtosecond timescales. An experimental protocol using in vitro neuronal networks is described to confirm or empirically falsify this conjecture via molecular matter-wave interferometry.


For more see: https://www.physicalism.com/


(cf. Qualia Computing in Tucson: The Magic Analogy)


(Trivia: David Chalmers is one of the attendees of the talk and asks a question at 24:03.)

Beyond Turing: A Solution to the Problem of Other Minds Using Mindmelding and Phenomenal Puzzles

Here is my attempt at providing an experimental protocol to determine whether an entity is conscious.

If you are just looking for the stuffed animal music video skip to 23:28.

Are you the only conscious being in existence? How could we actually test whether other beings have conscious minds?

Turing proposed to test the existence of other minds by measuring their verbal indistinguishability from humans (the famous “Turing Test” asks computers to pretend to be humans and checks if humans buy the impersonations). Others have suggested the solution is as easy as connecting your brain to the brain of the being you want to test.

But these approaches fail for a variety of reasons. Turing tests can be beaten by dream characters and mindmelds might merely work by giving you a “hardware upgrade”. There is no guarantee that the entity tested will be conscious on its own. As pointed out by Brian Tomasik and Eliezer Yudkowsky, even if the information content of your experience increases significantly by mindmelding with another entity, this could still be the result of the entity’s brain working as an exocortex: it is completely unconscious on its own yet capable of enhancing your consciousness.

In order to go beyond these limiting factors, I developed the concept of a “phenomenal puzzle”. These are problems that can only be solved by a conscious being in virtue of requiring inner qualia operations for their solution. For example, a phenomenal puzzle is to arrange qualia values of phenomenal color in a linear map where the metric is based on subjective Just Noticeable Differences.

To conduct the experiment you need:

  1. A phenomenal bridge (e.g. a biological neural network that connects your brain to someone else’s brain so that both brains now instantiate a single consciousness).
  2. A qualia calibrator (a device that allows you to cycle through many combinations of qualia values quickly so that you can compare the sensory-qualia mappings in both brains and generate a shared vocabulary for qualia values).
  3. A phenomenal puzzle (as described above).
  4. The right set and setting: the use of a proper protocol.

Here is an example protocol that works for 4) – though there may be other ones that work as well. Assume that you are person A and you are trying to test if B is conscious:

A) Person A learns about the phenomenal puzzle but is not given enough time to solve it.
B) Person A and B mindmeld using the phenomenal bridge, creating a new being AB.
C) AB tells the phenomenal puzzle to itself (by remembering it from A’s narrative).
D) A and B get disconnected and A is sedated (to prevent A from solving the puzzle).
E) B tries to solve the puzzle on its own (the use of computers not connected to the internet is allowed to facilitate self-experimentation).
F) When B claims to have solved it A and B reconnect into AB.
G) AB then tells the solution to itself so that the records of it in B’s narrative get shared with A’s brain memory.
H) Then A and B get disconnected again and if A is able to provide the answer to the phenomenal puzzle, then B must have been conscious!

To my knowledge, this is the only test of consciousness for which a positive result is impossible (or maybe just extremelly difficult?) to explain unless B is conscious.

Of course B could be conscious but not smart enough to solve the phenomenal puzzle. The test simply guarantees that there will be no false positives. Thus it is not a general test for qualia – but it is a start. At least we can now conceive of a way to know (in principle) whether some entities are conscious (even if we can’t tell that any arbitrary entity is). Still, a positive result would completely negate solipsism, which would undoubtedly be a great philosophical victory.


Vanished are the veils of light and shade,
Lifted the vapors of sorrow,
Sailed away the dawn of fleeting joy,
Gone the mirage of the senses.
Love, hate, health, disease, life and death –
Departed, these false shadows on the screen
of duality.
Waves of laughter, scyllas of sarcasm, whirlpools
of melancholy,
Melting in the vast sea of bliss.
Bestilled is the storm of maya
By the magic wand of intuition deep.
The universe, a forgotten dream, lurks
Ready to invade my newly wakened memory divine.
I exist without the cosmic shadow,
But it could not live bereft of me;
As the sea exists without the waves,
But they breathe not without the sea.
Dreams, wakings, states of deep turiya sleep,
Present, past, future, no more for me,
But the ever-present, all-flowing, I, I everywhere.
Consciously enjoyable,
Beyond the imagination of all expectancy,
Is this, my samadhi state.
Planets, stars, stardust, earth,
Volcanic bursts of doomsday cataclysms,
Creation’s moulding furnace,
Glaciers of silent X-rays,
Burning floods of electrons,
Thoughts of all men, past, present, future,
Every blade of grass, myself and all,
Each particle of creation’s dust,
Anger, greed, good, bad, salvation, lust,
I swallowed up – transmuted them
Into one vast ocean of blood of my own one Being!
Smoldering joy, oft-puffed by unceasing meditation,
Which blinded my tearful eyes,
Burst into eternal flames of bliss,
And consumed my tears, my peace, my frame,
my all.
Thou art I, I am Thou,
Knowing, Knower, Known, as One!
One tranquilled, unbroken thrill of eternal, living, ever-new peace!
Not an unconscious state
Or mental chloroform without wilful return,
Samadhi but extends my realm of consciousness
Beyond the limits of my mortal frame
To the boundaries of eternity,
Where I, the Cosmic Sea,
Watch the little ego floating in Me.
Not a sparrow, nor a grain of sand, falls
without my sight
All space floats like an iceberg in my mental sea.
I am the Colossal Container of all things made!
By deeper, longer, continuous, thirsty,
guru – given meditation,
This celestial samadhi is attained.
All the mobile murmurs of atoms are heard;
The dark earth, mountains, seas are molten liquid!
This flowing sea changes into vapors of nebulae!
Aum blows o’er the vapors; they open their veils,
Revealing a sea of shining electrons,
Till, at the last sound of the cosmic drum,
Grosser light vanishes into eternal rays
Of all-pervading Cosmic Joy.
From Joy we come,
For Joy we live,
In the sacred Joy we melt.
I, the ocean of mind, drink all creation’s waves.
The four veils of solid, liquid, vapor, light,
Lift aright.
Myself, in everything,
Enters the Great Myself.
Gone forever,
The fitful, flickering shadows of a mortal memory.
Spotless is my mental sky,
Below, ahead, and high above.
Eternity and I, one united ray.
I, a tiny bubble of laughter,
Have become the Sea of Mirth Itself.


– Songs of the Soul. Paramahansa Yogananda (source)

(cf. Ontological Qualia: The Future of Personal Identity)

Empathetic Super-Intelligence

I suspect quite a lot of AI researchers would think of empathy as a very, kind of, second rate form of intelligence, and will confuse it with the mere personality dimension of agreeableness. But in fact empathetic understanding is extraordinarily cognitively demanding. It is worth recalling that what it is like to be another subject of experience, and what it is like to be able to apprehend fourth, fifth, and even -in the case of someone like Shakespeare- sixth order intentionality is as much a property of the natural world as the rest mass of the electron. So in so far as one wants to actually understand the nature of the natural world one is going to want to understand other subjects of experience.


Now most of us -and it is true for good evolutionary reasons- are not mirror-touch synesthete. We don’t feel the pain of others as if it were our own, we don’t experience their perspectives as if they were our own. But… I think with a greater full-spectrum of super-intelligence, as one actually comes to understand the perspectives of other subjects of experience, as one starts to obtain this god-like capacity to impartially understand all possible subjects of experience, this will entail expanding our circle of compassion. It is not possible if you are a mirror-touch synesthete to act in a non-friendly hostile way to another sentient being, and I would see a generalization of mirror-touch synesthesia as part and parcel of being a full-spectrum super-intelligence.


Any supposedly intelligent being that doesn’t understand the nature of consciousness -that doesn’t understand that there are other subjects of experience as real as you or me right now- is in a fundamental sense ignorant. And if we are talking about super-intelligence, as distinct from savant minds or insentient malware systems, one must remember that by definition a super-intelligence is not ignorant.


– David Pearce, in The Mind of David Pearce

David is responding to a question I posed about the relationship between empathy and super-intelligence. One could in principle imagine an Artificial Intelligence system that is capable of designing nanotechnology without ever being aware of other minds. The AI could take over the world from the ground up and never suspect that anyone actually lived in it. The AI could simply model other agents to the extent that is necessary to predict their behavior within the parameters that define the conditions of its success and failure. No need to experience the colors, aromas, feelings and thoughts of humans when you can approximate them well enough with a Bayesian system trained on past observed behaviors.

A lot of people seem to be worried about this sort of scenario. Admittedly, if one thinks that all there is to intelligence is the capacity to optimize a given utility function, then yes, super-intelligences could in principle be completely ignorant of the matter-of-fact about the qualia we are experiencing right now. AI safety organizations and researchers mostly care about this sort of intelligence. As far as the safety concerns go, I think this is fair enough. The problem comes when the view that intelligence is nothing more than utility function maximization conceptually spills over into one’s full conception of what intelligence is and will always be.

The contention here, I think, is the way we conceptualize intelligence. Depending on one’s background assumptions one will end up with some or other idea of what this concept can or cannot mean:

On the one hand, if one starts by exclusively caring about the way autonomous systems behave from a third-person point of view and in turn disregards the computational importance of consciousness, then any talk of “a deep understanding of the nature of other’s experiences” will seem completely besides the point. On the other hand, if one starts from an empathetic mindset that acknowledges the reality -and ethical weight- of the vastness of the state-space of consciousness, one may prefer to define intelligence in terms of empathetic understanding. I.e. as the capacity to explore, navigate, contrast, compare and utilize arbitrarily alien state-spaces of experience for computational and aesthetic purposes.

Most likely, an enriched conception of genuine super-intelligence entails a seamless blend between a deep capacity for introspection and empathy together with the extraordinary power of formal logico-linguistic reasoning. Only by combining the empathizing and systematizing styles of mental activity, together with a systematic exploration of the state-space of consciousness, can we obtain a full picture of the true potential that lies in front of us: Full-Spectrum Super-Sentient Super-Intelligence.

Wireheading Done Right: Stay Positive Without Going Insane

Wireheads are beings who have changed their reward architecture in order to be happy all the time. Unfortunately few people are making a serious effort to steel man the case for wireheading. The concept of wireheading tends to be a conversation stopper, and is frequently used as a reductio-ad-absurdum for valence utilitarianism. Hedonism is a low-status philosophy at the time, but this may be the result of what amounts to dumb reasons (i.e. going against it signals intellectual sophistications). Let’s be meta-contrarian for a moment and think critically about it. What would a good case for wireheading look like? In what follows I will (1) provide an account of what is known about emotional dynamics over time, (2) discuss the known pitfalls of current wireheading methods, (3) propose a system to overcome these pitfalls, and (4) make the case that combining wireheading (done right) with a systematic exploration of the state-space of consciousness might ultimately be our saving grace against the perils of Darwinism at the evolutionary limit.

Let us begin by enriching our understanding of the nature of bliss and its temporal dynamics:

The Cube of Euphoria

A little over a year ago I conducted a study to figure out the main dimensions along which psychotropic drugs influence people. The State-Space of Drug Effects consists of six main dimensions: fast euphoria, slow euphoria, spiritual euphoria, clarity, perception of overall value, and external vs. internal source of interest. The first three dimensions are directly related to pleasure, which makes them relevant for our current discussion.

Fast euphoria is what you get when you take stimulants, exercise or anticipate that something great is about to happen. Slow euphoria is what you experience if you take opioids or depressants, receive massages or hug a loved one. Spiritual/philosophical euphoria changes less frequently relative to the daily comings and goings of the other two. It is a state of consciousness related to the way we represent “the big picture”. Those who seek it try to induce it by methods that include philosophical thinking, spiritual practices and/or psychedelic drug use.

Two out of these three dimensions are equivalent to the well-studied emotion classification space of valence and arousal (also called core affect). Valence is how good the experience feels, whereas arousal deals with the intensity of the experience. It turns out that one can get the slow-fast projection of the cube of euphoria by changing the basis used to represent the valence-arousal space. You can get the valence-arousal space simply by rotating the slow-fast euphoria projection by 45 degrees:

As we can see, fast euphoria is equivalent to “high-valence high-arousal” while slow euphoria is equivalent to “high-valence low-arousal”. This basis is not uncommon in affective psychology, and when used the axes are usually labeled “positive and negative activation”. We will use a yellow-red circle to represent fast euphoria and a blue-green circle to represent slow euphoria. I chose this color-coding by reasoning that warm colors are a better representation of ecstatic states of consciousness whereas cool colors illustrate better the feelings of cooling off and relaxing. I happen to prefer the fast-slow basis because it highlights the different kinds of euphoria in a helpful way that captures behavioral differences. This will be important when we get to steel-manning wireheading later on.

Formalizing the Hedonic Treadmill: Negative Feedback Mechanisms

It is well known that in the long run the things that happen to you have a surprisingly small effect on your overall level of happiness. One tends to always orbit around one’s hedonic set-point (our mean valence and arousal values). Although our average sense of wellbeing does change from context to context (in response to variables such as stress, novelty, drug regimens, accomplishments, and opportunities for meaningful relationships), the environmental effect usually washes out over time by one’s internal negative feedback mechanisms. The ability to achieve lasting happiness, it turns out, was not as evolutionarily adaptive in our ancestral environment as the robust re-centering of affective dynamics that ended up governing our patterns of wellbeing. Thankfully, though unfairly, we are not all equally miserable; some people are lucky to be born hyperthymic and enjoy life the majority of the timeGenetically-determined pain-thresholds do not only influence how one responds to physical discomfort, but also predict the size of one’s social network (presumably by making social rejection less taxing).

Less well known is that people have different values for their valence-arousal correlation. According to a 2007 study by Peter Kuppens, the conventional wisdom in affective psychology that valence and arousal are uncorrelated is not quite correct. For 30% of people valence is negatively correlated with arousal, for 30% it is the opposite and for the remaining 40% there is no correlation between these two dimensions.

This means that some people usually experience high valence (i.e. feel good) at the same time as being in an up-beat high energy state, and when they feel bad they tend to also have low levels of energy. On the other extreme there are those who experience bliss by tuning the energy down and relaxing, and primarily experience bad feelings in the form of high-energy states (such as irritation, worry and anger).


The study showed that the correlation between valence and arousal was person-specific (negative for ~30%, positive for ~30%, no correlation for ~40% of people).

What else is variable across people? As it turns out, the transition patterns of core affect are related to personality factors. People’s level of variance in the valence dimension is an important component of neuroticism. Although most neurotics tend to hang out in low-valence states, there are indeed very happy neurotics whose problem is not that they feel bad, but that great feelings are too short-lasting and unpredictable. It is the unpredictability of valence rather than its absolute value that results in the coping mechanisms typical of this dimension. Likewise, higher variability in arousal is a component of extraversion, SEE I AM SCREAMING NOW (for example). Openness to experience can be understood in terms of novelty-triggered increases in valence, so that more open individuals are more likely to experience euphoria of all kinds when learning new information relative to people who describe themselves as conventional. Conscientious individuals feel very rewarded when they complete a laborious task (but may experience more intense shame if they do not finish it on time). Agreeableness is undeniably connected to a positive perception of other people. If one feels that others are right and deserve to exist one is more likely to cooperate. The way to have positive perceptions of others is to increase the hedonic tone of the interpersonal representations. In brief, core affect dynamics can be used to capture otherwise hard-to-describe properties of the various personality factors. They each have a signature behavior in the valence-arousal space.

In a paper titled A Hierarchical Latent Stochastic Differential Equation Model for Affective Dynamics, Oravecz, Tuerlinckx, and Vandekerckhove applied the Ornstein–Uhlenbeck process to the dynamics of core affect. Their model takes into account many important features that had previously been overlooked for the sake of simplicity. As mentioned in the previous paragraph, these features turn out to be important signatures of personality factors, so having a model that incorporates them may be very useful to understand the differences between people. Their model describes people as having: a latent home base (hedonic set point), variance (for both components), a correlation between valence and arousal, an average speed, and a time-dependent relocation of the home base determined by the hour of the day. The model allows you to estimate person-specific parameters (using as input a sequence of self-reported emotional states). In turn, once you have determined someone’s latent parameters, the model can help you predict their future affect based on their current state.

This model is perhaps as good as it gets if you are restricted by a Markov assumption and given only the valence and arousal dimensions of the participants over time. The state-space of emotion is far more granular, though. Even increasing the number of dimensions by one (e.g. by including the dimension of spiritual euphoria) may go a long way in clarifying the nature of unexpected emotional transitions. What explains the sometimes very large effect of philosophical discoveries, religious conversions, and personal epiphanies?

A Map of Emotion Attractors: Studying 176X176 Transition Probabilities

Back between 2012 to 2014 I worked on modeling the dynamics of emotion transitions. I did this first as part of a research project for a company I worked for (thanks Kanjoya!) and it then transformed into the topic of my masters’ thesis. If you are interested in reading about it you can find some more on a paper I wrote with colleagues on predicting future emotions based on a sequence of previous ones (together with social cues).*

The analysis I worked on was made on a sample of hundreds of thousands of users of the now-defunct (but still browsable) Experience Project social network. Participants would have the option to record their mood on the landing page: they would select an emotion from a list of 176 words, say how intense this emotion was at the time (from 1 to 5) and explain why they feel the way they do (open text; optional). I analyzed the transition probability between each ordered pair of emotions for different intervals of time and compressed it into a score that describes the overall flow of people between them. This results in a flow graph that we can analyze with tools from graph theory. I explored many ways of clustering this graph and ultimately settled on a method that generated the best predictive power on a model to forecast future emotions. This method consisted of grouping the emotions in such a way that each emotion would maximize its mean transition probability to other emotions in the same group (relative to other groups). For the paper I made this graph with all of the emotions (nodes), the transition probabilities between them (edge thickness) and the resulting clusters (colors):


A weighted directed graph with 176 nodes, each representing a distinct state of consciousness. The edges represent the (directed) compressed transition probability between each ordered pair of states. The size of each node approximates the base rate at which the emotion occurs in the sample.

Each color represents a given “emotion attractor”. At a high level, we can say that whenever you are experiencing an emotion that is e.g. green you are more likely to transition to other emotions that are also green (relative to what would be expected from choosing an emotion randomly). This analysis is ultimately consistent with Oravecz et al.’s model in the sense that both analysis study the dynamic way in which people tend to get in and out of their home base. However, the granularity afforded by the 176 different options also allowed me to examine the deviations from this pattern. I investigated the question: “Which emotions take you to places that are inconsistent with the general trend of stochastically moving towards the central hedonic set-point?”

It turns out that some emotions behave in interesting ways. Some are what we called “hubs”: common stopping points that work as a route between any two colors. For example, “calm” and “tired” are hubs, and they do not give you much information about past or future emotions. Some other emotions behave like ‘gateways’ in the sense that they tend to indicate a jump from a particular color to another. For example, ‘hopeful’ and ‘relieved’ are two ‘gateway’ emotions: they work as stepping stones from blue (depressive) emotions to green (positive) ones.

Some emotions challenge the hedonic treadmill by virtue of predicting unexpectedly long-lasting stays on a given color. For example, the words “blessed”, “blissful” and “loved” were great predictors of long-lasting green emotions. By examining the text of these mood updates we determined that on average the people listing religious and spiritual themes as the cause of their feelings were more likely to stay for longer periods of time in the zone of positive emotions than most other people in the sample. I suppose that people’s spiritual euphoria may hack the pattern of hedonic habituation to some extent in a few lucky ones. I personally do not think that this is a scalable solution for everyone, though, since not everyone is spiritually oriented or has their endogenous opioid system wired up properly for meditation. The outstanding effect sizes we may see in some people who benefit from a particular e.g. meditation technique rarely generalize to everyone else. That said, it is certainly neat to see some evidence of some (spiritual/philosophical) sabotage at the mill.

How can we feel better in the long term?

A few years ago I abandoned hope in the idea that psychological interventions are sufficient to increase our wellbeing  (philosophy, spirituality and exposure therapy can only take you so far in making you feel better). So what is next? The trick will be to combine psychological, chemical, electrical and genetic methods together in a balanced and healthy way and forget about relying on a single method. Can we be happy all the time? Let us move on to the subject of wireheading more directly. Given what we have discussed about core affect, emotion dynamics and the resilience of the hedonic set-point, is it possible to wirehead oneself in a non-regrettable way? I think that the answer is yes, but we will need to avoid some crucial dangers…

Wireheading Done Wrong I: Forgetting About the Negative Feedback

Fast and slow euphoria can be reliably triggered by sensorial or chemical methods. However, doing so quickly kick-starts two negative feedback mechanisms.


Current hedonic negative feedback dynamics.

The first one is that the effect is reduced (shown in the image as the little loops with a minus sign) with each use. And the second one is that withdrawing from these euphoric states kindles circuits that do the opposite of what was intended (as shown by the arrows with a positive sign). Too much of something that calms you is going to bring about a long and withdrawn state of constant low-level anxiety. Too much of anything that makes you up-beat and ecstatic is going to induce a long and withdrawn state of low-level depression.

Amphetamines, traditional opioids, barbiturates and empathogens can be ruled out as wise tools for positive hedonic recalibration. They are not comprehensive life enrichers precisely because it is not possible (at least as of 2016) to control the negative feedback mechanism that they kick-start. Simply pushing the button of pleasure and hoping it will all be alright is not an intelligent strategy given our physiological implementation. The onset of this negative feedback often triggers addictive behavior and physiological changes that shape the brain to expect the substance.

The case of spiritual/philosophical euphoria is a lot trickier. It is clear that there is a negative feedback that may be described (more or less) as a sort of philosophical boredom. Psychedelics are capable of changing our brain so as to increase the range of possible valence (i.e. they can enable states of extreme pleasure but also extreme suffering) in a way that sidesteps the need to directly interact with our pleasure centers. I think it is extremely important to figure out the mechanism of action of psychedelic bliss. We will in fact address it in another article. For now it will suffice to say that psychedelic pleasure does not seem to induce cravings or withdrawal. We should take a close look at it because it may be the key to understanding how to produce unlimited positive valence with no negative repercussions. Unfortunately, producing philosophical, spiritual and psychedelic bliss nowadays is still more of an art than a science; these methods are unreliable and can backfire tremendously.

In summary, we might say that if one is oblivious to negative feedback, then meth addiction is an attempt at fast euphoria wireheading, whereas opioid dependence is the result of trying (ineffectively) to obtain boundless slow euphoria. Spiritual euphoria wireheading attempts usually involve activities such as philosophy, meditation, prayer and psychedelic drug use. Even though attempting spiritual euphoria wireheading on oneself is a hell of a lot healthier than doing meth or heroin, it is certainly not free from possible psychological side effects (such as acquiring bizarre beliefs or experiencing events- sometimes profoundly distressing- of spiritual dysphoria and unwanted changes in one’s belief system).

Wireheading Done Wrong II: Seduced by a World of Your Own

One simple approach to wireheading effectively is to remove either one or both of the negative feedback mechanisms shown in the image above. Wiring electrodes into one’s pleasure centers does the trick just fine, since it apparently removes both. It turns out that the mechanism for generating physiological tolerance is bypassed by direct electric (rather than chemical) stimulation to the nucleus accumbens. Bliss obtained this way does not seem to stop pouring nor diminish in greatness over time. Unfortunately this method has profound pitfalls. Most salient of all is that if given the choice, mice (and some but not all people) will continuously self-stimulate this way as frequently and as intensely as possible, neglecting both physiological needs (like food and sleep) and social demands (like feeding one’s children or paying taxes). In the case of humans, people feel compelled to self-stimulate when suffering, but under normal circumstances (if feeling good already) they can hold off from pressing the button in order to carry out other activities. Admittedly this is an improvement over drugs, which make you feel terrible in the long run and in turn make you seek relief with the same method that brought you there. With electrical rather than chemical stimulation we can at least avoid this pitfall. That said, people do not like to have objects implanted in their brain, and our infection-prone future will thank us for not developing an addictive technology that requires a constant stream of ineffective antibiotics to keep it plugged in place. Thankfully future wireheading may be minimally invasive. Attractive alternatives to old-fashioned electrodes include body-powered wireless implants, optogenetics, and genetically encoded magnetic triggers of neural activity.

A much more subtle way to try to improve one’s hedonic set point is by counteracting the activation of the post-pleasure dysphoria only. Anti-depressants of the SSRI variety and the less well-known fast-acting aminoguanidine agmatine help prevent gross kindling of circuits that produce unpleasant sensations. This method may ultimately come down to increasing the amount of noise in the entire system** and thus reducing the survivability of highly-ordered states (such as pain and pleasure) in one’s consciousness. Preventing withdrawal by this method comes at the cost of blunting high-valence states, unfortunately. Prolonged SSRI use often makes people anhedonic and feel like they have lost all zest for life. In contrast, Ibogaine and low-dose opioid antagonists are promising chemical avenues to attack the same problem in a very different way without such side-effects. These compounds work by rebalancing one’s proportion of the various opioid receptor subtypes and in turn driving one’s hedonic capacity upwards (for some reason I don’t understand).

A whole generation of people will probably be “lost” to what I call single euphoria wireheading: let’s say that you have mastered the ability to experience a high level of fast euphoria in a sustainable way. You can in principle stop at any point and come down without feeling like you are missing out. But whenever you do activate the fast euphoria you are about ten times more motivated to go out, explore the world, work on projects and meet great people who also share your newfound interests and values. You may end up choosing to join a community of other people who value living fast and staying hyper-motivated, just as you do now.

Fast euphoria in particular is extremely tricky to program correctly, since it deals so directly with behavioral reinforcement. Many people get hooked on meth + X rather than on just meth: whether X is music, gamingsex, gambling, porn and/or alcohol, during a meth binge people often end up doing the exact same repetitive but pleasant task for tens of hours. In other words, fast euphoria not only reinforces itself, but it also reinforces whatever activity you do while you experience it, and this is especially the case if the activity is more enjoyable as a result of the fast euphoria. Stimulant addiction, deep brain stimulation and manic states in bipolar sufferers share a core personality-changing effect driven by an excessive interest in a few rewarding activities at the expense of all other interests and responsibilities. It is extremely tricky to rationally use one’s reinforcement system directly in order to recalibrate one’s hedonic tone. Without (as-of-yet-uninvented) safeguards, doing so tends to increase impulsivity in the long term and mess up one’s preference architecture.

We could in principle block the metabolic pathways that lead to changes in one’s motivational system as a response to fast euphoria. If we did this, then people might be able to master side-effect-free hyper-motivation. Does this mean that a straightforward road to Super-Happiness is short-cutting to perpetual motivation?

The main problem is that motivation and one’s implicit notion of our self-in-time interact in unpredictable ways. One of the very mechanisms of action by which something like meth can transform your preference architecture is by forcibly redefining your self-model (cf. Ontological Qualia). Fast euphoria brings one’s attention towards the present moment and present happenings. In high amounts, it brings you face to face with your own presence in the eternal now. From that point of view, it feels as if that very moment is who you are, and one’s normal state of consciousness is re-interpreted as a mere jumping platform at the service of the few and far apart moments of real joy. To have an episode of feeling “truly alive” and returning to typical human conditions can unquestionably be felt as a sort of death -one not of the biological body but of the fleeting self-models that inhabited such sharper and subjectively more worthwhile state-spaces of consciousness.

If one’s implicit self-model is not robust against sudden changes in one’s level of fast euphoria, then one will not be good at surviving and being productive in a social economy. Let us say that person A is able to identify herself with her future self in 2059 and save for retirement, but person A on meth has a very hard time thinking of herself in any other terms than “me, right now, for as long as I can stay in this state of mind, 3 to 9 hours, give or take, depending on whether I redose.” The present moment, the immediate future and the pleasure opportunities available in it can be so salient that they eclipse one’s every other interest. If we do not find a way to prevent this shift in perspective it may be impossible to safeguard rationality when showered with streams of high-grade fast euphoria.

How about slow euphoria wireheading? I suspect that it is in principle possible to master hyper-relaxation without being incapacitated. In the meantime, trying to slow down too much does seem to reduce one’s productivity by a good margin, so wireheading of the slow euphoria type is not currently advisable. That said, achieving hedonic recalibration by guaranteeing a minimum of slow euphoria is, as I see it, a lot more feasible than doing so through fast euphoria. Slow euphoria does not have the explosive effects on one’s motivational architecture and self-models that fast euphoria does. On the contrary, relaxation can allow us to reconceive of ourselves as beings who inhabit much longer timelines (to really grasp our decades-long lifespan and know how to pace ourselves rather than feeling pressed to identify with our present moment exclusively).

Spiritual euphoria may or may not necessarily imply changes in one’s belief structure. Currently, peak spiritual/philosophical states (including high levels of psychedelia) are a rather different kind of subjective wellbeing than the other two that dominate our everyday life. This bliss is often associated with extreme changes in the quality of one’s conceptualization of reality, which limits its effective incorporation into a rational and economically productive being. Unless, of course… one is producing useful information in those states. More about this further below.

In summary: If a device is ever discovered that allows people to enjoy fast, slow or spiritual euphoria without implicitly influencing their worldview and economic capacity, then that device will probably become a staple of life. Issues of authorship and agency aside, single euphoria wireheading without serious engineering to counter its problems is a road to oblivion from the point of view of evolution. Whether controlled single euphoria wireheading can be adaptive is still up for debate.

Wireheading Done Wrong III: Becoming a Pure Replicator (Even If You Love It)

Look, we are all friends here. We are trying to delay for as long as possible the development of a Singleton (i.e. a state of complete control by one system), while we also try to keep at bay the problem of Moloch*** (i.e. complete lack of control). We are trying to find a sustainable solution against both extremes. In our ideal world, all beings should have the freedom to explore the state-space of consciousness however they want (or live in an Archipelago of societies at the very least). We need to work together on designing the future to avoid evolutionary extremes and safeguard freedom of consciousness. Now, who is the enemy?

The Threat of Pure Replicators

I will define a pure replicator, in the context of agents and minds, to be an intelligence that is indifferent towards the valence of its conscious states and those of others. A pure replicator invests all of its energy and resources into surviving and reproducing, even at the cost of continuous suffering to themselves or others. Its main evolutionary advantage is that it does not need to spend any resources making the world a better place.

If given the choice, please don’t become a pure replicator and throw under the bus all the hard work that people throughout history have put into making the world not an entirely horrible place. Pure replicators may come in many guises. While the term pure replicator may invoke images of cockroaches and viruses in one’s mind, the truth is that your modafinil-fueled income-maximizing coworker may already be on the path of turning into one. Wait, what did you just read?

Considering that the dimension of spiritual euphoria is the most intense (and subjectively profound) source of conscious value, it would be a shame if our society exclusively optimized for linear logico-linguistic “high clarity” states of consciousness. Of all the drugs available, when balancing side effects and overall effectiveness, it is likely that modafinil-like compounds (e.g. custom nootropics) give you the single largest economic edge within this society. Caffeine is already available to everyone, speed slowly kills you and micro-dosed LSD makes you (believe it or not) too creative for most paying jobs. Is it possible to make the interesting and valuable states of consciousness the ones that are economically rewarded? Are we going to let the economic incentives in our society silently maximize the presence of modafinil-like states of consciousness?


There is no known substance that enhances both “clarity” and “spiritual/philosophical euphoria” at the same time. It would be a shame if all the economy cared about was your level of clarity, for that would mean that modafinil junkies users will rule the world. (Oh, wait…). At the limit, such a world may be impervious to conceptual revolutions or caring about valence research.  Image Source.

In practice, unless digital AGI pans out or nanotechnology takes over, pure replicators are going to need to interface with human and posthuman markets to gain any power. Although fashionable to think about nowadays, exotic nanotech and/or AI pure replicators may ultimately be far easier to stop than pure replicators that disguise themselves as humans (i.e. people who turn into empty shells of their former selves by embracing hyper-competitive Moloch memes and their associated technologies). As we will see, the nature of future economic selection pressures may be the most important factor in whether or not we are taken over by armies of pure replicators.

Aren’t we all pure replicators already?

Tautologically, natural selection can only produce pure replicators. But this would be to think of the term in an unhelpful way that is not true to the spirit of the idea. This is why we defined pure replicators in terms of indifference towards conscious states. Most animals do indeed care a great deal about the valence of their own consciousness; after all, the motivational power of the pleasure-pain axis is the very reason why evolution recruited conscious valence to begin with. More so, sexual selection happens to have recruited introspection, aesthetics, benevolence and intelligence as fitness-indicators (which explains why we are so keen on advertising these traits). Brian Tomasik calls our times the Age of Spandrels because we live in a period that is reaping the benefits of surplus production (still being below carrying capacity) while silly non-optimal aesthetics inherited from our evolutionary past still survive. Interpersonal love, sexually selected hedonistic social rituals and ingrained prosocial implicit values may be evolutionary spandrels in the context of our economy, but (surprisingly) they are still part of our society. Hence, we today can enjoy watching movies, making love and thinking about philosophy. Our drive to delight in life is powerful enough to distract us from optimal economic participation, and our emotional wellbeing (which affects our economic participation) is still linked to events dealing with our level of pleasure outside work.

In contrast, the intelligent agents of the future may not be constrained to using the pleasure-pain axis to implement goal-oriented behaviors. One could envision scenarios like Robin Hanson’s Age of EM in which the most productive (and abundant) minds do 99.9999% of the work, and this work is boring 99.9999% of the time. These minds may work while in near-neutral states of consciousness that have either negligibly positive or even outright negative valence. The employees of this massive workforce are those individuals who are willing to do whatever they are told for 0.00001% vacation time and the opportunity to stay alive and multiply (in this case by copying the minds in digital servers). The employers may themselves not be particularly happy because they are also competing against other companies that cut down on costs as much as possible. If smiling does not increase one’s productivity at one’s job but it does waste precious calories and units of attention, then smiling will be abolished for purely economic reasons. In this scenario everyone is either employed and miserable (relative to our current standards), or unemployed and dying of starvation. We can thank those entities who were willing to completely sacrifice their own psychological depth (and freedom to explore the state-space of consciousness) for the sake of merely existing. The world now fails to produce any actual value in the form of meaningful states of consciousness and is over-saturated with modafinil-like consciousness.

Singleton and Moloch end-of-times scenarios tend to look pretty terrible because the worlds they present don’t seem to contain reasons for anyone to care about valence.

But in this day and age we may be on the brink of reverse-engineering valence itself. Once we figure out the equation that takes as input quantum fields and outputs the conscious valence present in them, we will be able to quantify just exactly how bad our possible futures-at-the-limit will be depending on the economic selection pressures that we put in place today.

A desirable Singleton should at the very least care about states of high-valence and avoid negative valence states as much as possible. In a future article we will discuss some ideas for how to design an economic system based on cooperation that increases our chances of having ecologies of sustainable conscious entities who have the following properties: (1) they are free to explore the state-space of consciousness, (2) are social, and (3) have access to practically unlimited positive valence. But what if we are headed towards a perpetual Moloch (failure of cooperation) scenario?

Surviving in the Sundown of the Age of Spandrels

‘[I]n Time any being that is spontaneous and alive will wither and die like an old joke.’

– (WL 111)

What would be a list of desirable traits that we want to have after acquiring complete control over our individual pleasure-pain axis? David Pearce doesn’t get tired of pointing out that the future does not belong to anti-natalists. Their compassionate genes will be weeded out of the gene pool, and it will be their own compassionate sentimental fault. Similarly, full-blown single euphoria wireheading (as discussed above) is destined for oblivion unless it also happens to give you marketable skills.

We want to be able to both feel good and at the same time remain economically competitive (or we are going to be crowded out by pure replicators). Here is a list of traits that would help us have lives worth living without sacrificing our economic value:

  1. Always in a positive valence state (i.e. remaining above hedonic zero).
  2. Faithful/good enough internal simulation of one’s environment (both physical and social).
  3. Free to explore at will the known state-space of consciousness.
  4. Capable of producing socially-useful information.
  5. Free from unconscious bonds and protected against mind control.
  6. Capable of exiting attractor states (affective, cognitive, social e.g. thought loops).
  7. Able to make others happier in ways they did not know were possible.

Trans/Post-human negative feedback mechanisms. It is a virtuous cycle that delivers hearty amounts of euphoria but with no craving as a result. What’s reinforced is the flow between the types of euphoria rather than each kind on its own.

The first trait matters for ethical reasons: one needs to guarantee that the entities you bring into existence will always be happy to be alive. One should never compromise on the wellbeing of the beings one designs and gives birth to. If someone does, then we are again off to the races against pure replicators willing to suffer for a chance to exist.

The second trait is a requirement to survive in a physical world and a social economy (for obvious reasons).

The third trait is motivated by both ethical and practical reasons: as I understand it, having the ability to explore the known state-space of consciousness guarantees that you yourself can benefit from whatever awesome things people have made and discovered already. It guarantees that each individual will be able to experience the most valuable states (as judged by themselves at the time) without a preconceived notion of which states are the most ideal before experimenting on their own.

Being capable of experiencing any state of consciousness already discovered and understood will hopefully also turn out to be economically desirable. In order to be of relevance in the market of information about the state-space of consciousness you yourself will need to be an explorer and be up to date with what is in vogue. This opens up the possibility of a full-fledged qualia economy: when people have spare resources and are interested in new states of consciousness, anyone good at mining the state-space for precious gems will have an economic advantage.

In principle the whole economy may eventually be entirely based on exploring the state-space of consciousness and trading information about the most valuable contents discovered doing so. The traits 4 through 7 are intended to address the complications that arise from the need to have social competence to survive in such an economy.

Society may ultimately converge into a system in which people are constantly in hyper-valuable states and the only way to become powerful is to invent new ways to improve upon the already highly-optimal state-spaces people are free to roam all the time. In this economy, people would also be motivated to help others succeed: Everyone benefits from making discoveries since every discovered state is made accessible to everyone.

How could we implement a conscious mind with these attributes? The task is indeed extremely demanding, and billions of dollars in R&D will have to be invested before we have a silver-bullet genetic intervention that takes us in that direction. In the next centuries we are likely to see hundreds of thousands of researchers experimenting with various cocktails, implants and genetic vectors hacking themselves in order to reliably improve their hedonic tone while also increasing their economic value.

In order to have any chance at living in such a society we need to make sure we won’t be overrun by pure replicators in any of their gazillion different guises. To do so, we need to make sure we do not fall prey to any of the wireheading mistakes outlined above. And we also need to make sure that we can give back to the world more than we take, so that the world is happy to have us around.

Wireheading Done Right: Stay Positive Without Going Insane

To remain economically relevant and subvert the rise of pure replicators, it is quintessential that one’s capacity to explore the state-space of consciousness is a marketable skill. Your imagination (if we choose to call it that way), should therefore work at an acceptable depth and pace from the point of view of the current economy of social relationships.


Be like my friends Leona and Link. They have a balanced routine of many varieties of experiential wellbeing and euphoria. They are hyper-motivated in the morning (fast euphoria), go crazy creative and spiritual in the afternoon (spiritual euphoria), and go to sleep in delightful oceans of cool sensations (slow euphoria). They are genetically engineered to live the good life without getting stuck in bad loops.

The diagram above illustrates the main idea: we want to rewire our reward architecture in such a way that as each kind of euphoria is instantiated a different one becomes more accessible.

For example, we may want to wirehead ourselves in such a way that our ability to experience fast euphoria is gated by slow euphoria. Until you have not “satiated” your psychological need for resting you will not be allowed to feel hyper-motivated. Our desires are already state-specific, but the current network of transition probabilities between emotions facilitates the reinforcement of toxic local attractors (also called “death spirals” like states of depression or generalized anxiety). By re-engineering the network of transition probabilities between emotions and extracting out the dysphoric components we might be able to guarantee a continuous flow between functionally and phenomenologically distinct modes of wellbeing. Wireheading done right consists of having wonderful experiences all the time, but in such a way that you never feel compelled to stay where you are for too long. In addition, a good wireheading procedure should also allow you to keep learning useful information about the state-space of consciousness. Wireheading should not imply the end of learning. In brief, we suggest that we should change our brains so that by feeling great in a certain way you temporarily reduce the response to that particular kind of euphoria but also make it easier to enjoy some other kind. One would thus be incentivized to keep moving, and to never give up or to get stuck in loops.

Naturally one may be skeptical that perpetual (but varied) bliss is at all possible. After all, shouldn’t we be already there if such states were actually evolutionarily advantageous? The problem is that the high-valence states we can experience evolved to increase our inclusive fitness in the ancestral environment, and not in an economy based on gradients of bliss. Experiences are calorically expensive; in the African Savannah it may cost too many calories to keep people in novelty-producing hyperthymic states (even if one is kept psychologically balanced) relative to the potential benefits of having our brains working at the minimal acceptable capacity. In today’s environment we have a surplus of calories which can be put to good use, i.e. to explore the state-space of consciousness and have a chance at discovering socially (and hedonically) valuable states. Exploring consciousness may thus not only be aligned with real value (cf. valence realism) but it might also turn out to be a good, rational time investment if you live in an experience-oriented economy. We are not particularly calorie-constrained nowadays; our states of consciousness should be enriched to reflect this fact.

Link and Leona (whom you may recognize from a previous article) are two successful wireheads who are now happier than ever. They chose to have the following feedback network for their valence: fast makes it easier to feel spiritual, spiritual makes it easier to feel slow, and slow makes it easier to feel fast. Their primary state of consciousness cycles over a period of 24 hours. Here is their routine: They wake up and experience intense zest for life and work at full capacity making others happy and having fun. Then they go crazy creative in the afternoon, usually spending that time alone or with friends, and explore (and share) strange but always awesome psychedelic-like states of consciousness. Finally, at night they just relax to the max (the healthy and genetically encoded phenomenological equivalent of shooting heroin). They report having more agency than before, since now they feel that there is time to do everything they want, and moving from one activity to the next is easy and spontaneous. This kind of wireheading allows them to avoid loops, drops in motivation or impoverished creativity and introspection. The only thing they had to accept was that “hey, you don’t need to have all of the euphorias at once all the time”. By enjoying them one at a time you can guarantee a healthy mind, a healthy social life and a healthy economic output.

Positive Wireheading at the Evolutionary Limit

One of the main insights of evolutionary game theory is that strategies that have the best shot at being dominant for long periods of time have three properties: (1) they do well on average against other strong strategies, (2) do well against themselves, and (3) have an immune system against custom anti-strategies. This last condition may be skipped if we are not going to play for too long, but in the long run it is absolutely necessary. Custom anti-strategies may themselves be terrible against other strong strategies and even against themselves, so you may not realize they exist at first. For a while your strategy may be dominating the space with no signs of anything changing. But then you may start noticing that a tiny population of contrarians are beginning to grow exponentially. In no time you are defeated and the world breaks into chaos (since the contrarians may not be good at holding power). A classical illustration of this phenomenon comes from the evolutionary setup of the Iterated Prisoner’s Dilemma. Here we can find some strategies that satisfy (1) and (2) but do not have a good immune system against free-loading strategies. The strategy always-defect is surprisingly effective in an ecosystem dominated by always-cooperate. Likewise the Pavlov strategy can exploit the vulnerabilities of tit-for-tat-with-forgiveness, (though improved versions can counter it). Over time we see sporadical population booms and busts caused by cycles of cooperative eras collapsing under parasitic breakthroughs.

I posit that an ecology of wirehead minds that dedicate themselves to explore (“mine”) the state-space of consciousness can be economically powerful if other people are willing to pay for the information about how to instantiate the high-valence qualia discovered during these explorations. A large group of cooperators that help each other explore the state-space of consciousness satisfies condition (2) from the previous paragraph. But in order to satisfy condition (1) we need an environment in which knowledge about consciousness is marketable. The Super-Shulgin Academies (i.e. rational psychonautic collectives) of the future may concentrate all the qualia research talent in the world and reap the highest economic benefits while producing the largest amount of value, but they will only be able to do so if the surrounding society values their output. A society of pure replicators has no use for Super-Shulgin Academies, let alone Manhattan Projects of Consciousness (i.e. global concerted efforts to find and recruit benevolence-enhancing states of consciousness). But given the moral and hedonistic pursuits of our fellow humans we may still have a chance: making people happier than they currently are is a trillion-dollar industry nowadays, and Super-Shulgin Academies may capitalize on this demand by selling valence-enhancing technologies to the masses.

With regards to (3) we may happen to be lucky this time: Knowing all there is to know about the state-space of consciousness is the best way to prevent oneself from being outsmarted. Super-Shulgin Academies would invest heavily in researching ways to defend themselves against pure replicators. As part of its immune system, a Super-Shulgin Academy should only admit benevolent individuals as researchers. Benevolence, perhaps, is best implemented at the level of ontological qualia: someone who believes that we are all the same consciousness is a lot less dangerous to others than someone who is solipsistic or self-centered. “Technology is destructive only in the hands of people who do not realize they are one and the same process as the universe.” (Alan Watts). Rational agency and super-sentience in the hands of Open Individualists (i.e. people who believe that we are all the same subject of experience) could eventually allow us to bring about a good qualia and valence-centric Singleton.

But to bootstrap our way there we need to make sure that the organization would not die out even in an economy that isn’t already completely focused on consciousness (i.e. to fulfill condition 1).

We are very lucky to live in the Age of Spandrels. We take for granted the fact that people around us like to watch movies, go to sports events, read novels, get drunk, listen to music, have sex, etc. without realizing they could be investing all of that energy in figuring out how to make clones. We don’t usually realize that people’s atavistic inefficient hedonism is in fact our saving grace. (As an aside, I hope I am not inspiring anyone to go wild into the arts and do silly non-optimal things. My readership is capable of much more than that. Let us do silly non-optimal things in the most optimal way possible, by which I mean, let’s try to ensure that future beings care about valence and consciousness.) We should be thankful that we still have residual sexually-selected preferences for experientially rich lifestyles over pure and efficient dullness.


Wiki-consciousness. The state-space of consciousness library is accessible to everyone free of charge.

As long as we make intelligent use of today’s collective interest in exploring consciousness (in all of its guises e.g. art, philosophy, drugs) we still have a chance to create a sustainable economy of well-rounded wireheads that is worth living. The wirehead psychonaut collective  would obtain most of its economic power via the revenue coming from the discoveries made during the systematic exploration of the state-space of consciousness. If the public consumes these discoveries, then the strategy may be perfectly self-sustaining. The process itself would be beneficial, as we would discover new ways to make people happy, bring the radical freedom to inhabit any known state of consciousness and increase our understanding of the universe.

One can even imagine that if Super-Shulgin Academies become the most powerful economic forces in the world, they may choose to create a massive “wiki-consciousness”: a library of all known states of consciousness completely accessible to anyone free of charge. Why would they do this given that their power comes from being able to sell this information? On the one hand it stabilizes people’s ability to gain power over each other (since the only way to gain power in such an economy is to sell information about the state-space). This incentivizes people to actually find something new and valuable for everyone if they are aiming to become more powerful. And on the other hand, making the information freely available would also increase the quality of prospective members of the psychonaut collective due to widespread consciousness literacy.

If we play our cards right we may still have a chance of avoiding the pure replicators, Molochs and Singletons that lurk ahead in our forward light-cone. But to do so we need to stay grounded and avoid the pitfalls here discussed.


If we are to have a chance at surviving with a good quality of life in the sundown of the Age of Spandrels we will need to preemptively outcompete pure replicators. To do so we must avoid wireheading traps and take seriously future economic selection pressures, as they will determine who or what survives at the evolutionary limit. It is imperative that we take advantage of the current collective demand for valence and information about consciousness to fund ambitious consciousness research programs. Such programs will capitalize on this demand and kick-start a valence-centric market. In turn, scientific breakthroughs in this area may increase the percentage of the economy that is dedicated to exploring consciousness, which may reduce the opportunities for pure replicators to participate in the economy.

We need to act fast: if the economic demand for valence technologies disappears (or is low enough), we will find ourselves in a world in which exploring the state-space of consciousness is not profitable and pure replicators win.

* Thanks to Chris Potts for putting the papers online.

** I owe this theoretical framework to Mike Johnson and his magnificent work on the nature of valence: Principia Qualia.

*** Tragedy of the commons (i.e. failure of cooperation).