Predictive Processing and the Nature of Conscious Experience
A Conversation with Andy Clark [6/6/19]
Perception itself is a kind of controlled hallucination. . . . [T]he sensory information here acts as feedback on your expectations. It allows you to often correct them and to refine them. But the heavy lifting seems to be being done by the expectations. Does that mean that perception is a controlled hallucination? I sometimes think it would be good to flip that and just think that hallucination is a kind of uncontrolled perception.
ANDY CLARK is professor of Cognitive Philosophy at the University of Sussex and author of Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Andy Clark’s Edge Bio Page
[ED. NOTE:] As a follow-up to the completion of the book Possible Minds: 25 Ways of Looking at AI, we are continuing the conversation as the “Possible Minds Project.” The first meeting was at Winvian Farm in Morris, CT. Over the next few months we are rolling out the fifteen talks—videos, EdgeCasts, transcripts.
From left: W. Daniel Hillis, Neil Gershenfeld, Frank Wilczek, David Chalmers, Robert Axelrod, Tom Griffiths, Caroline Jones, Peter Galison, Alison Gopnik, John Brockman, George Dyson, Freeman Dyson, Seth Lloyd, Rod Brooks, Stephen Wolfram, Ian McEwan. Project participants in absentia: George M. Church, Daniel Kahneman, Alex “Sandy” Pentland, Venki Ramakrishnan, Andy Clark. (Click to expand photo)
PERCEPTION AS CONTROLLED HALLUCINATION: PREDICTIVE PROCESSING AND THE NATURE OF CONSCIOUS EXPERIENCE
The big question that I keep asking myself at the moment is whether it’s possible that predictive processing, the vision of the predictive mind I’ve been working on lately, is as good as it seems to be. It keeps me awake a little bit at night wondering whether anything could touch so many bases as this story seems to. It looks to me as if it provides a way of moving towards a third generation of artificial intelligence. I’ll come back to that in a minute. It also looks to me as if it shows how the stuff that I’ve been interested in for so long, in terms of the extended mind and embodied cognition, can be both true and scientifically tractable, and how we can get something like a quantifiable grip on how neural processing weaves together with bodily processing weaves together with actions out there in the world. It also looks as if this might give us a grip on the nature of conscious experience. And if any theory were able to do all of those things, it would certainly be worth taking seriously. I lie awake wondering whether any theory could be so good as to be doing all these things at once, but that’s what we’ll be talking about.
A place to start that was fun to read and watch was the debate between Dan Dennett and Dave Chalmers about “Possible Minds” (“Is Superintelligence Impossible?“ Edge, 4.10.19). That debate was structured around questions about superintelligence, the future of artificial intelligence, whether or not some of our devices or machines are going to outrun human intelligence and perhaps in either good or bad ways become alien intelligences that cohabit the earth with us. That debate hit on all kinds of important aspects of that space, but it seemed to leave out what looks to be the thing that predictive processing is most able to shed light on, which is the role of action in all of these unfoldings.
There’s something rather passive about the kinds of artificial intelligence that Dan and Dave were both talking about. They were talking about intelligences or artificial intelligences that were trained on an objective function. The AI would try to do a particular thing for which they might be exposed to an awful lot of data in trying to come up with ways to do this thing. But at the same time, they didn’t seem to inhabit bodies or inhabit worlds; they were solutions to problems in a disembodied, disworlded space. The nature of intelligence looks very different when we think of it as a rolling process that is embedded in bodies or embedded in worlds. Processes like that give rise to real understandings of a structured world.
Something that I thought was perhaps missing from the debate was a full emphasis on the importance, first of all, of having a general-purpose objective function. Rather than setting out to be a good Go player or a good chess player, you might set out to do something like minimize expected prediction error in your embodied encounters with the world. That’s my favorite general objective function. It turns out that an objective function like that can support perception and action and the kind of epistemic action in which we progressively try to get better training data, better information, to solve problems for the world that we inhabit.
Predictive processing starts off as a story about perception, and it’s worth saying a few words about what it looks like in the perceptual domain before bringing it into the domain of action. In the perceptual domain, the idea, familiar I’m sure to everybody, is that our perceptual world is a construct that emerges at the intersection between sensory information and priors, which here act as top-down predictions about how the sensory information is likely to be. For example, I imagine that most people have experienced phantom phone vibrations, where you suddenly feel your phone is vibrating in your pocket. It turns out that it may not even be in your pocket. Even if it is in your pocket, maybe it’s not vibrating. If you constantly carry the phone, and perhaps you’re in a slightly anxious state, a heightened interoceptive state, then ordinary bodily noise can be interpreted as signifying the presence of a ringing phone.
It would work very much like, say, the hollow mask illusion: When people are shown a hollow face mask lit from behind, they see the concave side of the face as having a nose pointing outwards. Richard Gregory spoke about this many years ago. It’s a standard story in this area. We human beings have very strong expectations about faces. We very much expect, given a certain bit of face information, that the rest of that information will specify a convex, outward-looking face.
The very same story gets to grips with phantom phone vibrations. It explains the White Christmas experiments, which is certainly one of my favorites in this area. People were told that they would hear the faint onset of Bing Crosby singing White Christmas in a sound file that they were going to be played. They would listen to the sound file and a substantial number of participants detected the faint onset of Bing Crosby singing White Christmas, but in fact there was no faint onset of White Christmas. There was no Bing Crosby signal there at all amongst what was simply white noise. In these cases, our expectations are carving out a signal that isn’t there. But in other cases, perhaps someone speaks your name faintly and there’s a noisy cocktail party going on, your expectations about what your name sounds like and the importance of anything that vaguely signals what your name sounds like conspire to up the weighting of the bits of the noisy signal that are there so that you hear your name fairly clearly.
Same thing if you’re in the shower and a familiar song comes on the radio. Under those conditions, a familiar song sounds an awful lot clearer than an unfamiliar one. People might have thought that was a post-perceptual effect, as if you heard something fuzzy and then your memory filled in the details. But if the predictive processing stories are right, then that’s the wrong way to think about it. This is just the same old story where top-down expectation meets incoming sensory signals with a balance that is determined by how confident you are in either the sensory signals or your top-down predictions.
The Bayesian brain, predictive processing, hierarchical predictive coding are all, roughly speaking, names for the same picture in which experience is constructed at the shifting borderline between sensory evidence and top-down prediction or expectation. There’s been a big literature out there on the perceptual side of things. It’s a fairly solid literature. What predictive processing did that I found particularly interesting—and this is mostly down to a move that was made by Karl Friston—was apply the same story to action. In action, what we’re doing is making a certain set of predictions about the shape of the sensory information that would result if I were to perform the action. Then you get rid of prediction errors relative to that predicted flow by making the action.
There are two ways to get your predictions to be right in these stories. One is to have the right model of the world and the other is to change how the world is to fit the model that you have. Action is changing how the world is to fit the predictions, and perception is more like finding the predictions that make most sense of how the world is. But it turns out that they’re operating using the same basic neural architecture. The wiring diagram for motor cortex and the wiring diagram for sensory cortex look surprisingly similar, and this story helps explain why. Indeed, the same basic canonical computations would be involved in both.
What’s most interesting about predictive processing is the way it gives you a simultaneous handle on perception and action by showing they obey the same computational principles. It immediately invites you to think about having a model of the world that simultaneously drives how you experience and harvest information from the world. At that point, there’s a standing invitation to stories like embodied cognition and the extended mind.
Once the predictive brain story is extended to the control of action in this very natural way, then there’s a standing invitation to start thinking about how we weave worldly opportunities and bodily opportunities together with what brains are doing in a way that is going to make systematic sense of the extended mind story.
Before I go there, it’s also worth saying a word or two about where the models that drive the predictions get to come from. Perceptual experience is the construct that lives on the border between sensory evidence and top-down prediction or expectation. That’s what you’re seeing in the White Christmas case and in the phantom phone vibration case. Just to see a structured world of objects around me means to know a lot about structured worlds of objects, and to bring those expectations to bear on the sensory signal. These are the stories that bring a structured world into view quite generally.
There are some rather nice cases that you can find online if you haven’t already of so-called sine-wave speech cases, where speech gets stripped of some of its natural dynamics and what’s left is a skeletal version of the speech. When you first hear it, it just sounds like a series of beeps and whistles, then when you hear the actual sound file and play that again, it sounds like a clear sentence being spoken because now you have the right top-down model, the right expectations. It’s like hearing a familiar song when it’s played in the shower on a bad radio receiver. It’s a very striking effect and experience. It gives you a real sense of what is happening when a predictive brain gets to grips with the flow of sensory information.
Once you’ve played the real sentence, it might be something like, “The cat sat on the mat.” So, you first hear beeps and whistles and you hear the sentence. Then you hear the beeps and whistles again, but this time through those beeps and whistles most people will clearly hear the sentence. After a while, you can become a native speaker of sine-wave speech so that you could be played a brand new one and you would hear the sentence through the noise. So maybe it will be useful to play some examples. Here we go.
[Audio samples. Begin listening at: 13:00]
I hope you’ve now had the experience of bringing a stream of somewhat unruly sensory information under an active predictive model and hearing how that can bring a structured world of words into view. The very same thing is happening in visual perception. It’s the same effect that we were seeing in the White Christmas story, where your expectations are so strong that they make you think that there’s a signal there when there isn’t. But if predictive processing and stories of this kind are on track, then these are all exercises of the same constructive computational story. This is where human experience lives. As a philosopher, it sometimes interests me to wonder where this leaves the notion of veridical perception.
Perception itself is a kind of controlled hallucination. You experience a structured world because you expect a structured world, and the sensory information here acts as feedback on your expectations. It allows you to often correct them and to refine them. But the heavy lifting seems to be being done by the expectations. Does that mean that perception is a controlled hallucination? I sometimes think it would be good to flip that and just think that hallucination is a kind of uncontrolled perception.
The basic operating principle here is that you have a rich model of the world, a generative model, as it’s known in this literature. What that means is a model that is not a discriminative model which just separates patterns out and says, “This is a cat and this is a dog,” but rather a system that, using what it knows about the world, creates patterns that would be cat-like patterns or dog-like patterns in the sensoria. These systems learn to imagine how the sensory world would be, and in learning to imagine how the sensory world would be, they use that to do the classification and recognition work that otherwise would be done by an ordinary feed-forward discriminator. What that’s doing is making perception and imagination and understanding come very close together. They’re a cognitive package deal here, because if you perceive the world in this way, then you have the resources to create virtual sensory stuff like that from the top down.
Systems that can perceive the world like this can imagine the world, too, in a certain sense. That grip on the world seems to be very close to understanding the world. If I know how the sensory signal is going to behave at many different levels of abstraction and at many scales of space and time, so I can take the scene as it currently is and project it into the future and know what’s going to happen if you hit the can and so on, that way of perceiving the world seems to me to be a way of understanding the world.
It will be very reasonable to ask where the knowledge comes from that drives the generative model in these cases. One of the cool things is that learning here proceeds in exactly the same way as perception itself. Moment by moment, a multilevel neural architecture is trying to predict the sensory flow. In order to do better at predicting the sensory flow, it needs to pull out regular structures within that flow at different time scales, so-called hidden causes or latent variables. Over time, with a powerful enough system, I might pull out things like tables and chairs and cats and dogs. You can learn to do that just by trying to predict the sensory flow itself.
A nice simple case of that will be something like learning the grammar of a language. If you knew the grammar of a language, that would be helpful in predicting what word is coming next. One way that you can learn the grammar of a language is to try again and again to predict what word is coming next. Pull out the latent variables and structure that is necessary to do that prediction task, and then you’ve acquired the model that you can use to do the prediction task in the future. These stories are a standing invitation to this bootstrapping where the prediction task that underlies perception and action itself installs the models that are used in the prediction task.
There’s a pleasing symmetry there. Once you’ve got action on the table in these stories—the idea is that we bring action about by predicting sensory flows that are non actual and then getting rid of prediction errors relative to those sensory flows by bringing the action about—that means that epistemic action, as it’s sometimes called, is right there on the table. Systems like that cannot just act in the world to fulfill their goals; they can also act in the world so as to get better information to fulfill their goals. And that’s something that active animals do all the time. The chicken, when it bobs its head around, is moving its sensors around to get information that allows it to do depth perception that it can’t do unless it bobs its head around. When you go into a darkened room and you flip the light switch, you’re performing a kind of epistemic action because your goal wasn’t specifically to hit the light switch; it was to do something in the room. But you perform this action that then improves your state of information so you can do the thing you need to do. Epistemic action, and practical action, and perception, and understanding are now all rolled together in this nice package.
It’s interesting then to ask, if your models are playing such a big role in how you perceive and experience the world, what does it mean to perceive and experience the world as it is? Basically, what these stories do is ask you to think again about that question. Take the sine-wave speech example and ask yourself when you heard what was really there. Did you hear what was there when you heard it just as beeps and burps? Or did you hear what was there when you heard the sentence through the beeps and buzzes? I don’t think there’s a good answer to that question. If predictive processing is on track though, one thing we can say is that even to hear it as beeps buzzes is to bring some kind of model to bear, just one that didn’t reach as deeply into the external causal structure as the one that actually does have words in it.
An upshot here is that there’s no experience without the application of some model to try to sift what is worthwhile for a creature like you in the signal and what isn’t worthwhile for a creature like you. And because that’s what we’re doing all the time, it’s no wonder that certain things like placebo effects, medically unexplained symptoms, phantom phone vibrations, all begin to fall into place as expressions of the fundamental way that we’re working when we construct perceptual experience. In the case of medically unexplained symptoms, for example, where people might have blindness or paralysis with no medically known cause, or more than that, very often the symptoms here will have a shape that in principle can’t have a simple physiological cause.
A nice example is you might get someone with a blind spot in their field of vision. If you ask them what the width of that blind spot is when it is mapped close to the eye and when it’s mapped far from the eye, some people will have what’s called tubular visual field defect, which means they say it’s the same wherever it’s mapped. This is optically, physiologically impossible. It’s pretty clear in cases like that that what’s doing the work is something like belief expectation prediction. It’s their model of what it would be like to have a visual field defect that is doing the work.
In this broad sense of beliefs, it doesn’t mean beliefs that you necessarily hold as a person, but somehow they got in there somehow. These multilevel systems harbor all kinds of predictions and beliefs which the agent themselves might even disavow. Honest placebos do work. For example, if someone is told that this pill is an inert substance, you can nonetheless get symptomatic relief from those substances as long as they’re presented by people in white coats with the right packaging—mid levels of expectation are engaged regardless of what you, the person sitting at the top, thinks. In the case of medically unexplained symptoms, it looks like they’re the physiological version of the White Christmas effect. There are bodily signals there, and if your expectations about the shape of those signals are strong enough, then you can bring about the experiences that those expectations describe, just like White Christmas only done here in this somatosensory domain.
There’s interesting work emerging not just on medically unexplained symptoms, but even medically explained symptoms. If people live with a medically explained problem for long enough, they can build up all kinds of expectations about the shape of their own symptomology, which share a lot in common with the medically unexplained cases. The same person with a chronic condition on different days and in different contexts will have different experiences even if the physiological state, the bedrock state, seems to be exactly the same.
There’s a nice paper that came out recently by Van den Bergh and colleagues which was arguing that in the case of chronic effects, chronic pain, for example, an awful lot of ordinary symptomology has very much the character of the symptomology in the medically unexplained cases. So, it puts neuro-typical and less typical cases on a continuum and on par, which is quite interesting.
Acute cases are somewhat different because there you haven’t built up those regimes of expectation, and there’s a very straight signal being dealt with. Although, even there it seems as if your long-term model of the world makes a big difference as to how that signal plays out. There’s a large area here where work on placebo effects, medically unexplained symptoms, autism, the effects of psychedelics, schizophrenia, all of these things are being thought about under this general framework. Maybe this’ll be one of the test cases for whether we make progress using these tools with understanding the nature of human consciousness.
We had a visit from Robin Carhart-Harris, who works on psychedelics and is now working on predictive coding. There are some very interesting ideas coming out there, I thought. In particular, the idea that what serotonergic psychedelics do is relax the influence of top-down beliefs and top-down expectations so that sensory information can find new channels. If we think about this in the context of people with depression, maybe part of what goes on there is that we hold this structured world in view, in part by our expectations—and they’re not just about the world, they’re also about ourselves—and if you can relax some of those expectations and experience a way of encountering the world where you don’t model yourself as a depressive person, for example, even a brief experience like that can apparently have long-term, lasting effects.
Some of the Bayesian brain and predictive processing folks are doing some pretty cool things, looking at the action of psychedelics and the effects of sensory deprivation. For any of these things, you can ask how would those different balances—held in place by this prediction meets sensory information construct—play out under different regimes of neurotransmitters, for example, or under different environmental regimes where you might have a stroboscopic light being flashed at you very rapidly. The University of Sussex has one of these, and it creates surprisingly intense sensations. If you were to sit in it for a couple of hours, you might get full dissociation. Even for a few minutes, you get experiences of colors of an intensity that I’ve never experienced before.
If you begin to ask what these stories have to say, if anything, about the nature of human consciousness, there are several things to say. The first is that the basic construction of experience is already illuminated just by thinking in terms of this mixture of top-down expectations and bottom-up sensory evidence and the way that mixture gets varied in different contexts and by different interventions. At the same time, there’s a strong intuition some people have that consciousness is special and that whatever tools I was using to make progress with the White Christmas experiments and phantom phone vibrations are not getting to grips yet with what matters most about consciousness, which is how it feels, the redness of the sunset, the taste of the Tequila, and so on.
There’s quite a lot to say about how that should pan out. In some ways, my view is an illusionist view. A large part of this debate over consciousness is misguided because there’s nothing there. There’s a multidimensional matrix of real things, and among those real things, there’s a tendency to think there’s another thing and that other thing isn’t real. That’s one way of thinking about it.
Among the real dimensions are the perceptual dimension that we’ve spoken about, the dimension of acting to engage our world. There’s a lot of super interesting work on the role of interoceptive signals in all of this. Apart from the exteroceptive signals that we take in from vision, sound, and so on, and apart from the proprioceptive signals from our body that are what we predict in order to move our body around, there’s also all of the interoceptive signals that are coming from the heart and from the viscera, et cetera.
One of the effects of the general predictive processing story is that all of this is just sensory evidence thrown in a big pot. How I perceive the external world to be can be constantly inflected by how I’m perceiving my internal world to be. You see this, for example, in experiments where people are given false cardiac feedback. They’re made to think that their hearts are beating faster than they are. And under conditions like that, if they’re exposed to a neutral face, they’re more likely to judge that the face is anxious or fearful or angry. It looks as if what’s going on is that our constant intouchness with signals from our own body, our brains are taking as just more information about how things are.
In that sense, there’s a Jamesian flavor to some of the work on experience that comes out of predictive processing where the idea is that emotion, for example, is very much tied up with the role that interoception plays in giving us a grip on how things are in the world. William James famously said that the fear we feel when we see the bear has a lot to do with the experience of our own heart beating and our preparations to flee, all of that bodily stuff. If you took all that away, perhaps the feeling of fear would be bereft of its real substance.
There is something genuine in there that being subtly inflected by interoception information is part of what makes our conscious experience of the world the kind of experience that it is. So, artificial systems without interoception could perceive their world in an exteroceptive way, they could act in their world, but they would be lacking what seems to me to be one important dimension of what it is to be a conscious human being in the world.
We’ve got a number of real dimensions to consciousness. One of them is bringing a structured world into view in perception in part by structured expectations. The other one is an inflection of all of that by interoception. You can then ask questions about the temporal depth of the model that you’re bringing to bear, and that seems like an important dimension, too. If your model has enough depth and temporal depth, then you can turn up in your own model of the world. Technically here I can reduce prediction error by projecting myself into the future and asking what certain things a creature like me—the way I can see myself to be—might do, would serve to reduce prediction error in the future. In that way, I turn up as a latent variable in my own model of the world. That seems important in human consciousness, at least. That’s part of what makes us distinguishable selves with goals and projects that we can reflect on. That matrix is real. The thing that I don’t think is real is qualia.
To understand that, we need to take a more illusionist stance. To do that would be to ask some version of what Dave Chalmers has lately called the meta hard puzzle or the meta hard question. That would be, what is it about systems like us that explains why we think that there are hard puzzles of consciousness, why we think that the conscious mind might be something very distinct from the rest of the physical order, why we think there are genuine questions to be asked about zombies. What Chalmers thinks is that any solution to the meta hard question, the question of why we think there’s a hard question, why we say and do the things that express apparent puzzlement of this kind—those are easy questions in Dave’s sense.
You can say something about how you would build a robot that might get puzzled or appear to be puzzled about its own experience in those ways.
You might think, well there’s something very solid about all this perceptual stuff. I can be highly confident of it, and yet how the world really is could be very varied. If you’re the sort of robot that can start to do those acrobatics, you’re the sort of robot that might invent a hard problem, and might begin to think that there’s more than a grain of truth in dualism.
One thing that we might like to do is try to take an illusionist stance to just that particular bit of the hard problem while being realist about all the other stuff, thinking that there’s something to say about the role of the body, something to say about what it takes to bring a structured world into view. Do all of that stuff and then also solve the meta hard puzzle, and you’ve solved all there is to solve. Whereas Dave Chalmers, I’m sure, will say, at that point, you showing us how to build a robot that will fool us into thinking that it’s conscious, in certain sense it might even fool itself into thinking that it’s conscious, but it wouldn’t really because maybe it wouldn’t have any experiences at all when it’s doing all that stuff.
Dan Dennett’s take on consciousness is a perfect fit with a predictive processing take on consciousness. For many years, Dan has argued that there’s something illusory here, some self-spun narrative illusion. Predictive processing perhaps gives us a little bit more of the mechanism that might support the emergence of an illusion like that. Dan himself has written some interesting stuff on the way that predicting our own embodied responses to things might lead us down the track of thinking that qualia are fundamental special goings on inside us. I might predict some of my own ooing and awing responses to the cute baby, and when I find myself in the presence of the cute baby, I make those responses and I think that cuteness is a real genuine property of some things in the world.
What Dan has argued there is that maybe we get puzzled because we’re fooled by our own Bayesianism here. This model of how things are gets to grips with how we’re going to respond, and we then reify something within that nexus as these intervening qualia. But you don’t need the weird intervening qualia; you just have responses that come about in certain circumstances. There’s a rather natural fit between Dan’s approach and these approaches, and they’re both a kind of illusionism where we’re both saying whatever consciousness really is, it can’t be what Dave Chalmers thinks it is.