Automemorial

In early 2014, as I was learning to be motivated by long-run considerations and make important tradeoffs, I started to worry that I was giving up something important about my old self - that some things that had been precious to me, would never quite be worth the price of holding onto, so the parts of my soul that cared for them would gradually wither away, unused, until it wasn’t even tempting to try and reconnect to going to the opera, translating classical Greek, or any of the other things in my life that I chose for their beauty but not their utility.

It turned out that I was right, though not quite in the way I expected.

This is my story. It is an honest report of that story, but that is all it is.

This is the story of how, over the past year and a half, I died and was reborn. In it, you'll find the ways I had to learn to model the world to effect this transformation. I hope that some of them are useful to you.

Identifying with the algorithm

The ego is the constitution of the soul - the story we tell about ourselves to smooth some kinds of internal cooperation. There are many possible narratives one can tell, and the exact character and strength of one's narrative determines what one can do, and how well one's parts can cooperate.

Over the last year I rewrote my internal narrative. I started out attached to particular outcomes, and as I ratcheted up my ambition, this increased my stress levels, so that a year ago I felt like I was failing terribly at everything. The realignment involved giving up on this attachment, accepting that I might fail, and that most of the probability mass where I succeed was in possible worlds where I have enough time to learn. This is the story I alluded to in my post on attachment and empathy:

Previously, I thought I’d desensitized myself to tickling, but it turned out that instead of learning to integrate the sensation and respond proportionally, I’d learned to shield myself against it and not respond at all. True integration of that part of life will require repeated exposure to the stimulus, mindfully, with an openness to being affected by it, in a safe context. Similarly, while I initially thought I’d felt nothing for plants because of a lack of underlying empathy with them, it turns out that I had built up a shield and was blocking my awareness of them because of the risk of empathy overload.

Just before my nature walk, I’d undergone a shift that reduced my attachment to near-term good outcomes for those around me, and I think this was part of what enabled me to perceive the tenuous position of the plants around me. Not an increase in my underlying capacity for empathy - but an increase in my ability to cope with that feeling. So it's not intolerable pressure anymore to notice that plants are alive and have unmet needs. But still I feel like, if I fully connected, then I'd be horrified all the time - noticing that things made of wood are corpses, etc.

I suspect this is related to why cognitive self-knowledge through reasoning and meditation, reducing attachments to particular things in the world, and empathy for all living things, seem to be connected in traditions such as some Buddhist ones. Full awareness implies full empathy. But if you’re attached to what you’re attentive to, if you feel compelled to alleviate the suffering of anyone whose suffering you’re fully aware of, then you’re caught between denying or hiding from reality, and empathizing a world of searing pain. Only when you can accept the suffering of others without having to fix each instance, can you notice just how much suffering there is.

I started identifying with an algorithm running across all the possible worlds that look like mine: I want to do the thing that succeeds most often. This can often mean not doing the thing that salvages short-run value, but instead trying outright to fix problems, taking the risk that I might break things and only capture the benefit of learning next time - if there is a next time.

Existing across possible worlds

What do I mean by possible worlds? I'll give a few examples of attempts to think about them:

  • Modal realism
  • Tegmark levels
  • Timeless Decision Theory

For more on this, Greg Egan's novel Permutation City is an excellent introduction. None of these will be full explanations because this post is already running too long - I encourage you to follow the links if you're interested in learning more.

Modal realism

Analytic philosophers have taken an interest in the problem of counterfactuals: if X hasn't happened, what does it mean to say that if X had happened, then Y would have happened as well? One promising approach for expressing counterfactuals is called modal logic, and a related metaphysical model is called modal realism. In modal realism, counterfactual scenarios ("if X had happened") are real in some nearby world. What "actually" happened is just determined by which, of the many possible worlds, you're located in. You can be uncertain about the distribution of possible worlds. Separately, you can have indexical uncertainty: you might be uncertain which of several possible worlds you happen to be located in, since many of them may contain someone observing the things you're observing.

Tegmark levels

Physicist Max Tegmark (now running the Future of Life Institute) has similarly proposed a generalization of the quantum physics idea of "many worlds," in which there are different levels of generality at which we can talk about adjacent worlds. The generalization is called the Tegmark Levels. Level 1 is simply things that are physically elsewhere, far enough that we can't interact with them due to lightspeed limitations and the expansion of the universe, but implied by our model of physical reality. Levels 2 and 3 are successive levels of physical generalization - universes that our models of physics imply exist, but that we can't ever directly interact with and aren't meaningfully contiguous with the physical reality we can explore. Level 4 is alternative mathematical structures - the basic laws of physics might be deeply different, with different mathematical descriptions. Any of these levels could plausibly contain an observer with experiences indistinguishable from yours.

Timeless decision theory

Timeless decision theory (TDT) is an attempt to construct a decision theory that gets the intuitively "right" answer to a few classic decision theory problems that other standard decision theories fail. Examples include the smoking lesion problem (which Causal Decision Theory (CDT) gets right and Evidential Decision Theory (EDT) gets wrong) and Newcomb's problem (which EDT gets right and CDT gets wrong). The basic insight is that instead of deciding only for the agent you actually are, you decide for all agents following the same decision theory you do. This framing makes it much easier to see why you should do things like cooperate with a copy of yourself in the Prisoner's Dilemma - if your algorithm says "defect," so does your copy's algorithm. The resulting acausal trades don't have to be made between different agents in the actual world. You can bargain with versions of yourself in different possible worlds.

The darkest evening of the year

At last year's Secular Solstice celebration in the Bay Area, much of the focus was on the challenges we must meet in order to make the future bright. Yet darker moments of the ceremony reminded us of all the people we will be unable to save. A friend, who works as a nurse, talked about the patients she’s lost, each person’s death keenly felt.

The year before, I'd either have felt this as pressure to turn up my sense of urgency and stress about the problem, or I'd have just blanked out and refused to feel for the lost lives at all. But this time, I noticed myself, thinking, calmly - yes, we're losing people. We will continue to lose people. We are not currently up to the task of saving them all. I am not currently up to the task of saving them all. Nor are all my friends combined, and all the good people of the world.

If we don't learn to do better, we will fail, and we are likely to lose everyone.

If we do learn to do better, we might have enough time to save some of them.

The cosmic endowment is very large, and each moment of delay is a terrible, unrecoverable cost. We haven't yet built up our civilization enough to once and for all fix this.

Nick Bostrom's Fable of the Dragon-Tyrant tells the story of a world where death mainly comes from a one dragon, that threatens to kill everyone unless we feed it living humans. From time to time people try to fight the dragon, but they always, always lose. Eventually, civilization organizes to fight it, and begins work on a rocket powerful enough to kill the dragon. Meanwhile, victims are still being carried to the dragon by the trainload, up until the exact moment the rocket hits the dragon. But then it does, and the dragon dies, once and for all. And the monstrous prospect of having one's life suddenly taken away is simply gone. It's over. We won.

That hasn't happened yet. We haven't built that rocket. We don't know how to build it. Each person we fail to carry to the point where we finally "build the rocket" is truly lost. And that's why we have to keep our eye on the ball. That's why I cannot afford to be attached to saving them all - because if I do that, I don't get to save any. I'm playing for the highest stakes, and it's likely that I'm already too far behind. But in almost every scenario where I improve my chances, it's because I do end up having enough time to learn, to get better.

A few months later, an acquaintance asked how people manage to frame failures as opportunities to learn rather than flinching in shame, and I was able to say:

Identifying with some eternal algorithm which this instantiation of myself is endeavoring to approximate, rather than with me in this particular situation, sometimes helps. I want to run an algorithm that notices when it's in the convenient world and can exploit this, and when it's in the annoying world where I have to be more patient and build up my resources, but I'm still not interested in the worlds where I fail regardless.

(This last part reduces the despair-cost of acknowledging hard truths. Yes, I'm farther behind than I thought. So in the worlds where I can get to the goal anyway, I notice the problem and begin the work of catching up.)

This is why I was mentally capable of taking a step back and spending time beginning to learn how to build my own world-model. The vast majority of the expected utility of my actions comes from the scenarios where I have enough time to help ensure humanity's survival. But I almost certainly don't have enough time to do that, unless I have enough time to learn how. This argument used to feel formally endorsed but empty - because I couldn't see past my attachment to already being enough, I couldn't see past my own panic when I considered the high stakes and poor prospects of success, to weigh the probabilities soberly. Now it feels, not only right, but obvious and intuitive. This is the story of how I got there.

I had to shatter my attachment to certainly succeeding. But this left me in pieces, no different from the pieces that had made me up before the shattering. So I had to find a more stable narrative to stitch myself back together into a whole, but a better-functioning one that could stand up to the pressures that had shattered me before.

This is the story of how I did that, and of some help I got along the way. Like any personal story, it's been heavily shaped by the teller, and leaves out some parts. I hope I've managed to keep the most important ones.

Shattering of attachment

Three years ago, I permanently ratcheted up my level of ambition. I spent the first year going to workshops, working with a life coach, and thinking through my priorities. Then I moved across the country and, over the course of the second year, found myself with a job, side projects, friendships, and romantic relationships that pushed me beyond what I'd ever done before.

By the end of the second year I felt desperately behind on each of these things. I needed to be the sort of person who could pull this off, but I felt like the dog that had finally caught the car. I didn't know what to do. And it felt unacceptable to admit my failings and ask for help. It had to be the case that if I tried harder, if I just threw more effort in, I could make it work. It had to be the case.

I broke down and burned out. It became unavoidably obvious that I wasn't up to the challenges I'd set for myself, at least not all at once, at the level of ability I had at the time. I left the job, dropped my side projects, ended one big romantic relationship, and decided to focus on preventing this from happening again:

Around the time I was additionally doing a lot of work around moving into a new group house with friends, I broke down. I felt like I was failing at everything, and I ran out of motivation. I thought - this can’t ever happen again. If I’m going to be able to protect the world, I need to be able to perform consistently at a high level and stay motivated. If I’m going to be able to keep my promises, I need to understand why my motivation died off, and learn how to predict and prevent this well in advance.

Deciding to do this didn't quite work. I don't commit to relationships - I commit to people. And in my naive desperation to already be enough, I'd decided to include my partner's well-being as a term in mine.

This set me up for a difficult situation when the relationship ended. It's normal to avoid talking with a former partner just after a breakup. But, as far as my narrative was concerned, that didn't affect the urgency of my obligations one iota. Because my commitment to care for my partner wasn't anything like transactional, wasn't bound up with being in a relationship, I felt that same need to know how they were doing and do things to help them - with no outlets for it. Not having information about what I could do to help made the task harder - took away my tools - but didn't cancel the obligation. I simply owed a debt that was, for reasons out of my control, undischargeable.

I suffered, because I'd bound myself to my partner by a deep attachment.

What is suffering?

Buddhism notoriously promises relief from suffering through nonattachment. Mindfulness practice can be successful in improving the well-being of people suffering from chronic pain. To understand exactly what is on offer, one must understand the answer to the question: What is suffering? I mean this both in the sense of "suffering - what is it?" and "what is doing the suffering?"

Pain vs aversion

I am not a scholar of Buddhism, but I suspect that there is some sort of equivocation going on around suffering. In the Western tradition, words that are typically translated into "suffering" also carry the successively more neutral meanings of "enduring" and "experiencing". "Pathos" comes from the same root as "empathy" and "passion". This sort of thing is important to bear in mind when interpreting translated claims such as "all is suffering." A pre-Socratic philosopher might have said the same thing, and meant by it nothing more than the claim that reality is things being affected by other things - what's truly real is change, rather than unchanging, stable existence.

And yet, when people talk about freedom from suffering, they typically mean freedom from aversive experiences: suffering as aversion. Suffering as the thwarted intent to change one's situation. Suffering as attachment to unavailable outcomes.

One way to reduce aversive experiences is to alter the external world in a more favorable direction - to increase the rate at which desirable things happen, and decrease the incidence and severity of undesirable events. Another way is to simply reduce the strength of your intent to change things you cannot change. This is called a reduction in attachment.

Most people who have meditated seriously have come to the realization that pain is not the same thing as an aversive experience. When meditating in a fixed position, you inevitably start perceiving pain signals. Initially, this triggers a mental sub-process that wants to fix the cause of the pain. You can learn to disengage this process. The pain signals are still perceptible, but for a time, they just seem like value-neutral data. I wrote about my experience doing this at a Vipassana Center retreat:

When I experienced physical discomfort during meditation, sometimes I’d notice that an aversion response had tensed up related muscles, and by practicing disempowering these aversions, eventually I learned a mental operation that kind of felt like disconnecting that process from other things, like I was cutting off the threads it pulled on to exert control. I’d have to repeat this move from time to time, but it felt more powerful than simply declining to reinforce it with conscious attention.

As noted, these processes will repeatedly be triggered by the stimulus they are tasked with avoiding, so it requires a fair amount of diligent practice to keep them offline for extended periods of time. And these processes exist for a reason - if we didn't want to avoid pain at all, we wouldn't, and that would make it hard to get things done or to avoid sustaining damage. For this reason, I don't think that people who want to live should attempt total purification of all attractions and aversions - but more conscious control over them is often convenient.

Imagine that the mind is made up of two things: perceptive processes that receive unorganized data and output a simplified summary of the data, and control processes that note deviations from a preferred situation and take action to move things back to the preferred state. (I've been influenced by perceptual control theory here.)

Control subagents

According to this model of control and perception processes, when you learn that something seems important for achieving your goals, you sometimes set up a new control process optimizing for that outcome. This process runs automatically, and somewhat autonomously exerts pressure towards actions that further its interests. I laid out a tentative model of this in my post on group cognition:

In order to perceive patterns, our minds build substructures, subprocesses, that take in raw data and report out some summary of this. (Some comparatively universal and possibly hardwired subprocesses include ones that detect edges or faces in our visual data - usually the output isn’t directly consciously available here - and ones that detect a situation where it’s advantageous to punish someone and output anger.) These subprocesses are allocated more resources when they seem more useful to other mental processes. [...]

A nascent perceptual process gets rewarded - and therefore gets bigger and more salient - for two reasons: intelligibility and relevance. Intelligibility is the extent to which the raw mass of sensations can be reasonably transformed into a coherent pattern. Relevance is the extent to which other parts of the brain use the output from this process.

An example of intelligibility is that it’s easy to ignore as “noise” conversations in a language you don’t know, or using jargon you’re unfamiliar with. The Baader-Meinhof phenomenon is a related example - once you notice something for the first time, it will often turn out that you were hearing about it all the time and just not registering it.

An example of relevance is that it’s easy to ignore as “noise” conversations you can understand perfectly well, where you don’t care about the topic.

It seems to me that my internal control processes have two different sorts of reward and punishment structures that determine the relevance of things.

One is a smooth preference gradient. I enjoy the occasional bowl of cornflakes, but at worst if I want one and don't get one I'm slightly unhappy. For other things, if I don't get them, I'm very unhappy, but the incentives are continuous, and generally I can still function.

The other incentive structure is based on attachment - if some condition is not met, this is unacceptable. I am only willing to consider the possible worlds where the goal is achievable. When someone is trapped underwater, their drive to breathe will eventually outweigh their drive to hold their breath.

Grief and paradigm shifts

The advantage of the attachment motivation structure is that, for truly important goals, it can lead to stronger investments of effort. If your world model does not show you a way to achieve the thing, nonattached control systems will simply give up. It would be very convenient to be able to fly, but I don't feel too bad that I can't - mostly I just don't think about it. But attached systems will refuse to give up, and will instead exert a strong pressure on you to look for ways your world model might be wrong, to search for neglected pathways, to double back and check again, to shut up and do the impossible. This can lead to important paradigm shifts, deep changes in one's model of the world.

The disadvantage is that the extreme pressure required to force paradigm shifts that were a priori unlikely, is that it's not smart pressure - it's still present even if it is literally impossible to achieve the required outcome. This can drive you temporarily or permanently insane. More than one friend who has introspected carefully on this sort of thing has reported that when they notice that a core goal seems impossible, this kind of double bind can lead to either a manic or hypomanic episode, in which they experience a desperate sort of credulity in any plan that claims to get them the thing they want, or depression, in which, if they can't have the thing, then no action seems to have any value.

A third response is to become stably wrong, to adopt some particular unjustified belief that, if true, allows you to achieve your goals. In the case of social goals, where you want to  occupy some role with respect to other people, this response is called narcissism - especially when it fails. But sometimes the pure social pressure of confident expectations causes it to succeed instead, in which case it is only called narcissism by losers.

These responses are similar to those described in the five stages of grief model, in which people mourning the death of a loved one go through denial, anger, bargaining, and then depression, before acceptance. If you receive information that makes accomplishing one of your attachment goals impossible, you can simply refuse to take in the information (denial), you can experience it as a narcissistic injury (anger), you can adjust your world model to make it possible again (bargaining), or you can notice that every possible action is equally unacceptable (depression).

By narcissistic injury, I don't necessarily mean something pathological. I mean any case in which information that threatens someone's narrative about who they are and what their story is, provokes a defensive, angry reaction, as though it is a boundary violation. Pathological narcissism seems like an extreme version of this in which the narrative is exceptionally rigid and onerous.

I am modeling narcissism as a continuous parameter that takes a range of values in healthy, functional people. It takes all sorts to make a world, and modeling a variety of "pathological" attributes this way has helped me model minds better. Psychoticism, borderline, neuroticism, sensory fine-tuning, narcissism - at very high levels these produce psychosis, borderline personality disorder, neurosis, the kind of autism where you're in constant distress, narcissistic personality disorder - but it seems to me that at moderately high levels you get genuine freethinkers, Phoenixes, people with exceptional epistemic taste, autistic savants, and charismatic visionaries. There are downsides to all of these, of course, and some combinations of traits work better than others. But If you take a binary view, pathologizing high enough levels of various mental parameters and rounding non-pathological cases down to zero, you're missing out on this kind of detail.

Finally, the acceptance stage happens when the mind as a whole finishes learning that this particular control process, or subagent, no longer serves any useful purpose. In other words, when it is allowed to die. The process of propagating the belief that the attachment goal of a control process is permanently unavailable is called grieving. It is unpleasant because it involves, whenever the bereaved notices themselves expecting something good to be caused by the achievement of the attachment goal, processing the bad news that this good thing will not happen.

I am going through the long tail of the grieving process with respect to my grandparents. Mostly I don't feel the loss, but on some very rare occasions I notice myself wondering about something that they'd have the answer to, or thinking that some news from my life would be a thing they'd like to know about, and have to remind myself that this good thing isn't available anymore. Fortunately for my sense of well-being (but unfortunately for my sense of filial duty), the control process that valued my grandparents' existence wasn't very closely tied to my core sense of self.

Some attachment-based control processes, however, are extremely sticky, because the attachment goal feels like an integral part of one's values. When an attachment-based control process has learned that its goal is impossible to achieve, this can feel from the inside like wanting to die. The control process perceives that its future existence is nothing but punishment, that there are no outcomes it is not averse to, to the point of unacceptability. And the only way to stop experiencing this is - termination.

People are made of more than one such control process, and sometimes deal with this sort of problem by not thinking about it. But then the mental energy stays locked up in a control process that has been frozen in stasis, but not allowed to die. And if the whole person identifies with this process, as an essential part of their ego narrative, this can result in the person as a whole feeling the desire to die.

That's the situation I was in a little over a year ago. I had accepted responsibility for a thing. I was not able to do this thing. I wasn't willing to accept letting the control process die, because that would be disloyal. I didn't want to die as a whole person, because that would mean giving up on everything I valued. But living, undistracted from this thing, felt unacceptable. There were no acceptable options.

Then I got the one thing that had the power to release me from this bond. My former partner affirmatively requested that we let the old attachment lapse. This was the last thing holding together my old, non-functioning ego structure, and this news gave me permission to take a hammer and smash this link, in a way that felt like it did not violate my sense of honor.

I shattered my identification with that essential attachment.

Esalen, an interlude

The shattering improved things - but all the parts were still there in their old form. I didn't know how to begin to reassemble myself until a few weeks later.

Neurons are living things, neural patterns don't vanish as soon as the underlying demand for them goes away, and when you stop feeling that a control process is relevant, it doesn't get deleted away immediately, just because you've taken away its supply of attention. One still has to actually go through the grieving process. Acceptance was for me only the beginning of the grieving process, and I wasn't sure what to do about this, so I decided to focus on the things I could do something about. I decided to focus on my project to get in touch with my motivations.

One thing I tried was going down to Esalen, a retreat on Big Sur, between San Francisco and LA. On the drive back up, I felt an urge to turn my car into the oncoming traffic. Of course, I didn't do it; this internal suggestion was decisively slapped down due to the considerations that (a) I hadn't properly considered the pros and cons so this decision, (b) the oncoming traffic hadn't consented, and (c) I could always change my mind later. But it seemed like a big problem that part of me wanted to do this, and problems are for solving.

I let my internal parliament meet, and made a deal with myself. I'd give myself a year to try and repair myself into something that wholeheartedly wanted to live and could achieve my core values. If that failed - if I was still miserable after a year - then I'd try giving up on all my unachievable goals, and trying for nothing more than living a nice life. And if after a year of that I was still miserable, I would be willing to seriously consider all the options.

(It's been a little under a year and I seem to have durably fixed the problem, so don't expect to see me trying my fallback plan any time soon.)

During this process, I noticed - as I had a few times before - that it felt as if something hidden behind many veils, that usually lets me mind my own business, had reached out and directly steered me - and had been displeased at the need to do so. That some higher self had seized direct control to keep me from doing anything too stupid. (It's not that the internal vote was close, but even a small risk was enough to justify interference.)

I had dinner with a friend the night I got back, and they later reported that I seemed happier than I had in months.

Under the surface

Circling and center

Shortly after I got back from Esalen, I went to the Integral Center’s Aletheia workshop on authentic relating, to try and connect with my authentic sense of self. This was also part of my project to get in touch with my own experiences and preferences. My hope was that other people would notice when I was bullshitting or not really in touch with my true self, and call me on it. Instead, the workshop produced the most intense feeling of not being seen - or seeable - I’d ever experienced.

Aletheia is centered around a practice called Circling. Vipassana meditation is a mindfulness practice oriented towards experience on the sensory level; Circling is a mindfulness practice focused on the social level of experience. This does produce, in one sense, an intense experience of being seen; talking about one's experience of an interaction can produce a sort of unguardedness where things aren't going through social filters, since there's not enough metacognition to examine social responses and use them at the same time. But while I did feel authentically seen and understood, it seemed like it was only one particular sort of surface that people could see, not the core.

I found that the more I dropped filters, the less people felt connected to me. And I found that the things they could relate to about my surfaces felt increasingly like not me. As if they were talking about the car I were driving, and feeling and emotional connection with it, and imagining that it was me.

Initially I found this pretty upsetting. I had a sense that other people couldn't tell when I was being genuine from when I wasn't, because they were just looking at my face or mask. I was hoping that they’d be able to help me see when I was and wasn’t being the “real me,” but they just favored some faces I can put on over others. Maybe that’s all I was - a bunch of different faces that can cooperate with each other.

I worried that maybe there wasn't a real me, that all people could see was the masks because I was just a bunch of masks. After my last Circle, I went off to a corner and cried for a bit, and talked through my feelings alone where there was no one to mis-see me, and then something clicked. The thing I wanted people to connect to wasn't a surface. It wasn't a thing that could be present in real-time conversation.It was the thing behind and above this, that *caused* me to be able to show up in a certain way for people.

Then something clicked - the real me is in no way a face. The real me - the self I identify with, my locus of agency - is never present in the moment. It works on longer timescales than that. The pressure was off. It was no longer acutely distressing that people couldn’t see me, because I knew why - and what I really was. The self I wanted people to see is the one I tried to describe more directly in this post:

This is the story of my life, through the lens of motivations, of actions I took to steer myself towards long-term outcome, of the way the self that stretches out in causal links over long periods of time produced the self I have at this moment. This is only one of the many ways to tell the story of my life.

When I talk about something that causes me to have emotional responses I can’t suppress - for instance, I might burst out into tears - one way to relate to that is to try to be with my emotions in the moment, because I’m being “real”. This is the instant readouts form of authenticity:

There’s this idea of authenticity: you know who someone truly is by seeing them in their unguarded moments, seeing uncensored emotions, that’s when you can have a real interaction with them, that’s when you can see their true self.

This is counterintuitive to me. When I let down my guard and am my completely unfiltered self, people often find me incomprehensible. What’s more, they think I am being less authentic. When I let my social guard down and say things as soon as I think them, people say that they find it hard to relate to me and encourage me to just be myself. When I carefully filter and reframe things, and shape my behavior to get the interaction I want, I hear people say, “I can tell that you’re really being genuine with me.” [This happened at Aletheia.]

[...] By default, I tend to be in a calculative mode, which is less conducive to producing instant readouts. When I deliberately push myself into a less calculative and more emotive state, this doesn't give people an unbiased measurement of what it's like to be me. It doesn't show them where my center of mental gravity is. But it shows them something in a way they can quickly grasp and verify on a gut level.

Authenticity is the quick-read thermometer of social interactions. It doesn't tell you everything, it's not necessarily representative of the entire object being measured, but it tells you something precise about what's happening right now in the spot being measured.

Another way to relate to that is to infer that I’ve chosen to steer the conversation in a direction where I’ll have that kind of response, because people bond over vulnerability and displays of emotion. If you drew that inference, I’d feel like you saw me better. A third way to relate to that is to infer that I’ve spent time learning how people become close, and spent time refining my search process for people to become close with, in order to find someone like you. Then I’d feel like you could see the real me.

Once I realized this, it was a lot less upsetting that people couldn't see me, because I understood why. I'd previously had occasion to mention to friends that sharing things typically described as “vulnerable” doesn’t generally feel vulnerable to me at all. This is due to something I’ve often described as a firm sense that other people can’t see me - the true me is invisible. All they can relate to is some face I put on. I feel this most keenly when people try to read my emotional responses to things.

But now this all made perfect sense - the thing I wanted others to see had no surface. I needed them to infer it. It's sort of the underlying algorithm that adjusts, not my feelings in the moment, but my disposition to feel certain ways about certain things.

Godshatter, Daemon, Avatar, Atman

There are minor spoilers for Vernor Vinge's A Fire Upon the Deep, Scott Alexander's Unsong, and the Bhagavad Gita in this section.

Godshatter

Before my avatar nature was awakened, I was godshatter.

In Vernor Vinge's A Fire Upon the Deep, the protagonist has to go on a mission into a protected area where superintelligences can't follow. Old One, an allied superintelligence, sends Pham Nuwen, a human working with it, to try and help. Before they leave, it alters Nuwen's brain, effectively uploading some simplified, stripped-down parts of its mind into him. This is a thing that happens from time to time in this universe, and the shards of a superintelligence embedded in a human-level intelligence's brain are called "godshatter".

Over the course of the story, from time to time Nuwen slips into a fugue state where his normal mind is offline, and Old One's godshatter runs on his hardware. It's slow, and it's not transparent to Nuwen what it's doing, but it works, and it helps them get where they need to go, making judgment calls based on algorithms Old One didn't have the time to explain:

“What about the godshatter state? I see you for hours just staring at the tracking display, or mucking around in the library and the News,” scanning faster than any human could consciously read.

Pham shrugged. “It’s studying the ships that are chasing us, trying to figure out just what belongs to whom, just what capabilities each might have. I don’t know the details. Self-awareness is on vacation then,” when all Pham’s mind was turned into a processor for whatever programs Old One had downloaded. A few hours of fugue state might yield an instant of Power-grade thought— and even that he didn’t consciously remember. “But I know this. Whatever the godshatter is, it’s a very narrow thing. It’s not alive; in some ways it may not even be very smart. For everyday matters like ship piloting, there’s just good old Pham Nuwen.”

Godshatter closely resembles my initial experience of what I now think of as the true me; there was a part of me that was deeply wise, but can't respond to things in real time. This feels similar to the sense I have of some deep, wise, but slow part of me, that isn't good at steering me in real time, that needs a while to process information, but does eventually process new information and adjusts my behavior.

To the extent that I do participate in some sort of higher soul, it is often slow to respond - though now that I've put my locus of self there, it's getting faster, and bringing more parts of me into alignment with it. Often I take in a need to change in some way, internalize it, don't put in conscious external effort - but then, a month later, find myself acting in the needed way, as though this slow, powerful part of me had done the proper nudging work, even though at no point did I feel the wrenching force I associate with redirecting my drives. My experience of godshatter was slow - just like the Portia Labiata spider is slow:

Here's the thumbnail sketch: we have here a spider who eats other spiders, who changes her foraging strategy on the fly, who resorts to trial and error techniques to lure prey into range. She will brave a full frontal assault against prey carrying an egg sac, but sneak up upon an unencumbered target of the same species. Many insects and arachnids are known for fairly complex behaviors (bumblebees are the proletarian's archetype; Sphex wasps are the cool grad-school example), but those behaviors are hardwired and inflexible. Portia here is not so rote: Portia improvises.

But it's not just this flexible behavioral repertoire that's so amazing. It's not the fact that somehow, this dumb little spider with its crude compound optics has visual acuity to rival a cat's (even though a cat's got orders of magnitude more neurons in one retina than our spider has in her whole damn head). It's not even the fact that this little beast can figure out a maze which entails recognizing prey, then figuring out an approach path along which that prey is not visible (i.e., the spider can't just keep her eyes on the ball: she has to develop and remember a search image), then follow her best-laid plans by memory including recognizing when she's made a wrong turn and retracing her steps, all the while out of sight of her target. No, the really amazing thing is how she does all this with a measly 600,000 neurons— how she pulls off cognitive feats that would challenge a mammal with seventy million or more.

She does it like a Turing Machine, one laborious step at a time. She does it like a Sinclair ZX-80: running one part of the system then another, because she doesn't have the circuitry to run both at once. She does it all sequentially, by timesharing.

She'll sit there for two fucking hours, just watching. It takes that long to process the image, you see: whereas a cat or a mouse would assimilate the whole hi-res vista in an instant, Portia's poor underpowered graphics driver can only hold a fraction of the scene at any given time. So she scans, back and forth, back and forth, like some kind of hairy multilimbed Cylon centurion, scanning each little segment of the game board in turn. Then, when she synthesizes the relevant aspects of each (God knows how many variables she's juggling, how many pencil sketches get scribbled onto the scratch pad because the jpeg won't fit), she figures out a plan, and puts it into motion: climbing down the branch, falling out of sight of the target, ignoring other branches that would only seem to provide a more direct route to payoff, homing in on that one critical fork in the road that leads back up to satiation. Portia won't be deterred by the fact that she only has a few percent of a real brain: she emulates the brain she needs, a few percent at a time.

Sometimes godshatter takes something like direct control, but this is extremely expensive; it's flying blind, not able to process new data as it gives instructions, since it can't process new inputs fast enough. That's what happened over the first few months of the past year; I made a long list of interventions that seemed like they might help, and dragged my avatar through them, one by one, because while me-in-the-moment felt despair, my deeper self knew something like this was both necessary and promising, and didn't have time to figure out the details.

For substantially longer than I've had conscious access to my higher self, I've felt responsible for its actions. I never saw how I could meaningfully report on my internal state without having some expectation that I'd continue to feel the same way. This has led to lots of miscommunication:

When I said that I wanted to see the movie Toy Story, I meant that I had considered the decision, taken into account the fact that there are many movies available and would be many more in the future, and decided that this one was a priority. I’d had the initial flash of wanting, noticed that it stayed around, and waited for a good opportunity to express it (a car ride in which there was no other conversation).

My mom assumed that I was expressing a momentary preference as soon as it came to mind. If it were persistent, then surely it would come to mind repeatedly, and I’d express it repeatedly.

By this hypothesis, most people don’t have a single well-fleshed-out model of themselves and their preferences. They have a bunch of partial models that make explicit verbal claims far beyond their evidence base - but the verbal claim isn’t meant to be a promise about future behavior.

Another example of a miscommunication produced by this misalignment of expectations is that I've found it difficult to persuade people that I need help with problems until I've solved them. At the beginning of the process, I can only just barely name the problem - and this isn't a strong enough social signal for people to notice I might need something. Once I've already gotten a handle on the problem, I feel confident enough to describe my prior internal state, and offers of help start rolling in - but I can only speak eloquently about my state because it's resolved!

People typically don't have conscious access to some of the algorithms that determine their long-run dispositions towards things. If something makes them like someone less, they often don't know that some distinct change has happened, but just end up cancelling plans to hang out more often than before. If they tried to honestly report their intentions with explicit verbal communication, they'd end up reporting things about their current mental state but not their true long-run expected behavior.

This seems adaptive: we should expect that the smartest algorithm your brain can use (especially for problems humans specialize in like social coordination and competition) is way more complicated than the smartest algorithm you can consciously evaluate in the same timeframe. This doesn't typically mean that it's impossible to understand your motives, but it does often mean that it will take a substantial time investment to be able to report them truthfully.

I sometimes hear people talking about this set of facts - that the algorithms determining our long-run dispositions are not transparent by default - as if it meant that people are not agents. My interpretation is very different: I believe that the deep thing that determines the character of our surface persona, the deep thing that steers our unawakened avatars, that is by default inaccessible to the conscious mind - that this thing is an agent.

Daemon

I think that for a lot of people this underlying, slow-to-change, hard-to-talk-with part of the soul feels like something outside them. When people talk about connecting to Source, this is what they mean. When they have a Daemon that speaks to them in times of great need or opportunity, this is the cognitive process that represents. Sometimes people think of this as a thing that happens to them, a message from a higher power or other outside force. One example of such a shift is falling in love. To me this feels like a thing I can do in response to a situation that calls for it, but for other people it feels like some outside force is pulling their soul along or something like that.

Socrates talked about having a Daemon that was usually silent, but occasionally told him not to do certain things, if it would be deeply unwise to do so. I have a friend who talks about having a sense of Luck that can point out unusually good opportunities to fulfill their values. It's not always active, but it's always watching, and comes out when they're not in pain. This friend has been working on fully awakening their Daemon, realizing that they are one and the same or, that it's the most deeply and permanently you part of you.

I don't know which god you are but, it seems good for you to find out.

Avatar

Before Pham Nuwen is given godshatter, when the protagonist first meets him, he is a human emissary of a cooperative superintelligence named Old One. Old One made Pham Nuwen, so he's mostly human but has been briefed on Old One's interests in advance with of the meeting, and also has a direct brain interface link connecting him with Old One.

Unlike Pham Nuwen, it seems like my Godshatter insights eventually propagate into my faster control processes, adjusting how I respond to events in the moment. My self who interacts with people on short timescales is something like a semi-autonomous intelligent avatar or emissary that I have programmed and that has access to the feelings etc. I've already processed. But if others expect some kind of emotional response from me immediately, they're missing something important, which is: it's not really my response until it's percolated up to me, and I've had time to process it. Then the emissary gets a software update.

An example of the kind of thing that made me feel unseeable was, making a calculated choice to think about thoughts that I knew would make my avatar have emotions it couldn't help but show, or training my avatar to enter a certain mental mode where it talks "from the heart," and having people respond to seeing these in the moment by talking about how genuine I am, or vulnerable, or letting them really see me, or feeling connected with me. (Even though producing that impression is the whole point.) It makes me feel unseeable because they're assuming that the avatar's all there is. Like talking to Pham Nuwen when the connection's offline and thinking you've got a direct line to Old One.

Less metaphorically, the part of me that has to respond to things in real time has access to only a sort of frozen summary of my deeper self. This is part of what makes it hard work to talk to people who want to know, quickly, my true response to new information. I can query myself for a gut reaction, and this is somewhat informative, but not enough to indicate whether I've really updated. Instead, they're just a current best guess, and I have only really processed the information when it's gone through the deeper me.

I have been getting these levels to work together faster and more smoothly lately - improving the bandwidth of the connection between my true self and avatar, but at first it felt like things had to drop out of sight for a while, where the spotlight of consciousness couldn't look. But my habit of taking responsibility for things my "true self" did, even if I didn't have conscious access to why, seems to have helped reshape my faster control loops to align well with this process. I think this alignment also sped up once I gave up on already being enough, and is a big part of why I feel much better integrated as a person than I used to.

I now recognize that I've always, on some level, assumed that the sort of thing I think of as me-in-the-long-run is what I'm talking to when I talk to someone - and been frustrated when it hasn't been integrated enough to answer on the verbal level, and I end up speaking to the unawakened avatar.

There's a scene from Unsong that captures how I feel about this pretty well:

“Stop it and listen to…” Jala paused. This wasn’t working. It wasn’t even not working in a logical way. There was a blankness to the other man. It was strange. He felt himself wanting to like him, even though he had done nothing likeable. A magnetic pull. Something strange.

Reagan slapped him on the back again. “America is a great country. It’s morning in America!”

That did it. Something was off. Reagan couldn’t turn off the folksiness. It wasn’t even a ruse. There was nothing underneath it. It was charisma and avuncular humor all the way down. All the way down to what? Jala didn’t know.

He spoke a Name.

Reagan jerked, more than a movement but not quite a seizure. “Ha ha ha!” said Reagan. “I like you, son!”

Jalaketu spoke another, longer Name.

Another jerking motion, like a puppet on strings. “There you go again. Let’s make this country great!”

A third Name, stronger than the others.

“Do it for the Gipper!…for the Gipper!…for the Gipper!”

“Huh,” said Jalaketu. Wheels turned in his head. The Gipper. Not even a real word. Not English, anyway. Hebrew then? Yes. He made a connection; pieces snapped into place. The mighty one. Interesting. It had been a very long time since anybody last thought much about haGibborim. But how were they connected to a random California politician? He spoke another Name.

Reagan’s pupils veered up into his head, so that only the whites of his eyes were showing. “Morning in America! Tear down that wall!”

“No,” said Jalaketu. “That won’t do.” He started speaking another Name, then stopped, and in a clear, quiet voice he said “I would like to speak to your manager.”

Reagan briefly went limp, like he had just had a stroke, then sprung back upright and spoke with a totally different voice. Clear. Lilting. Feminine. Speaking in an overdone aristocratic British accent that sounded like it was out of a period romance.

“You must be Jalaketu. Don’t you realize it’s rude to disturb a woman this early in the morning?” The President’s eyes and facial muscles moved not at all as the lips opened and closed.

“I know your True Name,” said Jalaketu. “You are Gadiriel, called the Lady. You are the angel of celebrity and popularity and pretense.”

“Yes.”

“You’re…this is your golem, isn’t it?”

“Golems are ugly things. Mud and dust. This is my costume.”

I feel like doing that with people all the time, except they don't have any idea that they have a manager and I don't have the right Names to force it.

The Bhagavad Gita is a credible attempt at a description of the manager, and tries to teach you how to find yours.

Atman

I have an old friend, Isaac, who grew up in a Transcendental Meditation household, went to Maharishi University of Management in Iowa, and has branched out into a bunch more spiritual improvement and inner work stuff. I was catching up with him on a trip back East, and told him about the way I identify with my long-run algorithms rather than me-in-the-moment, and he said, "Oh, you identify with Atman!" The eternal, unchanging part of the self described in the Bhagavad Gita.

I only vaguely remembered the Bhagavad Gita, but this felt right. Rereading it, I found that from time to time I disagreed with the exact details of the description, but it felt like a genuine attempt to describe the thing that is actually me, by someone who'd actually experienced that sort of selfhood. "This is what you really are and have been all along," it said. "The aspects of your being that are attached to things on the surface all flowed from this underlying algorithm with an unchanging orientation towards your true utility function. This body, these particulars, this is your avatar. They're all you have material access to right now, but you get to decide what to do with them, based on your true, long-term interests in steering things the way you want."

The Bhagavad Gita is a dialogue between a human and a god. Arjuna, the main character, is a prince fighting in what is effectively a rebellion against the legitimate king, who's unjustly dispossessed some lesser nobles. His forces are arrayed against his adversary's forces, and he is riding his chariot towards the field of battle to begin the engagement.

Then Arjuna begins to feel doubt, and says as much to his charioteer. He notes that in fighting this war, he is fighting those who brought him up - he has cousins and uncles on both sides, many of those who fed and trained him are on the other side. Moreover, he is fighting for justice, but bringing war where there might otherwise be peace. He's fighting for his side's legitimate claims, but bringing war and thus lawlessness. He wonders how he can possibly be right to fight.

His charioteer responds that he happens to be an avatar of the god Krishna, and proceeds to explain the nature of morality and right action to Arjuna.

I'm not going to explain the whole text here, especially since I don't fully understand it, and have only read it in translation, a very imperfect way to understand a subtle text. Only one of its core ideas is relevant here: the idea of an unchanging, eternal self - and that once you recognize that this is you, you no longer suffer due to changes in your mutable self, are no longer dominated by your drives. Your universal self, Atman, is one with Brahman, the underlying principle of being that all things partake of.

This is not a philosophy of external inaction. Instead, it says to let your lower self do what it's meant to do, without imagining that your true self is affected by it. In so doing, your actions no longer incur karma, sin, damage. No longer become attached to the results of your actions, and you can act vigorously in the world, when appropriate, without placing bits of your self into them. No longer identifying with your basic drives, but letting them want things and cause actions when appropriate, while standing above them, unmoved.

Krishna talks about contemplation and right action as two different modes for finding Brahman. In the contemplative path, you focus on purifying your mind, and right action follows. In the active path, you focus on right action and relinquishing the fruits of the action, and right mind follows.

In Arjuna's case, this cashes out to: your higher soul led you here, to a place of action, not contemplation. You chose this path, so you chose the course of action, and it seems to be your strongest suit. That means that you should lean into what it chose as right action, and trust that your explicit understanding will catch up to your lived experience of the good. Trying to suddenly let your intellect overrule yourself on the field of battle seems like a poor choice; the important thing is to continue to act in a way that leads you to full wakefulness, not to get each particular call right on the merits. Your body and lower soul will continue to be involved in karma, but you as atman no longer are.

So far, this seems like a good fit for my sense of identifying with the algorithm. There's a deep sense of peace I can access when I let go of my sense that I need to feel all my preferences all the time, keep them at the core of my soul - when I can let them go when there's nothing activating them. Instead of thinking of my feelings as identical to my preferences, I think of them as things my brain can do, that are sometimes appropriate. I'm not afraid to lean into strong feelings when they are appropriate for the situation - and I'm able to step back and reflect, at peace, when that's the appropriate action. I've got much less of what I used to call mental friction than I had before.

But I still intend to fulfill my values, not fully generic ones. So I don't fully accept the Bhagavad Gita's solution.

I don't think the most universal possible form of the soul is the same as the generalized source of right action I want to look up to. And different people may have different higher selves. Coherent Extrapolated Volition, eudaemonia, and atman are all pointing at an underlying regularity in the world - I often find myself acting effortlessly in concert with others once I think clearly about things - but I think the right point of regularization to stop at is the point at which I'm still a distinct person. I don't want to just be the all-soul, and I'm willing to suffer for that. Not everyone is the same person, or at least not in any interesting way. Not everyone was the same person in a past life. There are levels of abstraction below that one, which are the point at which you should stop. You should not try to be generic human, the allgod, but instead you should stop while you still have your personality. There's no need to sell your whole soul to darkness (the literal meaning of "Krishna"). A small fraction will do.

Arjuna's charioteer claims to be a reincarnation of Krishna, and I also don't think that's true in any particularly literal sense.

It's artistically important that an avatar is explaining the thing. Krishna is not there in his own body. The charioteer is (in my interpretation) just some dude who discovered that he had the Krishna nature, and thus woke it up so that he could talk to Arjuna as Krishna - and almost as Brahman itself.

Being an instantiation of a god is a nice story that feels psychologically powerful, but that's really just an abstraction of the fact that the thing I'm trying to instantiate feels more important and real than the details of my day-to-day experience. Plato's Meno similarly suggests that true knowledge (about eternal things such as mathematics) is acquired by a process more like recollection than learning, and I do feel that sense of clicking, of things finally fitting the way they were always supposed to. This sort of feels like bootstrapping "back up" to something I forgot. But it's important to keep track of when something works well as a metaphor, and when it's literally the case. I think the recollection thing has to do with the way generalizations often feel like corrections of a bunch of imperfectly accounted for specifics.

Me

There are really two distinct concepts here that can be called a higher self:

  • The perfect algorithm of which I am imperfect instantiation.
  • The slow, deeply wise cognition I'm learning how to awaken and integrate with.

The first is a good fit for something like the Bhagavad Gita's "atman." The second, for Godshatter.

Now that I know where I am, I can tolerate other people missing it. I am my aspirational self, and this is my avatar or emissary on Earth. And it does seem like sufficiently perceptive people could infer my existence from my emissary’s behavior. Most people just aren't very perceptive, which is a much less painful problem than that of being fundamentally invisible.

I've given up some attachments to things the old me invested in - but it turns out that I was interested in these things for a reason, and they're often worth reconnecting with. I've accepted the possibility that parts of me may wither away - but it turns out that I need most and maybe all of myself engaged to have a chance at being adequate to the tasks ahead.

I've even reconnected with some old friends, on a deeper level than before, and found that I don't need to be my old self to be their friend, or force things to fit out of a sense of loyalty. They liked the old me, and the new me is mostly made of the same parts. They can meet me where I am.

Those who are looking for me can find me. I'm working on making a better map.

10 thoughts on “Automemorial

  1. devi

    Lots of interesting concepts, thanks a lot for writing them down. I particularly liked the model of depression as the final stages of grief for the passing of a subagent. I suspect that the appropriate conception of a subagent for this is a want associated with a partial model of the world (or outlook or paradigm). It fits really neatly as a generalization of the model by Alice Miller in The Drama of the Gifted Child, which considers the lack of truly unconditional love and acceptance in childhood as the root of the twin phenomena of grandiosity and depression. However, I think both you and Miller are too hesitant about discussing the useful computational properties of despairing states: they can keep you going, looking for a solution, when all the manic delusions have failed you but there still might be some way of achieving what the goal (alt. keeping the ontology). The question of when to give up on something is in general complicated and I think this is part of how we determine the answer.

    I'm curious about why you distinguish atman and Godshatter. To me it seems that as the deeper parts of me can think and integrate perspectives for longer they more closely resemble the idealized algorithm that I strive to approximate. In the realm of pure morality we can view CEV and Ideal Advisor Theories as describing how to define these in terms of each other (getting the correct ideal algorithm using more running time and what the correct right way of improving our approximation is using the ideal algorithm, respectively).

    Reply
  2. G Gordon Worley III

    Thanks for walking us through this. It's rare to get such a complete picture of how someone's thoughts develop over time and how that interacts with other things going on in their lives. Usually we just get snapshots that are interesting but suggest a kind of timelessness that doesn't really exist. This kind of writing is great for helping to give a picture of how you went from thinking one way to another.

    It also gave me a chance to finally notice what's been bothering me about your writing for years. I could never quite figure it out, but something always seemed "off" to me about it and seeing your thinking strung together here in this way helps me see the pattern. To my reading, you seem to prefer in sense making explanations that are interesting all else equal, and to my mind this matches a pattern I and many other have been guilty of where we end up preferring what is interesting to what is parsimonious and thus less likely to be as broadly useful in explaining and predicting the world.

    To put it another way, you seem to understand the world in the way you would like to understand it and this is somewhat at odds with how I like to understand things (though I think stating it this way implies way more identification with my own preferences than actually exists, or than I intend for you). I notice a similar pattern when reading Satvik's writing, now that I can point at it, where he seems to prefer something like simplicity all else equal.

    Reply
    1. Benquo Post author

      This seems pretty plausible, but I think specific examples of simpler but less "interesting" explanations would help me better learn from it.

      Reply
      1. G Gordon Worley III

        To expand a bit, by "interesting" I mean that which makes an engaging story. It's more fun to hear a story that explains things in terms of people-like processes, for example, than in terms of layered abstractions.

        A good example is thinking about the psyche as made up of subagents. This is an interesting explanation because the story of "this part of my self interacting with this other part of myself" is enough like the story of "this person interacting with that person" to engage attention in a similar way and use existing known patterns of causality, but does so at the cost of presuming more than seems necessary about the operation of the brain. This doesn't make it a worse explanation that you might learn to do better than, but it does make it one that is more likely to less accurately explain reality in more cases because it's preferring what's interesting to what's parsimonious, all else equal.

        However more parsimonious explanations are often harder to use than more interesting explanations, often because they require unintuitive thinking, so depending on the situation they are not as useful. I am, for example, be more likely to explain my own thinking in terms of subagents to other people because it can more quickly convey the right sense of what I want to convey without the high cost of being exact enough for an unintuitive, if more accurate, explanation to work. And I often think to myself in terms of subagents when that abstraction is robust enough that I need not worry about errors on the margin because it's much faster and it lets me reason about greater complexity in other parts of the world as I integrate it all together. For the same reason I sometimes prefer what is simple (has less stuff in it) to what is interesting or parsimonious if I need to focus my attention on other parts of the holon, like by thinking of the psyche as a general preference decision procedure so I can think more about interactions between agents.

        Reply
        1. Benquo Post author

          I think you're reading some connotative detail into "subagent" that I didn't mean to put in there. Where would a simpler model make a different prediction than "subagents"?

          Reply
          1. G Gordon Worley III

            Let's suppose you are trying to decide whether or not to eat a donut. A simple model would use a stochastic process to guess your action based on prior probabilities, and maybe determine there's a 20% chance you will eat the donut.

            A subagent model would view you as made up of many interacting subagents that collectively determine your behavior. To take the very simple 3 subagent model implied by 3-part psyche models (Freud, Aristotle, the gunas), the id agent wants to eat the delicious donut, the superego agent doesn't want to eat the unhealthy donut, and the ego balances these desires to determine a winner in this particular case. It may similarly determine a 20% chance of eating the donut, but if you happen to be tired the id may be able to overwhelm the ego and superego and increase the probability to 30%. This hopefully makes the subagent model more accurate than the stochastic model.

            A more parsimonious model might be to ditch subagents as unnecessary complexity but also could not be as simple as the stochastic model because that would ignore obvious details that should be considered, like how hungry you are. So a better model might be that of a neural network that can respond to input in a simple way but does not easily provide intuitive explanations for why it does what it does.

            To me subagents are like epicycles: they work, but require a lot more complexity than necessary, however they fit a more intuitive model of the world.

          2. Benquo Post author

            I agree that simply using prior behavior is a simpler predictive model, but as a causal model it just sweeps all the complexity under the rug of the label "stochastic process." I'm trying to develop causal models in order to identify promising interventions.

            I think accumulating epicycles is unambiguously good, as long as you're careful to keep records of them. This is *how* you get to simplifying paradigm shifts. Copernicus's model would have been much less persuasive if there hadn't been a widely accepted Ptolemaic model that yielded accurate predictions, that his model mostly lined up with, but seemed more elegant/natural than.

          3. Benquo Post author

            I agree that while it's often a helpful metaphor, the Freudian tripartite model isn't a good fit for "3 independent agents." It does seem like it's describing something real, and multiple intellectual traditions have ended up there. I think "superego" in particular doesn't have real goals and is *just* cognitive trigger-action patterns, typically verbal ones, that are hooked up to our social dominance-submission intuitions somehow. This seems importantly different from Platonic logos or sattva, which really does seem like a source of action that's able to perceive the world somehow. Likewise, "ego" seems like the constitution of self, not an independent agent that exists side by side wth id. By contrast, thumos/rajas seems more like it perceives the world and wants some things to happen.

            I was trying to specify a particular type of subagent in the "control subagents" section, and very much not the kind in any of these schemas. I think what I'm trying to point to predicts many of the same kinds of things that Connection Theory predicts by saying that we have beliefs about what is necessary to accomplish our goals.

    2. Benquo Post author

      On the process level, I should mention that this type of response seems like the sort of thing that will very often be productive:

      something always seemed "off" to me about it and seeing your thinking strung together here in this way helps me see the pattern

      Thanks!

      Reply

Leave a Reply

Your email address will not be published.