Exploitation as a Turing test

A friend recently told me me that the ghosts that chase Pac-Man in the eponymous arcade game don't vary their behavior based on Pac-Man's position. At first, this surprised me. If, playing Pac-Man, I'm running away from one of the ghosts chasing me, and eat one of the special “energizer” pellets that lets Pac-Man eat the ghosts instead of vice-versa, then the ghost turns and runs away.

My friend responded that the ghosts don't start running away per se when Pac-Man becomes dangerous to them. Instead, they change direction. Pac-Man's own incentives mean that most of the time, while the ghosts are dangerous to Pac-Man, Pac-Man will be running away from them, so that if a ghost is near, it's probably because it's moving towards Pac-Man.

Of course, I had never tried the opposite – eating an energizer pellet near a ghost running away, and seeing whether it changed direction to head towards me. Because it had never occurred to me that the ghosts might not be optimizing at all.

I'd have seen through this immediately if I'd tried to make my beliefs pay rent. If I'd tried to use my belief in the ghosts' intelligence to score more points, I'd have tried to hang out around them until they started chasing me, collect them all, and lead them to an energizer pellet, so that I could eat it and then turn around and eat them. If I'd tried to do this, I'd have noticed very quickly whether the ghosts' movement were affected at all by Pac-Man's position on the map.

(As it happens, the ghosts really do chase Pac-Man – I was right after all, and my friend had been thinking of adversaries in the game Q-Bert – but the point is that I wouldn’t have really known either way.)

This is how to test whether something's intelligent. Try to make use of the hypothesis that it is intelligent, by extracting some advantage from this fact.


In early 2014, as I was learning to be motivated by long-run considerations and make important tradeoffs, I started to worry that I was giving up something important about my old self - that some things that had been precious to me, would never quite be worth the price of holding onto, so the parts of my soul that cared for them would gradually wither away, unused, until it wasn’t even tempting to try and reconnect to going to the opera, translating classical Greek, or any of the other things in my life that I chose for their beauty but not their utility.

It turned out that I was right, though not quite in the way I expected.

This is my story. It is an honest report of that story, but that is all it is.

This is the story of how, over the past year and a half, I died and was reborn. In it, you'll find the ways I had to learn to model the world to effect this transformation. I hope that some of them are useful to you.

The engineer and the diplomat

I used to think that I had poor social skills. So I worked hard to improve, and learned a lot of specific skills for interacting with people more effectively. My life is a lot better for it. I have deeper friendships, and conversations go interesting places fast. I'm frequently told that I'm an excellent listener and people seek me out for emotional support, and even insight into social conflict. But I'm told that I have poor social skills more often than before.

Not everyone means the same thing by social skills. It's important to distinguish between the social skills that are valued for their own sake – the social skills people identify themselves with – and the social skills that are a means subordinated to some other specific ends.

Be secretly wrong

"I feel like I'm not the sort of person who's allowed to have opinions about the important issues like AI risk."
"What's the bad thing that might happen if you expressed your opinion?"
"It would be wrong in some way I hadn't foreseen, and people would think less of me."
"Do you think less of other people who have wrong opinions?"
"Not if they change their minds when confronted with the evidence."
"Would you do that?"
"Do you think other people think less of those who do that?"
"Well, if it's alright for other people to make mistakes, what makes YOU so special?"

A lot of my otherwise very smart and thoughtful friends seem to have a mental block around thinking on certain topics, because they're the sort of topics Important People have Important Opinions around. There seem to be two very different reasons for this sort of block:

  1. Being wrong feels bad.
  2. They might lose the respect of others.

Continue reading

Six principles of a truth-friendly discourse

Plato’s Gorgias explores the question of whether rhetoric is a “true art,” that when practiced properly leads to true opinions, or whether it is a mere “knack” for persuading people to assent to any arbitrary proposition. Socrates advances the claim that there exists or ought to exist some true art of persuasion that is specifically about teaching people true things, and doesn’t work on arbitrary claims.

(Interestingly, the phrase I found appealing to use in the title of this post, "truth-friendly," is pretty similar to the literal meaning of the Greek word philosophy, "friendliness towards wisdom.)

Six principles of the knack of rhetoric

Robert Cialdini’s Influence is about the science of the “knack” of rhetoric - empirically validated methods of persuading people to agree to arbitrary things, independent of whether or not they are true beliefs or genuinely advantageous actions. He outlines six principles of persuasion:

  1. Reciprocity - People tend to want to return favors. An example of this with respect to actions is the practice of Hari Krishna giving people “gifts” like a book or a flower, and then asking for a donation. A special case of this is “reciprocal concessions” - if I make a request and you turn it down, and then I make a smaller request, you’re likely to feel some desire to meet me halfway and agree to the small request.
  2. Commitment and consistency - People use past behavior and commitments as a guide to present behavior. If you persuade someone that they’re already seen as having some attribute, they’re more likely to want to “live up to” it. If you get people to argue for a point, even without any commitment to believing their argument, they’re more likely to say they believe it in the future. If you get someone to agree in principle to do a thing, they’re more likely to agree to specific requests to do the thing.
  3. Social proof - People use others’ behavior as a proxy for what’s reasonable. Advertisements exploit this by showing people using a product.
  4. Authority - People tend to accept the judgment of people who seem respectable and high-status whether or not they are an expert in the field in question.
  5. Liking - people are more likely to buy things from people they like.
  6. Scarcity - People are more eager to buy things that appear scarce. “Limited time offers” exploit this.

Six principles of the art of rhetoric

Making it easier for people to avoid these traps seems like a desirable attribute of a discourse, if we want to move more efficiently towards truth. Therefore, a rational rhetoric will have the following six principles, each one countering one of Cialdini's principles of the knack of influence:

Puppy love and cattachment theory.

Secure attachment and the limbic system

A couple of friends recently asked me for my take on this article by Nora Samaran on secure attachment and autonomy. The article focuses on the sense of security that comes from someone consistently responding positively to requests for comfort. The key point is that it's not just a quantitative thing, where you accumulate enough units of comfort and feel good. It's about really believing on a gut level that someone is willing to be there for you, and wants to do so:

Group cognition

I don’t even see groups

I want to foreground part of the subtext in my recent post on community and my problems with it. One underlying problem appears to me to be that I simply don’t perceive groups. Slate Star Codex writes about how important group membership is for making friends:

If I had written this essay five years ago, it would be be titled “Why Tribalism Is Stupid And Needs To Be Destroyed”. Since then, I’ve changed my mind. I’ve found that I enjoy being in tribes as much as anyone else.

Part of this was resolving a major social fallacy I’d had throughout high school and college, which was that the correct way to make friends was to pick the five most interesting people I knew and try to befriend them. This almost never worked and I thought it meant I had terrible social skills. Then I looked at what everyone else was doing, and I found that instead of isolated surgical strikes of friendship, they were forming groups. The band people. The mock trial people. The football team people. The Three Popular Girls Who Went Everywhere Together. Once I tried “falling in with” a group, friendship became much easier and self-sustaining precisely because of all of the tribal development that happens when a group of similar people all know each other and have a shared interest. Since then I’ve had good luck finding tribes I like and that accept me – the rationalists being the most obvious example, but even interacting with my coworkers on the same hospital unit at work is better than trying to find and cultivate random people.

Scott's original social strategy is exactly how I go about making friends. Where it didn't work, I just kept upgrading my social skills for one-on-one interactions. This I think is part of why some people think I have unusually poor social skills, and others say I have unusually good ones. I developed them unevenly relative to the norm.

Backwards and in feels

Somaticization is the tendency to experience mental distress as physical distress. For instance, some people with depression don’t report low mood, but instead things like nausea or pain.

A 3-stage model of emotions mediated by somatic responses would explain this fairly well. In most people, the cognitive processes that generate emotions do not directly generate qualia, but only affect our physical and mental behavior. These in turn are read by other mental processes that summarize them into feelings. Somaticizers, by this model, are people who are acutely consciously aware of the intermediate somatic stage of their feelings, but for whom the summarizing processes are either suppressed or weak to begin with.

In talking with friends about their experiences, I've noticed that this process can run in reverse as well - physical ailments with non-mental origins can get picked up by processes that are looking for somatic symptoms of emotions:

Solve your problems by fantasizing

The problem with most goal-driven plans is that most goals are fake, and so are most plans. One way to fix this is to fantasize.

Emotional qualia are mediated by somatic responses

Unembodied emotions

I used to be confused when people talked about feeling their emotions in their bodies. My emotions didn’t feel like physical sensations - they just felt like emotions. Doesn’t sadness or happiness just feel like sadness or happiness? I had trouble with a lot of advice for how to better manage or get in touch with emotions for this reason.

I sometimes felt my emotions saliently, but I experienced nothing like the variety of qualia other people reported. I basically had a four-quadrant model of emotion: