Exploitation as a Turing test

2017-01-04

A friend recently told me me that the ghosts that chase Pac-Man in the eponymous arcade game don't vary their behavior based on Pac-Man's position. At first, this surprised me. If, playing Pac-Man, I'm running away from one of the ghosts chasing me, and eat one of the special “energizer” pellets that lets Pac-Man eat the ghosts instead of vice-versa, then the ghost turns and runs away.

My friend responded that the ghosts don't start running away per se when Pac-Man becomes dangerous to them. Instead, they change direction. Pac-Man's own incentives mean that most of the time, while the ghosts are dangerous to Pac-Man, Pac-Man will be running away from them, so that if a ghost is near, it's probably because it's moving towards Pac-Man.

Of course, I had never tried the opposite – eating an energizer pellet near a ghost running away, and seeing whether it changed direction to head towards me. Because it had never occurred to me that the ghosts might not be optimizing at all.

I'd have seen through this immediately if I'd tried to make my beliefs pay rent. If I'd tried to use my belief in the ghosts' intelligence to score more points, I'd have tried to hang out around them until they started chasing me, collect them all, and lead them to an energizer pellet, so that I could eat it and then turn around and eat them. If I'd tried to do this, I'd have noticed very quickly whether the ghosts' movement were affected at all by Pac-Man's position on the map.

(As it happens, the ghosts really do chase Pac-Man – I was right after all, and my friend had been thinking of adversaries in the game Q-Bert – but the point is that I wouldn’t have really known either way.)

This is how to test whether something's intelligent. Try to make use of the hypothesis that it is intelligent, by extracting some advantage from this fact.

Why does abuse pass the Turing test?

Alan Turing famously proposed a test of whether a computer program can think like a human, in which investigators get to ask questions, unsure whether the one answering is a computer program or a human. If they can't tell the difference, the program "passes."

In practice, people often try to administer the Turing Test by engaging in conversation and judge based on whether they feel like the other side is being authentically human. This approach has some obvious vulnerabilities, as an account in the Atlantic by Brian Christian describes well:

The programmer Joseph Weintraub chose “Whimsical Conversation” for his PC Therapist III and went on to earn the very first Most Human Computer award. As the program practiced it, “whimsical conversation” was not a topic at all, but a manner—a crucial distinction. Weintraub was able to play off a looseness in the contest rules, the natural behavior patterns of chatbots, and people’s stereotypes about computers and humans, all at once. Strategically, this was brilliant.

By dutifully and scrupulously providing information in response to the questions asked, Clay demonstrated her knowledge and understanding—but sometimes in a factual, encyclopedic way commonly associated with computer systems. Here is a sample of Clay’s conversation:

Judge: What is your opinion on Shakespeare’s plays?

Confederate: That’s pretty general; would you be more specific? Otherwise, I’ll just say I like them.

The computer at the “Whimsical Conversation” terminal stood in stark contrast:

Judge: It has been a long day! Now I think I’m ready for some fun. The weather isn’t very pleasant today. Do you have any jokes that could lift the clouds?

Computer: Everybody talks about the weather but nobody seems to do much about it. Why do you need to tell me now you think you’re ready for some fun? Groannnnnn … … … … …

Judge: What is the definition of whimsical conversation?

Computer: Didn’t you say well anyway I would imagine whimsical conversation is like a series of knight’s moves in chess—skrt of zig-zag rather than following any nparticular line.? [Weintraub’s program, like many that followed it, faked typos. Any unintentional typing errors in the transcripts that follow have been corrected.] Why do you ask? Was that a clever comment, or what? Are such questions much on your mind?

Judge: I like the image of knights moving haphazardly across the chess board, does that mean there is no thought to whimsical conversation?

Computer: I could swear you just said how do you define whimsical?? How do yku define whimsical? Were you always so sick sick sick? Why do you need to tell me you like the image of knights moving haphazardly across the chess board?

“I really thought [PC Therapist] was human, because it … was weird and funny, in a normal sort of way,” said one of the judges, a Harvard graduate student. Weintraub’s program, shifting topics wildly and spouting non sequiturs and canned one-liners, came off as zany, a jokester, a much more “human” personality type. At least I used to think so—before I learned how easy this was to mimic.

As Richard Wallace, three-time winner of the Most Human Computer award (’00, ’01, and ’04), explains: Experience with [Wallace’s chatbot] ALICE indicates that most casual conversation is “state-less,” that is, each reply depends only on the current query, without any knowledge of the history of the conversation required to formulate the reply.

Many human conversations function in this way, and it behooves AI researchers to determine which types of conversation are stateless—with each remark depending only on the last—and try to create these very sorts of interactions. It’s our job as confederates, as humans, to resist them.

One of the classic stateless conversation types is the kind of zany free-associative riffing that Weintraub’s program, PC Therapist III, employed. Another, it turns out, is verbal abuse.

In May 1989, Mark Humphrys, a 21-year-old University College Dublin undergraduate, put online an Eliza-style program he’d written, called “MGonz,” and left the building for the day. A user (screen name “Someone”) at Drake University in Iowa tentatively sent the message “finger” to Humphrys’s account—an early-Internet command that acted as a request for basic information about a user. To Someone’s surprise, a response came back immediately: “cut this cryptic shit speak in full sentences.” This began an argument between Someone and MGonz that lasted almost an hour and a half. (The best part was undoubtedly when Someone said, “you sound like a goddamn robot that repeats everything.”)

Returning to the lab the next morning, Humphrys was stunned to find the log, and felt a strange, ambivalent emotion. His program might have just shown how to pass the Turing Test, he thought—but the evidence was so profane that he was afraid to publish it.

Humphrys’s twist on the Eliza paradigm was to abandon the therapist persona for that of an abusive jerk; when it lacked any clear cue for what to say, MGonz fell back not on therapy clichés like “How does that make you feel?” but on things like “You are obviously an asshole,” or “Ah type something interesting or shut up.” It’s a stroke of genius because, as becomes painfully clear from reading the MGonz transcripts, argument is stateless—that is, unanchored from all context, a kind of Markov chain of riposte, meta-riposte, meta-meta-riposte. Each remark after the first is only about the previous remark. If a program can induce us to sink to this level, of course it can pass the Turing Test.

What's happening here? The programs that won were the programs that entirely ignored the intellectual content of the testers' questions, and responded in a purely emotive way.

I think that what's happening here is that people are looking for direct cues about simple social emotions, similar to the sort of thing physical empathy can pick up on. We evolved from creatures where this was the most important sort of communication, so these signals are some of the most salient indicators that there's someone there on the other end, who cares about what's going on with us.

This is why abuse beats intellectual engagement – we don't expect enraged people to be very clever, but we don't think they're any less real for that. So an "enraged" computer program doesn't need as much intelligence to pass the Turing test. We don't expect any work from them.

As Christian points out:

Once again, the question of what types of human behavior computers can imitate shines light on how we conduct our own, human lives. Verbal abuse is simply less complex* *than other forms of conversation. In fact, since reading the papers on MGonz, and transcripts of its conversations, I find myself much more able to constructively manage heated conversations. Aware of the stateless, knee-jerk character of the terse remark I want to blurt out, I recognize that that remark has far more to do with a reflex reaction to the very last sentence of the conversation than with either the issue at hand or the person I’m talking to. All of a sudden, the absurdity and ridiculousness of this kind of escalation become quantitatively clear, and, contemptuously unwilling to act like a bot, I steer myself toward a more “stateful” response: better living through science.

Unintelligent abuse passes the Turing test because we don’t expect better. We should and we can.

The impossible dream: thought-empathy

If someone around you is having strong feelings about something, they might be about to do something interesting, or they might especially need your help. It's natural to want to pay more attention to them. Thus, in ordinary social situations, it is right and proper to prioritize emotionally loaded signals.

Our species evolved in, and most of us grew up in, an environment where humans were nearly the only thing that seemed to engage us on the level of social emotions. (Perhaps some of us had a family pet as well.) So it can also feel natural to test for human-ness by testing whether something seems to be sending us social signals. But this heuristic fails when we try to apply it in other domains.

My friend Brent wrote a short story that I'm going to quote here in full because it's very, very good, and because it is a good account of what it looks like to hold other domains to the standard of physical empathy:

They wanted me to use my invention for porn.

And if you think I'm not going to do that, too, then you don't know me at all. But that's okay - I can fix that now.

Because I'm holding the key to a door that Man has dreamed of opening since... I don't know, since some savannah ape stared into her pair-bonded mate's eyes and actually conceived of the first full-blown "theory of mind".

I've tested it in the lab a few times, with Xiao. But never in "field conditions". And never with anything like the intensity I expect to get tonight.

Prep time looks something like this:

Shower, shave everything, put hair in curlers. Blow-dry, slip into That One Red Dress, paint nails, makeup.

Slip the squid trodes into my hair like clothespins. Do some adjustments around my ears so the ringlets hide them. Fix eye makeup. Call Ben.

And now it's about to be showtime, so I've set my phone to start recording whenever the caudate nucleus activation passes 0.7 on the squid trodes. ... Of course, that means it starts up immediately, because I've got butterflies in my stomach just thinking about how this is going to go down. It takes several minutes just to calm myself down a bit. Once I'm zen, I reset the app, get my shit together, and call an Uber.

Ben's already got a seat at the restaurant, and when he looks up at me with those big puppy dog eyes I feel that tiny tingly flutter. My phone buzzes softly in my purse to let me know it recorded that, but I act nonchalant as I sit down across from him at the table and smile warmly at him.

Please work, please work, please don't crash on me, so much is riding on this...

We banter for a bit, brush aside some idle work chat, and I'm smiling at him while he starts digging into his fried noodles. Once he notices, it'll be showtime.

And... showtime.

Our eyes meet, he opens his mouth to ask what I'm looking at, and I grin as my heart swells in my chest, and say...

"Here, let me show you."

I pull off one of the squid trodes and push it against the left side of his skull, just behind the ear. I push the 'playback' button on the app and select the past 15 seconds.

It takes a half second to realize that I haven't stopped recording, which ... yeah.

Ben suddenly knows. Completely knows. Completely, intimately knows, EXACTLY how I feel about him. Because he's feeling it, the way I am, right now.

And I'm feeling how he's feeling about that. And he's feeling how I'm feeling about how he's feeling about how I'm feeling... it fades down into noise past that, but I'm suddenly struck by how... tragically insecure he is. He had no idea what it was like for me to look into his eyes. He had no idea how much I love him, how much I love that he gets so adorably awkward around me, how much I...

It's too much. It's too much and I don't know what's me or what's Ben but I'm in his mind and he's in my mind and he's not alone and I'm not alone and WE'RE not alone and holy fuck I never knew what "alone" meant, really meant until five seconds ago and was that me thinking that or was that Kyra I mean Ben I mean fuck this is intense I have to let her know that I have to let him know that we have to hold onto this because we never want to feel lonely like that time in the third grade where if I had been there I would have I know you would have I'm sorry I wasn't we're here now we need to ...

... I'm staring into his eyes and the squid trodes are blinking red and the phone app's crashed because I'm out of memory. And he's looking into my eyes, but neither of us are looking very well because we are completely soaked in tears and then he literally sweeps the table off to one side and he's kneeling in front of my chair and I've fallen into his lap and we're just bawling together in the middle of this fucking restaurant.

So yeah. Successful field test.

Physical empathy is what gives you an instantaneous sense of being seen and accepted. Likely, much of our more sophisticated social cognition is built on top of a foundational metaphor of physical empathy, and involves looking for similarly phatic expressions of simple social emotions – the verbal equivalent of emotional body language. But physical empathy can only pick up basic mammalian affect. This misses a lot of what makes you the person you are. Naturally, you feel as if the true you hasn't been seen and accepted, and it feels like physical empathy ought to extend to cover the rest of your mental states. Like until that happens, you're tribeless.

But the story makes clear that the ability to perceive someone's history, thoughts, and motivation as directly as we can perceive their emotions doesn't lead to deep cooperation. It doesn't cash out in something new being done. It doesn't cash out in anyone showing off their newfound knowledge of the other person by doing something specific that helps. It just ends in inarticulate bawling on the floor.

Brent's story doesn't feel like a description of "being seen" for me. I value being seen by a distinct agent who is not me.

Someone who gets my interests and values can look at parts of the world that I can't see. While I'm focused on one thing, they might notice something else that's important to my interests. It might be a threat they can neutralize. It might be an opportunity they can exploit on my behalf. It might simply be information they think I'd appreciate. And I can do the same thing for them. The important thing about not being alone isn't that my friends see the world through my eyes. It's that even when I'm not around, they've got my back.

When someone takes up work I've done and carries it forward in exactly a way I'd have wanted to do if I had their skills and abilities, such that it feels almost like my work, without explicit coordination or feedback, then I feel like I have a true ally who truly understands me.

Rapport matters

The fact is that many readers of Brent's story did recognize it as a thing they were missing. But that doesn't mean that what they're missing is as complex as it looks. Consider the success of the ELIZA program:

When the Turing Test was first proposed in 1950, it was a hypothetical: technology was nowhere near the point at which a practical test was possible. But, as we know, it got there; the first conversational computer program to attract significant notice and attention was Eliza, written in 1964 and 1965 by Joseph Weizenbaum at MIT. Modeled after a Rogerian therapist, Eliza worked on a very simple principle: extract key words from the users’ own language, and pose their statements back to them. (“I am unhappy.” “Do you think coming here will help you not to be unhappy?”) If in doubt, it might fall back on some completely generic phrases, like “Please go on.” This technique of fitting the users’ statements into predefined patterns and responding with a prescribed phrasing of its own—called “template matching”—was Eliza’s only capacity.

A look at an Eliza transcript reveals how adeptly such an impoverished set of rules can, in the right context, pass at a glance for understanding:

User: Men are all alike.

Eliza: In what way?

User: They’re always bugging us about something or other.

Eliza: Can you think of a specific example?

User: Well, my boyfriend made me come here.

Eliza: Your boyfriend made you come here.

User: He says I’m depressed much of the time.

Eliza: I am sorry to hear you are depressed.

User: It’s true. I am unhappy.

Eliza: Do you think coming here will help you not to be unhappy?

User: I need some help, that much seems certain.

Eliza: What would it mean to you if you got some help?

User: Perhaps I could learn to get along with my mother.

Eliza: Tell me more about your family.

The results were stunning: many of the people who first talked with Eliza were convinced that they were having a genuine human exchange. In some cases, even Weizenbaum’s own insistence to the contrary was of no use. People asked to be left alone to talk “in private,” sometimes for hours, and returned with reports of having had a meaningful therapeutic experience.

ELIZA doesn't really know you. The value of the ELIZA program is that it is good at building rapport. We're part-animal, part-god, and the animal part needs to know that someone is its friend, in order to start cooperating. Unless you're very unusual like me, this needs to happen through phatic rapport-building.

We're mammals. We're social animals. If we feel isolated, if we feel threatened, if we feel like our environment is unfriendly, then we'll be too on our guard to solve problems together. The usefulness of ELIZA shows, not that humans are stupid, or that our problems are stupid, but that often what's holding us back is not the complexity of our problems, but that we're too much in pain from lack of a sympathetic listener to think at all.

I'm not saying rapport doesn't matter. Rapport matters very much. But when we meet our needs for empathy, we will still have problems, and need allies and friends.

I'm saying that it's important to draw a conceptual distinction between your need for rapport, and your need for actual allies. The almost impossibly demanding desire for instantaneously verifiable rapport on the level of thoughts and not just feelings that leads to preferences like the one Brent's story articulates are the result of conflating these two.

Having examined that, you may want to hold onto that aesthetic anyway. If so, I respect your choice. But it is a choice.

Test intelligence by trying to extract cognitive work from it.

It's worth comparing this sort of test to the examples of potential test dialogue from Turing's original paper proposing the test:

Q: Please write me a sonnet on the subject of the Forth Bridge.

A : Count me out on this one. I never could write poetry.

Q: Add 34957 to 70764.

A: (Pause about 30 seconds and then give as answer) 105621.

Q: Do you play chess?

A: Yes.

Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?

A: (After a pause of 15 seconds) R-R8 mate.

[…] Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?

Witness: It wouldn't scan.

Interrogator: How about "a winter's day," That would scan all right.

Witness: Yes, but nobody wants to be compared to a winter's day.

Interrogator: Would you say Mr. Pickwick reminded you of Christmas?

Witness: In a way.

Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.

Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.

Turing is not asking the computer to share its feelings – he is proposing tests in which the investigator is trying to get the intelligence to accomplish something, which is (in many cases) a characteristically human task. He's asking it to perform cognitive work.

This seems like a thing it would be easy to trick yourself about, for the same reason that teachers can trick themselves into thinking they're testing knowledge of the material, when they're just testing whether the students can guess the teacher's password. This is why the Turing test criterion can't just be "something a human can do but a computer can't" – we need a simple, intuitive test to determine whether the criterion is valid. That's why I'm proposing this particular strategy:

Try to extract work from the intelligence, in a way that advances your actual interests.

Don't test whether the ghosts run away when Pac-Man eats the energizer pellet. Test whether you can exploit their chasing behavior to win more points. Don't ask the computer to explain Shall I compare thee to a summer's day?. Ask it to give you feedback on your own work, to help make it better.

5 Comments

Decius 2017-01-04 at 1:49 pm UTC

I feel like saying that a computer being abusive passes the Turing test in a way, but they shouldn't. Humans being abusive in the same way should be failing the Turing test.

Benquopost author 2017-11-04 at 8:15 am UTC

Equivalently: humans who fail to be bored by such abuse are not making the full human use of their own intelligence.

Aceso Under Glass 2017-01-04 at 4:43 pm UTC

I don't think this changes your conclusion, but I think you're too quick to dismiss the fact that your friend was wrong about the pacman monsters. I was really surprised to read that. If you'd told me Q-BERT, I wouldn't have been surprised at all. If you'd asked me what Q-BERT's monsters did I think I would have guessed random-walk. Neither of us ever did the formal experiments you describe around pacman, but we each had models that were good enough for the level of pacman we wanted to play. "Makes beliefs pay rent" doesn't just mean "test wrong ideas" to me, it means "having a model whose benefits outweigh its complexity costs." Being more correct isn't worth the costs in all circumstances.
I push back against this because I think the rationalist habit of treating destruction testing as the only valid form of data annoys me and is costly. I don't think intuition or fuzzy models should be given the credence of tested formal models, but they're often good enough for the purpose.

Benquopost author 2017-01-04 at 6:51 pm UTC

I think I see what you mean - this framing implicitly devalues "gift of fear" style judgment, skill-learning, etc.

Aceso Under Glass 2017-01-04 at 10:39 pm UTC

Yes. that is part of it. Also, a big relief I get from the kind of allies you describe is that I can *pay less attention*, because if I miss something they will catch it. Having an ally be wrong removes that benefit.