Bad intent is a disposition, not a feeling

2017-04-30

It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:

I believe that the ability to expect that conversation partners are well-intentioned by default is a public good. An extremely valuable public good. When criticism turns to attacking the intentions of others, I perceive that to be burning the commons. Communities often have to deal with actors that in fact have ill intentions, and in that case it's often worth the damage to prevent an even greater exploitation by malicious actors. But damage is damage in either case, and I suspect that young communities are prone to destroying this particular commons based on false premises.

To be clear, I am not claiming that well-intentioned actions tend to have good consequences. The road to hell is paved with good intentions. Whether or not someone's actions have good consequences is an entirely separate issue. I am only claiming that, in the particular case of small high-trust communities, I believe almost everyone is almost always attempting to do good by their own lights. I believe that propagating doubt about that fact is nearly always a bad idea.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?

There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but *social intuitions are a central case of the sort of things I would expect humans to get right *by default. I think the burden of proof is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.

Nate continues:

My models of human psychology allow for people to possess good intentions while executing adaptations that increase their status, influence, or popularity. My models also don’t deem people poor allies merely on account of their having instinctual motivations to achieve status, power, or prestige, any more than I deem people poor allies if they care about things like money, art, or good food. […]

One more clarification: some of my friends have insinuated (but not said outright as far as I know) that the execution of actions with bad consequences is just as bad as having ill intentions, and we should treat the two similarly. I think this is very wrong: eroding trust in the judgement or discernment of an individual is very different from eroding trust in whether or not they are pursuing the common good.

Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.

More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.

I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”

Aside from the obvious problems with assuring someone that you're telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.

Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn't be so common if it required an exceptionally bad character, anyway. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.

(Cross-posted on LessWrong.)

[EDITED 1 May 2017 - changed wording of title from "behavior" to "disposition"]

18 Comments

Aceso Under Glass 2017-05-01 at 8:02 am UTC

An example from another angle: lots of straight couples start off wanting to split childcare equally after breastfeeding is done. They also want to minimize total work. You can't do both; the baby is so much more used to the other that she will always have comparative advantage in comforting them, and that advantage will be sustained unless you deliberately let the dad to an inferior job for a while. Which means listening to your kid scream while you're both trying to sleep and have work the next day.

Benquopost author 2017-05-01 at 5:38 pm UTC

Can you say a bit more about how that applies to this? I'm missing the connection somehow.

Aceso Under Glass 2017-05-01 at 10:30 pm UTC

In both cases, if you make the naive best choice without considering the principles at play, you will end up abandoning the principle.

Benquopost author 2017-05-02 at 12:17 pm UTC

Thanks, I think I see now - the common thread is that for the most part rules/heuristics are the right level on which to make decisions, not individual actions. So the question should not be "who is best for tending the baby now?" / "what is the locally optimal behavior around bad faith accusations?", but, in both cases, what decision rules lead to good long-run outcomes.

Notify me of replies by email

Justis Mills 2017-05-01 at 11:13 am UTC

It could be adaptive to have a low threshold for detecting bad speaker intent if false negatives are way more dangerous than false positives. Getting swindled can be ruinous. Being suspicious of a well-intentioned person might be less so. I wouldn't be surprised if the adaptive level for bad-actor-suspicion is triple the actual prevalence of bad actors.

G Gordon Worley III 2017-05-01 at 3:27 pm UTC

Agreed. Just to comment on this specific point and not the piece as a whole, humans seem to generally prefer false positives to false negatives. This seems likely to be a result of false positives being less likely to kill you than false negatives, so animals on the whole take this approach. From what I can tell much of human maladaptation to the modern environment is also a result of this same preference for false positives to false negatives because now the environment is so safe that favoring false positives can end up being a worse strategy than favoring false negatives, and this seems born out in that most advice aimed at Industrial and post-Industrial folks says to take more risks not less.
This is not to say humans are not well calibrated to social situations, only that for a fixed level of accuracy we should expect more false positives to false negatives for most behaviors that ancestrally could have resulted in (genetic) death since for any fixed level of accuracy all you can do is shift between making type I and type II errors.
I don't think this actually breaks the overall argument, though: it's just not evidence supporting the supposition of where the burden of proof lies.

Benquopost author 2017-05-01 at 5:41 pm UTC

That would suggest that a 2:1 ratio of false to true positives is not evidence that the accusation rate is maladaptively high. Nate's not just saying that there are many false positives - he's saying that we're burning the commons and we should stop.

Wes 2017-05-02 at 5:56 am UTC

I'm confused by the phrase "mens rea is not really a thing." I've practiced criminal law, and mens rea is most definitely a thing we use to determine the culpability of bad actors. I don't think I understand how you're using it in this context.

Benquopost author 2017-05-02 at 10:42 am UTC

I mean it's philosophically incoherent. Likely the law has lots of tests in particular cases that give answers, but there's a sense in which it just shouldn't be possible to do something while fully comprehending how it's wrong.

Wes 2017-05-02 at 12:41 pm UTC

Oh, gotcha. I think the law, at least the way it's done in the model penal code, has it right. It determines the culpability of the actor by examining their understanding of the likelihood of their actions causing harm (or whatever action is being discouraged). There are four different mental states specified in the code:
- Purpose: a person is considered to have acted purposefully when the harm done was the conscious goal of the actor. When a person takes an action with the explicit goal of causing harm, their mental state is considered purposeful. - Knowledge: a person is considered to have acted knowingly when they were aware that their actions would almost certainly cause harm, but such harm was not their conscious goal. When a person takes an action that they know will cause harm, their mental state is considered knowledgeable. - Recklessness: a person is considered to have acted recklessly when they acted in conscious and unreasonable disregard of a known risk. A person is reckless where they are aware that their actions have a substantial risk of causing harm, and such risk is unreasonable under the circumstances. - Negligence: a person is considered to have acted negligently when they took an unreasonable risk that they should have known about, but were not consciously aware of. Where a person is unaware that their actions pose an unreasonable risk of causing harm, but they should have known the risk, their mental state is negligent.
I like this formulation. It's not about comprehending the wrongness of one's act, but it is about comprehending the likelihood of harm. A purposeful state of mind is the strongest mental state, and the one people usually think of when they hear "bad intent," but it's only one of four culpable mental states.
I think examining someone's mens rea is a critically important step in conflict resolution. There's a big difference between someone who didn't realize their actions would cause harm and a person who was well aware that their actions would likely cause harm, but proceeded anyway. And in terms of what Soarez is talking about, there's a big difference between talking to someone who *wants* to harm me, someone who harms me through recklessness, and someone who didn't know their words would cause harm.

Aceso Under Glass 2017-05-02 at 10:00 pm UTC

Another example, as described in http://www.econtalk.org/archives/2016/10/cathy_oneil_on_1.html: ML algorithms for sentencing prisoners are not allowed to use race as a factor, but are allowed to use a lot of proxies for race. Worse, they don't (can't) distinguish between associational and causal factors.

Alexander Gordon-Brown 2017-05-06 at 6:08 am UTC

(This was originally a set of comments on Facebook. Reposting it here as per Ben's preference. At time of writing the title said 'behaviour' not 'disposition'):
I got stuck at the title here. Bad intent is a feeling not a behaviour by definition, and bad faith is typically defined to include bad intent.
Are you saying that actual bad feeling is so rare that we needn't have language to describe it (you say mens rea is 'not really a thing') so can repurpose the language to describe your middle ground scenario?
[After Ben said he should have used 'disposition' not 'behaviour' in the title]
That sounds more defensible, but still not very defensible. There's a well-understood meaning of bad intent (or at least, I thought there was?), and the example you have given doesn't meet that definition.
In any case even if you do expand the scope I want to have language to describe actual deliberate intent to deceive; the partner who cheats and then chooses to cover it up, the insurance salesman who sells you something they know you don't need, politicians who say they will do x knowing full well x is impossible. These examples aren't rare, they are extremely common. Mens rea is very much a thing.
I want to clarify why I feel strongly about this. Expanding definitions of words with strong positive or negative connotations is a pretty basic example of rhetorical trickery; you associate the connotations with whatever it is you intend to then praise or criticise. I don't think Ben intended a trick, but in a post ostensibly about honest communication that's not great.

Benquopost author 2017-05-06 at 2:00 pm UTC

Thanks for putting this in public comments!
I somewhat regret accepting Nate's framing of a single axis from good to bad faith, and intend to write about how to talk about this more granularly.
I've gotten a fair amount of pushback on the "mens rea" thing, and it's helped me clarify my position. Mens Rea, in law, refers to a variety of tests applied to behavior, in order to selectively punish behavior that one ought to have been able to know to avoid, since such behavior is more deterrable. Setting up these standards necessarily involves some amount of modeling the possible mental states of people doing various kinds of harms, but the law generally doesn't directly try to measure someone's mental state, and there's no such one thing as having a guilty mind, that can be assessed directly in all such cases.
Likewise, in discourse, we can try to model what someone ought to have been able to figure out and acknowledge, and use this to define standards of conduct. But the actual standards we apply publicly have to be standards of conduct, not standards of subjective states, especially since self-deception is common.
I think it's more natural to talk about different degrees and kinds of bad faith, then to make a stark distinction between bad and good faith. Most of us aren't perfect, but we can try to do better, and create public standards that help.

Wes 2017-05-06 at 3:02 pm UTC

the law generally doesn't directly try to measure someone's mental state, and there's no such one thing as having a guilty mind, that can be assessed directly.

The word "directly" is doing a lot of work there. You're correct that the law usually doesn't directly measure someone's mental state absent a confession, but the law is very concerned with indirectly measuring a person's mental state. The black-letter is law is that a person is presumed to intend the natural and probable causes of their actions. So if you point a gun at someone and pull the trigger, the law presumes that you intended to shoot them. This is just a presumption, though, and can be overcome by showing evidence that, e.g. you thought the gun was fake or unloaded.
Likewise, in discourse, we can presume that people intend to cause the natural and probable results of their statements. We can not only model what they ought to have known, but also what they actually did know and expect. We can't do this with certainty, but we can estimate the degree to which our conversation partner was (a) genuinely trying to reach a common understanding vs. (b) focused on winning an argument. It's an important distinction, and not one I feel we can gloss over by focusing only on conduct. The degree to which a discussion is going to be productive is directly related to the intentions of the people having the discussion. Intentions can be inferred from conduct, but at the root, it's the intentions, not the conduct, that are going to determine whether the discussion is going to be productive.
For instance, there's a big difference between someone committing the sunk cost fallacy because they don't realize that's what they're doing vs. someone intentionally making an emotional appeal to sunk costs despite knowledge that it's a fallacy.
This is, I think, why there's so much focus on "bad faith." Rationalists tend to like argument and debate, if it's done with a person who is genuinely trying to reach an understanding. Even if they're not arguing fairly, as long as they are trying, we tend to forgive easily. But that goes out the window if people are arguing unfairly on purpose. I think that's why this is even an issue.

Benquopost author 2017-05-06 at 3:13 pm UTC

This is just a presumption, though, and can be overcome by showing evidence that, e.g. you thought the gun was fake or unloaded.

My guess is that such evidence would usually have to be able to cause a reasonable person to make the same error, not just a self-report. Is that right? If so, then here's what I'd say is happening:
We have a model of what a reasonable person acting in good faith would do in response to various circumstances. This involves modeling their hypothetical state of mind, as they genuinely try to understand the consequences of their actions, etc. We then use this as the standard against which we measure the behavior of actual people (at least if they're not deeply incapacitated). If we imagine the good-faith hypothetical person might have made the same mistake in the same situation, then the behavior is excused. If not, then we can say that there's mens rea, in cases where that's relevant. This is a lot easier to show, than specific claims about the mental state of the person accused.
I think this is roughly what should happen in discourse as well, though of course there will be lots of difficult to decide cases. This implies that in cases where norms have not yet been worked out, it's helpful to have a neutral arbiter, and refocus the conversation on what amount of interpretive labor we would expect a generically reasonable person in the same situation to do and why, with an eye towards setting good precedent, rather than the mental state of either party.

Notify me of replies by email

Wes 2017-05-06 at 3:36 pm UTC

My guess is that such evidence would usually have to be able to cause a reasonable person to make the same error, not just a self-report. Is that right?

In civil law, that is right. In criminal law, it is not. Every criminal law has a mens rea specified as part of the crime - otherwise it's unconstitutional. Murder is the easiest example. A knowing state of mind gets you first-degree murder. A reckless state of mind gets you 2nd or 3rd degree murder. A negligent state of mind (i.e. a reasonable person would have known, but you didn't) gets you manslaughter. The more severe the punishment for a crime, generally, the higher degree of culpable mental state is required, and the prosecution must prove that mental state beyond a reasonable doubt.
I don't think I'm quite understanding your proposed discourse norm. IIRC, this discussion started with people telling Nate Soarez "I think the people I've been talking with aren't arguing in good faith" or something to that effect. Are you suggesting that, if I'm in such a situation, I should judge against an objective standard? Or I should appeal to a neutral arbiter? I think it would be more effective just to say "I'm starting to feel like you are not arguing in good faith. Here is why. I'm willing to give you the benefit of the doubt, but please explain why you're doing [suspicious behavior]."
I think my issue is that I just don't believe there's an objective standard that will effectively separate the well-intentioned from the willfully obtuse. I think focusing on someone's actual intentions is the only way to determine whether it's worthwhile to keep the discussion going.
I'm continuing this discussion because it seems to me that we are both pursuing a shared goal of figuring out the best discourse norms to have in our communities. You're clearly responding in the spirit of collaboration and understanding, and I hope you feel the same way. At the same time, I almost never comment on most blogs because that sort of thing doesn't happen. You just get people defending their point, and not interested in actual engagement. I infer the intentions from behavior, but if someone shows me that they are interested in a genuine exchange of ideas (even if all they do is say so), I am usually willing to engage.

Comment depth limited to 5, consider continuing this exchange as a letter to the editor

Benquopost author 2017-05-07 at 11:18 pm UTC

I don't think I'm quite understanding your proposed discourse norm. IIRC, this discussion started with people telling Nate Soarez "I think the people I've been talking with aren't arguing in good faith" or something to that effect. Are you suggesting that, if I'm in such a situation, I should judge against an objective standard? Or I should appeal to a neutral arbiter? I think it would be more effective just to say "I'm starting to feel like you are not arguing in good faith. Here is why. I'm willing to give you the benefit of the doubt, but please explain why you're doing [suspicious behavior]."

Rather than presume accusations of bad faith are false, I think the right thing to do there is to taboo "bad faith," and focus on the actual behavior pattern. What sort of interpretive labor do you think the other party should have done, that they didn't? What's the generalization of this principle? How do you know you shouldn't be the one trying harder?
When you ask questions like this, you start to notice that there are degrees of bad faith. Often it's finite and can be overcome if you're willing to do extra work, but this willingness is exploitable and you might at some point want to refuse, and just say "it's not my job to educate you." But sometimes people really do have a mental block that amounts to infinite bad faith, in which case, you're making concessions to Suicide Rock, which is nearly always a bad idea.

Comment depth limited to 5, consider continuing this exchange as a letter to the editor

Benquopost author 2017-05-07 at 11:29 pm UTC

To give a specific example, in a prior post I criticized a proposed discourse norm on the grounds that it would effectively privilege existing institutions asking for resources, and asymmetrically discourage criticism of such requests.
I didn't use a phrase like "bad faith," in part because I worried it would put too much focus on the individual who said the thing, while my sense was that they were just saying what lots of people were implicitly thinking already. But it's important to be able to talk about behavior patterns that promote dishonesty, and if we have strong norms against accusations of bad faith, that means that the people trying to point out such problems have to watch their words carefully, while as far as I can tell, no corresponding burden is applied to people executing such patterns.
It would be much, much better if we'd just figure out how to chill out about accusations of bad faith. None of us are perfect! Most of us are dishonest at least a little bit, because we live in a culture saturated with dishonesty at all levels. When people argue (as Alexander did above) that something I'm doing seems dishonest, and I think they're pointing to something that's plausibly real, I'm thankful for the help noticing, which gives me the opportunity to try to do better next time.

Comment depth limited to 5, consider continuing this exchange as a letter to the editor