I have faith that if only people get a chance to hear a lot of different kinds of songs, they'll decide what are the good ones. -Pete Seeger
A lot of the discourse around honesty has focused on the value of maintaining a reputation for honesty. This is an important reason to keep one's word, but it's not the only reason to have an honest intent to inform. Another reason is epistemic and moral humility.
These considerations don’t just apply to cases of literal lying, but any attempt to motivate others using generic persuasion techniques that don’t rely on the factual merits of your case.
- 1 Give me absolute control…
- 2 My mirrored room
- 3 The blizzard of the world has crossed the threshold
- 4 Take the only tree that’s left
- 5 Love’s the only engine of survival.
Give me absolute control…
There's a lot of superficial appeal to strategies involving conventional, generic persuasion tactics that aren’t truth-seeking. They can put you in control of more general resources such as attention or money, and resources are often useful for accomplishing goals. A lot of consequentialists correspondingly think that when furthering cooperative or altruistic goals, it's morally right to use this kind of strategy, because it's being used in the service of good ends.
But the appeal of these strategies often rely on a claim of epistemic privilege. In many cases, if you’re justified in overriding others’ autonomy, it must be because you know better. So if you don’t have a good reason to think you know better, you may not want to control others’ actions after all.
My post on matching donation fundraisers works through an example of how generic persuasion techniques tend to destroy information. I’ll work through two more examples here – one on giving advice, the other on getting it. Then I'll talk about the problem of bottlenecking. If you try to take personal responsibility for others' behavior, you become the bottleneck, justifying increasing demands for resources to solve the bottleneck problem.
A few years ago, I started working chin-ups and pull-ups into my daily routine. As luck would have it, a few days later my neck started to feel acutely sore. I looked up the symptoms on the internet, and the only known condition that could cause this was neck cancer.
I happened to have a doctor’s appointment scheduled already. Let’s imagine that I was pretty sure I had neck cancer. What strategy should I have used?
I could have simply told my doctor about my symptoms and asked for his advice. However, if he’d dismissed my neck cancer as some other thing, then I wouldn’t have gotten treatment, which might have led to much worse outcomes. The alternative strategy would have been to do my utmost to persuade my doctor that I had neck cancer, making sure to describe the symptoms in ways that fit standard diagnostic checklists, to ensure that it didn’t go untreated.
The problem with applying that alternative strategy, of course, is that I didn’t have neck cancer at all, so treating that condition would have been pointlessly harmful. My neck muscles were just sore from unaccustomed exercise.
Let's say a friend is having relationship conflict, and asks me for advice. After I hear their description of the problem, the solution seems obvious to me, but I am worried that they will fail to implement this strategy even if I tell them to.
One thing I might do is tell my friend what I think the major considerations are in each direction, and why I think they weigh in favor of my proposed policy on balance. This is the strategy of honestly informing.
I might worry that if I follow the "honestly inform" strategy, my friend will not act on my advice, perhaps out of weakness of will, perhaps because I was insufficiently persuasive, perhaps because they'll fixate on the downsides. And if they do the wrong thing, they'll get a worse outcome. I want the best outcome for my friend. So I decide to adopt the maximally persuasive strategy, using rhetorical flourishes and compliance techniques. I make all the forceful arguments I can for the action, and avoid discussing downsides that might be demotivating. I get my friend to commit to acting, and say that I'll follow up with them in a few days to see if they've done it yet. I give the action momentum.
But what if I'm wrong?
On the first plan, my friend's much more likely to be persuaded by me if I'm right, than if I'm wrong. My friend might know that one of the key things I'm considering is irrelevant, because of information I don't have. They might know that one of the downsides is intolerable for them. Overall, they have lots of information I don't about their own situation and preferences, and honestly informing them lets them combine my information with their own.
On the second plan, the force of my persuasion now depends less on my having correctly identified the best action, or understanding my friend's situation at all. By using generic persuasion techniques, I have destroyed information that the system of my friend and me might have otherwise used. This is true even if I don't tell any lies at all – any kind of generic persuasion will have this effect.
Of course, sometimes your friends want encouragement. Recently, in a gathering of friend, a grad student I know mentioned that she was procrastinating on sending a paper to her advisor for advice on what journals to submit it to. Our friend group first checked in to see whether there was any endorsed prudential reason why she should not send that email. She said no. Then, and only then, did we start to encourage her to submit it. We were about to go out together, but decided to put that on hold until she wrote the email, attached the paper, and hit "send".
The checking in first was important. If we hadn't done that, we might have pressured a friend into sending something that wasn’t ready, or rushing a sensitive email. I was able to do this for my friend because I was able to check in with her personally, but this sort of intervention doesn't scale well.
We checked in first, because we trusted that she wouldn't pointlessly lie to us; she wanted what was best for her, at least as much as we did.
I ought to tell my friend the truth, not just because I want her to trust me; I ought to do it because I trust her to want to do what's right with the information.
…over every living soul
Let's go back to the example of giving romantic advice to a friend experiencing relationship conflict. If I'm attracted to my friend, I might be tempted to advise them to date me instead. After all, I'm in control of myself, so I can promise, with a high level of confidence, to instantiate the partner behavior they prefer. However, even with polyamory, this very quickly leads to a horrific bottlenecking problem – there's only one of me, and I only have so much time.
Note that this proposed solution is not tempting solely because I'd get something I want – it's tempting because I expect to cause my friend to have the experiences they want. And yet, it's an obviously wrong solution almost all of the time. It doesn't scale.
The right answer to being only one person, with limited capacity to allocate resources effectively, is to only try to control what you think you can do an especially good job with. The alternative answer, if you see relinquishing control as irresponsible, is to try to be more than one person. For instance, you might try to staff up an institution, made of multiple people, under your control, to achieve your aims.
An instructive example is the Open Philanthropy Project's failure in 2016 to give away even its "overall budget for the year, which [it] set at 5% of available capital." Of the remainder, 100% went to the Open Philanthropy Project's implied reserve, and 0% went to bolstering the reserves of grantees. (The 95% of available capital that wasn't budgeted was of course also allocated to the Open Philanthropy Project's implied reserve.) The Open Philanthropy Project is trying to solve this problem by staffing up, by acquiring more talented people to evaluate potential grants and program areas, but does not seem to have adopted the simple expedient of relinquishing control by giving grantees extra money until it meets its spending goals.
You don’t write up this kind of decision so clearly and prominently, if you're deliberately acting in bad faith. This is what you do it you think you're doing the right thing, because you're morally obliged to take responsibility for whatever you can. This sort of behavior follows naturally from believing that honesty is about deserving the trust of others (the reputation argument), but not about trusting others (the humility argument).
My mirrored room
You can try to acquire resources by justifying your bid according to the legitimate resource-allocation mechanisms, or you can try to override society’s resource-allocation mechanisms in order to grab more for itself then you should expect that one of two things are true. Either society has effective safeguards against such unjustified power grabs, or it lacks those safeguards.
Let’s say that for whatever reason you think the world’s badly run enough that the responsible thing is to simply seize control of as much as you can, and then optimize it yourself. If if there are sufficient safeguards against your strategy – then you should expect to lose. After all, you’re doing exactly what these safeguards are meant to prevent. So if you see these tactics appear to work, you should worry that such safeguards don’t exist.
Why worry? Because of the efficient market hypothesis – if there is a huge opportunity just lying there, someone will probably take it. You’re unlikely to be the first person or group to attempt a power grab. And if there are other groups trying to grab power through non-truth-tracking persuasion tactics, you should expect this to corrupt your epistemic environment. You should not expect to have good information; you’re surrounded by processes trying to grab credit, like your own. What you see is what you are.
If you want to do anything interesting, you’ll first have to figure out how to get out of that mirrored room.
Things are going to slide in all directions
In a world where there are safeguards to unjustified power grabs, individuals and groups will have defenses against mere claims that they owe you resources. These defenses will not just be skeptical, but punish claims that aren’t backed up by clear evidence of convergent interests. Otherwise, we’d all just spam each other with requests in the hopes that one would go through. It would be a world of “Nigerian prince” 419 scams. This is why, in practice, you generally have to stake some preexisting social status to be listened to, to have an unusual claim even considered.
People functioning in such an environment will find that it’s a good idea to follow some generalized decision heuristics, to defend themselves, and also to avoid triggering such defenses. These heuristics will involve things like taking into account the fact that both sides have a perspective. They’ll take into account the fact that some of our preferences converge – there are large potential gains from trade, humans sympathize with other humans – and some of them diverge – we can’t live together and all sit atop the local status hierarchy. They’ll take into account that we don’t all have the same information, and because of divergent incentives, don’t automatically trust each other.
The study of which heuristics enable agents with only partly overlapping preferences to live together is called ethics, or morality. Some branches of morality look for firm general decision rules for agents trying to live together – deontology. Others pay more attention to the details of human cognition, and focus on how to align your perceptions of advantage and disadvantage with a cooperative equilibrium – virtue ethics. Some subfields such as eudaemonic ethics try to get there from the starting point of an individual’s preferences and well-being. Others, such as variants of utilitarianism, focus on directly defining the preferences we have in common.
High-trust societies, with coherent, entrenched ethical norms, are able to achieve high levels of coordinated production. In lower-trust societies, people grab what they can, and use it primarily on their private interests rather than public goods. It’s a many-way tug-of-war, and things slide in all directions.
The breaking of the ancient western code
Douglas Lenat’s EURISKO, an early promising attempt at AI, provides an illustrative example. (EURISKO’s name comes from the same root as Archimedes’ famous “Eureka!”) The basic way EURISKO worked was that it generated a bunch of candidate heuristics to try to solve a problem. Then it gave each of them credit for how much it contributed towards getting a good answer, and mutated the best candidates to try to iterate towards better solutions.
EURISKO’s heuristics didn’t just affect the program’s answer directly; they could also affect other parts of the program. In The Nature of Heuristics, Lenat wrote:
One newly synthesized heuristic kept rising in Worth, and finally I looked at it. It was doing no real work at all, but just before the credit/blame assignment phase, it quickly cycled through all the new concepts, and when it found one with high Worth it put its own name down as one of the creditors. Nothing is "wrong" with that policy, except that in the long run it fails to lead to better results.
If optimizing for assigning yourself credit works as a way to get credit, then you should assume that this has already happened. If this persists inside EURISKO, perhaps because Douglas Lenat is asleep, then you should not expect EURISKO’s predictions to be anywhere near as good as EURISKO says they are. If this happens inside society, perhaps because our lawmakers are dormant, then you should not expect trusted institutions in society to be anywhere near as reliable sources of information as they seem to be reputed.
Nothing is “wrong” with the policy of non-truth-tracking self-promotion, except that in the long run it fails to lead to better results. Suppressing such pathological heuristics are in fact what much of Lenat’s ongoing work on EURISKO amounted to:
When run for very long periods of time, EURISKO invents ways of entering infinite loops (e.g., a mutant heuristic which manages to alter the situation so that it will soon be triggered again). Much of our current work involves adding new capabilities to the program to detect and break out of such infinite loops, and to compile its experiences into one or more heuristics which would have prevented such situations from arising. It is not always easy to explain what is wrong with a certain "bad product".
Likewise, much of the work of political and moral theory involves adding new capabilities to detect and break out of self-justificatory, self-referential, or otherwise pathological heuristics, and to compile humanity’s experiences into one or more heuristics which would have prevented such situations from arising. It is not always easy to explain what is wrong with a certain "bad product".
Won’t be nothing you can measure anymore
I care about this because I want to have some idea of what I’m doing in the world.
Health and medicine
I’d like to be able to buy medicine that improves my health and doesn’t harm it. I might hope that if it turned out that scientists were unable to replicate major findings in cancer research, this would be news, and I’d hear about which things no longer have a firm evidence base. Instead, I found out about this through this weirdly defensive Vox article by Julia Belluz about how failed replications just mean science is hard, not that we should lower our confidence in anything in particular. I’m a fairly skeptical and moderately well-informed consumer of medical news, but I’m not sure where else I’m being suckered by similarly slanted coverage implying that everything is fine.
Policy and economics
I’d like to know whether my country’s elected officials are implementing policies that economists largely disagree with. Instead, I get a New York Times article written to create the appearance that economists disagree with a policy when in fact the profession is largely undecided. I’m a fairly skeptical and well-informed consumer of economic news, but I’m not sure where else I’m being suckered by similarly slanted coverage implying that a policy is erroneous.
I’d like to be able to consume food that doesn’t cause undue suffering to animals. I’m pretty sure based on Brian Tomasik’s and my calculations that when measured in animal-days, chicken and eggs cause a disproportionate amount of farmed animal suffering. However, I’m not sure where to look for reliable information about how morally relevant chickens are; I’ve received confident assurances from people who’ve looked at the research, but these confident assurances imply that they’ve solved important unsolved problems in the philosophy of consciousness and qualia. Then, it’s easy enough to avoid factory-farmed chicken for the most part, but what to do about eggs? Can I just buy cage-free? I’ve heard lots of people arguing for veganism claim that cage-free eggs are hardly better than the regular kind. My current strategy is to preferentially buy pasture-raised eggs but not bother putting effort into preferentially seeking out cage-free over conventional eggs. But, it’s not clear this makes sense; The Open Philanthropy Project’s Lewis Bollard describes more recent studies showing a substantial welfare improvement from cage-free systems. Turns out the original studies were conducted by the egg industry, which had an interest in not being asked to change its practices. The efficient market hypothesis strikes again: if truth-indifferent persuasion works, and someone has a strong interest in affecting your opinion, then your opinion has already been affected by truth-indifferent persuasion.
I’d like to be able to reason about the chances that humanity will wipe itself out, since it seems relevant to the future of humanity. I’ve regularly seen claims that there are enough nuclear weapons to wipe out humanity many times over, generally in the context of urging nuclear disarmament. Risk of a nuclear war wiping out humanity was a substantial part of why I endorsed Clinton over Trump. The apparent success of humanity in surviving a half decade when relatively few decisionmakers could have wiped us out affected my assessment of existential risk’s urgency. But it turns out I was misinformed. Carl Shulman actually bothered to talk to experts on nuclear winter, who told him that the chance a nuclear winter would wipe out humanity was between 1 in 10,000 and 1 in 100,000.
Effective altruism, far and near
I’d like to know whether there are opportunities to trade money for large improvements in human well-being. Peter Singer, when making the case for giving to the developing world, often implies cost-effectiveness estimates on the order of less than two hundred dollars per life saved, sometimes an order of magnitude lower. These estimates are wrong, as Singer himself agrees. Even GiveWell's "cost per life saved" estimates in the thousands of dollars are not to be taken literally. And yet, a casual observer might see these headline numbers and actually try to make inferences using them.
I don't know of anyone who's literally become a contract killer on the assumption that they can (just barely) save more lives than they end on net at the going price of $3,400. But I do know of "effective altruists" who seem to assume that the choice they make to give 10% of their salary to developing-world charity is much more important than the choices they make about how they behave locally, such as career choice. Medical doctors have ample opportunity to make large improvements in human well-being simply by doing their job with unusual integrity. And yet, probably the most prominent medical doctor in EA has written about how it’s horrifically guilt-inducing not to have a clear giving threshold of 10% of his income, after which he’s done his part. Meanwhile, he compares his work environment to a concentration camp. He writes positive reviews of books that portray his chosen specialty of psychiatry as a massive campaign of gaslighting, abuse, and torture. Something funny is going on here.
As in many other cases, I don’t think the case I’m picking on is someone being especially silly. I think they’re just especially honest, open, and clear about how silly they’re being. I have other friends who have in the past massively underinvested in their long-run potential, to scrape together more money to give to charity in the present – in some cases, originally motivated by Singer-style numbers.
All this seems like a bad misalignment in priorities, and one promoted by well-intentioned exaggeration of how good giving opportunities are. 80,000 Hours seems like it’s trying to correct some of this misalignment within Effective Altruism, but in part through an offsetting marketing campaign. If the correction is also making important mistakes, is it making it easy to see that? I don’t know.
The blizzard of the world has crossed the threshold
Perhaps there used to be a time when you could trust the newspapers to be trying to inform you. Perhaps it just used to be harder to check. Either way, I’m pretty sure that my broader epistemic environment is not very good, so that if I try to do weird things based on extraordinary claims without an exceptional level of personal epistemic vigilance, I’ll just fail. Perhaps quite badly.
To do better, I need an environment that helps me do better. I need an environment with safeguards at multiple levels, to protect my access to reliable information.
The hole in your culture
What might these safeguards look like? Perhaps they’ll look like resistance to unjustified claims of superiority. Skepticism of general claims to be the best. Social norms that encourage criticism and checking things for yourself, especially if the thing you’re criticizing has social momentum. Rewarding people for the outcomes they’re able to create, not the inputs they’ve already been able to demand.
One thing I’ve changed my mind on while writing this post is the value of criticizing noncentral claims people make. I used to think that this was antisocial because it directed attention away from the most relevant parts of the argument. I still think that a dearth of recognition of an argument’s central claims is a problem. But I don’t think that nitpicking per se is a problem. The reason is spillover effects. For the same reason that distorting or selectively reporting reality causes unpredictable amounts of harm even when done for a good cause, letting errors go unchallenged lets them propagate, and causes people’s world-models to get worse.
Take the only tree that’s left
It’s far from hopeless. Once upon a time there were no philosophers questioning the just and the true, and then there were. Once upon a time there were no scientists actively looking for ways they might be wrong, and then there were. Once upon a time there were no Quakers refusing to lie even when it made them stick out as weirdos, and then there were.
But I don’t think we have to start anew. Some parts of academia still had their act together enough – or perhaps we should instead say that they quaked independently enough – to have a replication crisis. The Rationalist movement has collected a lot of people who want to do better than their culture. And the Effective Altruist movement benefits from much of the same intellectual tradition – luminaries like Toby Ord, Rob Wiblin, and Carl Shulman used to blog together with Robin Hanson, Eliezer Yudkowsky, and others at Overcoming Bias. Philip Tetlock’s Good Judgment Project is doing valuable empirical research into how to make correct predictions. And probably there are sound efforts I’m not familiar with, elsewhere, that we don’t know about yet, and that don’t know about us.
I like the metaphor of a tree for epistemically sound, production-oriented strategies. They grow slowly, by the gradual accretion of progress. They mostly interact with their surroundings by producing, rather than consuming. They’re not concerned with sudden action, with attunement, with capturing the advantage of the moment. So, of course, in a war of all against all, without defenses they’d be some of the first things to get eaten. And yet we have some trees left.
Inch by inch, row by row
Or perhaps we’re not quite trees yet. Perhaps we’re just shrubs.
The Rationalists are pursuing a radical individual epistemic practice. One thing that parts of academia seem to have, that groups like the Rationalists don't quite have, is something like formal mechanisms for registering when arguments have been defeated, or what the standard objections are to a point of view. Individual Rationalists try to cultivate the personal ability to do so, but as an intellectual community we don't have the same ability to accumulate progress. (Robin Hanson writes about related issues here).
I'd like to figure out how to combine these two virtues and build an intellectual culture where individuals are expected to propagate beliefs efficiently, and there's also public accounting about progress, which arguments are discredited, which arguments are generally accepted. Arbital seems to be working on a technical foundation for something like this.
Maybe we can make this work. But if we’re going to build something tall, the foundation needs strengthening first. As Longfellow writes in The Builders:
In the elder days of Art,
Builders wrought with greatest care
Each minute and unseen part;
For the Gods see everywhere.
Let us do our work as well,
Both the unseen and the seen;
Make the house, where Gods may dwell,
Beautiful, entire, and clean.
Else our lives are incomplete,
Standing in these walls of Time,
Broken stairways, where the feet
Stumble as they seek to climb.
Build to-day, then, strong and sure,
With a firm and ample base;
And ascending and secure
Shall to-morrow find its place.
Thus alone can we attain
To those turrets, where the eye
Sees the world as one vast plain,
And one boundless reach of sky.
We can’t do the right thing unless we can see. And we can’t see until we can build, collaboratively, with reliable error-checking so that our errors cancel out and our knowledge compounds, rather than the other way around.
Love’s the only engine of survival.
In one sense, this is a message of relaxation. Focus on getting your part right, not on jumping to the finish line. And yet, this relaxation lets you hold yourself to the most exacting standard, and is the only way to build a tower with its top in the heavens.
We don’t get there by taking control, taking responsibility for others’ choices, trying to control others’ actions directly. We get there by honestly and humbly behaving according to heuristics that we would like more agents like us to follow. We don’t get there by emphasizing strategies to pursue our divergent goals, but by leaning into our convergent goals.
The methods of convergent goals are different than the methods of divergent ones. When you find others you have goals in common with, such that you all get what you want when any of you become more capable, then you don't have to keep strict accounts. You can just help out your collaborators when you see an advantageous way to do so. They'll be motivated to do the same, because you too are an agent working towards your common goals.
The plan of winning first, doing good second, has been tried, over and over again. The future it’s pushing us towards is not a nice place.
(Here are the lyrics for those of you who want to see the words of the prophets written instead of hearing them sung.)
This dim prophecy is not the only possible future, though. But to get a different outcome, we need to do something different. There's still time to plant seeds, but we should get to work – the time is close at hand: