If there's anything we can do now about the risks of superintelligent AI, then OpenAI makes humanity less safe.
Once upon a time, some good people were worried about the possibility that humanity would figure out how to create a superintelligent AI before they figured out how to tell it what we wanted it to do. If this happened, it could lead to literally destroying humanity and nearly everything we care about. This would be very bad. So they tried to warn people about the problem, and to organize efforts to solve it.
Specifically, they called for work on aligning an AI’s goals with ours - sometimes called the value alignment problem, AI control, friendly AI, or simply AI safety - before rushing ahead to increase the power of AI.
Some other good people listened. They knew they had no relevant technical expertise, but what they did have was a lot of money. So they did the one thing they could do - throw money at the problem, giving it to trusted parties to try to solve the problem. Unfortunately, the money was used to make the problem worse. This is the story of OpenAI. Continue reading →
I am surrounded by well-meaning people trying to take responsibility for the future of the universe. I think that this attitude – prominent among Effective Altruists – is causing great harm. I noticed this as part of a broader change in outlook, which I've been trying to describe on this blog in manageable pieces (and sometimes failing at the "manageable" part).
I'm going to try to contextualize this by outlining the structure of my overall argument.
Why I am worried
Effective Altruists often say they're motivated by utilitarianism. At its best, this leads to things like Katja Grace's excellent analysis of when to be a vegetarian. We need more of this kind of principled reasoning about tradeoffs.
At its worst, this leads to some people angsting over whether it's ethical to spend money on a cup of coffee when they might have saved a life, and others using the greater good as license to say things that are not quite true, socially pressure others into bearing inappropriate burdens, and make ever-increasing claims on resources without a correspondingly strong verified track record of improving people's lives. I claim that these actions are not in fact morally correct, and that people keep winding up endorsing those conclusions because they are using the wrong cognitive approximations to reason about morality.
Summary of the argument
When people take responsibility for something, they try to control it. So, universal responsibility implies an attempt at universal control.
Maximizing control has destructive effects:
An adversarial stance towards other agents.
These failures are not accidental, but baked into the structure of control-seeking. We need a practical moral philosophy to describe strategies that generalize better, and that benefit from the existence of other benevolent agents rather than treating them primarily as threats.
I've read a few business books and articles that contrast national styles of contract negotiation. Some countries such as the US have a style where a contract is meant to be fully binding such that if one of the parties could predict that they will likely break the contract in the future, accepting that version of the contract is seen as substantively and surprisingly dishonest. In other countries this is not seen as terribly unusual - a contract's just an initial guideline to be renegotiated whenever incentives slip too far out of whack.
More generally, some people reward me for thinking carefully before agreeing to do costly things for them or making potentially big promises, and wording them carefully to not overcommit, because it raises their level of trust in me. Others seem to want to punish me for this because it makes them think I don't really want to do the thing or don't really like them. Continue reading →
I saw a beggar leaning on his wooden crutch.
He said to me, "You must not ask for so much."
And a pretty woman leaning in her darkened door.
She cried to me, "Hey, why not ask for more?"
-Leonard Cohen, Bird on the Wire
In my series on GiveWell, I mentioned that my mother's friend Charlie, who runs a soup kitchen, gives away surplus donations to other charities, mostly ones he knows well. I used this as an example of the kind of behavior you might hope to see in a cooperative situation where people have convergent goals.
I recently had a chance to speak with Charlie, and he mentioned something else I found surprising: his soup kitchen made a decision not to accept donations online. They only took paper checks. This is because, since they get enough money that way, they don't want to accumulate more money that they don't know how to use.
When I asked why, Charlie told me that it would be bad for the donors to support a charity if they haven't shown up in person to have a sense of what it does. Continue reading →
Effective Altruists talk about looking for neglected causes. This makes a great deal of intuitive sense. If you are trying to distribute food, and one person is hungry, and another has enough food, it does more direct good to give the food to the hungry person.
Likewise, if you are trying to decide on a research project, discovering penicillin might be a poor choice. We know that penicillin is an excellent thing to know about and has probably already saved many lives, but it’s already been discovered and put to common use. You’d do better discovering something that hasn’t been discovered yet.
My critique of GiveWell sometimes runs contrary to this principle. In particular, I argue that donors should think of crowding out effects as a benefit, not a cost, and that they should often be happy to give more than their “fair share” to the best giving opportunities. I ought to explain. Continue reading →
At the end of 2015, GiveWell wrote up its reasons for recommending that Good Ventures partially but not fully fund the GiveWell top charities. This reasoning seemed incomplete to me, and when I talked about it with others in the EA community, their explanations tended to switch between what seemed to me to be incomplete and mutually exclusive models of what was going on. This bothered me, because the relevant principles are close to the core of what EA is.
A foundation that plans to move around ten billion dollars and is relying on advice from GiveWell isn’t enough to get the top charities fully funded. That’s weird and surprising. The mysterious tendency to accumulate big piles of money and then not do anything with most of it seemed like a pretty important problem, and I wanted to understand it before trying to add more money to this particular pile.
So I decided to write up, as best I could, a clear, disjunctive treatment of the main arguments I’d seen for the behavior of GiveWell, the Open Philanthropy Project, and Good Ventures. Unfortunately, my writeup ended up being very long. I’ve since been encouraged to write a shorter summary with more specific recommendations. This is that summary. Continue reading →
I have faith that if only people get a chance to hear a lot of different kinds of songs, they'll decide what are the good ones. -Pete Seeger
A lot of the discourse around honesty has focused on the value of maintaining a reputation for honesty. This is an important reason to keep one's word, but it's not the only reason to have an honest intent to inform. Another reason is epistemic and moral humility. Continue reading →
Perhaps much of what appears to be disagreement on how much dishonesty is permissible is in fact disagreement on how much words have meanings. I'll begin with a brief treatment of the reputation considerations for keeping one's word, and then complicate it. Continue reading →
I've promoted Effective Altruism in the past. I will probably continue to promote some EA-related projects. Many individual EAs are well-intentioned, talented, and doing extremely important, valuable work. Many EA organizations have good people working for them, and are doing good work on important problems.
That's why I think Sarah Constantin’s recent writing on Effective Altruism’s integrity problem is so important. If we are going to get anything done, in the long run, we have to have reliable sources of information. This doesn't work unless we call out misrepresentations and systematic failures of honesty, and these concerns get taken seriously.
Sarah's post is titled “EA Has A Lying Problem.” Some people think this is overstated. This is an important topic to be precise on - the whole point of raising these issues is to make public discourse more reliable. For this reason, we want to avoid accusing people of things that aren’t actually true. It’s also important that we align incentives correctly. If dishonesty is not punished, but admitting a policy of dishonesty is, this might just make our discourse worse, not better.
To identify the problem precisely, we need language that can distinguish making specific assertions that are not factually accurate, from other conduct that contributes to dishonesty in discourse. I'm going to lay out a framework for thinking about this and when it's appropriate to hold someone to a high standard of honesty, and then show how it applies to the cases Sarah brings up. Continue reading →
Sometimes, new technical developments in the discourse around effective altruism can be difficult to understand if you're not already aware of the underlying principles involved. I'm going to try to explain the connection between one such new development and an important underlying claim. In particular, I'm going to explain the connection between donor lotteries (as recently implemented by Carl Shulman in cooperation with Paul Christiano)1 and returns to scale. (This year I’m making a $100 contribution to this donor lottery, largely for symbolic purposes to support the concept.) Continue reading →
This phrasing was suggested by Paul. Here's how Carl describes their roles: "I came up with the idea and basic method, then asked Paul if he would provide a donor lottery facility. He did so, and has been taking in entrants and solving logistical issues as they come up."