Recently, a friend looking to support high-quality news sources by subscribing asked for recommendations. I noted that New York Magazine had been doing some surprisingly good journalism.
I'd sneered at that sort of magazine in the past – the sort that people mainly buy to see who's on the annual top doctors list or top restaurants list. But my sneering was inconsistent. I'd assumed that such an obviously gameable metric must already be corrupt – but when I lived in DC, Washingtonian Magazine's restaurant picks were actually pretty good, and my girlfriend found a really good doctor on the Top Doctors list. Nor was he an expensive concierge doctor – he took her fairly ordinary health insurance. I'd assumed there would be paid placement, but there wasn't. The methodology of such lists is actually fairly clever: they survey doctors, asking for each specialty – if you needed to see a doctor other than yourself in this specialty, whom would you go to? Now I live in Berkeley, and the last time I needed to see an ear doctor, I found one on the list just a few blocks from my house – and he was excellent.
But even after correcting for my prejudices, New York Magazine is special. They recently published some of the best science reporting I've seen – it's nominally about the Implicit Association Test, but it's really about the sorts of bad science that contributed to the replication crisis. Here are some excerpts I thought were especially clear:
What constitutes an acceptable level of test-retest reliability? It depends a lot on context, but, generally speaking, researchers are comfortable if a given instrument hits r = .8 or so. The IAT’s architects have reported that overall, when you lump together the IAT’s many different varieties, from race to disability to gender, it has a test-retest reliability of about r = .55. By the normal standards of psychology, this puts these IATs well below the threshold of being useful in most practical, real-world settings.
In a 2007 chapter on the IAT, for example, Kristin Lane, Banaji, Nosek, and Greenwald included a table (Table 3.2) running down the test-retest reliabilities for the race IAT that had been published to that point: r = .32 in a study consisting of four race IAT sessions conducted with two weeks between each; r = .65 in a study in which two tests were conducted 24 hours apart; and r = .39 in a study in which the two tests were conducted during the same session (but in which one used names and the other used pictures). In 2014, using a large sample, Yoav Bar-Anan and Nosek reported a race IAT test-retest reliability of r = .4 (Table 2). Calvin Lai, a postdoctoral fellow at Harvard who is the director of research at Project Implicit, ran the numbers from some of his own data, and came up with similar results. “If I had to estimate for immediate test-retest now, it would be r ~= .35,” he wrote in an email. “If it was over longer time periods, I would revise my estimate downward although I’m uncertain about how much.” (In emails, Greenwald argued that Lai’s figures should be adjusted upward using the so-called Spearman-Brown formula to account for the fact that they stemmed from IATs that weren’t full-length, but Blanton strongly pushed back on that claim. I emailed a few statisticians asking them to arbitrate the dispute and basically got a hung jury.) (Update: Lai emailed me after this article went up and said that in light of research published since he provided me with the original estimate, he’d now estimate the true value to be in the neighborhood of r = .42.)
One is that the most IAT-friendly numbers, published in a 2009 meta-analysis lead-authored by Greenwald, which found fairly unimpressive correlations (race IAT scores accounted for about 5.5 percent of the variation in discriminatory behavior in lab settings, and other intergroup IAT scores accounted for about 4 percent of the variance in discriminatory behavior in lab settings), were based on some fairly questionable methodological decisions on the part of the authors. The Oswald team, in a meta-analysis of their own published in 2013, argued convincingly that Greenwald and his colleagues had overestimated the correlations between IAT scores and discriminatory behavior by including studies that didn’t actually measure discriminatory behavior, such as those which found a link between high IAT scores and certain brain patterns (these studies, in fact, found some of the highest correlations). The Oswald group also claimed — again, convincingly — that the Greenwald team took a questionable approach to handling so-called ironic IAT effects, or published findings in which high IAT scores correlated with better behavior toward out-group than in-group members, the theory being the implicitly biased individuals were overcompensating. Greenwald and his team counted both ironic and standard effects as evidence of a meaningful IAT–behavior correlation, which, in effect, allowed the IAT to double-dip at the validity bowl: Unless the story being told is extremely pretzel-like, it can’t be true that high IAT scores predict both better and worse behavior toward members of minority groups. If one study finds a correlation between IAT scores and discriminatory behavior against out-group members, and another, similarly-sized study finds a similarly sized correlation between IAT scores and discriminatory behavior against the in-group members, for meta-analytic purposes those two studies should average out to a correlation of about zero. That isn’t what the Greenwald team did — instead, they in effect added the two correlations as though they were pointing in the same direction.
And this is the middlebrow city magazine for New York. The other magazine is the New Yorker. Which happens, incidentally, to be the other source of high-quality reporting I'd recommend. What's so good about New York? I wonder whether it's the theater.
A friend was recently considering moving from the San Francisco Bay Area, back to New York. At first, they thought that their social life there couldn't plausibly measure up to the level of intellectual engagement and high-quality friendship they'd found here. At first, I agreed.
But then I thought, what if we don't have to be stupid about this? What if, instead of trying to find the exact same kinds of friends one left behind in the Bay, one looked for where New Yorkers were trying to make something real and interesting?
My friend would probably end up in a finance job if they moved back to New York. There's a certain sort of honesty in finance, where you're not expected to pretend you're not motivated by money. In fact, people look at you funny if you talk about how you love your job – which I suspect is more conducive to actually loving your job than the opposite norm. But, ultimately, I don't expect that to lead to full authentic engagement with reality.
Then I thought about the other major New York industry - one where the creative class start new entrepreneurial ventures on a small scale in the hopes of making it big. I'm speaking, of course, of off-off-broadway theater. And, more generally, the startup arts scene.
Most new businesses in the developed world aren't about providing basic necessities anymore, so there's no point in old prejudices in favor of "industrialists" against impractical artsy types. A theatrical performance is a kind of organized production that requires both good taste and operational competence. Hamilton and its ilk might be the Apple Computer of the east coast.
The arts engage with what it is to be human, they grapple with philosophical and political questions, they are an especially human use of human beings, and they benefit from a very, very long tradition. And, there's an important sense in which theater is more honest than large portions of the new app-driven tech economy – they're both pretending, but theater is overtly pretending. They both say they're going to change the world, but theatrical artists actually do change the way people talk about their lives, the narratives people have available.
So, perhaps it's no coincidence that the magazine smart enough to retain Atul Gawande (of checklist manifesto fame) as a staff writer, is one that has a focus on the New York arts scene. That the other New York magazine, with extensive theater listings, even unto the off-off-Broadway shows, gets science reporting right.
So I argued that my friend should try to make friends with theater people if they moved to New York. That these are the New York equivalent to the Berkeley Rationalists and Bay Area startup founders - or better.