Improve comments by tagging claims

I used to think that comments didn’t matter. I was wrong. This is important because communities of discourse are an important source of knowledge. I’ll explain why I changed my mind, and then propose a simple mechanism for improving them, that can be implemented on any platform that allows threaded comments.

Should there be comments?

Comments are optional.

Organizing discourse as a forum with comments introduces substantial structural rigidity. If you don’t like one of the regular posters’ content, too bad, it’s still cluttering up your feed. If you want to start contributing to the discourse, this comes at the cost of pulling everyone else’s attention, willing or no. In forums with high barriers to entry, this means that the move equivalent to “start your own blog” is not available to everyone. In forums with low barriers to entry, this can mean dilution by low-value posts, leading to the departure of discriminating readers. If standards are ambiguous, this can lead to an adverse selection process in which the contributors who are most conscientious about respecting community standards err on the side of not posting, and the least conscientious contributors flood the forum with low-value content, accelerating the process of pushing away the most discriminating participants.

My prior model of how written discourse should work was that people should publish in whatever venue they thought appropriate - often their own blog. If they find someone else’s writing interesting and want to comment on it, they can write their own post, and link to the thing they want to talk about. This has a few benefits. A link-based attention economy means that in order to get the attention of the readers of some content, you generally have to get the attention of the person generating the content. This means that higher-quality writing gets attention, and limits the effectiveness of low-quality trolling. Another benefit is that it allows for partly overlapping clusters. If a blog you like keeps linking to another blog, you can decide whether you like that one, and if you do, you can start reading it directly. Thus, you’ll be directed to blogs that are “nearby” in your local network, but not everyone reading your blog has to read the same set of nearby blogs.

In the early 2000s, a natural experiment happened. Americans on the right and left were both galvanized by the World Trade Center attacks and the consequent wars in Afghanistan and Iraq, and started writing. Two parallel blogospheres were formed - one on the left, and one on the right. And it so happened that some of the first prominent right-wing bloggers didn’t allow comments on their blogs. The result? The left-blogosphere developed parallel thriving communities on prominent blogs. The right-blogosphere had a “get your own blog” ethos, and were generous with their links and hat-tips. For more on this, see Adamic and Glance 2005 (h/t Kevin Drum) and Benkler and Shaw 2010.

My first exposure to the blogosphere was mostly blogs on the right during the relevant period. When I heard tell of problems due to trollish commenters, and difficulties establishing clear standards for posting in forums, my initial response was, “Who needs a forum? If you have something to say, you should start your own blog.” I can no longer wholeheartedly insist that this is the whole solution. But I still think that the comments-links axis is an important one in discourse design, and it’s not obvious that the end of the spectrum “comments’ points towards is always better.

Comments matter.

A friend recently expressed disappointment that an article they published in a forum got low-quality comments that they felt compelled to respond to. When I suggested posting on their own blog, and if necessary closing the comment section, they pointed out something I hadn’t properly considered:

Comments are a high-quality, high-sensitivity measure of engagement.

It’s great when someone links to your work - and perhaps linking would be more common without comment sections. It’s a strong signal that you’ve been heard, and someone thinks your message is relevant - but it happens rarely, and unless you’re already extremely popular, it won’t happen on most of your posts. It’s a bad feedback loop taken on its own; it measures the right thing, but gives extremely coarse-grained feedback. A more sensitive metric is website traffic - how many people did you get to look at your post? But this doesn’t tell you whether anyone was moved to do something on their own based on your post, just how many people felt moved to click through, and maybe share it on social media. It favors feel-good posts and outrage porn over true insight and clear criticism. Judging by web traffic alone, my all-time “best” blog post is one about the query language of the mind. This probably happened because it was shared by some prominent figures in my community. But judging by engagement, one of my first posts pointing out some specific problems with Effective Altruism seemed to be much better at starting productive conversations, based on the comment section. (It turns out to be my second most popular post of all time, but I couldn't have known that as quickly as I knew that I was getting good engagement.) As a result, I’ve written more posts like the latter, and gotten more in-person feedback that those posts were persuasive, than for posts on any other topic I’ve written on.

A second reason comments are important, is that starting your own blog - or writing a whole blog post - is a pretty big deal when all you want to do is signal-boost an article you think is worth reading (upvoting or “liking” is a way to do this and only this, without interposing yourself between the reader and the content), or reply to a specific point that’s only relevant in the context of a particular article (in which case commenting is the natural solution). Oddly, the more hierarchically structured forums and blogs with comments welcome more interaction than the flatter structure of a purely posts-and-links blogosphere.

Improving comments

Selection vs treatment

There’s already been some discussion about how to make comments work better. For instance, Paul Christiano has proposed a machine learning solution in line with his approach to AI safety. Much of the discussion has been about efficiently promoting good comments and detecting and removing bad actors. In short, it’s about improving the quality of observed comments through selection.

I want to talk how to create, not selection effects, but treatment effects. I want to focus on making comments better - doing things that directly cause people who are trying in good faith to participate in the discourse to post better comments.

Problem: nitpicking

One problem that’s come up repeatedly in conversations about the quality of blog post comments is that they don’t respond to important points, and instead nitpick about minutia. I think this happens for a few reasons, including: It’s often easier to evaluate minor fact-claims than major claims. This is because many readers often aren’t comparing an article’s underlying model to theirs - they may not have a model - and so, look for details that it’s easy to say yes or no to. The obvious solution is to make it easier to identify and think about a post’s substantive claims.

Solution: tag claims

The Arbital team has recently implemented the feature of tagging claims made in an article. For instance, in Alexei Andreev’s recent post about waiting to donate until the end of a fundraiser, there are links in the article to specific claims it makes or addresses, such as:

When linking to my blog post on GiveWell and "crowding out" considerations, he adds a link to a related claim page:

I’m pretty excited about this infrastructure. It gives the post author to foreground the considerations they think are most relevant, and gives commenters a set of default topics to argue about.

It also accomplishes the secondary goal of making comments more of a lasting, accessible record. If the comments about a claim are scattered over several related articles relevant to the claim, and also mixed together with comments on other topics, it can be hard to know whether you’ve seen the important discussions on a given topic. If, on the other hand, all those articles are tagged with the same claim, then you need only click through to the claim page, and you’ll find a record of comments by readers of all of the relevant posts - and only the comments relevant to this claim.

However, Arbital isn’t an universal platform, and many people want to maintain their personal blogs or post to a public forum allowing comments. What can we do to improve comments there?

Threaded comments enable tagged claims.

In my post advocating publishing private opinions on secret blogs, I posted three comments, one for each claim I wanted to make. Each comment ended with: "If you want to discuss this claim, I encourage you to do it as a reply to this comment.” On LessWrong, someone asked me to put the claim comments in boldface so they’d be easier to find. For another example, see my comments on this post.

The post didn’t get a huge number of comments but they felt maybe slightly more on-topic than usual. My claim-comments didn’t prevent people from introducing new threads, but they might have generated relevant responses that wouldn’t have been written otherwise, and they mostly got comments on the same topic grouped together. If you’re writing your own posts, whether on something like a forum or other group publication, a personal blog or website, or just ordinary social media, I encourage you to try this. Let me know how it goes.

(Cross-posted at LessWrong and Arbital.)


13 thoughts on “Improve comments by tagging claims

  1. Benquo Post author

    Claim 1: Location on the comments-links continuum is an important aspect of discourse design.

    If you want to discuss this claim, I encourage you to do it as a reply to this comment.

  2. Benquo Post author

    Claim 2: Comments are a high-quality, high-sensitivity measure of engagement with little in the way of viable substitutes.

    If you want to discuss this claim, I encourage you to do it as a reply to this comment.

  3. Benquo Post author

    Claim 3: Irrelevant nitpicks are an important problem in comment sections on sites such as LessWrong.

    If you want to discuss this claim, I encourage you to do it as a reply to this comment.

  4. Benquo Post author

    Claim 4: Explicitly tagging the core claims of a post will make people substantially more likely to respond to these claims.

    If you want to discuss this claim, I encourage you to do it as a reply to this comment.

  5. Benquo Post author

    Claim 5: Claim-tagging is worth trying more broadly (because of claims 3,4).

    If you want to discuss this claim, I encourage you to do it as a reply to this comment.

  6. John Salvatier

    I think you may be misunderstanding why people focus on selection mechanisms. Selection mechanisms can have big effects on both the private status returns to quality in comments (~5x) and the social returns to quality (~1000x). Similar effects are much less plausible with treatment effects.

    Claim: selection mechanisms are much more powerful than treatment effects.

    I think people are using the heuristic: If you want big changes in behavior, focus on incentives.

    Selection mechanisms can make relatively big changes in the private status returns to making high quality comments by making high quality comments much more recognized and visible. That makes the authors higher status, which gives them good reason to invest more in making the comments. If you get 1000x the audience when you make high quality comments, you're going to feel substantially higher status.

    Selection mechanisms can make the social returns to quality much larger by focusing people's attention on high quality comments (whereas before, many people might have had difficulty identifying high quality even after reading it).

    1. Benquo Post author

      "More powerful" seems like it's implicitly using categories that don't cut at the joints. I think Aceso Under Glass's post on Tostan makes an important distinction between capacity-building and capacity-using interventions:

      This is more speculative, but I feel like the most legible interventions are using something up. Charity Science: Health is producing very promising results with SMS vaccine reminders in India, but that’s because the system already had some built in capacity to use that intervention (a ~working telephone infrastructure, a populace with phones, government health infrastructure, medical research that identified a vaccine, vaccine manufacture infrastructure… are you noticing a theme here?). [...] Having that capacity and not using it was killing people. But I don’t think that CS’s intervention style will create much new capacity. For that you need inefficient, messy, special snowflake organizations.

      I'd guess that treatment effects seem less powerful than selection effects of equal importance because treatment effects are typically more capacity-building loaded.

      1. John Salvatier

        I think your overall model of capacity using vs. capacity building makes sense (taking care to distinguish from 'capacity consuming'). Some interventions increase underlying capabilities and some capacities let us use those underlying capabilities much more effectively.

        The effects are multiplicative, so it doesn't make that much sense to talk about the effect size?

        Is that what you meant?

        If so, I disagree that this implies that getting good selection effects is not more important *at this time in history*. If you choose to improve the underlying capabilities instead of improving selection effects, you will still end up with a much less effective overall process.

        If you're multiplying effects together, its still more important to use the bigger numbers first.

        It does make sense to think about effect sizes, but you need to make sure you're estimating their multiplicative effect.

        But on reflection this argument seems somewhat obvious, so maybe I'm misunderstanding you.

        To make this more concrete: lots of underlying capability gets wasted when there aren't good selection effects to make us of it. For example, people will repeatedly do useful cognitive work in blog comments, but because there's no good mechanism for bringing it strongly to the attention of people that should care, that work is usually thrown away and will be repeated.

        1. Benquo Post author

          I think you make a good case that we should search for selection effects first. However, once we've done the obvious things (e.g. karma scores), if a problem persists, it makes sense to think about treatment effects. (Separately, it's important to keep the distinction in mind in situations where what we really care about is treatment, and selection effects make a proxy metric less useful. Medicine is a good example of this.)

          I wonder whether this wording might have been misleading:

          I want to talk how to create, not selection effects, but treatment effects. I want to focus on making comments better - doing things that directly cause people who are trying in good faith to participate in the discourse to post better comments.

          I meant that in this particular blog post, I'm talking about treatment effects, because there's already been plenty of discussion about selection effects in the relevant domain. I did not mean that we should generally favor treatment effects over selection effects.

          1. John Salvatier

            We are in agreement!

            I agree that if you try to make better selection effects a bunch, and you still have more problems, then you might want to look at treatment effects.

            And that its important to keep the two distinct.

            I had the general impression from your post that people were substantially overvaluing selection effects, which I don't think is the case. Also the fact that you were spending time trying to think of treatment effects, suggested to me that you thought this was a priority.

            It sounds like you don't think those things, so I think we now agree.

            (As an aside, it was incredibly satisfying to read the words "I think you make a good case that we should search for selection effects first.". I thank you from the bottom of my heart for directly acknowledging my point.)

  7. Pingback: The humility argument for honesty | Compass Rose

  8. Chris

    I really wish there was a thread here for people's results when they tried this. I like the idea, but it could generalise to discussion points.


Leave a Reply

Your email address will not be published. Required fields are marked *