To block or not to block? Online tools for taking on trolls

A US project uses AI-assisted human moderation to weed out toxic commenters

One objection to “fixing” online spaces is the argument for freedom of expression.

One objection to “fixing” online spaces is the argument for freedom of expression.


“To block, or not to block, that is the question:

Whether ’tis nobler on the web to suffer

The slings and arrows of outrageous comments,

Or to take arms against a sea of trolls

And by algorithm end them.”

Internet wisdom advises us not to feed the trolls, thereby starving them of the oxygen of attention. But what happens when we discover that we’re fighting a system inadvertently designed to bring out the troll in all of us? In an attention economy where polarised viewpoints, toxic speech and moral outrage are exactly the kinds of content that generate revenue for social media platforms, how do we make nice on the web?

Andrew Losowsky, project lead at the Coral Project at Mozilla, believes that how online platforms are designed – and moderated – can have a profound influence on how users engage with each other. The Coral Project is an open-source project creating free online tools for newsrooms to engage with their audience.

It was founded as a collaboration between the New York Times and Washington Post and their Talk commenting platform is used by 28 publications in 11 countries including the Wall Street Journal, New York magazine, and the Intercept.

Losowsky says that uncivil engagement in the comments section is most certainly a problem but it is a symptom, not the disease. If media websites treat their reader engagement as an afterthought, it’s not going to spontaneously self-organise into polite, considered and relevant conversation.

What [publishers] have done with the comments so far is put an empty box at the bottom of every single article, walked away, come back and wondered why it is full of rubbish

“There’s an analogy I like to use: imagine you are collecting canned food for a local food bank. You could take 100 cargo boxes and just randomly throw them around on the pavements of Dublin. If you go back a couple of days later to check whether you’ve collected anything, you’ll find that people have just thrown rubbish in them. It’s just an empty box on a pavement, who knows what it’s for?

“You need to ask: what are we trying to do here? We’re trying to collect food. Where are we more likely to find food? Outside supermarkets. So, we go to the five biggest supermarkets in Dublin and put a box outside each one with writing on the box to say what it’s for along with the name of the food bank. With these five boxes, it’s now also manageable; you can go between these boxes, removing rubbish if any finds its way in.”


Losowsky extends this analogy further to include signposting: take out the kind of food that is most ideal and label it to give people examples of what is best to donate. Other people walking past now know what you want and as they’re shopping in the supermarket/reading an article, they are primed on how best to engage when they emerge.

“What [online publishers] have done with the comments so far on news sites is to put an empty box at the bottom of every single article, walked away, come back and wondered why it is full of rubbish,” adds Losowsky. “We shouldn’t be that surprised that’s what we are getting.”

The Coral Project’s Talk platform uses, among others, Google’s Perspective API, which was designed for online publishers in order to detect toxic speech. The platform is not, however, fully automated; it purposely uses AI-assisted human moderation so that a human always makes the final decision rather than letting an algorithm take over.

“If you write a comment in our system and the AI thinks it might be toxic it will send a message back to the commenter asking if they are sure they want to post this particular comment because the system thinks it might break our rules. You can post it any way or you can change it, this way we’re giving the user one chance because maybe they’re having a bad day; it gives them a chance to rethink their actions. Also, the algorithm could be mistaken, it might not necessarily be toxic speech,” explains Losowsky.

Clear labelling

Talk is also careful with its wording: it points out the language in the individual comment rather than accusing the person, thereby depersonalising the situation. Research has found that these kinds of design features can shape commenter behaviour. Another simple feature is clearly labelling journalist or staff member contributions, which University of Texas research has found leads to an improvement in the civility and quality of the conversation. People even cite more evidence in their online arguments if they can see the journalist is in the comments interacting with them.

This is about intentional design that gives people a chance to improve their behaviour, explains Losowsky.

So is bad design the only reason why online conversations can escalate from a couple of snarky replies to outright war on someone who has expressed an unpopular opinion? Or is something else happening? When we look to design features of dominant social media platforms such as Facebook and Twitter, the almighty algorithm plays a massive role in detecting what is popular and serving up more of the same, even if it provokes toxic behaviour.

“Content that we find emotionally engaging tends to capture the most attention, and because of this, we’re served more of it. Social media companies have created incredibly powerful AI to optimise our feeds for this engagement, which is built to keep our attention,” explains Tobias Rose-Stockwell, technologist and researcher who is currently writing a book on this area.

“The result is a system that was created to probe and extract emotional responses from its two billion users at scale. By design, they are tools for influence, and their dominance has created a renaissance for propaganda and divisive content.”

More worryingly, says Rose-Stockwell, news organisations have learned that “covering social media trends is a guaranteed way to sell advertisements and subscriptions, which means that even if you’re getting your news offline, the information you consume is being influenced by these platforms”.

“A great example of this is Donald Trump, who was able to get massive organic reach by saying more and more outlandish things during the 2016 election. The analytics firm Mediaquant estimated that he received nearly $5 billion in free media coverage during the presidential campaign, and 142 per cent more media value from Twitter than Clinton – specifically due to the types of tweets he posted,” he adds.

It’s not all about hate speech. Another way to fix discourse is to provide not just the right platform but the right reasons for online engagement, explains Losowsky: “We need to be able to create more usefulness for everyone in this space so that people have a better reason to come [to a comment section] rather than just yell and scream at the news, which is kind of utility because it gives you some catharsis but I think [the news industry] can do a lot more!”

Let’s say the article is about recycling. Inevitably, some people will come into the comments claiming that climate change isn’t real. That’s not what the article is about but there’s a feeling among journalists and others in the newsroom that because they believe in free speech they shouldn’t delete such comments, says Losowsky, because it’s not abusive or offensive.

“But if you were running a town-hall conversation with your community and someone started shouting off topic, you would say: that’s an interesting point but we’re not here to talk about that and if you keep disrupting our topic we’re going to have to ask you to leave. If that’s how we would run it in person, why wouldn’t we run it the same way in the comment space?” he asks.

Maybe now, says Losowsky, “with so many spaces for online communities to gather, it is not just an opportunity but a necessity for platforms providers to make conversations more productive by saying: our space is for this and if you want to say something else, there are other spaces to do that.”

Freedom of expression

However, one objection to “fixing” online spaces is the argument for freedom of expression. Back in the early 1990s Utopian cyberpunks saw the internet as the last bastion of free speech, no matter how objectionable that speech is.

“You don’t have a legal right to free speech on a private company’s website,” says Losowsky, adding that “people who shout ‘free speech’ tend to silence others”. Research has proven by showing us the chilling effect of trolling and toxic speech as online bullies, through verbal harassment and doxing (revealing the person’s address or other identifying information), censor the voices of others.

“I suspect a small number of vocal users would object to these kinds of prompts for more civil engagement. There will always be a small percentage of users that voice inordinately loud and toxic opinions. Normally they are not given a platform but on social media they are the stars of the show. Reducing the prominence of this type of toxic content is critical to fixing our discourse,” argues Rose-Stockwell.

If these platforms are doing what they are designed to do – keep you coming back for more – won’t social media companies be reluctant to redesign for more pleasant and less toxic interactions?

“There will no doubt be calls that these kinds of prompts are ‘engineering’ user behaviour, but the sad truth is that these platforms are already engineering our behaviour in order to keep us online and sell ads. With the state of things, it’s difficult to argue against the responsibility of platforms to improve the quality of our online conversation,” explains Rose-Stockwell.

And if attention-grabbing content has the side effect of stirring up outrage and other extreme emotion, there is also the issue of transparency in how our timelines are populated. The opacity of how these algorithms operate is problematic as a recent Pew Internet study showed that a significant number of social media users felt they had little or no control over what they saw on these platforms.

“This is arguably the real free-speech issue of our time. If a trending global hashtag is primarily determined by the way an algorithm is written, what does that mean for democracy? If the prominence of #metoo or #brexit was influenced by code, what kind of control are we ceding to these companies? Having these decisions obscured from us is the equivalent of giving up our public square to people that have zero accountability to the public,” he adds.

“Providing users with the ability to control the type of content they consume in their feeds is a hugely important piece of solving these problems. I have written about different concepts, like outrage-limited and politics-free switches on our profiles. More immediately, however, people should consider seriously limiting their time on social media, and be aware that it can be a harmful place, even when it seems benign and well aligned with their goals.”

The Irish Times Logo
Commenting on The Irish Times has changed. To comment you must now be an Irish Times subscriber.
Error Image
The account details entered are not currently associated with an Irish Times subscription. Please subscribe to sign in to comment.
Comment Sign In

Forgot password?
The Irish Times Logo
Thank you
You should receive instructions for resetting your password. When you have reset your password, you can Sign In.
The Irish Times Logo
Please choose a screen name. This name will appear beside any comments you post. Your screen name should follow the standards set out in our community standards.
Screen Name Selection


Please choose a screen name. This name will appear beside any comments you post. Your screen name should follow the standards set out in our community standards.

The Irish Times Logo
Commenting on The Irish Times has changed. To comment you must now be an Irish Times subscriber.
Forgot Password
Please enter your email address so we can send you a link to reset your password.

Sign In

Your Comments
We reserve the right to remove any content at any time from this Community, including without limitation if it violates the Community Standards. We ask that you report content that you in good faith believe violates the above rules by clicking the Flag link next to the offending comment or by filling out this form. New comments are only accepted for 3 days from the date of publication.