Innovation > Innovation

WECOUNTERHATE

POSSIBLE, Seattle / LIFE AFTER HATE / 2018

CampaignCampaignLayout(opens in a new tab)
Case Film
Presentation Image
Supporting Images
Supporting Images
Supporting Images
1 of 0 items

Overview

Credits

OVERVIEW

CampaignDescription

With the advent of social media, technology gave hate a way to spread like never before. In fact, in 2017 there were more tweets involving hate speech than Game of Thrones, Major League Baseball, the Super Bowl or the Grammys.

We decided to use the same technology that empowered hate, to stop it.

That was the genesis of #WeCounterHate, a people-powered, machine-learning platform created to stop the spread of hate speech on Twitter, one retweet at a time.

First, AI helps identify tweets containing hate speech. Once identified, they’re tagged with a reply. This permanent marker lets those looking to spread hate know that retweeting will commit a donation to a nonprofit that fights for inclusion, diversity and equality.

Potential retweeters are presented with a decision: Don’t retweet the hateful ideology, or retweet it and financially benefit an organization they’re opposed to. Either way, love wins.

Execution

How do you teach a machine to understand hate speech? In addition to being obviously unpleasant, hate speech is a technical challenge to unpack. The sheer volume of content online (one hate tweet every two seconds) is an issue, and is riddled with subtleties. The hate speech our system sees comes from nine different countries, covering six different targeted groups, and uses three times as many words compared to normal language. Plus, hate speakers often use humor, sarcasm and coded language (Juice = Jews) to avoid detection.

#WeCounterHate created a series of filters designed to identify and classify hate speech for response. The first filter uses a bag-of-words approach to identify thousands of potential candidates for hate speech on Twitter. This subset is filtered through a second layer of supervised machine learning, which rests on our proprietary hate-speech classifiers that were trained in part by former white supremacists.

These classifiers are layered on top of IBM Watson’s NLP platform, and passed through Google Perspective’s toxicity API. That filter sends the resulting stream of the most toxic hate speech to a human moderator for response. The technology has been in production since February and has proven 91% successful at identifying hate.

Outcome

The platform has radically outperformed expectations of identifying hate speech (91% success) relative to a human moderator, and we are continuing to improve the model.

When #WeCounterHate responds to a hate tweet, it reduces the spread of that hate by an average of 54%, and 19% of the "hatefluencers" delete the tweet outright. It all equates to more than 4MM fewer people being exposed to hate speech (at the time of this writing), essentially making for a hugely successful anti-media plan.

Our hope is to continue to counter hate speech online, while collecting insightful data about how hate speech online propagates. This data will allow experts in the field to address the hate speech problem at a more systemic level.

Relevancy

For #WeCounterHate, we developed an innovative machine-learning and natural-language processing technology to identify and classify hate speech in social media, one of the most unstructured and ugliest data sets imaginable, so that we can respond to it in near real-time and reduce its spread.

Solution

September, 2017 - beta binary classifier built (hate vs. no hate)

November, 2017 - beta classifiers formalized based on 5 intensities

January 2018 - Machine training updated with feedback from former white supremacists

February 2018 - #WeCounterHate launches

Synopsis

Our responses to tweets containing hate speech give the hate speakers an impossible choice: do nothing, and let the hateful message die, or share the tweet and drive donations to a non-profit that fights for inclusion and diversity. The technology succeeds by either driving funds to an anti-extremist nonprofit, or by reducing the actual hate speech that the nonprofit is fighting against. A win-win.

Our identification of, and responses to, hate speech have reduced its spread on Twitter by 54%. It all equates to more than 4MM fewer people (at the time of this writing) being exposed to hate speech on Twitter because of #WeCounterHate. Learn more here: https://showmethe.work/wecounterhate.

More Entries from Innovative Technology in Innovation

24 items

Grand Prix Cannes Lions
MY LINE

Innovative Technology

MY LINE

MINISTRY OF COMMUNICATIONS & INFORMATION, MULLENLOWE SSP3

(opens in a new tab)

More Entries from POSSIBLE

24 items

Gold Cannes Lions
MICROSOFT COLLECTIVE PROJECT

Devices & Diagnostics

MICROSOFT COLLECTIVE PROJECT

MICROSOFT, POSSIBLE

(opens in a new tab)