Harmful Content Dashboard

Individual users can review targeted hateful and harassing content & comments via a dashboard, and address this harmful content through blocking, reporting, etc. (See Offensive Comment Filter)

How does this mitigate hate?

Platforms can reduce exposure to online hate and harassment by enabling users to proactively filter out and quarantine targeted replies, comments, and messages that are automatically flagged as harmful. However, platforms cannot just hide this content; they should give users the ability to review this content and to address it.

LEARN MORE

When to use it?

When a user is facing online hate and harassment and wants to reduce exposure while still being able to monitor for threats or risks.

How does it work?

Users should be able to turn on filters that automatically flag targeted replies, comments, and messages as potentially hateful or abusive and quarantine this harmful content in a dashboard.

From within the dashboard, users should be able to review the content (to monitor for threats or doxing, for example) and to address it via blocking, reporting, documentation, etc.

Users should be able to manually add abusive content to the dashboard that was missed by the automated filter and manually release content from the dashboard that was mistakenly automatically flagged as abusive or that the user does not perceive as abusive.

Advantages

A harmful content filter and dashboard focused on individual user experience of targeted replies, comments, and messages can provide an alternative to overzealous proactive content moderation, which can severely undermine free expression for all users.

Disadvantages

The automated filtering of harmful content is an imperfect science—with false positives, rapidly evolving and coded forms of abuse and hate, and challenges analyzing symbols and images.

Some detection algorithms have also shown to have racist or sexist biases.

Platforms should work more closely with one another, with companies that build third-party tools, and with civil society to create and maintain a shared taxonomy of abusive tactics, terms, symbols, etc., and to create publicly available data sets and heuristics for independent review.

Examples

TikTok offers a dashboard to review filtered comments. (screenshot taken unknown date)

YouTube offers a dashboard that automatically holds potentially inappropriate comments for review. (screenshot taken May 2022)

References

Germain, Thomas. “How to Filter Hate Speech, Hoaxes, and Violent Clips out of Your Social Feeds.” Consumer Reports, August 13, 2020. https://www.consumerreports.org/social-media/combat-hate-speech-and-misinformation-on-social-media/.

Madison, Quinn. “Tuning out Toxic Comments, with the Help of AI.” Google Design, February 11, 2020. https://medium.com/google-design/tuning-out-toxic-comments-with-the-help-of-ai-85d0f92414db.

Systrom, Kevin. “Protecting Users with Bullying Filters on Instagram | Instagram Blog.” about.instagram.com, May 1, 2018. https://about.instagram.com/blog/announcements/bully-filter-and-kindness-prom-to-protect-our-community/.

Vilk, Viktorya, Elodie Vialle, and Matt Bailey. “No Excuse for Abuse: What Social Media Companies Can Do Now to Combat Online Harassment and Empower Users.” Edited by Summer Lopez and Suzanne Nossel. PEN AMERICA. PEN America, March 31, 2021. https://pen.org/report/no-excuse-for-abuse/.

Attribution

Written by PEN America