You Need to Monitor for Toxic Content on Your Website. AI Can Help
Toxic content is pervasive across the internet, and businesses that implement user generated content run the risk of playing host to scams, bigotry, and misinformation. Platforming this toxic content can be damaging to your brand’s image and can harm consumer sentiment if not taken care of promptly and at scale.
Hiring moderators to sift through every livestream, podcast, post, and gif, however† could put your company out of business. There’s simply too much content for humans to clean it all up. Another problem is that sifting through the worst of the internet can have adverse effects on your employees’ health. In 2021, a judge awarded a group of more than 10,000 former Facebook moderators an $85 million settlement after the moderators developed PTSD on the job.
Enter the content moderation solutions market, which, by utilizing AI or pairing it with humans, is helping to turn the tide in the war against toxic content on the internet.
Kevin Guo, co-founder and CEO of AI-powered content moderation company Hive, first saw a potential business in automated content moderation when he and his co-founder Dmitriy Karpman were students at Stanford University. Guo and Karpman had created Kiwi, a video chat app that would randomly pair users with strangers from around the world.
Quickly, Guo found himself dealing with what he coined the “hot dog problem,” which would later be parodied in the HBO comedy Silicon Valley. Put simply: men were using the app to expose themselves on camera to an unwilling public.
After determining that a solution to his problem did not exist, Guo decided to build a machine learning model that could identify and flag “hot dogs” himself. “I hand-labeled that set of images myself and it could really only do one thing, which was telling if something was a ‘hot dog’ or not.”
Guo began selling his “hot dog” model on the side, but quickly realized that there were more applications for a learning model that could identify and label objects in images and video than just nudity detection, so in 2017 he and Karpman shut down their apps to focus entirely on enterprise business.
Now, Hive offers automated content moderation services of all kinds, with models that can be trained to detect toxic content within text and audio in addition to images. These models are used by companies including Reddit, Giphy, and Vevo to detect and shut down violence, hate speech, spam, bullying, self-harm, and other behaviors you’d rather not see on your website or app.
One of Guo’s earliest successes in content moderation came when live video chat services Omegle and Chatroulette approached Hive to assist in cleaning up their content. Both companies became infamous in the early 2010s for their inability to deal with problems similar to the “hot dog” situation, so when they heard that Guo had cracked the code, they were intrigued.
“Now,” Guo says, “those platforms are 100 percent clean. We sample every video chat, and we can flag it the moment something comes up.” According to a case study, Hive closes about 1.5 million Chatroulette streams per month.
Guo says his models are designed to be used without any human assistance or input, a particularly attractive aspect for big businesses that need highly scalable solutions.
In October 2021, Microsoft announced that it had acquired Two Hat, a content moderation provider focused on the online gaming industry. Like Hive, most of Two Hat’s content moderation services work without human interaction. In a blog post announcing the acquisition, Xbox Product Services corporate vice president Dave McCarthy said that Two Hat’s tech has helped to make global communities in Xbox, Minecraft and MSN safer for users via a highly configurable approach that allows the user to decide what they are and aren’t comfortable with.
Other content moderation professionals, however, feel that the true solution lies in combining what AI does well with human decision-making. Twitch, the global livestreaming service for gaming, music, and entertainment, is creating internal programs that use machine learning partnered with humans to flag suspicious and harmful content. While some content is banned platform-wide, such as nudity and violence, Twitch also allows streamers to customize content moderation specifically for their own channel.
A primary example of this customization, according to Twitch community health product director Alison Huffman, comes in the form of a recently released tool called Suspicious User Detection. The tool uses machine learning to identify users who have created a new account in order to get around being banned from a specific channel. The tool flags potential ban evaders and then lets creators make their own decision on how to proceed.
“We’re trying to combine the best of machine learning, which is scalable and efficient but imperfect at detecting nuance, with human review, which is less efficient but more nuanced and personal,” says Huffman.
“This way, we’re using machine learning to give creators information that helps them make better, faster safety decisions–while still leaving that final decision in their hands.”