The Secret Rules of the Internet

The untold history of online content moderation, and how it’s shaping the future of free speech.

Soraya Chemaly & Catherine BuniThe VergeApril 13, 2016

Image: ERIC PETERSON/THE VERGE

Julie Mora-Blanco remembers the day, in the summer of 2006, when the reality of her new job sunk in. A recent grad of California State University, Chico, Mora-Blanco had majored in art, minored in women’s studies, and spent much of her free time making sculptures from found objects and blown-glass. Struggling to make rent and working a post-production job at Current TV, she’d jumped at the chance to work at an internet startup called YouTube. Maybe, she figured, she could pull in enough money to pursue her lifelong dream: to become a hair stylist.

It was a warm, sunny morning, and she was sitting at her desk in the company’s office, located above a pizza shop in San Mateo, an idyllic and affluent suburb of San Francisco. Mora-Blanco was one of 60-odd twenty-somethings who’d come to work at the still-unprofitable website.

Mora-Blanco’s team — 10 people in total — was dubbed The SQUAD (Safety, Quality, and User Advocacy Department). They worked in teams of four to six, some doing day shifts and some night, reviewing videos around the clock. Their job? To protect YouTube’s fledgling brand by scrubbing the site of offensive or malicious content that had been flagged by users, or, as Mora-Blanco puts it, “to keep us from becoming a shock site.” The founders wanted YouTube to be something new, something better — “a place for everyone” — and not another eBaum’s World, which had already become a repository for explicit pornography and gratuitous violence.

Mora-Blanco sat next to Misty Ewing-Davis, who, having been on the job a few months, counted as an old hand. On the table before them was a single piece of paper, folded in half to show a bullet-point list of instructions: Remove videos of animal abuse. Remove videos showing blood. Remove visible nudity. Remove pornography. Mora-Blanco recalls her teammates were a “mish-mash” of men and women; gay and straight; slightly tipped toward white, but also Indian, African-American, and Filipino. Most of them were friends, friends of friends, or family. They talked and made jokes, trying to make sense of the rules. “You have to find humor,” she remembers. “Otherwise it’s just painful.”

Videos arrived on their screens in a never-ending queue. After watching a couple seconds apiece, SQUAD members clicked one of four buttons that appeared in the upper right hand corner of their screens: “Approve” — let the video stand; “Racy” — mark video as 18-plus; “Reject” — remove video without penalty; “Strike” — remove video with a penalty to the account. Click, click, click. But that day Mora-Blanco came across something that stopped her in her tracks.

“Oh, God,” she said.

Mora-Blanco won’t describe what she saw that morning. For everyone’s sake, she says, she won’t conjure the staggeringly violent images which, she recalls, involved a toddler and a dimly lit hotel room.

Ewing-Davis calmly walked Mora-Blanco through her next steps: hit “Strike,” suspend the user, and forward the person’s account details and the video to the SQUAD team’s supervisor. From there, the information would travel to the CyberTipline, a reporting system launched by the National Center for Missing and Exploited Children (NCMEC) in 1998. Footage of child exploitation was the only black-and-white zone of the job, with protocols outlined and explicitly enforced by law since the late 1990s.

The video disappeared from Mora-Blanco’s screen. The next one appeared.

Ewing-Davis said, “Let’s go for a walk.”

Okay. This is what you‘re doing, Mora-Blanco remembers thinking as they paced up and down the street. You‘re going to be seeing bad stuff.

Almost a decade later, the video and the child in it still haunt her. “In the back of my head, of all the images, I still see that one,” she said when we spoke recently. “I really didn’t have a job description to review or a full understanding of what I’d be doing. I was a young 25-year-old and just excited to be getting paid more money. I got to bring a computer home!” Mora-Blanco’s voice caught as she paused to collect herself. “I haven’t talked about this in a long time.”

Mora-Blanco is one of more than a dozen current and former employees and contractors of major internet platforms from YouTube to Facebook who spoke to us candidly about the dawn of content moderation. Many of these individuals are going public with their experiences for the first time. Their stories reveal how the boundaries of free speech were drawn during a period of explosive growth for a high-stakes public domain, one that did not exist for most of human history. As law professor Jeffrey Rosen first said many years ago of Facebook, these platforms have “more power in determining who can speak and who can be heard around the globe than any Supreme Court justice, any king or any president.”

Launched in 2005, YouTube was the brainchild of Chad Hurley, Steve Chen, and Jawed Karim—three men in their 20s who were frustrated because technically there was no easy way for them to share two particularly compelling videos: clips of the 2004 tsunami that had devastated southeast Asia, and Janet Jackson’s Superbowl “wardrobe malfunction.” In April of 2005, they tested their first upload. By October, they had posted their first one million-view hit: Brazilian soccer phenom Ronaldinho trying out a pair of gold cleats. A year later, Google paid an unprecedented $1.65 billion to buy the site. Mora-Blanco got a title: content policy strategist, or in her words, “middle man.” Sitting between the front lines and content policy, she handled all escalations from the front-line moderators, coordinating with YouTube’s policy analyst. By mid-2006, YouTube viewers were watching more than 100 million videos a day.

In its earliest days, YouTube attracted a small group of people who mostly shared videos of family and friends. But as volume on the site exploded, so did the range of content: clips of commercial films and music videos were being uploaded, as well as huge volumes of amateur and professional pornography. (Even today, the latter eclipses every other category of violating content.) Videos of child abuse, beatings, and animal cruelty followed. By late 2007, YouTube had codified its commitment to respecting copyright law through the creation of a Content Verification Program. But screening malicious content would prove to be far more complex, and required intensive human labor.

This small team of improvisers had yet to grasp that they were helping develop new global standards for free speech.

Sometimes exhausted, sometimes elated, and always under intense pressure, the SQUAD reviewed all of YouTube’s flagged content, developing standards as they went. They followed a guiding-light question: “Can I share this video with my family?” For the most part, they worked independently, debating and arguing among themselves; on particularly controversial issues, strategists like Mora-Blanco conferred with YouTube’s founders. In the process, they drew up some of the earliest outlines for what was fast becoming a new field of work, an industry that had never before been systematized or scaled: professional moderation.

By fall 2006, working with data and video illustrations from the SQUAD, YouTube’s lawyer, head of policy, and head of support created the company’s first booklet of rules for the team, which, Mora-Blanco recalls, was only about six pages long. Like the one-pager that preceded it, copies of the booklet sat on the table and were constantly marked up, then updated with new bullet points every few weeks or so. No booklet could ever be complete, no policy definitive. This small team of improvisers had yet to grasp that they were helping to develop new global standards for free speech.

In 2007, the SQUAD helped create YouTube’s first clearly articulated rules for users. They barred depictions of pornography, criminal acts, gratuitous violence, threats, spam, and hate speech. But significant gaps in the guidelines remained — gaps that would challenge users as well as the moderators. The Google press office, which now handles YouTube communications, did not agree to an interview after multiple requests.

As YouTube grew up, so did the videos uploaded to it: the platform became an increasingly important host for newsworthy video. For members of the SQUAD, none of whom had significant journalism experience, this sparked a series of new decisions.

In the summer of 2009, Iranian protesters poured into the streets, disputing the presidential victory of Mahmoud Ahmadinejad. Dubbed the Green Movement, it was one of the most significant political events in the country’s post-Revolutionary history. Mora-Blanco, soon to become a senior content specialist, and her team — now dubbed Policy and more than two-dozen strong — monitored the many protest clips being uploaded to YouTube.

Image: ERIC PETERSON

On June 20th, the team was confronted with a video depicting the death of a young woman named Neda Agha-Soltan. The 26-year-old had been struck by a single bullet to the chest during demonstrations against pro-government forces and a shaky cell-phone video captured her horrific last moments: in it, blood pours from her eyes, pooling beneath her.

Within hours of the video’s upload, it became a focal point for Mora-Blanco and her team. As she recalls, the guidelines they’d developed offered no clear directives regarding what constituted newsworthiness or what, in essence, constituted ethical journalism involving graphic content and the depiction of death. But she knew the video had political significance and was aware that their decision would contribute to its relevance.

Mora-Blanco and her colleagues ultimately agreed to keep the video up. It was fueling important conversations about free speech and human rights on a global scale and was quickly turning into a viral symbol of the movement. It had tremendous political power.They had tremendous political power. And the clip was already available elsewhere, driving massive traffic to competing platforms.

The Policy team worked quickly with the legal department to relax its gratuitous violence policy, on the fly creating a newsworthiness exemption. An engineer swiftly designed a button warning that the content contained graphic violence — a content violation under normal circumstances — and her team made the video available behind it, where it still sits today. Hundreds of thousands of individuals, in Iran and around the world, could witness the brutal death of a pro-democracy protester at the hands of government. The maneuvers that allowed the content to stand took less than a day.

Today, YouTube’s billion-plus users upload 400 hours of video every minute. Every hour, Instagram users generate 146 million “likes” and Twitter users send 21 million tweets. Last August, Mark Zuckerbergposted on Facebook that the site had passed “an important milestone: For the first time ever, one billion people used Facebook in a single day.”

The moderators of these platforms — perched uneasily at the intersection of corporate profits, social responsibility, and human rights — have a powerful impact on free speech, government dissent, the shaping of social norms, user safety, and the meaning of privacy. What flagged content should be removed? Who decides what stays and why? What constitutes newsworthiness? Threat? Harm? When should law enforcement be involved?

While public debates rage about government censorship and free speech on college campuses, customer content management constitutes the quiet transnational transfer of free-speech decisions to the private, corporately managed corners of the internet where people weigh competing values in hidden and proprietary ways. Moderation, explains Microsoft researcher Kate Crawford, is “a profoundly human decision-making process about what constitutes appropriate speech in the public domain.”

During a panel at this year’s South by Southwest, Monika Bickert, Facebook’s head of global product policy, shared that Facebook users flag more than one million items of content for review every day. The stakes of moderation can be immense. As of last summer, social media platforms — predominantly Facebook — accounted for 43 percent of all traffic to major news sites. Nearly two-thirds of Facebook and Twitter users access their news through their feeds. Unchecked social media is routinely implicated in sectarian brutality, intimate partner violence, violent extremist recruitment, and episodes of mass bullying linked to suicides.

Content flagged as violent — a beating or beheading — may be newsworthy. Content flagged as “pornographic” might be political in nature, or as innocent as breastfeeding or sunbathing. Content posted as comedy might get flagged for overt racism, anti-Semitism, misogyny, homophobia, or transphobia. Meanwhile content that may not explicitly violate rules is sometimes posted by users to perpetrate abuse or vendettas, terrorize political opponents, or out sex workers or trans people. Trolls and criminals exploit anonymity to dox, swat, extort, exploit rape, and, on some occasions, broadcast murder. Abusive men threaten spouses. Parents blackmail children. In Pakistan, the group Bytes for All — an organization that previously sued the Pakistani government for censoring YouTube videos — released three case studies showing that social media and mobile tech cause real harm to women in the country by enabling rapists to blackmail victims (who may face imprisonment after being raped), and stoke sectarian violence.

A prevailing narrative, as one story in The Atlantic put it, is that the current system of content moderation is “broken.” For users who’ve been harmed by online content, it is difficult to argue that “broken” isn’t exactly the right word. But something must be whole before it can fall apart. Interviews with dozens of industry experts and insiders over 18 months revealed that moderation practices with global ramifications have been marginalized within major firms, undercapitalized, or even ignored. To an alarming degree, the early seat-of-the-pants approach to moderation policy persists today, hidden by an industry that largely refuses to participate in substantive public conversations or respond in detail to media inquiries.

Whether online content stays or goes has the power to shape movements and revolutions.

In an October 2014 Wired story, Adrian Chen documented the work of front line moderators operating in modern-day sweatshops. In Manila, Chen witnessed a secret “army of workers employed to soak up the worst of humanity in order to protect the rest of us.” Media coverage and researchers have compared their work to garbage collection, but the work they perform is critical to preserving any sense of decency and safety online, and literally saves lives — often those of children. For front-line moderators, these jobs can be crippling. Beth Medina, who runs a program called SHIFT (Supporting Heroes in Mental Health Foundational Training), which has provided resilience training to Internet Crimes Against Children teams since 2009, details the severe health costs of sustained exposure to toxic images: isolation, relational difficulties, burnout, depression, substance abuse, and anxiety. “There are inherent difficulties doing this kind of work,” Chen said, “because the material is so traumatic.”

But as hidden as that army is, the orders it follows are often even more opaque — crafted by an amalgam of venture capitalists, CEOs, policy, community, privacy and trust and safety managers, lawyers, and engineers working thousands of miles away. Sarah T. Roberts is an assistant professor of Information and Media Studies at Western University and author of the forthcomingBehind the Screen: Digitally Laboring in Social Media‘s Shadow World. She says “commercial content moderation” — a term she coined to denote the kind of professional, organized moderation featured in this article — is not a cohesive system, but a wild range of evolving practices spun up as needed, subject to different laws in different countries, and often woefully inadequate for the task at hand. These practices routinely collapse under the weight and complexity of new challenges — as the decisions moderators make engage ever more profound matters of legal and human rights, with outcomes that affect users, workers, and our digital public commons. As seen with Black Lives Matter or the Arab Spring, whether online content stays or goes has the power to shape movements and revolutions, as well as the sweeping policy reforms and cultural shifts they spawn.

Yet, even basic facts about the industry remain a mystery. Last month, in a piece titled “Moderating Facebook: The Dark Side of Social Networking,” Who Is Hosting This? suggested that one third of “Facebook’s entire workforce” is comprised of moderators, a number Facebook refutes as an overestimate. Content moderation is fragmented into in-house departments, boutique firms, call centers, and micro-labor sites, all complemented by untold numbers of algorithmic and automated products. Hemanshu Nigam, founder of SSP Blue, which advises companies in online safety, security, and privacy, estimates that the number of people working in moderation is “well over 100,000.” Others speculate that the number is many times that.

Image: ERIC PETERSON

At industry leaders such as Facebook, Pinterest, and YouTube, the moderation process is improving drastically; at other platforms it might as well be 2006. There, systems are tied to the hands-off approach of an earlier era, and continue to reflect distinct user and founder sensibilities. At 4chan, for example, users are instructed against violating US law but are also free to post virtually any type of content, as long as they do so on clearly defined boards. According to the site’s owner Hiroyuki Nishimura, 4chan still relies heavily on a volunteer system of user-nominated “janitors.” These janitors, Nishimura said in a recent email exchange, play a critical role, “tasked with keeping the imageboards free of rule-breaking content.” As a janitor application page on the website lays out, “Janitors are able to view the reports queue, delete posts, and submit ban and warn requests” for their assigned board. 4chan janitors, Nishimura said, use “chat channels,” to discuss content questions with supervising moderators, some paid, some unpaid. “If they can’t decide,” he wrote, “they ask me, so that I’m the last one in 4chan. And, in case I couldn’t judge. I asked with lawyers.” Even after more than a decade, 4chan remains a site frequently populated by harassment and threats. Content has included everything from widespread distribution of nonconsensual porn, to “Niggerwalk” memes, to racist mobs evoking Hitler and threatening individual users. “People who try to do bad things use YouTube, Facebook, Twitter, and 4chan also,” Nishimura told us. “As long as such people live in the world, it happens. Right now, I don’t know how to stop them and I really want to know. If there is a way to stop it, we definitely follow the way.”

The details of moderation practices are routinely hidden from public view, siloed within companies and treated as trade secrets when it comes to users and the public. Despite persistent calls from civil society advocates for transparency, social media companies do not publish details of their internal content moderation guidelines; no major platform has made such guidelines public. Very little is known about how platforms set their policies — current and former employees like Mora-Blanco and others we spoke to are constrained by nondisclosure agreements. Facebook officials Monika Bickert and Ellen Silver, head of Facebook’s Community Support Team, responded to questions regarding their current moderation practices, and Pinterest made safety manager Charlotte Willner available for an interview. However, Facebook and Pinterest, along with Twitter, Reddit, and Google, all declined to provide copies of their past or current internal moderation policy guidelines. Twitter, Reddit, and Google also declined multiple interview requests before deadline. When asked to discuss Twitter’s Trust and Safety teams’ operations, for example, a spokesperson wrote only:

“Our rules are designed to allow our users to create and share a wide variety of content in an environment that is safe and secure for our users. When content is reported to us that violates our rules, which include a ban on violent threats and targeted abuse, we suspend those accounts. We evaluate and refine our policies based on input from users, while working with outside safety organizations to ensure that we have industry best practices in place.”

Several motives drive secrecy, according to Crawford, Willner, and others. On the one hand, executives want to guard proprietary tech property and gain cover from liability. On the other, they want the flexibility to respond to nuanced, fast-moving situations, and they want to both protect employees who feel vulnerable to public scrutiny and protect the platform from users eager to game a policy made public. The obvious costs of keeping such a significant, quasi-governmental function under wraps rarely rank as a corporate concern. “How,” asks Roberts, the content moderation researcher, “do you or I effect change on moderation practices if they’re treated as industrial secrets?”

Dave Willner was at Facebook between 2008 and 2013, most of that time as head of content policy, and is now in charge of community policy at Airbnb. Last spring, we met with him in San Francisco. He wore a rumpled red henley and jeans, and he talked and walked fast as we made our way across the Mission.

Members of the public, “as much as ‘the public’ exists,” he said, hold one of three assumptions about moderation: moderation is conducted entirely by robots; moderation is mainly in the hands of law enforcement; or, for those who are actually aware of content managers, they imagine content is assessed in a classroom-type setting by engaged professionals thoughtfully discussing every post. All three assumptions, he said, were wrong. And they’re wrong, in great part, because they all miss the vital role that users themselves play in these systems.

By and large, users think of themselves as customers, or consumers. But platforms rely on users in three profound ways that alter that linear relationship: One, users produce content — our stories, photos, videos, comments, and interactions are core assets for the platforms we use; two, user behaviors — amassed, analyzed, packed, and repackaged — are the source of advertising revenue; and three, users play a critical role in moderation, since almost every content moderation system depends on users flagging content and filing complaints, shaping the norms that support a platform’s brand. In other words, users are not so much customers as uncompensated digital laborers who play dynamic and indispensable functions (despite being largely uninformed about the ways in which their labor is being used and capitalized).

Anne Collier, the founder of iCanHelpline, a social media tool for schools, suggests that users have not yet recognized their collective power to fix the harms users have themselves created in social media. “They’re called ‘users’ for a reason,” she said, “and collectively still think and behave as passive consumers.” By obfuscating their role, she argues, the industry delays users’ recognition of their agency and power.

Some of the larger companies — notably Facebook and Google — engage civil society through the creation of expert task forces and targeted subject matter working sessions dedicated to the problem of online harassment and crime. Organizations such as the Global Network Initiative, the Anti-Cyberhate Working Group, Facebook’s Safety Advisory Board, and Twitter’s new Trust and Safety Council are all examples of such multidisciplinary gatherings that bring together subject matter experts. However, these debates (unlike say, congressional hearings), are shielded from public view, as both corporate and civil society participants remain nearly silent about the deliberations. Without greater transparency, users, consumers — the public at large — are ill-equipped to understand exactly how platforms work and how their own speech is being regulated and why. This means that the most basic tools of accountability and governance — public and legal pressure — simply don’t exist.

In the earliest “information wants to be free” days of the internet, objectives were lofty. Online access was supposed to unleash positive and creative human potential, not provide a venue for sadists, child molesters, rapists, or racial supremacists. Yet this radically free internet quickly became aterrifying home to heinous content and the users who posted and consumed it.

This early phase, from its earliest inceptions in the 1960s until 2000, is what J.G. Palfrey, the former executive director of the Berkman Center for Internet and Society at Harvard University, calls the Open Internet. It was in great part the result of 1996’s Communications Decency Act’s Section 230(c), known as the Good Samaritan Act, which absolved companies of liability for content shared on their services.

Section 230 is widely cherished as the “most important law on the Internet,” credited with making possible a “trillion or so dollars of value” according to David Post, legal scholar and fellow at the Center for Democracy and Technology. He calls Section 230, a “rather remarkable provision.” It reads: “No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.” These 26 words put free speech decisions into private hands, effectively immunizing platforms from legal liability for all content that does not violate federal law, such as child pornography. All the checks and balances that govern traditional media would not apply; with no libel risk there were, in effect, no rules.

Moderation’s initially haphazard, laissez-faire culture has its roots here. Before companies understood how alack of moderation could impede growth and degrade brands and community, moderators were volunteers; unpaid and virtually invisible. At AOL, moderation was managed by a Community Leader program composed of users who had previously moderated chat rooms and reported “offensive” content. They were tasked with building “communities” in exchange for having their subscription fees waived.

By 2000, companies had begun to take a more proactive approach. CompuServe, for instance, developed one of the earliest “acceptable use” policies barring racist speech, after a user with a Holocaust revisionist stance started filling a popular forum with antisemitic commentary. In 2001, eBay banned Nazi and Ku Klux Klan memorabilia, as well as other symbols of racial, religious, and ethnic hatred. Democratized countries joined forces to take down child pornography. Palfrey calls this phase Access Denied, characterized by a concerted effort across the industry and the government to ban unappetizing content.

Over the next decade, companies and governments honed these first-generation moderation tools, refining policies and collapsing the myth that cyberspace existed on a separate plane from real life, free from the realities of regulation, law, and policy.

This was the era in which Mora-Blanco began her career at YouTube. Trying to bring order to a digital Wild West one video at a time was grueling. To safeguard other employees from seeing the disturbing images in the reported content they were charged with reviewing, her team was sequestered in corner offices; their rooms were kept dark and their computers were equipped with the largest screen protectors on the market.

Members of the team quickly showed signs of stress — anxiety, drinking, trouble sleeping — and eventually managers brought in a therapist. As moderators described the images they saw each day, the therapist fell silent. The therapist, Mora-Blanco says, was “quite literally scared.”

Around 2008, she recalls, YouTube expanded moderation to offices in Dublin, Ireland and Hyderabad, India. Suicide and child abuse don’t follow a schedule, and employing moderators across different time zones enabled the company to provide around-the-clock support. But it also exposed a fundamental challenge of moderating content across cultures.

“We were told to take down anything that makes you feel bad, that makes you feel bad in your stomach.”

Soon after the expansion, Mora-Blanco found herself debating a flagged video that appeared to show a group of students standing in a circle with two boys brawling in the middle. Moderation guidelines prohibited content showing minors engaged in fighting for entertainment and are written to be as globally applicable and translatable as possible. But interpretation of those guidelines, she discovered, could be surprisingly fluid. Moderators in India let the flagged video remain live because to them, the people in the video were not children, but adults. When the video was flagged again, it escalated to the Silicon Valley team, as most escalations reportedly still do. To Mora-Blanco, the video clearly violated YouTube’s policy. “I didn’t know how to more plainly describe what I was seeing,” she said. The video came down.

Cultural perspective is a constant and pervasive issue, despite attempts to make “objective” rules. One former screener from a major video-sharing platform, who participated in Roberts’ research and spoke with us on the condition of anonymity, recounted watching videos of what he characterized as extreme violence — murder and beatings — coming from Mexico.

The screener was instructed to take down videos depicting drug-related violence in Mexico, while those of political violence in Syria and Russia were to remain live. This distinction angered him. Regardless of the country, people were being murdered in what were, in effect, all civil wars. “[B]asically,” he said, “our policies are meant to protect certain groups.” Before he left, he quietly began removing blackface videos he encountered. At the time, YouTube considered blackface non-malicious.

Image: ERIC PETERSON

When Dave Willner arrived at Facebook in 2008, the team there was working on its own “one-pager” of cursory, gut-check guidelines. “Child abuse, animal abuse, Hitler,” Willner recalls. “We were told to take down anything that makes you feel bad, that makes you feel bad in your stomach.” Willner had just moved to Silicon Valley to join his girlfriend, then Charlotte Carnevale, now Charlotte Willner, who had become head of Facebook’s International Support Team. Over the next six years, as Facebook grew from less than 100 million users to well over a billion, the two worked side by side, developing and implementing the company’s first formal moderation guidelines.

“We were called The Ninjas,” he said, “mapping the rabbit hole.” Like Mora-Blanco, Willner described how he, Charlotte, and their colleagues sometimes laughed about their work, so that they wouldn’t cry. “To outsiders, that sounds demented,” he said.

Just like at YouTube, the subjectivity of Facebook’s moderation policy was glaring. “Yes, deleting Hitler feels awesome,” Willner recalls thinking. “But, why do we delete Hitler? If Facebook is here to make the world more open,” he asked himself, “why would you delete anything?” The job, he says, was “to figure out Facebook’s central why.”

For people like Dave and Charlotte Willner, the questions are as complex now as they were a decade ago. How do we understand the context of a picture? How do we assign language meaning? Breaking the code for context — nailing down the ineffable question of why one piece of content is acceptable but a slight variation breaks policy — remains the holy grail of moderation.

In the absence of a perfectly automated system, Willner said, there are two kinds of human moderation. One set of decisions relies on observable qualities that involve minimal interpretation. For example, a moderator can see if a picture contains imagery of a naked toddler in a dimly lit hotel room — clearly a violation. In these cases, trained moderators can easily lean on detailed guidance manuals. The other method of decision, however, is more complex and interpretation is central. Recognizing bullying, for example, depends on understanding relationships and context, and moderators are not privy to either.

“For instance, let’s say that I wear a green dress to work one day and everyone makes fun of me for it,” explained Facebook’s Monika Bickert when we talked. “Then when I get home, people have posted on my Facebook profile pictures of green frogs or posts saying, ‘I love your dress!’ If I report those posts and photos to Facebook, it won’t necessarily be clear to the content reviewers exactly what is going on. We try to keep that in mind when we write our policies, and we also try to make sure our content reviewers consider all relevant content when making a decision.”

Created in 2009, Facebook’s first “abuse standards” draft ran 15,000 words and was, Willner said, “an attempt to be all-encompassing.”

While Willner is bound by an NDA to not discuss the document, a leaked Facebook cheat sheet used by freelance moderators made news in 2012. (Willner says the sheet included some minor misinterpretations and errors.) “Humor and cartoon humor is an exception for hate speech unless slur words are being used or humor is not evident,” read one rule. “Users may not describe sexual activity in writing, except when an attempt at humor or insult,” read another. Moderators were given explicit examples regarding the types of messages and photos described and told to review only the reported content, not unreported adjacent content. As in US law, content referring to “ordinary people” was treated differently than “public figures.” “Poaching of animals should be confirmed. Poaching of endangered animals should be escalated.” Things like “urine, feces, vomit, semen, pus and earwax,” were too graphic, but cartoon representations of feces and urine were allowed. Internal organs? No. Excessive blood? OK. “Blatant (obvious) depiction of camel toes and moose knuckles?” Prohibited.

The “Sex and Nudity” section perhaps best illustrates the regulations’ subjectivity and cultural biases. The cheat sheet barred naked children, women’s nipples, and “butt cracks.” But such images are obviously not considered inappropriate in all settings and the rules remain subject to cultural contexts.

In 2012, for instance, when headlines were lauding social media for its role in catalyzing the Arab Spring, a Syrian protester named Dana Bakdounes posted a picture of herself with a sign advocating for women’s equal rights. In the image Bakdounes is unveiled, wearing a tank top. A Facebook moderator removed the photo and blocked the administrators of an organization she supported, Uprising of Women in the Arab World. Her picture had been reported by conservatives who believed that images of women, heads uncovered and shoulders bare, constituted obscenity. Following public protest, Facebook quickly issued an apology and “worked to rectify the mistake.”

The issue of female nudity and culturally bound definitions of obscenity remains thorny. Last spring, Facebook blocked a 1909 photograph of an indigenous woman with her breasts exposed, a violation of the company’s ever evolving rules about female toplessness. In response, the Brazilian Ministry of Culture announced its intention to sue the company. Several weeks later, protesters in the United States, part of the #SayHerName movement, confronted Facebook and Instagram over the removal of photographs in which they had used nudity to highlight the plight of black women victimized by the police.

In early March, at packed panel at South by Southwest called “How Far Should We Go To Protect Hate Speech Online?” Jeffrey Rosen, now president of the National Constitution Center, was joined by Juniper Downs, head of public policy at Google / YouTube, and Facebook’s Monika Bickert, among others. At one point, midway through the panel, Rosen turned to Downs and Bickert, describing them as “the two most powerful women in the world when it comes to free speech.” The two demurred. Entire organizations, they suggested, make content decisions.

Not exactly. Content management is rarely dealt with as a prioritized organizational concern — centrally bringing together legal, customer service, security, privacy, safety, marketing, branding, and personnel to create a unified approach. Rather, it is still usually shoehorned into structures never built for a task so complex.

The majority of industry insiders and experts we interviewed described moderation as siloed off from the rest of the organization. Few senior level decision-makers, they said — whether PR staff, lawyers, privacy and security experts, or brand and product managers — experience the material in question first-hand. One content moderator, on condition of anonymity, said her colleagues and supervisors never saw violent imagery because her job was to remove the most heinous items before they could. Instead, she was asked to describe it. “I watched people’s faces turn green.”

Joi Podgorny is former vice president at ModSquad, which provides content moderation to a range of marquee clients, from the State Department to the NFL. Now a digital media consultant, she says founders and developers not only resist seeing the toxic content, they resist even understanding the practice of moderation. Typically cast off as “customer-service,” moderation and related work remains a relatively low-wage, low-status sector, often managed and staffed by women, which stands apart from the higher-status, higher-paid, more powerful sectors of engineering and finance, which are overwhelmingly male. “I need you to look at what my people are looking at on a regular basis,” she said. “I want you to go through my training and see this stuff [and] you’re not going to think it’s free speech. You’re going to think it’s damaging to culture, not only for our brand, but in general.”

Brian Pontarelli, CEO of the moderation software company Inversoft, echoes the observation. Many companies, he told us, will not engage in robust moderation until it will cost them not to. “They sort of look at that as like, that’s hard, and it’s going to cost me a lot of money, and it’s going to require a lot of work, and I don’t really care unless it causes me to lose money,” he said. “Until that point, they can say to themselves that it’s not hurting their revenue, people are still spending money with us, so why should we be doing it?”

When senior executives do get involved, they tend to parachute in during moments of crisis. In the wake of last December’s San Bernardino shootings, Eric E. Schmidt, executive chairman at Google, called on industry to build tools to reduce hate, harm, and friction in social media, “sort of like spell-checkers, but for hate and harassment.”

Likewise the words of former Twitter CEO Dick Costolo in an internal memo, published byThe Verge in February 2015. “We lose core user after core user by not addressing simple trolling issues that they face every day,” he wrote, concluding, “We’re going to start kicking these people off right and left and making sure that when they issue their ridiculous attacks, nobody hears them.” As if it were so simple.

Mora-Blanco worked for five years at Twitter, where she said, “there was a really strong cultural appreciation for the Trust & Safety team,” responsible for moderation related to harassment and abuse. She joined the company in 2010 and soon became manager of the User Safety Policy Team, where she developed policies concerning abuse, harassment, suicide, child sexual exploitation, and hate speech. By the end of her tenure in early 2015, she had moved to the Public Policy team, where she helped to launch several initiatives to prevent hate speech, harassment, and child sexual exploitation, and to promote free speech.

“It was embedded within the company, that the Trust & Safety team were doing important work,” she said. Even so, she found, her team didn’t have the tools to be effective. “The Trust & Safety teams had lots of ideas on how to implement change,” she said, “but not the engineering support.” Prior to 2014, according to Mora-Blanco and former Twitter engineer Jacob Hoffman-Andrews, there was not a single engineer dedicated to addressing harassment at Twitter. During the past two years, the company has taken steps to reconcile the need to stem abuse with its commitment to the broadest interpretation of unmoderated free speech, publicly announcing the formation of an advisory council and introducing training sessions with law enforcement.

According to a source close to the moderation process at Reddit, the climate there is far worse. Despite the site’s size and influence — attractingsome 4 to 5 million page views a day — Reddit has a full-time staff of only around 75 people, leaving Redditors to largely police themselves, following a “reddiquette” post that outlines what constitutes acceptable behavior. Leaving users almost entirely to their own devices has translated into years of high-profile catastrophes involving virtually every form of objectionable content — including entire toxic subreddits such as /r/jailbait, /r/creepshots, /r/teen_girls, /r/fatpeoplehate, /r/coontown, /r/niggerjailbait, /r/picsofdeadjailbait, and a whole category for anti-black Reddits called the “Chimpire,” which flourished on the platform.

Image: ERIC PETERSON

In the wake of public outrage over CelebGate — the posting on Reddit of hacked private photos of more than 100 women celebrities — a survey of more than 16,000 Redditors found that 50 percent of those who wouldn’t recommend Reddit cited “hateful or offensive content or community” as the reason why. After the survey was published in March 2015, the company announced, “we are seeing our open policies stifling free expression; people avoid participating for fear of their personal and family safety.” Alexis Ohanian, a Reddit co-founder, and other members of the Reddit team, described the company’s slow response to CelebGate as “a missed chance to be a leader” on the issue of moderating nonconsensual pornography. Two months later, Reddit published one of its first corporate anti-harassment moderation policies, which prohibited revenge porn and encouraged users to email moderators with concerns. Reddit includes a report feature that is routed anonymously to volunteer moderators whose ability to act on posts isdescribed in detail on the site.

But the survey also laid bare the philosophical clash between the site’s commitment to open expression, which fueled early growth, and the desire for limits among the users who may fuel future growth: 35 percent of complaints from “extremely dissatisfied users” were due to “heavy handed moderation and censorship.” The company continues to grapple with the paradox that to expand, Reddit (and other platforms) will likely have to regulate speech in ways that alienate a substantive percentage of their core customer base.

When asked in the summer of 2015 about racist subreddits that remained in place despite the company’s new policies, CEO Steve Huffman said the content is “offensive to many, but does not violate our current rules for banning,” and clarified that the changes were not “an official update to our policy.” By then, as Slate tech columnist David Auerbach wrote, Reddit was widely seen as “a cesspool of hate in dire need of repair.” Within weeks, Reddit announced the removal of a list of racist and other “communities that exist solely to annoy other Redditors, [and] prevent us from improving Reddit, and generally make Reddit worse for everyone else.”

The sharp contrast between Facebook, with its robust and long-standing Safety Advisory Board, and Reddit, with its skeletal staff and dark pools of offensive content, offers up a vivid illustration for how content moderation has evolved in isolated ways within individual corporate enclaves. The fragmentation means that content banned on one platform can simply pop up on another, and that trolling can be coordinated so that harassment and abuse that appear minor on a single platform are amplified by appearing simultaneously on multiple platforms.

A writer who goes by Erica Munnings and asked that we not use her real name out of fear of retaliation, found herself on the receiving end of one such attack, which she describes as a “high-consequence game of whack-a-mole across multiple social media platforms for days and weeks.” After writing a feminist article that elicited conservative backlash, a five-day “Twitter-flogging” ensued. From there, the attacks moved to Facebook, YouTube, Reddit, and 4chan. Self-appointed task forces of Reddit and 4chan users published her address and flooded her professional organization with emails, demanding that her professional license be rescinded. She shut down comments on her YouTube videos. She logged off Twitter. On Facebook, the harassment was debilitating. To separate her personal and professional lives, she had set up a separate Facebook page for her business. However, user controls on such pages are thin, and her attackers found their way in. “I couldn’t get one-star reviews removed or make the choice as a small business not to have ‘Reviews’ on my page at all,” she said. “Policies like this open the floodgates of internet hate and tied my hands behind my back. There was no way I could report each and every attack across multiple social media platforms because they came at me so fast and in such high volume. But also, it became clear to me that when I did report, no one responded, so there really was no incentive to keep reporting. That became yet another costly time-sink on top of deleting comments, blocking people, and screen-grabbing everything for my own protection. Because no one would help me, I felt I had no choice but to wait it out, which cost me business, and income.”

Several content moderation experts point to Pinterest as an industry leader. Microsoft’s Tarleton Gillespie, author of the forthcoming Free Speech in the Age of Platform, says the company is likely doing the most of any social media company to bridge the divide between platform and user, private company and the public. The platform’s moderation staff is well-funded and supported, and Pinterest is reportedly breaking ground in making its processes transparent to users. For example, Pinterest posts visual examples to illustrate the site’s “acceptable use policy” in an effort to help users better understand the platform’s content guidelines and the decisions moderators make to uphold them.

When we met with Charlotte Willner, now Pinterest’s Safety Manager, the film Fifty Shades of Grey had just been released. She and her team were hustling to develop new BDSM standards. “We realized,” she later explained by email, “that we were going to need to figure out standards for rape and kidnapping fantasy content, which we hadn’t seen a lot of but we began to see in connection with the general BDSM influx.”

The calls were not easy, but it was clear that her team was making decisions on a remarkably granular level. One user was posting fetish comments about cooking Barbie-size women in a stew pot. Should this be allowed? Why not? The team had to decipher whether it was an actual threat. A full-size woman can’t fit into stew pot, the team figured, so the content was unlikely to cause real-world harm. Ultimately, they let the posts stand — that is, according to Willner, until the “stew pot guy” began uploading more explicit content that clearly violated Pinterest’s terms and her team removed his account.

On that same South by Southwest panel, Rosen expressed concerns about both corporate regulation of free speech and newly stringent European Union regulations such as the Right to be Forgotten. “Censorship rules that have a lower standard than the First Amendment are too easily abused by governments,” he said. Yet he offered some warm words of praise, saying that companies such as Google, Facebook, and Twitter are “trying to thread an incredibly delicate and difficult line” and “the balance that they’re striking is a sensible one” given the pressures they face. He added, “Judges and regulators and even really smart, wonderful Google lawyers” are doing “about as good of a job with this unwelcome task as could be imagined.” It was a striking remark, given Google’s confirmation, only days earlier, that it had hired controversial 4chan founder Christopher Poole. Prominent industry critic Shanley Kane penned an outraged post describing Poole as responsible “for a decade + of inculcating one of the most vile, violent and harmful ‘communities’ on the Internet.” For marginalized groups, she wrote, the decision “sends not only a ‘bad message,’ but a giant ‘fuck you.'”

Several sources told us that industry insiders, frustrated by their isolation, have begun moving independently of their employers. Dave Willner, and others who spoke on the condition of anonymity, told us that 20 or 30 people working in moderation have started meeting occasionally for dinner in San Francisco to talk informally about their work.

One front-line expert, Jennifer Puckett, has worked in moderation for more than 15 years and now serves as social reputation manager heading up the Digital Safety Team atEmoderation. She believes that moderation, as an industry, is maturing. On the one hand, the human expertise is growing, making that tableful of young college grads at YouTube seem quaint. “People are forming their college studies around these types of jobs,” Mora-Blanco says. “The industry has PhD candidates in internship roles.” On the other hand, efforts to automate moderation have also advanced.

Growing numbers of researchers are developing technology tooled to understand user-generated content, with companies hawking unique and proprietary analytics and algorithms that attempt to measure meaning and predict behavior. Adaptive Listening technologies, which are increasingly capable of providing sophisticated analytics of online conversations, are being developed to assess user context and intent.

In May 2014, Dave Willner and Dan Kelmenson, a software engineer at Facebook, patented a 3D-modeling technology for content moderation designed around a system that resembles an industrial assembly line. “It’s tearing the problem [of huge volume] into pieces to make chunks more comprehensible,” Willner says. First, the model identifies a set of malicious groups — say neo-Nazis, child pornographers, or rape promoters. The model then identifies users who associate with those groups through their online interactions. Next, the model searches for other groups associated with those users and analyzes those groups “based on occurrences of keywords associated with the type of malicious activity and manual verification by experts.” This way, companies can identify additional or emerging malicious online activity. “If the moderation system is a factory, this approach moves what is essentially piecework toward assembly,” he said. “And you can measure how good the system is.”

Working with Microsoft, Hany Farid, the chair of Computer Science at Dartmouth College, developed something called PhotoDNA, which allows tech companies to automatically detect, remove, and report the presence of child exploitation images on their networks.

When we spoke, Farid described the initial resistance to PhotoDNA, how industry insiders said the problem of child exploitation was too hard to solve. Yes, it‘s horrible, he recalled everyone — executives, attorneys, engineers — saying, but there‘s so much content. Civil liberties groups, he recalls, also balked at an automated moderation system.

Talk to those people today, he said, and they’ll tell you how successful PhotoDNA has been. PhotoDNA works by processing an image every two milliseconds and is highly accurate. First, he explains, known child exploitation images are identified by NCMEC. Then PhotoDNA extracts from each image a numeric signature that is unique to that image, “like your human DNA is to you.” Whenever an image is uploaded, whether to Facebook or Tumblr or Twitter, and so on, he says, “its photoDNA is extracted and compared to our known CE images. Matches are automatically detected by a computer and reported to NCMEC for a follow-up investigation.” He describes photoDNA as “agnostic,” saying, “There is nothing specific to child exploitation in the technology.” He could just as easily be looking for pictures of cats. The tricky part of content moderation is not identifying content, he says, but the “long, hard conversations” — conducted by humans, not machines — necessary for reaching the tough decisions about what constitutes personal or political speech. Farid is now working with tech companies and nonprofit groups to develop similar technology that will identify extremism and terrorist threats online — whether expressed in speech, image, video, or audio. He expects the program to launch within months, not years.

While tech solutions are rapidly emerging, the cultural ones are slower in coming. Emily Laidlaw, assistant professor of law at University of Calgary and author of Regulating Speech in Cyberspace, calls for “a clarification of the applicability of existing laws.” For starters, she says, Section 230 of the 1996 Communications Decency Act needs immediate overhaul. Companies, she argues, should no longer be entirely absolved of liability for the content they host.

Image: ERIC PETERSON

For more than five years, Harvard’s Berkman Center for Internet and Society has pushed for industry-wide best practices. Their recommendations include corporate transparency, consistency, clarity, and a mechanism for customer recourse. Other civil society advocates call for corporate grievance mechanisms that are accessible and transparent in accordance with international human rights law, or call on corporations to engage in public dialogue with such active stakeholders as the Anti Defamation League, the Digital Rights Foundation, and the National Network to End Domestic Violence. “What we do is informed by external conversations that we have,” explained Facebook’s Bickert in early March. “Every day, we are in conversations with groups around the world… So, while we are responsible for overseeing these policies and managing them, it is really a global conversation.”

Some large established companies like YouTube, Pinterest, Emoderation, Facebook, and Twitter are beginning to make headway in improving moderation practices, using both tech and human solutions.

In the eight months since CEO Dick Costolo’s departure, Twitter has reached out to users and advocates in an effort to be more responsive. In early February, Patricia Cartes, Twitter’s head of global policy outreach, announced the formation of a Trust & Safety Council, a multidisciplinary advisory board comprised of 40 organizations and experts. Still, members of bodies such as this one, and Facebook’s Safety Advisory Board, established in 2009, operate under NDAs, meaning the conversations taking place in these places remain behind-the-scenes. In 2010, Google founded Google Ideas (now Jigsaw), a multi-disciplinary think tank to tackle challenges associated with defending against global security threats and protecting vulnerable populations. Meanwhile, a virtual cottage industry of internet helplines the world over has cropped up, such as Zoë Quinn’s Crash Override Network, an online abuse helpline for schools and “revenge porn” hotlines in the United States and the United Kingdom.

As these policy debates move forward in private, platforms are also taking steps to protect their front-line moderation workers. Medina told us that representatives from Google, Microsoft, Yahoo, Facebook, and Kik have recently attended SHIFT’s resilience trainings. Facebook’s hundreds of moderators — located in Menlo Park, Austin, Dublin, and Hyderabad — says Silver, head of Facebook’s Community Support Team, now receive regular training, detailed manuals, counseling and other support. The company brings on experts or trains specialists in specific areas such as suicide, human trafficking, child exploitation, violence against women, and terrorism. Moderator decisions on reported pieces of content are regularly audited, she says, to ensure that moderation is consistent and accurately reflects guidelines. “We have one set of content standards for the entire world,” explains Facebook’s Bickert. “This helps our community be truly global, because it allows people to share content across borders. At the same time, maintaining one set of standards is challenging because people around the world may have different ideas about what is appropriate to share on the Internet.”

Many US-based companies, however, continue to consign their moderators to the margins, shipping their platforms’ digital waste to “special economic zones” in the Global South. As Roberts recounts in her paper “Digital Refuse,” these toxic images trace the same routes used to export the industrial world’s physical waste — hospital hazardous refuse, dirty adult diapers, and old model computers. Without visible consequences here and largely unseen, companies dump child abuse and pornography, crush porn, animal cruelty, acts of terror, and executions — images so extreme those paid to view them won’t even describe them in words to their loved ones — onto people desperate for work. And there they sit in crowded rooms at call centers, or alone, working off-site behind their screens and facing cyber-reality, as it is being created. Meanwhile, each new startup begins the process, essentially, all over again.

And even industry leaders continue to rely on their users to report and flag unacceptable content. This reliance, says Nicole Dewandre, an advisor to the European Commission on Information and Communication Technology policy, is “an exhausting laboring activity that does not deliver on accountability.” The principle of counter speech, by which users are expected to actively contradict hateful or harmful messages, firmly puts the burdens and risks of action on users, not on the money-making platforms who depend on their content. Perhaps for that reason, counter speech has become an industry buzzword. Susan Benesch, founder of the Dangerous Speech Project, which tracks inflammatory speech and its effects, notes that while counter speech is an important tool, “it cannot be the sole solution. We don’t yet understand enough about when it’s effective, and can often put the counter-speaker at risk of attack, online or offline.”

Sarah T. Roberts, the researcher, cautions that “we can’t lose sight of the baseline.” The platforms, she notes, “are soliciting content. It’s their solicitation that invites people to upload content. They create the outlet and the impetus.” If moderators are, in Dave Willner’s estimation, platforms’ emotional laborers, users are, in the words of labor researcher Kylie Jarrett, their “digital housewives” — volunteering their time and adding value to the system while remaining unpaid and invisible, compensated only through affective benefits. The question, now, is how can the public leverage the power inherent in this role? Astra Taylor, author of The People‘s Platform, says, “I’m struck by the fact that we use these civic-minded metaphors, calling Google Books a ‘library’ or Twitter a ‘town square’ — or even calling social media ‘social’ — but real public options are off the table, at least in the United States.” Though users are responsible for providing and policing vast quantities of digital content, she points out, we then “hand the digital commons over to private corporations at our own peril.”

January 19th, 2015, Mora-Blanco packed up her desk and left Twitter.

From her last position in Public Policy, she no longer had to screen violent images day after day and, after months of laying groundwork, she told us, “things had started to shift and move along lines I’d always wanted.” She and her coworkers — from Legal, Safety, Support, Public Policy, Content “all the pieces of the puzzle all across the platform” — had started meeting weekly to tackle online harassment and threat. “We got to sit in the room together,” she said, “and everyone felt really connected and that we could move forward together.”

But she’d finally earned enough money to take a year to train as a hair stylist. Just as important, she felt she could finally leave without betraying her colleagues, especially those working on the front lines.

Today, Mora-Blanco is studying cosmetology at the Cinta Aveda Institute in downtown San Francisco. She loves her new work, she told us, because it allows her the freedom to innovate. “I love to think big, but I am really passionate about being creative.”

“What are you doing here?” her instructor sometimes asks. “You could go out and make so much money.”

Occasionally she takes an online safety consulting gig — she hasn’t completely ruled out jumping back in. “But things would need to change,” she said. “I’d like to be a part of that change, but I need to be convinced that the attitude toward this advocacy work is going to be treated seriously, that the industry is going to be different.”

Correction: A previous draft of this piece suggested Google acquired YouTube in October of 2005. In fact, that acquisition took place in October of 2006.

This story was reported in partnership with The Investigative Fund at The Nation Institute, now known as Type Investigations.