Peering into the Black Box: Investigating the efficacy of a popular social media monitoring tool using AI

By: Ari Sen

Introduction

Across the country millions of students are returning to campuses or setting foot on them for the very first time. This new period in students’ lives brings lots of emotions — excitement and anxiety, fear and joy, connection and loneliness.

For the past decade many have turned to social media accounts to express these feelings to their circle of friends, family and peers. But this trusted group may not be the only ones watching — across the country dozens of colleges have purchased a technology called Social Sentinel, in what they say is an attempt to keep the worst from happening.


Background

Social Sentinel is a service sold to schools which scans social media messages which tries to detect threats of violence or self-harm. The company has said in the past that it scans more than a billion posts on social media every day against more than 450,000 words and phrases in its “Language of Harm.”

The service started its life as Campus Sentinel, an app which tracked crime statistics for campuses across the country. The app was created by Gary Margolis and Steven Healy, two former campus police chiefs who formed a security consulting company together in 2008. Sometime in late 2014, the two men shifted the app's purpose away from crime stats and towards social media monitoring, rebranding it “Social Sentinel.”

So far as we can tell from the company's patents, documents I've obtained and news articles, Social Sentinel’s system consists of two main pieces:

  1. An AI system that detects potentially threatening messages posted to social media.
  2. A method which associates a flagged tweet with a client school.

My reporting suggests that Social Sentinel has been used by at least 37 colleges in the past six years. Some of the country’s largest and most well-known colleges have used Social Sentinel, including UNC-Chapel Hill, the University of Virginia, Michigan State University, MIT and Arizona State University.













When compared to the number of K12 schools, the number of colleges using the service may seem insignificant. This perhaps explains why very little attention has been paid to college campuses that use this technology. But colleges present many unique dynamics which make the use of this technology potentially more concerning.

Unlike at the K12 level, where alerts are often sent to mental health counselors or school administrators, at the college level Social Sentinel’s alerts are typically sent to campus police officers. This is interesting for two reasons.

The first is, unlike regular police which answer to a chief which answers to a mayor or city council, campus police are essentially totally unaccountable to the population they serve. A college student usually gets no vote on who the chancellor or president of their university will be, nor can they usually argue against a police action before it takes place.

The second is that campus police were, quite literally, created to suppress student activism. Although the first campus police were established in 1894 at Yale, most other universities didn’t follow suit until the late 1960s and early 1970s; According to historians, these departments were largely formed to quash student protests against the ongoing war in Vietnam.

Reporting Hypotheses

My hypotheses are as follows:

  1. Social Sentinel is used by campus police not just for its stated purpose of preventing suicides and shootings, but also for suppressing protests and activism
  2. Social Sentinel is not an effective tool for preventing suicides and shootings on campuses, or at very least is not as effective as the company claims that it is

In this project I will mainly be focusing on the second hypothesis.

Data

My data is a collection of nearly 1236 tweets, nearly 400 of which were flagged by Social Sentinel as potential threats. The flagged tweets were gathered by Peter Aldhous and Lam Vo for their 2019 story Your Dumb Tweets Are Getting Flagged To People Trying To Stop School Shootings. The unflagged tweets were scraped from Twitter using the Twint library in Python. These tweets were gathered from the same users in the same time period as the flagged tweets, plus and minus a week for those with only one flagged tweet.

Method

To support my reporting hypotheses, I generated embeddings for every tweet using BERTweet and the clustered them using k-means, with a k of 2.

BERT is a machine learning langauge model developed by Google, which generates an "embedding," or a long list of numbers, for each unit of text by considering the words surrounding it. BERTweet is a version of this model specifically tuned for tweets.

K-means clustering is a common statistical method for discovering hidden patterns in data. In this case I was trying to see whether the tweets Social Sentinel flagged would cluster together.

To assess this visually, I took each tweet embedding (long list of numbers) and performed dimensionality reduction using a method called TSNE. This essentially means that I had the computer do some math to reduce that long list of numbers to only two, allowing me to plot each of them on an X,Y coordinate plane.

I also went through every tweet and hand-labeled whether I thought it was threatening or not and compared Social Sentinel's labels to mine. I also used my hand labels as a baseline to assess the performance of my clustering.

I colored the following plots:

  1. Based on the cluster labels
  2. Based on my human annotated labels
  3. Based on Social Sentinel's labels

The plots are reproduced below:



Cluster Labels


My Labels


Social Sentinel's Labels





I also used topic modeling to investigate the salience of groups of words in the corpus:





Analysis

The results from the clustering suggest that neither my system, nor the method Social Sentinel achieves very high accuracy (correct guesses compared to human labels). The company's model did perform better on this metric though, scoring 0.696 vs my 0.574 (the top score is 1.0).

However, my method has far fewer false positives AKA much higher precision: 0.065 for Social Sentinel vs 0.310 for my system.

This should be a concerning finding for the company and any school that uses this technology, given that I built my system in only a few hours and had far less training data to work with for this task. Additionally, I didn't even attempt to build a real classifier system given the incredibly low number of tweets I had to work with (low for a machine learning model at least).

Why it matters

Social Sentinel has repeatedly claimed in that they have significantly reduced their false positives in emails to current and potential client schools.

My analysis suggests that claims are at best dubious and at worst outright fabrications. The system, as evalauated here, is thus likely to be a waste of both money and precious policing and mental health resources.

This reporting also raises the question: If it isn't threats of suicide and shootings they are surfacing, what is it that they are catching?