A study published Monday in the Proceedings of the National Academy of Sciences suggests that analyzing language from Facebook posts can predict whether a user is depressed three months before the person receives a medical diagnosis.
The work is still in very early stages, the researchers from the University of Pennsylvania and Stony Brook University cautioned. The study was based on a group of fewer than 700 users and the predictive model is only moderately accurate. But this approach could hold promise for the future, they said.
"Depression is a really debilitating disease and we have treatments that can help people," said Raina Merchant, one of the study authors and director of the Penn Medicine Center for Digital Health. "We want to think of new ways to get people resources and identification for depression earlier."
Researchers recruited participants for the study from a hospital emergency department, asking for permission to access their electronic medical records and Facebook history. For every participant who had a diagnosis of depression in the medical records, researchers found five people who did not — creating a sample that mirrored rates of depression in the national population.
Examining more than 500,000 Facebook posts from both groups, researchers determined which words, post lengths, frequency of posting and timing of posts were most associated with a depression diagnosis. They found people with depression used the words "I, my, and me," as well as such words as "hurt, tired, and hospital," more often than others in the months preceding their diagnosis. Using indicators such as these, they built a computer model that could predict which people would receive a depression diagnosis with comparable accuracy to commonly used clinical surveys.
The model worked best when using Facebook data from the three months right before a participant received a depression diagnosis. When longer periods of Facebook data were included, the model became less precise.
"We're at the very beginning of trying to understand how this data is sometimes people just saying hi to each other, but sometimes it can give us insight into the health of individuals and communities," Merchant said.
Depression symptoms manifest differently by race, gender, and age, and can be affected by other diseases, making it difficult to diagnose. Most screening tools rely on people accurately reporting their own symptoms and answering survey questions, which can be interpreted differently based on a person's cultural background and language skills.
Primary-care doctors can screen for depression, but their visits with patients are often short and months apart, leaving the discussion focused on crises and immediate concerns.
"With social media and other data, you can start to fill in those gaps," said Munmun De Choudhury, an assistant professor in Georgia Tech's School of Interactive Computing, who was not involved in the study. Her previous research has shown that Twitter data can be used to predict which users will develop symptoms of depression.
In the future, if patients shared social media data with their doctors, it could create more personalized care, De Choudhury said. "How is their social life? Are they getting enough sleep? A lot of these attributes you can measure using social media," she said.
Social media data could be used for public health, too, De Choudhury said. For example, the Centers for Disease Control and Prevention could figure out which communities are most at risk for suicide by examining their online posts, and then target specific prevention measures to them.
Facebook and Google have started taking steps in this direction. Facebook uses artificial intelligence to flag posts that indicate risks of self harm or suicide. From there, an employee can direct people to national suicide prevention resources. Google prompts users who search depression-related terms to take a screening questionnaire.
It's encouraging to see these companies take social responsibility, De Choudhury said, but this can be only one aspect of mental health care. Predictive models built on social media are not highly accurate yet. They're also built on small sample sizes, which means they may not work the same in a large, diverse population.
"You shouldn't be using such an algorithm by itself at any point in time," she said. It needs to be combined with traditional screening surveys for depression and clinical expertise.
Another reason to be cautious with the use of social media for health care is the issue of privacy, Merchant said. "We should see this data the way we do any health data," she said. "It is the data of the patients." But it's a tricky premise, given recent high-profile data breaches, including one that compromised millions of Facebook users.
There are also concerns that social media do more than reflect one's mental health. Some studies have shown that those with greater social media use are more likely to be depressed or have eating disorders. But other studies show social media can be helpful in connecting people to resources and peer support.