Nosier than a Nosy Barista? An Ethical Test for Digital Phenotyping Algorithms

By Robert Thorstad

Image courtesy of Flickr
More than half of US adults use social media at least once a day (Perrin & Anderson, 2019). This implies that we record a staggering amount of our lives online. Before reading on, please pause and guess about an implication of this social media use. What could someone learn about you by reading your social media posts? What if this reader was not a person but an algorithm?

We’re starting to learn that algorithms can infer many things about you from your social media posts. In particular, I and others recently showed that machine learning models can use your social media posts to predict with moderate accuracy whether you have a mental illness like depression or anxiety (Thorstad & Wolff, in press; Eichstaedt et al, 2018; De Choudhury et al, 2013). To show this, Dr. Phillip Wolff and I downloaded five years of posts to clinical psychological discussion groups on the social media website Reddit (r/anxiety, r/depression, r/bipolar, r/adhd). For every user who wrote on one of these groups, we also downloaded every other post that individual wrote on Reddit, sometimes hundreds of posts or more. To make this a fair test, we removed every explicit reference to mental illness from those posts. We found that, using only these everyday posts which lacked overt references to mental illness, a machine learning model could predict which specific mental health forum an individual wrote on, such as r/anxiety or r/depression. Our findings join those of others who have found that your tweets (De Choudhury et al, 2013) and Facebook posts (Eichstaedt et al, 2018) can predict whether you have a mental illness. Some of these other results are also the subject of a recent Neuroethics blog post.

Predicting Mental Illness from Social Media. In a series of 3 studies, we found that people’s Reddit posts were predictive of whether they also posted on a mental illness subreddit (Thorstad & Wolff, in press)

What if the reader is a person, not an algorithm? We know less about what human readers can learn about you from your social media than we do about the capacity of algorithms. We do know that people can quickly form snap judgments of you from viewing a few-second video clip of your behavior called a thin-slice (Ambady & Rosenthal, 1992; Todorov et al, 2015; Todorov, Pakrashi, & Oostenhof, 2009). Since a social media post is, much like a thin slice, a short and relatively random slice of your behavior, it’s reasonable to expect that people could make many of the same inferences using social media as they would with video clips, even though social media posts usually use text instead of video. We also know that people can use thin slices to predict some aspects of personality like whether you are extraverted or introverted (Carney, Colvin, & Hall, 2007), to predict certain mental health outcomes such as whether you are depressed, anxious, or have a personality disorder (Slepian, Bogart, & Ambady, 2014), and to predict certain future outcomes such as a professor’s end-of-term teaching evaluations (Ambady & Rosenthal, 1993; see also Todorov et al, 2005). It follows that a person could probably make many of these same judgments about you from reading your social media posts, although we have more to learn about how accurate thin-slicing is, what kinds of traits it can reveal, and how well people can make these judgments using social media.

Thin Slicing. A thin slice is a short random sample of a person’s behavior, such as a short video clip. People can use these thin slices to make many, largely accurate, inferences about a person’s psychology.

Here is the ethical point. If algorithms can infer private traits from your social media, should there be limits to how we use them? Intuitively, the issue is that while people intend to reveal certain things on social media, algorithms go beyond what is explicitly written to infer private psychological traits. You may have intended, for example, to share that you went to a movie, but you did not intend to share that you are extraverted or depressed.

I want to suggest an ethical test. Consider a human in the position of a digital phenotyping algorithm – that is, an algorithm which tries to use your online behavior to make inferences about your psychology. To make this concrete, picture a very nosy barista at a coffee shop you go to every day. Like an algorithm, the nosy barista sees hundreds of snapshots of your behavior. Like an algorithm, the nosy barista can also observe other people in the coffee shop. Like an algorithm, the nosy barista cannot directly ask you questions about your psychology, but they spend a lot of time thinking about how these snapshots of behavior can predict private things about people’s psychology. Intuitively, while the nosy barista strikes us as creepy, there seems nothing that ought to be prohibited about the nosy barista’s people-watching. On the other hand, to the extent that an algorithm’s inferences are even nosier than this nosy barista, we may consider that the algorithm is too powerful. The ethical question is then whether a digital phenotyping algorithm is fundamentally like the nosy barista. Can an algorithm learn something from your social media that the nosy barista could not?

The Nosy Barista Test. Imagine a nosy barista at your regular coffee shop who observes you many times, thus having access to many thin slices of your behavior. The nosy barista test says that an algorithm deserves extra ethical scrutiny if it can infer something about you that the nosy barista could not.

We don’t know whether digital phenotyping algorithms pass the nosy barista test. To tell whether they pass, we need to know two things. First, we need to know more about what people can learn from your social media. If a person, like an algorithm, was shown all of your social media posts (which can number in the thousands!), how much could that person learn about you? Second, we need to know more about the limits of digital phenotyping algorithms. The most common inferences currently drawn with digital phenotyping are about mental illness or personality (Youyou, Kosinski, & Stillwell, 2015). These are the same inferences we know people can make about you from a thin slice of your behavior. What we don’t know is about the limits of digital phenotyping algorithms. Beyond personality and mental illness, are there traits an algorithm can infer but the nosy barista could not?

You might argue that people assume they have more privacy online than in a coffee shop, in which case perhaps algorithms should be held to a higher standard than the nosy barista. Although we have more to learn about people’s expectations of privacy online, existing data suggest that people do not assume more privacy online than in a coffee shop. First, people understand that social media posts are used by third parties, and they are generally accepting of this use as long as those posts were already publicly visible (Madden et al, 2013; Fiesler & Proferes, 2018). Second, people are generally in favor of the use of social media posts for digital phenotyping of mental illness, provided the information is aggregated and read only by computers (Mikal, Hurst, & Conway, 2018). We have more to learn about people’s expectations of privacy, and people certainly vary in their beliefs (Mikal, Hurst, & Conway, 2018), but the analogy of public social media posts to coffee shops seems reasonable.

There are many other ethical issues around digital phenotyping that we can’t discuss in detail here. Chief among them are the potential for de-anonymizing users or abusing algorithms to nudge consumers’ or voters’ behavior in certain ways (Narayanan & Shmatikov, 2009; Cadwalladr & Graham-Harrison, 2018). That digital phenotyping can be abused is an important issue, but perhaps the most fundamental ethical question is whether, even in its intended use, there ought to be limits on digital phenotyping. My suggestion is that decisions concerning these limits revolve, in part, around the empirical question of whether a digital phenotyping algorithm can learn something about you that a nosy barista could not.

________________


Robert Thorstad is a 5th-year Psychology PhD student at Emory University. His research asks whether people’s everyday behavior online can be used to make inferences about their psychology, especially decision-making, mental health, and future thinking. A description of his research is at www.robertthorstad.com, and he frequently tweets about psychology and digital phenotyping @robert_thorstad. He can be contacted at rthorst (at) emory (dot) edu.


References
  1. Ambady, N. & Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences: a meta-analysis. Psychological Bulletin, 111(2), 256-274.
  2. Cadwalladr, C. & Graham-Harrison, E. (2018). The Cambridge Analytica files. The Guardian, Retrieved from https://www.theguardian.com/news/series/cambridge-analytica-files
  3. Carney, D., Colvin, C., & Hall, J. (2007). A thin slice perspective on the accuracy of first impressions. Journal of Research in Personality, 41(5), 1054-1072.
  4. De Choudhury, Gamon, M., Counts, S., & Horvitz, E. (2013). Predicting depression via social media. In Proceedings of ICSWM 2013.
  5. Eichstaedt, J. & Smith, R. (2018). Facebook language predicts depression in medical records. Proceedings of the National Academy of Sciences, 115(44), 11203-11208.
  6. Fiesler, C. & Proferes, N. (2018). Participant perceptions of twitter research ethics. Social Media and Society, 4(1).
  7. Madden, M., Lenhart, A., Cortesi, S., Gasser, U., Duggan, M., Smith, A., & Beaton, M. (2013). Teens, social media, and privacy. Pew Research Center, 21, 2-86.
  8. Mikal, J., Hurst, S., & Conway, M. (2016). Ethical issues in using twitter for population-level depression monitoring: a qualitative study. BMC medical ethics, 17(1), 17-22.
  9. Narayanan, A. & Shmatikov, V. (2009). De-anonymizing social networks. In Proceedings of IEEESSP 2009.
  10. Slepian, M., Bogart, K., & Ambady, N. (2014). Thin-slice judgments in the clinical context. Annual Review of Clinical Psychology, 10, 131-153.
  11. Thorstad, R. & Wolff, P. (In press). Predicting future mental illness from social media: a big data approach. Behavior Research Methods. [preprint: https://psyarxiv.com/arf4t/].
  12. Todorov, A., Mandisodta, A., Goren, A., & Hall, C. (2005). Inference of competence from faces predict election outcomes. Science, 308(5728), 1623-1626.
  13. Todorov, A., Olivola, C., Dotsch, R., & Mende-Siedlecki, P. (2015). Social attributions from faces: determinants, consequences, and functional significance. Annual Review of Psychology, 66, 519-545.
  14. Todorov, A., Pakrashi, M., & Oosterhof, N. (2009). Evaluating faces on trustworthiness after minimal exposure time. Social Cognition, 27(6), 813-833.
  15. Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112(4), 1036-1040.


Want to cite this post?

Thorstad, R. (2019). Nosier than a Nosy Barista? An Ethical Test for Digital Phenotyping Algorithms. The Neuroethics Blog. Retrieved on , from http://www.theneuroethicsblog.com/2019/05/nosier-than-nosy-barista-ethical-test.html

Emory Neuroethics on Facebook

Emory Neuroethics on Twitter

AJOB Neuroscience on Facebook