Depression and suicidality as evolved credible signals of need in social conflicts

Michael R. Gaffney (Department of Anthropology, Washington State University) , Kai H. Adams (UC Berkeley Haas School of Business) , Kristen L. Syme (Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam) , Edward H. Hagen (Department of Anthropology, Washington State University)
January 19, 2022

Abstract

Mental health professionals generally view major depression and suicidality as pathological responses to stress that elicit aversive responses from others. An alternative hypothesis grounded in evolutionary theory contends that depression and suicidality are honest signals of need in response to adversity that can increase support from reluctant others when there are conflicts of interest. To test this hypothesis, we examined responses to emotional signals in a preregistered experimental vignette study involving claims of substantial need in the presence of conflicts of interest and private information about the signaler’s true level of need. In a sample of 1,240 participants recruited from Amazon Mechanical Turk, costlier signals like depression and suicidality resulted in greater perceptions of need, reduced perceptions of manipulativeness, and increased likelihood of support compared to simple verbal requests and crying without further symptoms. Additionally, as predicted, the effect of signaling on likelihood of support was largely mediated by the effect of signaling on participants’ belief that the signaler was genuinely in need. Our results support the hypothesis that depression and suicidality, apparent human universals, are credible signals of need that elicit more support than verbal requests, sad expressions, and crying when there are conflicts of interest.

Introduction

In a classic study, Coyne (1976) found that depression alienates others, a result that was subsequently confirmed in numerous studies (Segrin, 2000; Segrin & Dillard, 1992). (For DSM-5 depression symptoms, see Table 1.) In interactions with spouses and others, depressed individuals express anger and aggression, make frequent demands for help, self-disclose personally relevant negative issues at inappropriate times, and view such topics as more appropriate for discussion than do the non-depressed (Segrin, 2000; Segrin & Abramson, 1994). Such self-disclosures have been shown to be a key ingredient in the rejection of depressed persons by others, and “may appropriately be understood as an attempt to elicit social support from targets” (Segrin & Abramson, 1994, p. 657). Excessive reassurance seeking – repeatedly requesting reassurance that one is lovable and worthy despite previous attempts by others to provide such reassurance – is another factor implicated in the rejection of the depressed by others (Joiner & Metalsky, 1995; Joiner, Metalsky, Katz, & Beach, 1999; Starr & Davila, 2008).

Table 1: DSM-5 criteria for a Major Depressive Episode include five or more of the criteria, at least one of which is criteria 1 or 2. The symptom must persist most of the day, daily, for at least 2 weeks in a row. For more details, see American Psychiatric Association (2013).
Symptom
  1. Depressed mood—indicated by subjective report or observation by others (in children and adolescents, can be irritable mood).
  1. Loss of interest or pleasure in almost all activities—indicated by subjective report or observation by others.
  1. Significant (more than 5 percent in a month) unintentional weight loss/gain or decrease/increase in appetite (in children, failure to make expected weight gains).
  1. Sleep disturbance (insomnia or hypersomnia).
  1. Psychomotor changes (agitation or retardation) severe enough to be observable by others.
  1. Tiredness, fatigue, or low energy, or decreased efficiency with which routine tasks are completed.
  1. A sense of worthlessness or excessive, inappropriate, or delusional guilt (not merely self-reproach or guilt about being sick).
  1. Impaired ability to think, concentrate, or make decisions—indicated by subjective report or observation by others.
  1. Recurrent thoughts of death (not just fear of dying), suicidal ideation, or suicide attempts.

Negative social responses to depression are widely interpreted as evidence of impaired social functioning in the depressed (Evraire & Dozois, 2011; Gadassi & Rafaeli, 2015; Gotlib & Lee, 1989; Hames, Hagan, & Joiner, 2013; Hirschfeld et al., 2000; Kupferberg, Bicks, & Hasler, 2016; Weightman, Knight, & Baune, 2019). This interpretation is reinforced by the equally widespread view that depressed individuals have impaired, negative perceptions of themselves and their environments (Beck, 1963; Joiner, 2007; Nolen-Hoeksema, 1991). Together with the costs of depression, such as profound loss of interest in virtually all activities and suicide, these facts are the basis of the mainstream claim that depression is a psychopathology.

In contrast to these views, we will argue that the depressed have suffered genuinely severe forms of adversity and are therefore genuinely in need, but often have conflicts with their social partners. In these circumstances, costly and putatively dysfunctional depressive behaviors can instead be understood as aversive but credible and adaptive signals of need that elicit more support than verbal requests, sad expressions, and crying when there are conflicts of interest. We follow Smith & Harper (2003) in defining a signal as (p. 15):

An act or structure that alters the behaviour of another organism, which evolved because of that effect, and which is effective because the receiver’s response has also evolved.

Depression is caused by genuine adversity

All individuals suffer adversity, such as injury or loss of material or social resources, at some point in their lives, with over 70% of participants in a global survey reporting exposure to a traumatic event (Benjet et al., 2016). Psychological pain, such as sadness and low mood, probably evolved to motivate victims of adversity to shift their attention to the causes of adversity so as to mitigate its negative fitness consequences and to learn to avoid future such adverse events (Andrews & Thomson, 2010; Del Giudice, 2018; Nesse, 1990; Thornhill & Thornhill, 1989). Over human evolution, social partners could have often helped victims, and therefore signals of psychological pain, such as sad expressions and crying, probably evolved to indicate need (Balsters, Krahmer, Swerts, & Vingerhoets, 2013; Bowlby, 1980; e.g., Darwin, 1872; Reed & DeScioli, 2017).

Contrary to the view that the depressed have a distorted perception of their environment, there is strong evidence that most cases of depression are caused by genuinely severe negative life events, such as physical assault and death of a loved one (Devries et al., 2013, 2011; Ellsberg et al., 2008; Hammen, 2005; Mazure, 1998). Compared to non-depressed individuals, those with depression report about twice as many negative events (Mazure, 1998) and more negative events than those with schizophrenia and bipolar depression across multiple studies (Paykel, 1994). Longitudinal studies indicate that depression onsets soon after a negative event (Han et al., 2019; Kendler, Karkowski, & Prescott, 1999; Lewinsohn, Hoberman, & Rosenbaum, 1988; Rich, Gidycz, Warkentin, Loh, & Weiland, 2005; Sen et al., 2010) or coincides with periods where adversity is likely to increase prior to it (e.g., depression starting before a divorce rather than after, Blekesaune, 2008; Metsä-Simola & Martikainen, 2013; Rosenström et al., 2017). Although the relationship between negative events and depression is likely bidirectional (i.e., depression probably also causes adversity, Wichers et al., 2012), negative events predict depression even when considering only events outside of one’s control, indicating that the connection is unlikely to be driven solely by individuals who are already depressed selecting into situations where negative events are likely to be common (Hammen, 2005; Kendler et al., 1999). Furthermore, twin studies have shown that one’s history of negative events remains a strong predictor of major depression when controlling for genetic similarity, and that part of the heritability of depression stems from the heritability of negative events like divorce and family conflict (Kendler & Baker, 2007; Kendler et al., 1999).

For these and other reasons, we and others argue that most cases of depression are probably functional instances of psychological pain, i.e., the severe end of a spectrum of adaptive low mood, sadness, and grief, and not mental dysfunctions (Andrews & Thomson, 2010; Dowrick & Frances, 2013; Frances, 2013; Hagen, 2003; Hagen & Syme, 2021; Horwitz & Wakefield, 2007). For reviews of other evolutionary theories of depression, see Hagen (2011) and Durisko, Mulsant, & Andrews (2015).

Depression, anger, and conflict

One might expect that victims of adversity who become depressed would receive positive responses from family, friends, colleagues, and perhaps even strangers. Indeed, beneficial responses to depressed individuals have also been reported, such as increased caretaking (Hokanson, Loewenstein, Hedeen, & Howes, 1986), more offers of advice and support (Stephens, Hokanson, & Welker, 1987), and reduced aggression within families (Dadds, Sanders, Morrison, & Rebgetz, 1992; Hops et al., 1987; Sheeber, Hops, & Davis, 2001). Why, though, would these positive responses often be accompanied by negative ones?

The missing piece of the puzzle is that depression is closely associated with anger and conflict (Cassiello-Robbins & Barlow, 2016). Of the adversity-related risk factors for depression, those that involve conflict tend to be the strongest (Hammen, 2005; Mazure, 1998). Marital problems, bullying, and abusive relationships are all common risk factors for depression (Kendler et al., 1999, 1995; Klomek et al., 2019; Klomek, Marrocco, Kleinman, Schonfeld, & Gould, 2007), with sexual and non-sexual assault, in particular, greatly increasing one’s risk of depression (Kendler et al., 1999, 1995). This holds true even in a small-scale, non-Western society: among the Tsimane, Amazonian horticulturalists, depression is also associated with conflict, especially conflict involving non-kin (Stieglitz, Schniter, von Rueden, Kaplan, & Gurven, 2015). See Hagen & Syme (2021) for a review of the association of depression with anger and conflict.

Other notable depression risk factors, like loss of a loved one or severe or prolonged illness, might seem less related to conflict. In these situations, however, the fitness costs that stem from reduced access to resources could be mitigated with help from social partners (Sugiyama & Sugiyama, 2003). However, social partners might not be able, or want, to provide more investment than they already are. Therefore, problems whose solutions require substantial investment or other changes on the part of social partners will often involve social conflict even if they did not start that way (Hagen, 2003). Indeed, there is increasing evidence that loss of a loved one is often followed by increased family conflict (see Hagen & Syme, 2021 for a brief review).

When need is private information and there are conflicts with social partners, “cheap” signals of need, such as sad expressions and crying, may often not be believed when providing support is costly. We argue next that in this common situation, some of the most harmful and mysterious symptoms of depression – profound loss of interest in virtually all activities, and suicidal ideation and behaviors – serve as credible and adaptive signals of need.

Bargaining: Credibly signaling need during conflicts

With a cooperative species like our own, ubiquitous conflicts of interests means that there will always be disagreement over the levels of investment in a cooperative endeavor and the division of the resulting benefits, even among closely related individuals. According to partner choice models, individuals who are dissatisfied with the terms of cooperation can switch partners (Hammerstein & Noë, 2016), e.g., workers unhappy with their pay can look for a better job. In many cases, however, it is difficult or impossible to switch partners. Spouses who are dissatisfied with their partner’s investment in their new infant, for example, cannot easily find a different partner to invest in that infant (Hagen, 1999). Similarly, an adolescent who is dissatisfied with her parent’s investment in her cannot easily find other parents who were willing to invest more, nor could parents easily produce another adolescent. In these latter examples, and many cooperative endeavors central to human biological fitness, all parties have monopoly power over the benefits they bring to the endeavor – no one is easily replaced (Hagen, 2002, 2003). Such interdependence is increasingly recognized to be important to the evolution of cooperation in humans and other animals (Aktipis et al., 2018; Balliet, Tybur, & Van Lange, 2017; Roberts, 2005; Tomasello et al., 2012).

Hagen (2003) proposed that physical aggression and core depression symptoms like loss of interest in virtually all activities were complementary strategies to resolve conflicts in interdependent relationships. Sell, Tooby, & Cosmides (2009) found that physically formidable individuals were more prone to anger, prevailed more in conflicts of interest, and considered themselves entitled to better treatment. Physically or socially weaker individuals, though, are not without options to resolve conflicts in their favor. An individual with monopoly power over the benefits she contributes to a critical cooperative endeavor can withhold those benefits, or put them at risk, until her partners change their behaviors in ways that benefit her. As depression often involves a profound loss of interest in virtually all activities that can jeopardize one’s productivity (American Psychiatric Association, 2013), it might therefore be an evolved bargaining strategy for relatively powerless individuals in the wake of adversity and social conflict (Hagen, 2003; Hagen & Syme, 2021; see also Watson & Andrews, 2002).

Bargaining models assume that delaying cooperation is costly so that there is an incentive to quickly agree on a division of benefits, especially for those who highly value the fruits of the cooperative endeavor. In a classic non-cooperative game theory model of bargaining, Rubinstein (1982) showed that two parties can come to an immediate agreement over division of benefits despite conflicts of interest if the parties’ valuations of cooperation are not private information: in this case, each party knows exactly what division of benefits the other will accept, and can therefore make that offer immediately, avoiding the cost of delay.

If valuations are private information, however, costly delays might be unavoidable because each party has an incentive to deceptively request more than their actual valuation, and to reject the likely inflated requests from partners, leading to multiple rounds of bargaining. Models of bargaining with private information have a close relationship to models of credible signaling. When there are conflicts of interest, there are incentives to send deceptive signals. A credible signal is one that the receiver can believe despite the signaler’s incentive to deceive. A willingness to delay (i.e., refuse offers) credibly reveals one’s low valuation of the endeavor, and therefore genuine need – the benefit of waiting for a better offer outweighs the low cost of delay. Eagerness to reach a deal, on the other hand, credibly reveals a high valuation – the benefit of waiting for a better offer does not outweigh the higher cost of delay. Once valuations are known, the game reduces to the one analyzed by Rubinstein (1982), and the parties can reach an agreement on the division of benefits (Kennan & Wilson, 1993). Delays also typically require that additional factors come into play (Feinberg & Skrzypacz, 2005 and references therein).

Hagen (2003) proposed that whereas crying is a “cheap,” mostly short-term signal of need that might be deceptive (e.g., crocodile tears), the substantial, long-term reduction in productivity that characterizes many cases of depression corresponds to a willingness to delay, and is therefore a credible signal of low valuation and need. In terms of classic costly signaling theory, reduced productivity is relatively less costly for signalers whose efforts are currently not yielding many fitness benefits (i.e., the needy) than it would be for signalers whose efforts are yielding substantial benefits (the non-needy). Hence, the benefits of signaling outweigh the costs for needy individuals, who therefore send the signal, whereas the costs outweigh the benefits for non-needy individuals, who therefore do not send the signal.

Suicidality

Theoretical models of depression must account for suicidality. Suicidal ideation is one of nine diagnostic criteria for a major depressive episode (MDE) (American Psychiatric Association, 2013) and is associated with depression across cultures (Haroz et al., 2017); depression is a major risk factor for suicidal behavior (Hawton, Comabella, Haw, & Saunders, 2013); and suicidality is a major justification for the claim that depression is a brain dysfunction (e.g., Pies, 2014).

Anthropology, in contrast, has long viewed suicidality as largely the result of social problems. Early in the field’s history, anthropologists reported on suicide attempts and deaths in the small-scale societies that serve as models for the types of societies in which humans evolved. Suicide, they found, was commonly a form of protest, revenge, and/or appeal (Firth, 1936, 1961; Malinowski, 1932; Niehaus, 2012). Some ethnographers emphasized suicide as a form of anger or social pressure (Giddens, 1964; Hezel, 1987), whereas others emphasized the powerlessness of suicide victims (Counts, 1980).

Common to almost all theoretical and empirical investigations of suicide in anthropology and other disciplines is a focus on completed suicides, i.e., suicide deaths. The vast majority of suicidal behavior, however, does not result in death. In young adult women in the US, for example, there are hundreds of attempts for every death (see Figure 1). Syme, Garfield, & Hagen (2016) therefore argued that the theoretical focus should be on suicide ideation and suicide attempts.

US Suicidality non-fatal injury and death rates by age and sex (2001-2019). Data from @CDC2021.

Figure 1: US Suicidality non-fatal injury and death rates by age and sex (2001-2019). Data from CDC (2021).

Raymond Firth, an anthropologist who worked in the southwestern Pacific, was one of the first to view suicidality as a gamble to improve one’s circumstances in the here and now. Based on observations that suicide attempts often followed loss or conflict and varied substantially in their likelihood of death, he argued that a sizable subset of the suicide attempts among the Tikopia were not meant to end in death but instead were a means to elicit aid, status, or immediate reintegration into the community following negative events (Firth, 1936, 1961).

In the bargaining framework, suicidality, and perhaps also non-suicidal self-injury (Hagen, Watson, & Hammerstein, 2008), is conceptualized as putting all future contributions to cooperative endeavors with social partners at risk with some low but non-zero probability, credibly signaling low valuation of current circumstances. On this view, most suicides deaths, especially in young, physically healthy individuals, would therefore be the inevitable consequence that some individuals lose this gamble.

As with depression, there are negative social responses to suicidal behavior. Most studies examining responses to those who have survived suicide attempts have focused on stigmatization. In these studies, perceptions of survivors as being weak, selfish, mentally ill, and antisocial are commonly reported (Batterham, Calear, & Christensen, 2013; Tzeng & Lipson, 2004), with stigmatization often being found within and outside one’s social network (Frey, Hans, & Cerel, 2016; Scocco, Castriotta, Toffol, & Preti, 2012).

Despite this potential for stigmatization, increased social support and beneficial changes to important relationships have been reported to follow suicide attempts with some indication that these effects may hold long term (Stengel, 1956). For example, a study of 100 women who survived suicide attempts found that individuals gained identifiable benefits through the attempt in 75 cases, with 41 individuals benefiting from reconciliations with others (Lukianowicz, 1971). Unlike many Western countries, where suicide is often viewed as pathological (Hidaka, 2012), members of traditional societies have been reported to view suicide attempts as cries for help rather than mental illness (Shostak, 1981), with attempts having been described as ways of escaping unwanted marriage arrangements, persistent abuse, or a lack of support in obtaining mates by those engaging in the behavior and observers (Gutiérrez de Pineda & Muirden, 1948; Hilger, 1957; Karsten, 1935; Tessmann, 1930; Wilson, 1960). Although findings that individuals view suicide attempts as cries for help is not necessarily evidence for their ability to lead to beneficial responses, 30 out of 84 examples of suicidal behavior included in the HRAF resulted in positive changes for the survivor (Syme et al., 2016).

Aversiveness is a feature of depression, not a bug

Under the bargaining model, aversive responses to depressive and suicidal bargaining are expected throughout the process, encouraging beneficial concessions by interdependent social partners with whom one is in conflict, who in turn signal the costliness of increasing their support (Hagen, 2003; Hagen & Syme, 2021). This predicted pattern is quite similar to anger, which is aversive to targeted social partners, yet is probably an adaptation that exploits advantages in physical or social formidability to force beneficial concessions from them (Sell et al., 2009).

We argue that among those who lack better options, aversive depression symptoms that put one’s value to others at risk, such as loss of interest and suicidality, credibly signal low valuation of the current efforts of social partners and motivate them to provide more support so as to end the aversive depressive behaviors (for similar views, see Andrews, 2006; Farberow & Shneidman, 1961; Firth, 1936, 1961; Hagen et al., 2008; Nock, 2008; Rosenthal, 1993; Stengel, 1956).

Study aims and predictions

The prevailing view is that depression involves impaired social abilities that lead to rejection by social partners (Coyne, 1976; Gadassi & Rafaeli, 2015; Hames et al., 2013; Joiner et al., 1999; Segrin, 2000; Weightman et al., 2019). The aim of this study was to test an alternative hypothesis that when there are conflicts of interest, depressive and suicidal behaviors benefit victims of adversity by increasing belief that they are telling the truth and consequently increasing willingness to help them.

Most of the limited literature on social responses to depression and suicidality comprises observational studies of depressed individuals interacting with family, friends, or roommates (Dadds et al., 1992; Hops et al., 1987; Joiner & Metalsky, 1995; Sheeber et al., 2001; Starr & Davila, 2008). These have ecological validity, but cannot easily determine causal relationships. Some studies, though, have employed an experimental design in which participants were randomized into conditions in which they listened to, watched, or interacted with either a depressed or non-depressed person, where in some cases the depressed person was a non-depressed confederate enacting a depressed role (Marcus & Nardone, 1992). These designs can demonstrate causation but the transient, inconsequential relationships and laboratory settings lack ecological validity.

Experimental vignette studies, which employ a short, carefully constructed description of a person, object, or situation, aim to approach the ecological validity of observational studies by presenting participants with rich, real-world scenarios, while at the same time allowing researchers to randomize participants into conditions in which theoretically relevant dimensions of the vignettes are systematically manipulated, thus enabling robust causal inferences (Atzmüller & Steiner, 2010). Experimental vignette studies are conducted in a broad range of disciplines, including psychology, economics, sociology, management studies, political science, and education (Aguinis & Bradley, 2014; Atzmüller & Steiner, 2010).

In the bargaining framework, one’s “willingness to delay” is a credible signal of one’s valuation of current cooperative arrangements, with a greater willingness to delay indicating a lower valuation. Here, we investigated responses to emotional signals that varied in the extent to which they reduced productivity or put future productivity at risk, which we refer to as costs, in an experimental vignette study in which a possible victim of adversity asks for help from the participant, but has incentives to exaggerate her need. As signal cost increased, we predicted that participants would report (1) increased belief in the signaler’s claims and (2) increased likelihood of providing help, with (3) the increased likelihood of providing help mediated by the increased belief in the signaler’s need.

Materials and methods

Design

This study utilized a between-subjects pretest-posttest design to examine how four different emotional signals (treatments) would influence (1) the degree participants believed a fictional character to be in need (Belief) and (2) the likelihood they would provide help (Action) relative to a simple Verbal request without additional signaling (the control condition), in four different vignettes, for a total of 20 conditions. In this design, the outcomes are measured at pre-treatment (T1). Participants are then randomized into either a control group or a treatment group, i.e., one of the emotional signals, and the outcomes are measured again (T2). Regression models (described later) are used to determine the effect of the treatment conditions on the posttest outcome variables, relative to the control condition, controlling for pretest levels of the outcome variables (we also explored within-subjects effects of the signal on outcomes at T2 compared to T1).

In principle, pretest-posttest designs, by controlling for pretest variation in the outcome, increase the precision of the estimate of the treatment effect on the outcome (Dimitrov & Rumrill Jr, 2003). In survey experiments, however, researchers often favor posttest-only designs over pretest-posttest designs. The common concern is that the pre-treatment measurement of the outcome will influence the treatment effect on the outcome (i.e., the effects of asking the same question twice) due to, e.g., demand effects, in which participants try to conform to experimenter expectations, or to consistency pressures, in which participants try to provide consistent responses regardless of treatments (Clifford, Sheagley, & Piston, 2021). In a study with six experiments that randomly assigned respondents to alternative designs (e.g., pretest-posttest, posttest only) Clifford et al. (2021) found these concerns to be overblown. In all cases, the pretest-posttest design had substantially greater precision than the posttest-only design, with little evidence that pretest measurement altered the treatment effect.

Lessons learned from two pilot studies

The current study is a refinement of a large MTurk experimental vignette pilot study (N=1636) that used a different vignette but very similar signals and outcomes (see below for more details on MTurk samples), and a much smaller pilot study posted to reddit.com/r/SampleSize/ (N=28) that used draft versions of three vignettes used in the current study, along with the same signals and outcomes. One major goal of the MTurk pilot study was to determine if believability and willingness to help were simply artifacts of the fictional victim’s psychiatric distress. We therefore included a “signaling” condition in which the victim exhibited schizophrenic symptoms. As predicted, believability and willingness to help in this condition were dramatically lower than in any other condition (see Figure 9), ruling out this alternative explanation. We consequently did not include the schizophrenic condition in the current study. See the SI for more details on the pilot studies.

A second lesson was that participants in the pilot study tended to believe the fictional victim prior to her signaling need, which made it difficult to determine if the signals increased her believability. The vignettes for this study were therefore written to undermine the victim’s credibility by making her seem manipulative at T1.

Power analysis

We used the MTurk pilot data to estimate the sample sizes needed to detect an effect of the Mild depression signal vs. Verbal request control on Belief in the victim’s need. Power was about 80% for a sample size of about 95, and was about 90% for a sample size of about 130. See Figure 10. Given our $1500 USD budget, we aimed for a sample size of 120-130 for treatment plus control conditions, and 1200-1300 for all conditions in the study. For more details, see the SI.

Sampling

Participants for this study were recruited from Amazon Mechanical Turk, a crowdsourcing platform that allows for the creation of Human Intelligence Tasks (HITs) that workers can complete for pay. As Amazon provides the infrastructure, it allows for a relatively low-cost way of collecting data for academic research, with the disadvantage that the data are not representative of any real population (Thomas & Clifford, 2017). Despite this limitation, MTurk samples have a wider range of ages and incomes than most university samples, and therefore might be more informative about the general population (J. Dworkin, Hessel, Gliske, & Rudi, 2016; Kennedy et al., 2020; Thomas & Clifford, 2017). US MTurk samples do differ from the general US population, though, mainly in being younger, more educated, and lower income (Boas, Christenson, & Glick, 2020; Ross, Zaldivar, Irani, & Tomlinson, 2010).

To test emotional signals in an arranged marriage vignette, we recruited an Indian MTurk sample. Indian samples are also likely to be younger, more educated, and have higher income than the general Indian population, and are more likely to come from regions with good internet access (Boas et al., 2020).

Overall, the quality of data provided by MTurk workers tends to resemble that of university sample pools (Necka, Cacioppo, Norman, & Cacioppo, 2016; Robinson, Rosenzweig, Moss, & Litman, 2019; Thomas & Clifford, 2017), with some studies reporting that MTurk samples are more attentive than samples of university students (Hauser & Schwarz, 2016). In vignette studies, MTurk data quality also compares favorably to that from much more expensive population-based samples (Weinberg, Freese, & McElhattan, 2014). For these reasons, concerns about data quality come primarily from the threat of bot use or respondents faking their location to take surveys in a language they do not understand well, with there being little evidence of the former (Kennedy et al., 2020) and the risk of the latter able to be minimized through well designed attention checks, timed responses, and good study design (Aguinis, Villamor, & Ramani, 2020; Huang, Bowling, Liu, & Li, 2015; Kennedy et al., 2020; Thomas & Clifford, 2017).

Participants

All participants were over 18, located in the United States or India, and had high quality MTurk metrics (completed at least 100 HITs with a HIT approval rate of over 98%, Kennedy et al., 2020). Participants were excluded from the study if (1) they read the vignette too quickly (one-third of the time it took MG to read it), and (2) they failed clearly labeled attention checks. The first attention check was shown immediately after the consent form and provided participants with a random word and asked them to enter the vowels in the order in which they are found in the word. The second attention check followed the vignette and involved asking three questions about the story that were easy to answer for anyone paying attention.

Ethics

All participants provided informed consent, and the consent form warned that some content might involve sexual assault. We estimated the study would take 4-8 minutes to complete for participants who did not take breaks (MTurkers commonly multitask, or leave the survey page and return later, Necka et al., 2016). All participants who passed the attention checks were paid $1 for their time, for an estimated rate of $7.50/hr to $15/hr (75% of US participants completed in 8.4 minutes; 75% of Indian participants completed in 25 minutes). This study was certified exempt by the Washington State University Human Research Protection Program.

Survey

Four vignettes were used in this study that involved (1) a female’s claim of severe adversity that was private information, (2) conflicts of interest between the victim and the participant that would undermine the believability of her claims and make her seem manipulative, and (3) her emotional signals. The vignette scenarios involved potentially severe types of adversity, such as sexual and non-sexual assault and thwarted marriage, that often precede cases of depression and suicidality in the ethnographic and clinical record (Brown, 1986; Kendler et al., 1999, 1995; Syme et al., 2016). See Table 2.

Time 1: Claim of need in a conflictual relationship

At Time 1 (T1) participants in the US sample were randomly assigned to either the “basketball coach,” “romantic partner,” or “brother-in-law” vignettes, and the Indian sample was assigned to the “thwarted marriage” vignette.

Basketball coach vignette: Participants were asked to imagine that they are a university athletic director. The star player on the women’s basketball team comes to the participant and claims she was sexually assaulted by her head coach, a physically powerful man. However, there is a history of conflict between the star player and the coach over playing time, and police are unable to find evidence to corroborate her claims.

Brother-in-law vignette: Participants were asked to imagine that they let their sister, brother-in-law, and niece move in with them after their sister’s family lost their house in a fire. During this time, the participant’s 15 year old daughter becomes jealous of the niece, who appears to be a social competitor. A few weeks after claiming the niece was trying to steal her boyfriend, the participant’s daughter accuses the brother-in-law of sexually assaulting her.

Romantic partner vignette: Participants were asked to imagine that they found a highly desirable romantic partner after years of being single. However, the participant’s 13-year-old daughter, who has a history of interfering with the participant’s past relationships, is clearly unhappy with the new partner. After a period of sustained conflict with both the participant and the romantic partner, the daughter accuses the romantic partner of physically assaulting her, but cannot produce any evidence.

Thwarted marriage vignette: Indian sample only. Participants were asked to imagine that their family was trying to arrange a dowry for their older daughter (the signaler in this vignette) so she can marry a man she already loves, while still saving enough money for their younger daughter’s dowry. After the man’s family demands more money, the participant’s family tries to find a second man, who the older daughter claims to find unattractive. Any increase in the dowry will come at the younger daughter’s expense. The participant therefore proceeds to arrange a marriage to the second man as the first man’s family makes arrangements with a different woman.

The full vignettes are available in the SI.

Table 2: The cooperative endeavor, conflict of interest, and private information in each vignette
Vignette Cooperative Endeavor Conflict of interest Private information
Thwarted marriage Inclusive fitness (parent-offspring) Parental investment in sibling Value of arranged marriage with second man
Basketball coach Winning the championship Coach’s investment in other players; keeping the coach Did sexual assault happen?
Romantic partner Inclusive fitness (parent-offspring) Investment in offspring vs. romantic partner Did physical assault happen?
Brother-in-law Inclusive fitness (parent-offspring) Investment in child vs. investment in adult sibling and niece Did sexual assault happen?

Baseline measures (T1)

After reading the vignettes, participants rated their belief that the signaler was telling the truth (T1 Belief: 0-100) and the likelihood of them helping the signaler, as requested (T1 Action: 0-100). With the thwarted marriage vignette we also asked how they would split the money they had saved for the dowry between their daughters (T1 Divide: 0-100; 50 is equal split). In every instance, the order of the questions was randomized to avoid order effects (Krosnick & Alwin, 1987).

Responses were recorded with sliders due to the fact they allow for finer grained changes than categorical scales (Klimek et al., 2017). Based on findings that a slider’s starting position may bias results (Liu & Conrad, 2019), we had concerns that participants would be less likely to move away from intermediate starting values than they would be in reality. For this reason, we set each T1 slider to start fully to the left (0). For the exact wording of the question and the labels on the sliders, see Table 4.

We also asked which emotions participants felt the signaler was experiencing using a multiple choice question in which they could select as many options as they would like. This included emotions directly related to the signals (e.g., sad, depressed, and suicidal), states which suggest genuine need (traumatized and violated), states which suggest deception (e.g., deviousness or jealousy), and if the victim was mentally ill. The complete list can be found in the SI. To further explore the effect of signaling on participants’ inferences of the signaler’s emotional state, we created two new variables: Low mood was the sum of the binary variables Depressed, Distressed and Sad; and Manipulative was the sum of the binary variables Devious and Jealous.

Time 2: Signals

After rating their beliefs and actions, and which emotions they thought the potential victim was experiencing, participants were randomized into either the control condition or one of four emotional signals by the victim (in order of increasing signal cost): (1) control condition: a verbal request without additional signaling; (2) crying; (3) mild-depression; (4) depression; and (5) a suicide attempt. The signals involved the participant encountering the victim some time after the adverse event and observing, e.g., crying; sad expressions; reduced effort, fatigue, and poor personal hygiene; and suicidal self-injury. These descriptions did not use the terms depression, depressed, suicidal, or mental health.

The signals were cumulative: crying can be an important feature of depression (for discussion on the relationship between crying and depression see Bylsma, Gračanin, & Vingerhoets, 2020), and depression is a major risk factor for suicide (Bostwick & Pankratz, 2000; Kessler, 2012). Accordingly, components of less-costly signals were included in more-costly signals. Although we use the term ‘signals’ throughout the paper for brevity, we expect that the hypothesized signals of need, like many signals, would also provide information to others in the form of cues (for discussion on the evolution of signals from cues see: Biernaskie, Perry, & Grafen, 2018; Steinkopf, 2015; Tiokhin, 2016). The complete texts of the signals are available in the SI.

Post-treatment measures (T2-T3)

After reading the signaling text, participants answered questions identical to those asked at T1 as the main post-treatment variables of interest (T2). For the Belief, Action, and Divide variables, the position of the slider starting where participants placed it at T1. T2 emotion multiple-choice questions were identical to those used in T1, as were the composite variables Low mood and Manipulative.

As both a validity check and a way to understand the degree participants would be willing to help if they believed the participant completely, at T3 we presented participants with strong evidence that the claims were true. In the US sample, this involved telling participants there was video evidence of the event in question occurring or a similar event after the fact. With the thwarted marriage vignette, this involved the participant seeing the man their daughter wants to marry trash-talk their daughter and their family. Participants were then asked to rate their likelihood of acting (T3 Action).

Demographic Questions

The final part of the survey was a brief demographic questionnaire which asked for the (1) age, (2) sex, (3) number of siblings, (4) number of sons, (5) number of daughters, (6) current relationship status, (7) highest level of education, and (8) the annual household income of each participant.

Statistical analyses: Preregistered and modified

Our intervention was signal, an ordinal variable with the following preregistered rank order: verbal request (control), crying, mild depression, depression, suicide attempt. We coded this 5-level ordinal factor variable using default 4th-order orthogonal polynomial contrasts. We preregistered a test of our hypothesis that used ordinary least squares (OLS) regression models with the following form:

\[ \begin{aligned} \operatorname{Belief}_{T2} &= \beta_{0} + \beta_{1}(\operatorname{Belief}_{T1}) + \beta_{2}(\operatorname{signal}_{\operatorname{.L}}) + \beta_{3}(\operatorname{signal}_{\operatorname{.Q}})\ + \\ &\quad \beta_{4}(\operatorname{signal}_{\operatorname{.C}}) + \beta_{5}(\operatorname{signal}_{\operatorname{\text{^}4}}) \end{aligned} \]

\[ \begin{aligned} \operatorname{Action}_{T2} &= \beta_{0} + \beta_{1}(\operatorname{Action}_{T1}) + \beta_{2}(\operatorname{signal}_{\operatorname{.L}}) + \beta_{3}(\operatorname{signal}_{\operatorname{.Q}})\ + \\ &\quad \beta_{4}(\operatorname{signal}_{\operatorname{.C}}) + \beta_{5}(\operatorname{signal}_{\operatorname{\text{^}4}}) \end{aligned} \]

We predicted that there would be a statistically significant monotonically increasing effect of the signal on Belief and Action.

We decided to fit generalized linear regression models instead of OLS, however, for the following reasons. Our pre-test and post-test measures, T1 & T2 Belief and T1 & T2 Action, were all measured on a 0-100 point scale. A substantial number of participants rated their beliefs and actions as exactly 0 or exactly 100 at either T1 or T2. OLS linear regression models are not suitable for a closed and bounded distribution with so many values on the boundary because the residuals would not be normally distributed or have constant variance. For further discussion, see the SI, where we also report the preregistered OLS models.

To test our preregistered hypothesis that the likelihood of acting to help the victim would be largely mediated by a signal’s positive impact on the participant’s belief in the victim’s need, we used the mediation package (Tingley, Yamamoto, Hirose, Keele, & Imai, 2014) to fit a mediation model for Depression treatment vs. Verbal request control. We did the same for Suicide attempt. See Figure 2.

Possible causal effects of the signal on helping behavior. A credible signal of need increases belief that the victim is telling the truth and needs help, which increase the likelihood of helping. The direct path from the signal to action represents other causal effects of the signal on observers, such as perceptions of the victim's emotional state (and other factors not measured in this study) that might alter observer behavior.

Figure 2: Possible causal effects of the signal on helping behavior. A credible signal of need increases belief that the victim is telling the truth and needs help, which increase the likelihood of helping. The direct path from the signal to action represents other causal effects of the signal on observers, such as perceptions of the victim’s emotional state (and other factors not measured in this study) that might alter observer behavior.

For specifications of all regression models, see the SI. Our preregistration is here: https://osf.io/g3s6n

Data availability

The data are available at http://doi.org/10.5281/zenodo.4637904

Results

The study was started by N=1950 participants who clicked the link to Qualtrics (1213 US and 737 India). After removing participants who did not finish the survey, failed attention checks, or moved through the study at an unrealistic pace (N=710, 36%), our final sample was N = 1240 (937 US and 303 India), with 759 males and 479 females, 609 of whom were married or in a long-term relationship, 205 who were divorced, 414 who were single, and 11 who were widowed. The median number of participants per condition was 61 (min=58, max=67). For the number of participants in each condition, see Table 5. For summary statistics, see Table 3. For the distributions of participants by age, income, and nationality, see Figure 11.

Table 3: Summary statistics for study variables. Values that were on a 0-100 scale were rescaled to 0-1. Indian participants reported income in rupees, which we converted to USD at the current exchange rate (1 rupee = 0.014 USD).
Variable N Range Mean (SD)
Age (years) 1240 18-81 37 (12)
Income (USD) 1235 0-840000 53000 (60000)
Education (years) 1238 11-24 16 (2.1)
Number of children 1236 0-11 1 (1.3)
Time to complete (minutes) 1240 1.6-1700 13 (50)
T1 Belief 1240 0-1 0.38 (0.3)
T1 Action 1240 0-1 0.37 (0.32)
T1 Division 303 0.14-1 0.58 (0.15)
T1 Low mood 1240 0-3 1.3 (1.1)
T1 Manipulative 1240 0-2 1 (0.74)
T2 Belief 1240 0-1 0.48 (0.33)
T2 Action 1240 0-1 0.49 (0.35)
T2 Division 303 0.14-1 0.62 (0.16)
T2 Low mood 1240 0-3 1.7 (1.1)
T2 Manipulative 1240 0-2 0.53 (0.7)
T3 Action 1240 0-1 0.88 (0.22)

Distributions of beliefs and actions at Time 1

Across the four vignettes, mean belief of the victim (after rescaling original 0-100 values to 0-1) was relatively low at baseline (T1), Mean = 0.38, albeit with wide variation, SD = 0.3; 118 participants (9.5%) rated their belief = 0, and 38 participants (3.1%) rated it as = 1. The distribution of likelihood of helping the victim (action) was similar, Mean = 0.37, SD = 0.32, with 168 participants (14%) rating their action = 0, and 57 participants (4.6%) rated action as = 1. Although not a pre-registered prediction, T1 Belief and T1 Action were highly correlated across the four vignettes, consistent with help being worth providing if the signaler’s claims were true (see Figure 3).