A Conceptual Review of Lab-Based Aggression Paradigms

Aggression is often defined as a behavior that is done with the intent to harm an individual who is believed to want avoid being harmed (e.g., Baron & Richardson, 1994). Accordingly, social scientists have developed several tasks to study aggression in laboratory settings; tasks that we refer to as “lab-based aggression paradigms.” However, because of the legal, ethical, and practical issues inherent in provoking aggression within the confines of a laboratory setting, it is feasible to study only very mildly harmful aggression. The current conceptual review examines the criteria that are necessary to study aggression in a laboratory setting, discusses the strengths and weaknesses of several new and/ or commonly-used lab-based aggression paradigms, and offers recommendations for the future of labbased aggression research. Collectively, we hope the current discussion helps researchers to describe the contributions and limitations of lab-based aggression research and, ultimately, helps to improve the informativeness of lab-based aggression research.

Aggression is a common feature of social interactions and, thus, has been a topic of study for social scientists for decades. One valuable approach to isolate and understand the theorized causes of aggression is laboratory-based research, which requires usable and valid methods for measuring aggression in otherwise highly-artificial laboratory settings. However, behaviors that are clearly aggressive, such as one person forcefully striking another person with a weapon, are fraught with legal, ethical, and safety considerations for both participants and researchers; and would be logistically difficult to allow to occur in a laboratory setting. Intentionally provoking such extreme aggression is, therefore, not a viable option within lab-based research. For these reasons, aggression researchers have developed a repertoire of tasks that purportedly measure aggression, are believed to be safe for participants and researchers, and are ethically and legally permissible within the confines of a laboratory setting. We collectively refer to these tasks as "lab-based aggression paradigms." The overarching goals of the current manuscript are threefold: (1) To delineate the criteria that would be necessary for the behaviors within these lab-based paradigms to be considered aggressive, (2) to discuss the strengths and weaknesses of several extant lab-based aggression paradigms, and (3) to offer recommendations for improving research using lab-based aggression paradigms. We have organized the current manuscript into three major sections organized around these goals. The first major section discusses conceptualizations of aggression and their implications for measuring aggression in laboratory settings. The second major section discusses several extant lab-based aggression paradigms. The final major section lays out a series of recommendations that we believe would maximize the contribution of lab-based research to a cumulative and progressive science of aggressive behaviors.

Criteria for Valid Lab-Based Measures of Aggression
The current manuscript is primarily concerned with a conceptual assessment of the construct validity of lab-based aggression paradigms. Construct validity is the degree to which a test measures what it purports to be measuring (Kline, 2009). Essentially, aggression researchers want to know whether participants' behaviors that are enacted within lab-based aggression paradigms meet the criteria for being considered "aggressive." Thus, to appropriately assess the construct validity of lab-based aggression paradigms, researchers need to determine (a) What characterizes an aggressive behavior? (b) And are the behaviors enacted within these paradigms accurately described as aggressive?

What is an aggressive behavior?
A common definition of aggression, and the definition we use in the current manuscript, is "a behavior done with the intent to harm an individual who is motivated to avoid receiving that behavior" (Baron & Richardson, 1994, p. 7; see also Huesmann, 2003 andParrot &Giancola, 2007). If one adheres to this definition, any behavior is considered aggressive when it is done (a) with intent to harm the target and (b) with the belief the target wanted to avoid receiving the behavior. A strength of this definition is the intuitive demarcation between intentionally harmful behaviors that are not aggressive (i.e., a dentist who causes pain in the process of pulling a patient's tooth; partners who inflict consensual pain for sexual pleasure, etc.) and intentionally harmful behaviors that are aggressive (i.e., punching; yelling, etc.).
It also is notable that the extent to which a behavior actually causes harm is not relevant to whether that behavior is considered aggressive. That is, aggression, by Baron and Richardson's definition, is defined based on whether the behavior was intended to cause harm and not whether harm actually occurs as a consequence of the behavior. For example, an individual who punches at another person is behaving aggressively even if that person avoids the punch and avoids receiving any harm. Thus, aggression may or may not cause harm. And behaviors that cause harm may or may not be considered aggressive.
In an attempt to classify the wide range of behaviors to which Baron and Richardson's (1994) definition of aggression might apply, Parrott and Giancola (2007) proposed a taxonomy of how such aggressive behaviors may manifest. Within their taxonomy, aggressive behaviors vary along the orthogonal dimensions of direct versus indirect expressions and active versus passive expressions. For example, a physical fight would be considered a direct and active form of physical aggression whereas not correcting knowingly-false gossip would be considered an indirect and passive form of verbal aggression (to the extent the individual believes their inaction will indirectly result in a harmful consequence for the target individual). Because Parrott and Giancola adhere to the definition of aggression proposed by Baron and Richardson, each of these manifestations of aggression are still required to meet the criteria described above. Notably though, Parrot and Giancola's taxonomy subtly modifies Baron and Richardson's definition to include intentional absence of behavior that, as a consequence of the inaction, intends to cause harm to a target individual (which the target is believed to be motivated to avoid).
Another dimension of aggression is the extremity of the intended harm caused by the behavior. Although not incorporated into Parrott and Giancola's (2007) taxonomy, aggressive behaviors vary in the extent to which the behaviors, if successfully executed to completion, would cause harm to the recipient. Such harm could be the intensity and duration of physical pain inflicted, the extent to which a behavior caused an injury or detrimental outcome, etc. Within their manuscript, Parrott and Giancola provide examples of behaviors that would be located at the same conceptual space in their directindirect, passive-active taxonomy and differ greatly in the extremity of the harmfulness. For example, direct-active aggression can range from behaviors that cause relatively severe harm, such as a physical injury (e.g., punching, striking another person with a weapon, etc.), to behaviors that cause extremely mild harm, such as a mildly negative psychological experience (e.g., making a threatening face, adopting an aggressive posture, etc.). Likewise, two aggressive behaviors can be similarly harmful and be located in different conceptual space in their taxonomy. For example, shooting another person with a gun is an extremely harmful direct-active aggressive behavior whereas surreptitiously poisoning another person is an extremely harmful indirect-active aggressive behavior.
Although Baron and Richardson's (1994) definition requires aggressive behaviors to occur with intent and with the belief the recipient wants to avoid the behavior, it is not necessarily implied that "causing harm" is the ulterior motive of the behavior rather than being merely instrumental to achieving some other ends (e.g., Buss, 1961). A person who behaves aggressively does so with the intent to harm the recipient by definition (according to Baron and Richardson), but other motives may have caused the person to behave aggressively in the first place (see also Bushman & Anderson, 2001 for similar arguments). In short, aggression is sometimes an effective strategy to achieve one's goals.
For this reason, Ferguson and Beaver (2009) eschewed Baron and Richardson's (1994) definition and proposed defining aggression as a behavior "to increase one's own position in a dominance hierarchy at the expense of another" (p. 287). Similarly, Tedeschi and Felson's (1994; see also Felson & Tedeschi, 1993) social interactionist perspective also conceptualizes aggression as an inherently instrumental behavior that individuals sometimes use to achieve their social motives. Similar to Ferguson and Beaver, in the social interactionist approach individuals may have a proximate goal or intention of causing harm to another individual, but these harmful behaviors must always be considered as a strategy to achieve a more distal social motive. Whereas Ferguson and Beaver solely focus on the motive of ascending a dominance hierarchy from an evolutionary perspective, the motives to behave aggressively described in Tedeschi and Felson's approach are inherently social in nature. For example, the social interactionist perspective argues that aggression can be a strategy to acquire resources, deter others from acquiring your resources, to restore one's reputation, to defend oneself or others, etc. Because the social interactionist perspective does not conceptualize the intent to harm another individual as the terminal goal of a behavior, understanding distal social motives that individuals are trying to achieve provides critical context for accurately understanding why people behave aggressively.
Rather than superseding Baron and Richardson's (1994) definition of aggression, the social interactionist perspective merely emphasizes a different aspect of aggressive behaviors. Specifically, whereas Baron and Richardson provide a definition of what qualities are necessary for a behavior to be classified as aggressive, the social interactionist perspective focuses on what social motives those aggressive behaviors can achieve without delineating the necessary criteria for the behavior to be considered aggressive in the first place. Thus, for the purposes of the current manuscript, behaviors must meet Baron and Richardson's criteria to be considered aggressive. Further, we use the two dimensions of aggressive behaviors outlined by Parrott and Giancola (2007; i.e., the directindirect and the active-passive dimension) and discuss a third dimension of the "extremity" of the harmfulness of the aggressive behavior. Finally, consistent with the social interactionist perspectives, we conceptualize aggression as a class of behaviors (out of an entire repertoire of possible behaviors) that individuals use to navigate their social environments and achieve various social motives.
Are the behaviors within lab-based aggression paradigms aggressive?
What do the above conceptualizations of aggression mean for studying aggression in laboratory settings? First, study participants must be able to execute a behavior they believe has the potential to cause harm to another person. Second, participants must behave with the intent to cause harm to another person and believe the recipient wants to avoid receiving the consequences of the behavior. Third, ideally these behaviors collectively should represent the multidimensional nature of aggression. Finally, as per the social interactionist perspective, researchers should consider which motives participants' behaviors may be trying to accomplish.
Are the behaviors within lab-based aggression paradigms harmful?
People's everyday conceptualizations of aggression probably include injurious physical behaviors such as punching, kicking, shooting, stabbing, etc., and harsh verbal behaviors such as berating, scolding, etc. Although these behaviors are clearly harmful, they also are clearly not permissible in laboratory settings. In comparison, examples of "harmful behaviors" that are permissible within laboratory settings include behaviors such as exposing another participant to an unpleasant noise blast and selecting how long another participant must submerge their hand in ice water. Although these latter behaviors may be mildly noxious and unpleasant to experience, they are clearly less extreme than many everyday conceptualization of aggression.
Notably, we consider extremity of the harm to be a continuous quality of an aggressive behavior. The more harm may be potentially caused, the more aggressive the behavior towards the target. However, it is not a criterion when making the binary decision of whether aggression has occurred: either the behavior has the potential to cause some harm, which makes it eligible for possibly being considered aggressive, or not. If a behavior is on the harmfulness continuum, regardless of the extremity of the harmfulness, the behavior has the potential to be aggressive if the other defining criteria are met. It also is notable that there is a threshold for a behavior to merely be considered "harmful" and a threshold for behaviors that are considered harmful enough to be deemed socially important. A behavior can, in theory, meet the threshold for being considered harmful (and, thus, potentially aggressive) and fail to meet the threshold for being socially relevant.
Unfortunately, exactly what is meant by "harm" is ill-defined in aggression research, which has led to an imprecise boundary between which behaviors are "harmful" and which behaviors are not. For example, if harm is defined to require there to be tissue damage or long-term negative consequences, then the behaviors within labbased paradigms are not harmful (and by that extension not aggressive either), and likely no behaviors permissible in laboratory settings would be allowed to reach the threshold for being considered harmful. We do not wish to imply that a narrow definition of harm is necessarily flawed. It may merely reflect a researcher's interest in a particular class of behaviors arguably more significant to individuals and society overall. It also comes at the price of not being able to study it in the safety of a university laboratory (and again, we do not wish to suggest that everything needs to be studied in laboratories). For better or for worse, many psychologists seemingly accept more lenient criteria for determining harm. For example, Parrott and Giancola (1997) provide the example of "stepping on someone's foot by mistake" (p. 282) as an example of a "harmful" behavior that does not meet the criteria for aggression. The implication seems to be that the harm involved in stepping on someone's foot is sufficient to be considered aggressive (if the behavior was done on purpose). Further, several researchers accept the behaviors within lab-based aggression paradigms as instances of aggression; thus, to some, harm seemingly only requires a mildly unpleasant experience or a mildly detrimental outcome without long-term negative consequences. In other words, by sheer virtue of considering, for example, sound blasts to be an instance of aggression, several researchers must assume that the unpleasantness caused by the sound blasts is sufficiently harmful to potentially be aggressive. Consequently, though, the research conducted with those laboratory paradigms may not generalize to those more narrowly defined types of harm discussed above.
The naively obvious solution of requiring more extreme behavior in laboratory settings creates a Catch-22. In laboratory settings, researchers could measure behaviors that are extremely harmful, which would allow for strong inferences that aggression has occurred. However, the lower bound of harmfulness at which behaviors become unambiguously aggressive is likely the upper bound of harmfulness that is permissible within laboratory settings. Aggression researchers therefore strive to have participants exchange the minimum amount of harm necessary to test their hypotheses, which is in direct conflict with the motivation to obtain clear measures of aggression. Alternatively, aggression researchers also strive to have participants exchange the maximum amount of harm that is allowable within the confines of a laboratory (so as to obtain a clear measure of aggression), which is in direct conflict with the goal of not unnecessarily placing participants in harms' way during the course of the study. Thus, lab-based aggression research becomes a difficult balancing act between two conflicting goals: Researchers must minimize the levels of harm that participants exchange while ensuring the harmfulness does not become sanitized from the behaviors altogether.
In summary, we believe harm is a continuous quality of the consequences of behaviors. Some behaviors exhibited in laboratory settings are mildly noxious, and, thus, may occupy the extreme low end of the continuum of possible harmfulness. Critically though, by virtue of merely being on the harmfulness continuum, these mildly harmful noxious behaviors have the potential to meet the criteria for being aggressive so long as the other criteria are met.
Are the behaviors within lab-based aggression paradigms caused by an intent to harm and the belief the recipient wanted to avoid the behavior?
As described above, whether actual harm occurs as a result of the behavior is not relevant to whether that behavior is considered aggressive. Rather, the critical criterion is whether the behavior was done with an intention to cause harm. Tedeschi and Quigley (1999) state that "an intention refers to the proximate goal of an act" (p. 128). Thus, in the context of displayed aggression, an intent to harm simply means that participants' behaviors are purposeful and participants believe their behavior, if executed to completion, will successfully inflict harm on a recipient. In other words, as stated by Anderson and Bushman (1997), "[i]n the laboratory domain, one must be sure that participants understand the dependent variable in the way intended by the experimenter. If delivery of electric shock (or any noxious stimulus) is supposed to measure aggression and only aggression, then the conditions must be set up so that participants believe that the shocks they deliver will harm the victim." (p. 36).
Not surprisingly, the disagreements about the extremity of the actual harmfulness of the behaviors within laboratory research also are a point of disagreements about participants' beliefs about the harmfulness of their behaviors. For example, when discussing the behaviors exhibited within a paradigm where participants (ostensibly) deliver unpleasant sound blasts to one another, Ferguson and Rueda (2009) note that the sound blasts "are obviously (to the participant) not harmful, and so the participant has no real expectation of causing actual harm to another individual, no matter how loud the blasts are set" (p. 133). Here, the authors imply that "harm" only refers to behaviors resulting in a level of extremity that exceeds what is allowable within laboratory settings; therefore, the behavior has no chance of being considered aggressive. Indeed, participants who perceive a stimulus as only slightly unpleasant to themselves might have little reason to believe they could use the same stimulus to, for example, cause tissue damage or excruciating pain. However, as with the actual harm of the behaviors, participants' beliefs about the harmfulness of a behaviors also are on a continuum. Participants may believe the sound blasts will cause another person to have a mildly unpleasant experience. Thus, these behaviors seemingly occupy the space where the behavior can result in mild harm, but also not exceed the threshold which would make the study ethically unallowable. Regrettably, participants' beliefs about the harm potential of stimuli they encounter in laboratory paradigms (which one might considered equivalent to a successful manipulation check) are not routinely assessed.
Similarly, some have raised the point about whether participants believe the recipient of a harmful behavior is really motivated to avoid the behavior (e.g., Ferguson & Rueda, 2009;Tedeschi & Quigley, 1996). After all, if participants believe the harm they are delivering is only mildly noxious, it seems reasonable for them to also believe the recipient would only be mildly motivated to avoid it. Further, the recipient of the harmful behavior is typically another participant who presumably consented to participate in the study and may (from a participants' perspectives) be at the risk of losing an incentive for prematurely terminating a study. Nevertheless, it seems reasonable to assume that humans are motivated to minimize the unpleasantness of their experience; thus, it is reasonable to assume that participants believe the recipient of the harmful behavior is motivated to avoid the consequences of that behavior, even when it is only mildly noxious.
In summary, merely demonstrating harmful behaviors have occurred is insufficient to claim that a person behaved aggressively. For aggression to occur, behaviors must be assumed to have been caused by a cognitive process that involved an intent to harm the recipient and a belief the recipient wanted to avoid the consequences of the behavior. Within lab-based aggression paradigms, it seems reasonable that participants may believe their behaviors would cause slight discomfort to the recipient and the recipient would be motivated to minimize the amount of discomfort they experience. Thus, the behaviors within lab-based measures of aggression have the potential to be classified as aggressive. Naturally, even though laboratory paradigms may be used to assess aggressive behaviors, their generalizability is limited, among other things, by the level of (potential) harm operationalized.

Do behaviors within lab-based aggression paradigms
under-represent aggression?
The first dimension of Parrott and Giancola's (2007) taxonomy is the direct versus indirect nature of the aggressive behavior. In describing the distinction between direct and indirect aggression, Parrott and Giancola state that direct aggression involves "face-to-face interactions in which the perpetrator is easily identifiable by the victim. In contrast, indirect aggression is delivered more circuitously, and the perpetrator is able to remain unidentified and thereby avoid accusation, direct confrontation, and/or counterattack from the target" (p. 287). As we discuss below, most lab-based aggression paradigms lack features of direct aggression. Many of these paradigms involve contrived interactions between participants and a generic "other participant" and the harmful behaviors are typically not face-to-face. Further, participants' behaviors exhibited within lab-based aggression paradigms are often not "directly" transmitted to the recipient of those behaviors, but are asynchronous with the (ostensible) delivery of harm to the recipient. The consequences of participants' behaviors are ostensibly transmitted to the recipient via the features of the study in which they are participating. Therefore, in addition to the aforementioned definitional characteristics of aggression, participants must believe the experimenter will actually execute the harmful behavior at a later point in time. Collectively, the behaviors within lab-based aggression paradigms are rather indirect according to these definitions.
The second dimension of Parrott and Giancola's (2007) taxonomy is the active versus passive nature of the behavior. Active aggression involves an individual who engages in a behavior that results in harm to the recipient. In contrast, passive aggression is characterized by participants' lack of action that is believed to knowingly result in a harmful consequence for the recipient. The lab-based aggression paradigms we discuss below all involve behaviors that are considered active.
In summary, within lab-based aggression paradigms, the harmfulness of the behaviors is on the extreme low end of the range of possible harmfulness, participants may believe their behaviors will only cause mild amounts of harm, participants may believe the recipient may only be mildly motivated to avoid the behaviors, and the form of participants' behaviors may only cover a limited amount of the conceptual space of possible forms of aggression. Collectively, the behaviors exhibited in lab-based aggression paradigms seem to be limited and unrepresentative of the multi-dimensional nature of aggression.
What are participants' motives within lab-based aggression paradigms?
Another point about the behaviors exhibited within labbased aggression paradigms is which motives participants may be trying to achieve. These motives are important to consider because they may not match the researchers' beliefs about participants' motives. This mismatch may lead to researchers' erroneous interpretations of participants' behaviors.
Within lab-based aggression paradigms, participants' behaviors are constrained to a limited set of responses (e.g., Tedeschi & Quigley, 1996, 1999. It is not uncommon for participants to "interact" with another participant and only be given the option to deliver some (mild) degree of harm (sometimes including no harm). This creates an impoverished and limited representation of social interactions that occur outside of the lab. Within lab-based aggression paradigms, participants are often not provided with opportunities to, for example, de-escalate a situation, compromise with their interaction partner, or warn their partner about their impending experience of a harmful stimulus; they are only allowed to exhibit either no harm or some degree of harm. However, these non-aggressive behaviors are all tactics that may occur in natural "realworld" interactions. Even in paradigms where some kind of interaction ostensibly occurs, they usually do not interact with another participant, but actually with a nonresponsive computer program. In this way, participants' behaviors may be artificially forced onto the continuum of harmful behaviors, which may make it ambiguous as to whether participants truly intended their behaviors to be harmful or whether their behaviors were simply caused by a lack of alternative response options. However, researchers may still interpret the behavior, occurring in an "aggression study" using a "lab-based aggression paradigm," as "aggressive." For example, suppose that a participant receives a seemingly unprovoked insult from an interaction partner and wants to assert that such unprovoked insults are not acceptable. Outside of the lab, the participant may engage the interaction partner in a conversation. Inside of the lab, if the only possible channel of communication with their interaction partner is, for example, via sending a series of noxious sound blasts back-and-forth, participants may try to communicate their disapproval with the insult by sending their interaction partner a noxious sound blast. The participant may not intend to harm their interaction partner, and participants may even have preferred an alternative, yet unavailable, means of communication. Even attempts to control (e.g. one very loud blast as a deterrent) or de-escalate the situation (e.g. a series of very mild blasts), will only be met by an indifferent pre-programmed pattern or randomization function. Nevertheless, the researcher may erroneously interpret any instance of participants' sound blast selection as an instance of aggression merely because it was an observed behavior that occurred within a "lab-based aggression paradigm." This is essentially taking a complex social interaction, severely restricting participants' options on how they can behave, and then (mis)interpreting the observed behaviors within a narrow conceptualization of participants' motives.
Similarly, participants may have a motive to conform to (what they intuit to be) the study's hypotheses. For example, a participant who receives a seemingly unprovoked insult immediately before being given an opportunity to harm another person may intuit they are in an aggression study (particularly if the experimenter is known to study human aggression). This hypothetical participant may reason the easiest route to getting their compensation and leaving the lab is to "act aggressively" and not admit any suspicion. Here, the participant has a motive to fulfill the study requirements and be compensated. Their behavior on the lab-based measure of aggression is a means to achieving that motive, but the behavior would not be aggressive.
Thus, to properly understand participants' behaviors, it is important to consider the requirements of the study that are both explicitly and tacitly communicated to participants, participants' experiences within the study, as well as their social motives. These factors will dictate why participants choose their behaviors. Sometimes participants will behave aggressively to achieve a social motive within the parameters of the lab-based aggression paradigm. In this case, the lab-based aggression paradigm is accurately measuring participants' aggression. However, participants also may behave in the same manner, but the behaviors would not meet the criteria for being considered aggressive. For this reason, it is important to consider the possible response options that participants are provided. If participants' response options are limited, they may try to use those limited response options to achieve a wider range of motives than the researcher considers when interpreting their observations. In this case, the lab-based aggression paradigm would not be accurately measuring participants' aggression.

Extant Lab-Based Aggression Paradigms
With the criteria for a behavior to be considered aggressive described above, the following sections discuss several lab-based aggression paradigms. These paradigms were selected because we believe (a) they are the most commonly-used paradigms in contemporary lab-based aggression research or (b) they are notable examples of lab-based aggression paradigms that have been used in prior research (see Table 1 for a summary).
The selected paradigms have been used in a wide range of fields and research domains. Many are part of the social psychologist's toolbox, and as such often found in studies of the interaction between persons and social or situational cues, such as the effects of violent video games (Anderson & Dill, 2000;Saleem, Anderson, & Gentile, 2012) or responses to ostracism (DeWall, Twenge, Gitter, and Baumeister, 2009;Warburton, Williams, & Cairns, 2006) and provocation (Finkel, DeWall, Slotter, Oakten, & Foshee, 2009). But they are also used in clinical research, for example to study the effects of the consumption of alcohol (Pederson, Vasquez, Bartholow, Grosvenor, & Truong, 2014) or pharmaceuticals (Weisman, Berman, & Taylor, 1998), or the social and cerebral responses in criminal psychopaths (Veit, Lotze, Sewing, Missenhardt, Gaber, & Birbaumer, 2010). Thus, arguably, the questions about the extent to which they meet the definitional criteria of aggression by Baron and Richardson (1994) is relevant for a large body of literature.

Competitive Reaction Time Task
What is it?
One of the most commonly used lab-based measures of aggression is the Competitive Reaction Time Task, which is a modified version of the Taylor Aggression Paradigm (e.g., Taylor, 1967). Within this task, participants are ostensibly competing with another participant to react quickly to stimuli that are shown on a screen in a multi-round game. Typically, the other participant does not exist, but it is necessary for participants to believe they are playing against another person. Prior to each round, participants select a noise blast intensity (i.e., loudness and sometimes duration) that will possibly be delivered to their competitor. The participant who is fastest to react to a stimulus ostensibly "wins" that round. Participants who "win" a round send the noise blast (with the intensity settings previously selected) to their competitor. Participants who "lose" a round are exposed to the noise blast that was chosen by their competitor. Typically, the researcher sets in advance for each the ostensible "winner" of the round and the intensity of sound blasts sent by the competitor.

What is the harmful behavior?
The harmful behavior is quantified as the intensity (loudness and sometimes the duration) of the sound blasts selected during the task. Although several quantification strategies have been previously used, it is generally the case that louder and longer-duration sound blasts are considered to be more harmful behaviors. However, a large number of quantification strategies to compute an aggression score from data in the Competitive Reaction Time Task is found across studies (e.g., Elson, 2016;Elson et al., 2014;Ferguson, 2013). As such, there is no standardized procedure to analyze data recorded from the task, and there is little evidence that one of the multiple variants is a better operationalization of aggression than the others.
Although the interaction is not face-to-face, the behaviors and the delivery of the harm occur at approximately the same time (there is a short delay between the sound blast selection and the completion of each round, which is when the sound blast is ostensibly delivered). Further, participants deliver the sound blast to their interaction partner via the ostensible connection between computer. For these reasons, we consider the behavior to be "active" and fairly "direct" (Parrott & Giancola, 2007). Prior to each round, participants intentionally choose the loudness and, if applicable, the duration of the sound blasts. And to the extent the cover story is successful, participants believe their sound blast selections are delivered to their competitor at the conclusion of each round. One of the ways researchers attempt to strengthen the validity of their inferences regarding the participants' motives during the Competitive Reaction Time Task is by having the participants report their motives for selecting the sound blasts. Participants report these motives after task completion. For example, participants may report whether they selected sound blasts with the intent to aggress towards their competitor (e.g., Anderson et al., 2004). To minimize the extent to which participants intuit the hypotheses of the study, these questions are embedded within a questionnaire with several other motives about sound blast selection (i.e., participants are not only asked about their aggressive motives). One consideration when using retroactive self-reported motives is their validity rests on the assumption that participants can and are willing to accurately report on their motives at an earlier point in time.

Cold Pressor Task
What is it?
In the Cold Pressor Task participants believe they are completing a study about how distracting or unpleasant stimuli affects performance on a cognitive task. This cover story is merely to provide a rationale for the responses participants will provide later in the study. At some point in the study participants believe they will select how long (typically between 0 and 80 seconds) another participant will hold their hand in ice cold water.
What is the harmful behavior?
The harmful behavior in the Cold Pressor Task is the duration the participant chooses for the other person to hold their hand in ice water. Because the longer time selected ostensibly corresponds to the other person's exposure duration to the unpleasant stimulus, longer exposure duration selections are interpreted as more aggression. However, depending on the specific wording of the cover story, it may be implied that the behavior is merely distracting, and not harmful. The harmful behavior is typically not face-to-face, the behavior is asynchronous with the ostensible harmful experience, and the harm is ostensibly delivered to the recipient via the experimenter. For these reasons, we consider the behavior to be "indirect" and "active" (Parrott & Giancola, 2007).

What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
Behaviors during the Cold Pressor Task are intentionally selected by participants and participants are aware of the contingency between their responses and the other person's presumed experience. The aggressive cognitions in the Cold Pressor Task are inferred when participants believe that holding one's hand in the ice water would be an unpleasant experience and they believe the other participant must hold their hand in the water for the assigned amount of time. Therefore, participants must believe their responses correspond to the extent to which the other participant will have an unpleasant experience.
Researchers can enhance their inferences about these aggressive cognitions by having participants hold their hand in ice water prior to making their decision (e.g., Pederson et al., 2014). Having participants feel the ice water serves two purposes. First, it helps ensure participants believe the cover story that another participant will hold their hand in ice water. By actually having participants hold their hand in ice water, participants no longer have to doubt certain aspects of the cover story such as whether the researchers actually have ice water and whether participants will be holding their hand in ice water as part of the study. Second, this methodological feature helps ensure participants believe that holding one's hand in ice water is an unpleasant experience (to the extent this is an unpleasant experience for the participant).

Tangram Help\Hurt Task
What is it?
Tangrams are puzzles that consist of geometric shapes that can be arranged to form a target shape. In the Tangram Help/Hurt Task, participants determine which 11 of 30 possible Tangrams another participant must complete and, if successful in completing the assigned Tangrams, the "other participant" ostensibly has an opportunity to win a prize. Participants are informed that the Tangrams have been pretested based on difficulty levels so that each Tangram is either easy, moderate, or difficult to complete. Because participants must select 11 out of 30 possible Tangrams, participants cannot assign only easy Tangrams or only difficult Tangrams. As with the other lab-based measures of aggression, the other participant typically does not exist, but it is necessary for participants to believe the other person exists.

What is the harmful behavior?
The harmful behavior is the number of difficult Tangrams a participant assigns to the other participant. Assigning more difficult Tangrams ostensibly impedes the likelihood the other participant will obtain a desired goal. Therefore, more difficult Tangrams selected is interpreted as a greater impediment and, hence, a more aggressive behavior.
The harmful behavior is not face-to-face, the behavior is asynchronous with the ostensible harmful experience, the harm is ostensibly delivered to the recipient via the experimenter, and the harm is based on the probability that the difficult Tangrams will impede the recipient from attaining a desirable outcome. For these reasons, we consider the behavior to be "indirect" and "active" (e.g., Parrott & Giancola, 2007).
What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
The aggressive cognitions in the Tangram Help/Hurt Task rest on several assumptions. First, participants must believe another participant exists. Second, participants must believe the other participant desires the outcome that can be obtained by successfully completing the Tangrams. Third, participants must believe their selection of difficult Tangrams effectively impedes the other participant's likelihood of obtaining the desired outcome. Thus, establishing that participants do not have other motives for selecting difficult Tangrams is paramount. For example, participants may believe the prize ought to be "earned," and, thus, they may select difficult Tangrams to ensure that nobody undeservedly wins a prize. Evidence for participants' motives in their Tangram selections can be solicited by asking participants to report the extent to which they selected Tangrams with the goal to make it difficult for the other participant to obtain the desired outcome (e.g., Saleem et al., 2015).

Hot Sauce Paradigm
What is it?
In the Hot Sauce Paradigm participants believe they are in a study about food preferences. At some point during the study, participants are told to prepare food for another participant. The experimenter informs participants that a necessity of the study is that the experimenter is blind to certain aspects of the food preparation and the participant will select how much hot sauce another participant will consume. Participants pour hot sauce into a cup and believe that the other participant will be required to consume the entire contents of the cup. Typically, this other participant does not exist.

What is the harmful behavior?
The harmful behavior is the amount of hot sauce (and sometimes the level of hotness when there are several sauces to pick from) that participants dole out for the other participant to consume. More hot sauce (or hotter sauce) is interpreted as a more aggressive behavior.
The harmful behavior is not face-to-face, the behavior is asynchronous with the ostensible harmful experience, and the harm is ostensibly delivered to the recipient via the experimenter. For these reasons, we consider the behavior to be "indirect" and "active" (e.g., Parrott & Giancola, 2007). What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
Participants must believe that consuming hot sauce is an unpleasant experience for the other participant. This is typically enhanced with a cover story in which the participant learns the other participant does not like spicy foods. With this cover story, participants are knowingly giving food the other participant does not prefer.
Participants must further believe the other participant will (have to) eat the food they prepared regardless of the amount of hot sauce and potentially against their food preference instead of simply refusing to consume it after trying a first bite (which someone might normally do when food given to them is unpalatable and the situation is such that one cannot be forced to consume food). In some studies, participants are told the other person will "eat every drop of the given sauce," but it is unclear whether this sufficiently convinces them the other participant is unable to successfully avoid their aggressive behavior.

Negative Evaluation Task
What is it?
In the Negative Evaluation Task, participants are given an opportunity to evaluate the researcher on their performance during the study. Presumably, the participants' feedback is requested to help determine whether the researcher will obtain a desired position such as a competitive research assistantship (e.g., DeWall, Twenge, Gitter, & Baumeister, 2009). Effectively, participants are provided an opportunity to impede the researcher's likelihood of obtaining a desired goal.
What is the harmful behavior?
Participants' negative evaluations would negatively affect the likelihood the researcher would obtain the desired position. More negative evaluations are interpreted as more of an impediment to a desired goal and, thus, would be considered as a more aggressive behavior. If the evaluation is not made on a numerical scale, but as a written performance review, the valence of the evaluation is coded by one or multiple raters.
The harmful behavior is not face-to-face (although the participant meets the target during the study), the behavior is asynchronous with the ostensible harmful experience, and the harm is based on the probability that the evaluations will impede the recipient from attaining a desirable outcome. For these reasons, we consider the behavior to be "indirect" and "active" (e.g., Parrott & Giancola, 2007). What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
Participants must believe that the researcher wants to obtain the position, and they must believe their evaluations will adversely affect the likelihood the researcher will obtain the desired goal. Whereas the first is easily communicated, the latter might not be feasible for every participant, depending, for example, on their knowledge of university hiring policies or labor law.

Uncomfortable Pose Task
What is it?
To our knowledge the Uncomfortable Pose Task has been used only once in a published study. Finkel, DeWall, Slotter, Oakten, and Foshee (2009) had undergraduates who were members of a romantic couple partake in a study wherein they select how long their partner would have to hold several uncomfortable yoga poses. This was accomplished by having participants assign a length of time between 5 seconds and 120 seconds for each physically uncomfortable position that their partner would ostensibly have to hold.

What is the harmful behavior?
The harmful behavior is the length of time participants select their partner to hold the physically uncomfortable yoga poses. Longer time selected presumably is associated with more discomfort and, thus, is interpreted as more aggression.
Although the harmful behavior is not face-to-face, the one instance of this paradigm being used (i.e., Finkel et al., 2009) involved romantic couples and, thus, was not anonymous. Nevertheless, the behavior is asynchronous with the ostensible harmful experience and the harm is ostensibly delivered to the recipient via the experimenter. For these reasons, we consider the behavior to be "indirect" and "active" (e.g., Parrott & Giancola, 2007). What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
The Uncomfortable Pose Task rests on the assumption that participants believe they are actually selecting how long another participant has to maintain a physically uncomfortable position. Because the task is framed as "yoga" poses, it is necessary to ensure that participants' selections are motivated by a desire to make the recipient uncomfortable. Finkel et al. (2009) asked partners their perceptions of how much they believed their partner was interested in yoga and the extent to which their partner has been involved in yoga-related activities in the past. Responses to these items were then statistically accounted for when analyzing the length of time participants selected. Finally, the subject must believe that assuming the yoga poses does more harm than good; even positions that feel uncomfortable as a novice could be beneficial or healthy (e.g. by resulting in greater physical fitness). 1

Voodoo Doll Task
What is it?
Although the Voodoo Doll Task does not meet the criteria for an aggressive behavior (as will be described below), this task has been used in several recent aggression studies (e.g. DeWall et al. 2013;Slotter et al., 2012). During a study, participants are either presented with an actual Voodoo Doll or a visual representation of a Voodoo Doll and are told to imagine the doll represents another person. Participants are then given an opportunity to "stick pins" into the doll, either by physically sticking pins into the doll (if a real doll is used) or reporting how many pins they would like to stick into the doll (if the visual representation of a Voodoo Doll is used). The scoring of the task is straightforward: The number of pins are counted, and a higher count of pins used is interpreted as more intent to inflict harm.
What is the harmful behavior?
The "harmful" behavior in the Voodoo Doll Task is the use of pins that symbolically harms the person who is imagined while completing the task. Because most participant will not believe that the execution of their behavior will cause real harm to another individual, the behaviors performed during Voodoo Doll Task do not meet the criteria for being aggressive. For this reason, studies using the Voodoo Doll Task often refer to the underlying measured construct as "aggressive inclinations." What features of the task lead researchers to infer that participants' aggressive cognitions caused the behavior?
The Voodoo Doll Task rests on the assumption that participants can easily project characteristics onto the doll. Thus, if participants imagine the doll represents a specific individual and participants choose to symbolically harm that individual, this is presumably psychologically similar to participants' actually inflicting harm onto the imagined individual. However, it is currently unclear to which extent researchers may infer participants' behavioral aggression from the inclinations supposedly exhibited in the Voodoo Doll Task.

Moving Forward
Temper claims of generalizability Some, but not all of the extant lab-based aggression paradigms may create conditions in which participants' behaviors can be aggressive (albeit aggression that only has the potential to cause extremely mild harm). Commonly, the successful implementation of these conditions is not substantiated with empirical data (e.g. manipulation check type queries), and it remains debatable whether some laboratory paradigms supposedly measuring aggression are not actually operationalizing, for example, competitiveness. We emphasize that meeting the definitional criteria proposed by Barron and Richardson (1994) and Parrot and Giancola (2007) would be necessary, but not sufficient for laboratory paradigms to successfully measure aggressive behavior that is relevant to more extreme aggressive behaviors. Proper validation studies are necessary once those procedures are established.
Further, as stated by Tedeschi and Quigley (1996), "[l] aboratory research on aggression has, at best, studied only a small portion of the multifarious phenomena of human aggression" (p. 174). We wholeheartedly concur. As argued above, the behaviors exhibited within such paradigms collectively under-represent the multi-dimensional nature of aggressive behaviors. The harmfulness of the behaviors that are permissible in laboratory settings is on the low end of the range of possible harmfulness, participants likely believe their behaviors are only mildly harmful, and likely believe that recipients are only mildly motivated to avoid the behaviors. Thus, the behaviors exhibited within lab-based measures of aggression are about as representative of all aggression as college students are representative of all people.
Further, the harmful behaviors exhibited within lab-based aggression paradigms are often contrived and are nothing like typical social behaviors in participants' day-to-day lives. Even staunch proponents of the validity of lab-based measures of aggression readily admit that "real-world" aggression (e.g., punching another person) share few surface features with laboratory aggression measures (e.g., delivery of a sound blast; Anderson & Bushman, 1997). A great deal of work is needed to ensure the behaviors exhibited in lab-based measures of aggression are not only mutually agreed upon as measures of aggression by researchers, but are actually informative about real-world aggression.
One implication is that researchers should acknowledge that the behaviors exhibited in lab-based aggression paradigms under-represents the broader class of aggressive behaviors and be cautious in their generalizations to aggression outside of the lab (for a different view, see Bushman & Anderson, 1997). For example, if researchers use a lab-based aggression paradigm where participants exhibit direct-verbal aggression, these results may, if anything, only be directly informative about mild direct-verbal aggression in the "real world." Similarly, if researchers use a lab-based aggression paradigm where participants exhibit mild indirect-physical aggression, these results may only, at best, be directly informative about mild indirect-physical aggression in the "real-world." Only when there is converging and replicable evidence from several different lab-based aggression paradigms researchers may tentatively make claims about "aggression" as a general concept. However, because the extant lab-based aggression paradigms do not exhaustively cover the entire possible range of conceptual space that aggressive behaviors can occupy, it seems that such global claims are not currently possible, although such unwarranted generalizations are frequently observed in the scientific literature .

Consider participants' motives
Our conclusions also concur with Tedeschi and Quigley's in that the motives for participants' aggression are often not emphasized when interpreting participants' behaviors. At best, this lack of emphasis on motives may lead to an under-development of theories of aggression. Thus, we echo the suggestion for aggression researchers to increase the measurement of participants' intentions during interactions and what motive those intentions are trying to achieve (e.g., Giancola & Chermack, 1998;Tedeschi & Quigley, 1996). For example, it is well-established that provocation increases aggressive behaviors (e.g., Pederson et al., 2014). But demonstrating that provocation increases aggression does not provide any information about why provocations increase aggression. Are participants trying to restore their reputations? Are participants trying to be assertive in hopes of stopping the situation from escalating further? Are participants trying to enforce a social norm of how strangers should behave towards one another?
Embrace "open science" practices Finally, we advocate for several recommendations for those who use lab-based aggression paradigms. First, the quantification strategies for the data from several labbased aggression paradigms are currently unstandardized. When coupled with the flexibility that may be involved with other aspects of studies that use those paradigms (e.g., operationalizations of variables, data collection stopping decisions, omission of participants, etc.), the proliferation of "researcher degrees of freedom" are staggering and are a serious impediment to scientific progress. A lack of standardization makes it ambiguous as to why any specific quantification strategy was selected, does not provide information about whether the theoretical conclusions would change if other quantification strategies were chosen, and significantly hinders the ability to accumulate evidence across different studies (even if those studies are identical in every other way). We hope that aggression researchers take this lack of standardization seriously and adopt standard uses for each lab-based aggression paradigm. Additionally, and complementary to standardization, we strongly advocate for researchers to pre-register their hypotheses and analysis plans. Pre-registration communicates that the hypotheses and analytic strategy were determined independently of the obtained results. Even if the validity of a particular measure is debatable, preregistration at least ensures that researchers are debating data that were generated under known circumstances, which helps to focus continuing areas of disagreement onto other features of the data.
Second, we encourage aggression researchers to share their data and stimuli for other researchers to (re)use and scrutinize. Although several of these paradigms require a great deal of experimenter skill in, for example, successfully selling the cover story to participants, the sharing of stimuli helps to standardize some portion of these lab-based aggression paradigms. Similarly, the sharing of data allows other researchers to validate published results (i.e., testing for analytic reproducibility) and test the robustness of researchers' claims (e.g., testing whether the conclusions are robust to alternative and justifiable analytic decisions). They also allow researchers to explore relationships between variables, and use this information in the planning of their own research. We believe that such transparency in the research process will enhance a cumulative science of aggressive behaviors.

Conclusion
We acknowledge the difficulties in measuring complex behaviors in the lab. And we strongly advocate for labbased research being a critical component of a multifaceted and robust understanding aggression. However, we also want to ensure the future of aggression research is progressive and that lab-based research significantly contributes to this progression. This does not mean producing more publications that include lab-based aggression paradigms; this means ensuring our lab-based studies are actually informative about incredibly complex behaviors that occur in the "real world." We believe this progress cannot occur without a frank and open discussion of the characteristics and limitations of the tools at our disposal.
Note 1 Or, in the authors' experience, particularly those.

Author Note
This manuscript was inspired by an ongoing Twitter conversation between the authors. Although spirited at times, there were no instances of interpersonal aggression within the process of writing the manuscript.

Funding Information
The authors declare that they received no funding for the completion of this manuscript.

Competing Interests
The authors have no competing interests to declare.

Authors Contributions
• Contributed to the conception: RJM, ME • Drafted and/or revised the article: RJM, ME • Approved the submitted version for publication: RJM, ME