Action accounts of police-civilian interactions: Using video elicitation to explore police officers? how-to knowledge

Police work relies fundamentally on non-declarative how-to knowledge, such as embodied skills (Lizardo, 2017). While there is a longstanding tradition of research on forms of police culture, knowledge, and narratives, insights from cultural sociology have only recently been introduced in this tradition. Forms of police culture are predominantly studied through ethnographies and interviews to arrive at the (re)creation of meanings and experiences in situ, although recently, other researchers have introduced video-analysis to understand the situational dynamics of police work. Whereas the former methodology does not allow for showing sequences of bodily action, the latter does not focus on social meanings and officers’ lived experiences. This article addresses this problem by combining narrative and visual techniques through video elicitation to explore police officers’ bodily knowledge of how and when to act. In 24 video interviews I watched, discussed, and examined video footage with Dutch police officers who participated in violent situations recorded on camera. This method reveals how officers read bodies to generate incentives for taking action. Theoretically, this article draws attention to bodily action knowledge which has received scarce attention in cultural sociology and policing studies alike. It contributes to cultural sociology more generally by demonstrating that collective bodily know-how is learned and plays an important role in collective co-creation of situations. I conclude by discussing how analyzing violent situations through examining videos with those recorded in these events allows us to make explicit embodied understanding and knowing, thus furthering our understanding of situated, in this case police, action.


Introduction
Police work relies fundamentally on non-declarative how-to knowledge, such as embodied skills (Lizardo, 2017). While there is a longstanding tradition of research on forms of police culture, knowledge, and narratives, studies of policing have only recently begun to include sociological concepts, such as repertoires, to study meaning-making processes and officer's situational acts (Cockcroft, 2013;Noppe, 2015). However, policing scholars generally neglect a form of culture that is crucial in police-civilian encounters: embodied how-to knowledge. My key argument is that police officers share bodily knowledge on how-to and when-to engage in action collectively. Yet, conventional observational and narrative research methods often obscure embodied mastery as police common-sense, while it is vital to understand the physical aspects of police work.
Police officers are expected to know what to do, how to do it, and when to do it. Importantly, knowledge not only exists in discursive resources or talk, it also resides in the body (Ignatow, 2007). Drawing on Bourdieu's concept of "habitus" (1990) in which socially ingrained cultural knowledge and dispositions are embodied, scholars emphasize that what people know is manifest in what they do, in their bodily practices and habits through skills, wording, gesturing (Yakhlef, 2010, p. 423), and embodiments like seeing, emotional arousal, and touching (Küpers, 2005). The problem is not that embodied forms of knowledge are hidden or "tacit" (Polanyi, 1966) and therefore difficult to grasp for the researcher. Instead, the issue is that police officers largely draw upon narratives and attitudinal cultural statements, i.e. police discourse, about what it means to be and act as an officer. This in turn, inhibits them from explicating their how-to knowledge (see Schütz, 1946). I use the term "how-to knowledge" to escape the prevailing body-mind dualism in social research, and to reiterate that knowledge is not contingent upon a carrier such as the mind or the body (Shilling, 2012). More accurately, bodily knowledge is in the action itself. Unfortunately, research methods remain limited to interviews and observation for studying action. We thus need a methodological tool that encourages officers to explicate bodily action, that is, verbalize their shared bodily know-how, which allows us to understand why and how officers act situationally. This gives rise to a new question: how should police officers' how-to knowledge, be studied?
In this article, I propose that watching, discussing, and examining video footage with police officers who participated in violent situations recorded on camera, provides a fruitful approach to elicit police embodied knowledge on how and when to do things. The aim of this article is twofold. First, to illustrate that analyzing violent situations together with officers who participated in these situations, creates a specific "communicative situation" in which they as protagonists verbalize their actions that get us close to the knowhow of bodily action. Second, to show how these verbalizations of embodied knowledge are part of police culture. In what follows, I demonstrate how data from 24 video interviews with Dutch police officers reveals the ways in which officers read bodies to generate incentives for taking action. This article not only contributes to contemporary policing studies by showing how bodily action plays a key role in policing, but is also relevant for sociological studies that aim to advance the ongoing debate of how to study embodied cocreated situated action as it demonstrates how people use culture to sync their actions into collective bodily conduct. More specifically, video elicitation contributes to our understanding of both in situ bodily action through observation, and officers' reflections and accounts.

From culture as value to culture as how-to knowledge
Studies of policing often depart from the premise that police culture embodies shared values and attitudes, which prescribe and explain how officers act in the course of their daily work (Fielding, 1994;Reiner, 2016;Waddington, 1999). Such studies are grounded in the notion of culture as value system. Other scholars critique the homogenous and monolithic model of police culture, and highlight its complexities and variations (Herbert, 1998;Loftus, 2010;Paoline III & Gau, 2017). Those that question the utility of a cultural focus argue that the use of culture as an explanation has led to "individualistic and reductionist" analyses (Manning, 2007;Sklansky, 2007) through repeatedly ascribing supposedly shared beliefs as a guide to how officers will behave (Turner & Rowe, 2017). Inspired by cultural sociologists, Campeau (2015) and Shearing and Ericson (1991) thus shift their attention to viewing police culture as a resourceful tool, arguing that officers refer to a repertoire of skills, or toolkit, to make sense of situations. However, approaching culture and knowledge as a resource located in the minds of people (see DiMaggio, 1997), does not help us understand how culture creates action (Chan, 1996). Other scholars therefore argue that practical understandings are corporeal (Csordas, 1990). In fact, to say "now I know how to go on" (Mondada, 2011;Wittgenstein, 1953, p. 154) requires embodied cultural knowledge. Embodiment of culture is relevant to the sociology of policing because the collectively shared know-how of bodily action has been neglected in studies of policing. In addition, sociology's tendency to focus on individual uses of culture ignores that situated action is a process of mutual "sense making" which involves bodily know-how. For instance, Crossley (2007) advocates for the concept of "body techniques" to explore practical understandings and meanings, but he overlooks the shared nature of embodiment, especially in trained professions such as policing. Hence, we need a methodological tool that encourages officers to verbalize their shared bodily know-how which allows us to understand why and how officers act situationally.
With examples from 24 video interviews with Dutch police officers who were recorded on camera during violent situations, I demonstrate how analyzing these situations together with officers who participated in them makes explicit embodied police knowledge. Violent interactions have scarcely been analyzed through a bodily action lens. In addition, policing occurs within a broader context of ideas of the proportionality of applied force, and accountability issues, making understanding officers' how-to knowledge a pressing matter. Cultural sociology would benefit from video elicitation to make explicit bodily know-how and help scholars understand how such collective knowledge plays a role in co-creation of situations. In the coming sections, I shortly note the pitfalls of mere narrative and observational methods, and discuss the benefits of video elicitation. Next, I illustrate how this method yields important insights into police action incentives.

Pitfalls of narrative and observational methods
Police work is an inherently storied activity which means it can be understood through a narrative lens (Smith, Pedersen & Burnett, 2014). Scholars thus often research storytelling through ethnographies and interviews to understand police actions (Schaefer & Tewksbury, 2018;van Hulst, 2019). Narrative criminologists also focus on how stories shape the morally significant things people do (Presser, 2016), but rely heavily on discourse analysis. While some narrative ethnographies pay attention to situational factors, such as the locations in which stories are performed (Tutenges, 2019), narrative studies generally neglect the situative character of social action, and are unclear about the relationships of stories to action. In addition, the emphasis on individual sense making in narratives neglects meaning-making processes as a collective achievement.
Other scholars have recently turned to video analysis to understand people's on-the-spot behavior and step-by-step actions. Inspired by microanalysis of interactions (Collins, 2008), they analyze videos of violent situations through coding schemes, systematic behavioral analysis (Levine, Taylor & Best, 2011), and logistic regression analysis (Friis, Liebst, Philpot & Lindegaard, 2020). However, studies based on CCTV footage-which mostly lacks sound-focus on quantifying action sequences and provide only limited insights into social meanings (Nassauer & Legewie, 2018). In police research, the use of videos is still rare. To my knowledge, a few scholars use body-worn camera footage to understand how incident characteristics affect the use of force (Willits & Makin, 2018), study racial disparities in officers' use of language (Voigt et al., 2017), or public videos available online to assess the effectiveness of crowd policing (Nassauer, 2015;Stott & Reicher, 1998). Unfortunately, such studies generally lack systematic examination of body acts. 1 Moreover, cultural concepts appear to be absent here. In general, observational studies reduce and oversimplify real-life police action to (in)dependent variables, and isolate happenings from the interactions from which they emerged. Although police ethnographers directly observe behavior and ethnographies highlight the participation of the body in social action, many details will not be captured due to the fast pace of action or be liable to observer bias (Spano, 2005). Thus, observational methods do not reveal situational incentives for taking action, officers' embodied knowing, or meaning-making processes. That means there is a substantial gap in scholarly knowledge, which necessitates the development of an adequate research method.

Video elicitation in interviews
Video elicitation prompts interviewees to discuss subjects in greater detail and has been used as a research tool in the social sciences (Heath, Hindmarsh & Luff, 2010), for example, in doctor-patient interactions (Henry & Fetters, 2012), and sports research (Brümmer, 2019). This study differs from the existing body of work on video elicitation in two ways: first it analyzes violent situations through examining video footage together with the protagonists, and second, it focuses explicitly on embodied understanding and action. In police research this is the first known study that includes the police officers actually involved in recorded events. Discussing videos with officers who were not participants would yield evaluations of situations based upon a generalized police "professional vision" (Goodwin, 1994), whereas doing so with participants allows to examine how officers collectively make sense of how and when to engage in action.
First and foremost, watching videos delimits attention to actual events. It focuses officers on situations with which they dealt, preventing them from transcending the situation as a topic of discussion, and spiraling into general comments. Secondly, video elicitation creates a flashback: officers remember particular events, such as crucial points of (de)escalation or when control was situationally established. In this way, they have a more accurate recall of acts, thoughts and emotions experienced during the interaction. Videos make bodily acts visible, which helps to elicit reflections on habitual modes of acting and the subtleties of swift police work, upon which officers would normally not reflect. Seeing themselves in action on the screen, and explaining what they are doing while seeing themselves doing things, makes them better able to designate and give meaning to feelings and particular lines of action.
Thirdly, video interviews transcend the generalized and police-accepted justifications so prevalent in standard interviews. Officers use specific vocabulary and narratives that prohibit them from explicating embodied experiences, and scholars from accessing the details of experiences in situ. Discussing videos is an excellent technique to break away from self-evident "that's just how the situation went" responses because they shift the focus of discussion from discourse and subjective "lived experiences" to the physical, bodily actions. Video elicitation acknowledges that culture has many different forms. It simultaneously shows 1) how officers act; 2) how officers interpret situations or behavior; and 3) how officers reflect. Although impression management and social desirability are an ever-present factor in interviews, videos allow officers to account for their bodily actions in a detailed manner, fostering analysis of how officers act in situ, i.e. observation, and their explanations, i.e. reflections, thereof. In this way, their communicative accounts become loaded with how-to articulations that get us closer to bodily action. After the data section, I demonstrate how watching videos yields important insights into police action incentives.

Data collection and sampling
The data used in the analysis is taken from 24 video interviews with 27, mainly patrol, police officers, who were recorded participating in a violent situation. Access to watching videos was established during over 2 years of ethnographic fieldwork. I accompanied officers from several Dutch police forces during regular shifts, ride alongs, and training sessions. Research sites included two police stations in two large cities, and several stations in smaller towns. While participating in daily routines, officers told me about violent incidents recorded on CCTV and body-worn camera videos. Some police stations had videos of these incidents available on site on computers. Due to my immersion in the police teams, I was able to develop enough rapport that the officers were open to discussing these videos with me. I interviewed 3 female and 24 male officers, with an average age of 32, and 9 years of employment.
Of the 24 video interviews, six were joint interviews. In one case, I watched the same video separately with both officers involved, and during another, a trainer commented on other officers' behavior. Because police officers usually encounter situations in pairs-that is how they conduct patrols-it is helpful to match interview settings with the social relationships through which they experience events. Joint interviews (see Polak & Green, 2016) are especially enlightening because they enable officers to reflect upon the thoughts and actions of their interaction partner, of which they may be unaware, as well as shared practices. Joint interviews thus circumvent the issue of individualistic interpretations common in studies of situated action where people are studied in isolation. The interviews I conducted lasted about an hour and 45 min, were all voice-recorded, and then transcribed. To ensure anonymity, I used pseudonyms for officers and locations.
Videos came from various data sources, including body-worn cameras, local media coverage, bystander videos on YouTube, and CCTV. The types of footage available varied depending upon the police station, as there is no standard practice for archiving video recordings of violent incidents. Principally, the objective was to trigger memories, stir responses, and encourage officers to explain and reflect. Discussing different types of footage was helpful because this allowed for observation and explanations from, quite literally, different angles and perspectives. That is, CCTV or YouTube footage captures natural settings, allowing officers to discuss the bigger picture of situations, whereas body-worn camera footage foregrounds the movements and experiences of the individual officer. My criterion for selection was that each video concerned only the officers being interviewed. Thus, in all but one interview, officers watched footage of themselves in situations they experienced. They always saw themselves acting. The incidents ranged from tense situations where violence was averted, to using force when a suspect resisted arrest, to shootings.

Analytical procedures
During the video interviews, I meticulously asked officers to describe and explain their actions in detail. For example, I noticed that an officer took a couple steps sideways so I asked, "Why are you changing your position here?" The officer replied, "Because I knew my colleague is blocking that side with his body, and now I will block this side so the suspect cannot escape." Such comments made clear how bodily positioning matters for maintaining and gaining control. I asked officers why and how they initiated certain acts, in what ways they co-operated with colleagues, and how their thought processes evolved. My common questions included: "What are you doing here," and "What is going through your mind here?" Officers were also asked to make unsolicited comments. Indeed, they frequently explained their actions without explicitly being asked to do so. Videos were frequently paused and rewound, by both the officers and me, to review segments and elaborate upon or discuss something specific. For instance, in one video, I noticed an officer looking in a particular direction so I asked, "What are you looking at here?" The officer rewound the video, watched herself, and explained, "I was looking for a quiet place to which to move the suspect." Such explanations revealed officers', usually taken-forgranted, sense of surroundings.
In the analysis, I searched for ways in which officers understood and articulated when, where, how, and why they have to engage in action. I specifically focused on how officers work together, their bodily acts, as well as their perception of suspects and surroundings. This resulted in an observation of recurring topics. For example, I created initial topics such as "turning point of (de)escalation," and then specified these topics into subthemes, such as "awareness of bodies." This thematic approach (Braun & Clarke, 2018) enabled me to make sense of officers' shared meanings and experiences. This is relevant because collective meaning-making processes are imperative to policing. Next, I used interpretive coding to identify relevant segments, constructed a coding dictionary of overlapping themes, and applied these codes to the transcripts in Atlas.ti. The codes applied include: aligning and understanding one another; assessing the suspect; and signals and cues. In the video interviews it became especially apparent that how officers' hold a bodily gaze toward suspects and colleagues generates action, the argument to which I turn now.

Action incentives through reading civilians' bodies
In standard interviews, officers say they visually orient toward civilians using their gazes to gather information about how that person is behaving and signal their intentions. They state that they focus on "body language," e.g., monitor hands, facial expressions, and notice tightening of muscles. Discussing videos elicits explanations of how these gazes function, that is, reveal how officers assess whether someone is going to resist arrest, fight, or flee the scene. Officers point at hands shaking before throwing a punch, civilians' legs or torso slowly turning away from officers' bodies, looking in a direction toward which to run, or a slight acceleration in stride. Other studies also find that, for example, turning away in frontal orientation, and folding one's arms indicate intentions (Harrigan, 2008), or that officers perceive impending violence through interpersonal social cues or "concerning behaviors": muscle tenses, placing hands in pockets, and blinking rapidly (Johnson, 2016 Table 3). Officers Alex and Perry explain their bodily gaze while watching a bystander-recorded video of their attempt to arrest a man. Before we start the video, they comment in a generic manner: "He's physically pumping himself up" and "Tension is filling the suspect's body." Then, seeing the footage, they expand upon their comments in detail: "Here, he's getting restless, starts to kind off tiptoe on his feet. His-shoulders and arms move forward a bit." This tiptoeing on his feet, like a ballerina, indicates to them the suspect is "getting ready" for an attack or an attempt to flee. Alex thus immediately grabs the suspect by his shoulder to prevent this from happening. Officers Neil and Lewis also unpack how their bodily gaze informs them as to what subsequent action to take while watching CCTV footage of their attempt to arrest a man. After explaining the suspect "is on a warpath," that is, walking around with clenched fists, chest forward, and peeking in the rearview mirror of the police car, one bodily act is a clear incentive to change their own bodily positioning. In the video, we see that the man starts to-what looks like-search for something in his waistband, putting his hand in his pocket, while Neil and Lewis stand in front of him: N: Now he's annoyingly touching his waistband and putting his hand in his pocket. L: Step back, did you see that? And then Neil takes a step back as well [italics indicate emphasized wording].
In the video we see that they take a step back the second the suspect puts his hand in his pocket. Lewis specifically calls my attention to their collective move, indicating that they are both noticing this movement, and have a similar sense of what hands-in-pocket means: potential danger. Neil thus calls it "annoying" because he foresees violence, or at least turmoil. Following a narrative approach to policing (e.g. Turner and Rowe, (2017)), it could be argued that Neil and Lewis assume that placing a hand in a pocket is the suspect's motivation for action, and that they then construct a narrative that legitimizes intervention. However, what the officers' explanations, elicited through videos, reveal is that behaviors carry certain meanings and those meanings immediately generate action incentives. More specifically, the bodily "cue" is an action, which is a resource and stimulus for another action by an officer. In Garfinkel's (1967) words, each participant takes the action of the other as a resource, or sign, to develop another line of action. Officers' comments on videos, then, not only show how they use bodily gazes to interpret behavior. They also help to access the sequence, or in ethnomethodological terms, the accomplishment of an interaction sequence. Through discussing videos, officers reveal that reading bodies generates action incentives. Thus, while watching Emre's body-worn camera video of his and his colleague Jesse's encounter with a man in a psychotic state who refuses to leave a closed movie theater, Emre explicitly points out why he decides to call for backup: Look here, we thought the situation was calm. Emre explains that he views the man's tight movements as tense and aggressive. He uses these bodily acts, interpreted as emotional arousal, as a resource for his own action: the second the man screams, Emre thinks he is not going to comply and calls for backup. Gallagher (2005) also indicates that emotional states have expressive bodily accompaniments, which means that people can grasp emotional states by observing movements, faces, and tones of voice. In the video, we also see that Emre and Jesse align their movements with the man slowly: when he moves backwards, they take a step forward, when he moves forward, they take a step backward. Emre explains that they are waiting for an opportunity to handcuff him: they take a step forward when the man starts to turn around, indicating he is going to cooperate. But when the man walks away, the opportunity passes. Watching the video uncovers the negotiatory and timely process of projecting a line of action: the man's position at first signals that the arrest is about to happen; then, the man walking away changes this line of action. In fact, this reveals an action sequence. Emre then explains his embodied knowledge about when to act: E: Here, I know we're going to cuff him. You just know that, you're trained in this. This was really the moment: now we're going to cuff him, at least try to Me: how do you know this? E: Yeah, I just felt it. Maybe because someone reached for their cuffs, but that would be too early. I don't know. It was just because he was standing like that [turned facing a wall]. There was the opportunity to cuff him. [He turns to the video] Yes, here he [colleague] grabs his cuffs.
At first, Emre says he "just knows" that handcuffing is about to happen. However, by watching the video, it becomes clear that the suspect's position-turned around with his hands up against the wall-indicates the projected line of action, the beginning of the cuffing procedure. Once a suspect assumes that position, officers know to follow the appropriate steps: stretch the suspect's arms, bend them behind his or her back, and so on. His-colleague's reach for his handcuffs already signifies this next step. Video-elicited explanations help to understand that reading bodies is cultural knowledge which sets into motion the next bodily action to make. That is, to be able to know how and when to act requires knowing how to use other people's actions or projected lines of action as a resource and input for your own actions or projected lines of action. Finally, Jesse elucidates his how-to knowledge when he explains that taking slow steps toward the man allows him to defend himself: "Because the minute you go hard and he suddenly turns around and swings at you, then you're in a forward motion so you can't do anything anymore." Such elucidation of how to go about an arrest with your body, taking into account the other body, is difficult to elicit without video footage.
Using the suspect's body actions as a resource for action is also what Officers Ollie and Saul describe. While watching a cellphone recording made by a bystander of an attempted arrest, Ollie explains when and why he grabs hold of the suspect: O: In the beginning, when I said "You're under arrest," I grab hold of him, but just loosely. That's where I think, like, anyway, I have hold of him. So if he starts running or act difficult, then you feel that tension [in the suspect's body], and then you already have the upper hand. Then he shouts like, "Ah, let me go," and pulls his arm away.
S: To me it was annoying, but not really threatening. But it was a reason to…see here, I put my notebook in my pocket, grab my cuffs, and put one cuff around his wrist. [emphasis added] Ollie's explanation of his behavior while watching the video shows that grabbing hold of the suspect is a way to feel the other body. If and when the suspect tenses his muscles in an attempt to free himself, Ollie will feel it in his own body. In this way, Ollie is feeding himself with input for taking action: the suspect's body is used as a resource to project a line of action. Ollie is thus one step ahead because he can project the suspect's next move. Saul, on the other hand, notes that the pulling away motion signals when to grab his handcuffs. From there, they collectively move into a mutual line of action: the arrest. Interestingly, the question of when to act comes to the fore especially when a failure occurs. For example, Officers Wallace and Boaz and I are discussing a CCTV recording of them escorting a handcuffed suspect toward the police station. A bystander starts following and harassing them. I ask them at what moment they intervene, to which they reply it is the man's third approach that indicates the moment to use force. After a push and kick, the man slinks away. Wallace and Boaz interpret their intervention as having an effect. But seconds later, the man punches Boaz and renders him unconscious. Watching the video helps them realize why, and at which moments, they misjudged the man's intentions: Me: So he approaches you guys. You turn around again. W: Yeah. Me: And then when you Boaz are turned around, then he runs away again. B: Yeah, that's why. I just misjudged the danger. W: This is just a madman, he's just a madman! B: I interpreted his threat differently, I misjudged it because he was really scared of us. [Wallace laughs] Because every time we turned around, he ran away like a scared cat. I never expected that he-a scared cat-would do that, but ok scared people do crazy things. That's a lesson for me. [Both laugh] Me: but why did you think he was scared? B: Well, you saw that every time we made a move toward him or turned around, that he ran away really fast. W: He sprinted off. [emphases added] Whereas Wallace first simplifies the situation by calling the suspect a "madman," the video helps them both to reflect and explain the behaviors that caused them to misinterpret the man's behavior. Because the man ran off every time they turned toward him, they did not expect that he would act violently. Although Wallace had verbally signaled Boaz that "If he comes again, he's going with us," projecting a line of action, this action was canceled because he ran away. So, their when-to knowledge of initiating action-grabbing the man-falls short and their collective accomplishment fails. Discussing emotionally charged 'failures,' as opposed to fluid taken-forgranted action, enables officers to explicate their "I-knew-it-was-going-wrong" knowledge into a detailed interactional sequence about how, when, and where things went wrong. Officer Alex's account, also shows how discussing a bystander's phone-recorded video elicits reflection on when and why his actions failed. Alex is embarrassed to discuss the situation because he feels like he has failed at his job by letting the suspect run off after he had a hold on him. While watching the video, he continuously repeats "Oh, this is so bad," and "I can't watch this," turning his face away and placing his hands over his eyes and mouth. His-colleague, Perry, consoles him, and they playfully laugh together several times while watching. Despite his embarrassment, Alex is able to pinpoint in the video when and where he thinks the suspect has room to escape: This is the moment he's standing with his hands against the wall and what I'm trying to do is grab his [right] hand, and push with my elbow so his arm bends into the right position. Then, the other side [left arm] comes loose, so I do the one side and then the other side is free. Well, that to him is the moment he thinks like "oh I'm free" because I wanted to grab my cuffs to put them on, and to him that is moment, "hey I can get away" Alex explains that he thinks the suspect's position indicates cooperation and readiness to be handcuffed. The point is, every action signifies the next one, and all are working toward a certain projected outcome. In this way people move into a "signifying chain" (I elaborate upon this term later). However, the projected outcome here-the arrest-was based on a misinterpretation of bodily cues. In terms of how-to knowledge, Alex takes the "wrong" sequential path because he fails to interpret the suspect's behavior as the intention to flee. Then, by lacking control over the suspect's body, he creates the opportunity to flee. In sum, examining videos with police officers who participated in violent situations results in them explicating how they read suspects' bodies and use this as a resource to set their own bodily actions into motion. Aligning with colleagues is another way of initiating action.

Action incentives through reading colleagues' bodies
From interviews alone, it would seem as if officers initiate action solely through verbal means. Officers describe they quickly consult one another about what to do, and give verbal cues like "Now" or "Are we going in?" Most frequently, the sentence, "You're under arrest", signifies the kickoff because it projects a line of action: at that moment officers know the goal of the situation, and what action is expected of them. But although officers know what acts need to be undertaken, how and when do they initiate and execute these acts when there is no verbal instruction?
During the video interviews, it becomes clear that officers also hold a bodily gaze toward colleagues to assess what is going on, and when (collective) actions need to be initiated. The videos show officers establishing eye contact, nodding, and using hand gesturing like snapping their fingers to signal their colleagues. These nonverbal "action behaviors" not only indicate interactional sequences, they also influence the behaviors of another person, whether intended or not (Ekman & Friesen, 1969). This is crucial here: to officers, certain body acts, positions, and gestures carry meanings that help them determine "the next move." Those acts are signified as the ones to perform, even when they are not articulated (see Schatzki, 1996 on practical and general understandings). This signifying channels the flow of action: certain actions make sense to, and thus cue, officers to perform a next one. Officers then act immediately because the actions are automatic and part of their "bodily repertoire." Thus, getting a suspect into a certain position, as described above, signifies that the officers are going to handcuff, which signifies specific proceedings: grabbing the suspect's hands accordingly, reaching for their handcuffs, putting the handcuffs on, and so on. In this way, officers enter into a signifying chain of action.
For ethnomethodologists, making an arrest is thus an accomplishment during which people build upon each other's actions. Officer Rufus, repeatedly pointing at the screen, articulates this working towards a next move, i.e., getting into a chain of action as "I am constantly busy, considering what is happening and what is the next step. What if the suspect does this? Then what?" This suggests that police officers even process situations in terms of next moves. Pointing at the screen while commenting also shows that watching videos is not just a cognitive or discursive endeavor, but involves performative bodily actions. Officers often reenacted gestures and my body was frequently used to demonstrate arresting positions or chokeholds. Knowledge is thus also bodily formulated and produced. Unfortunately, the ethnomethodologists in Garfinkel's day did not yet have the technical means available to demonstrate that interactional resources are also embodied. Officer William explains this signifying process and his bodily gaze, when he and his colleague, Bob, are confronted with a man urinating in public, recorded on CCTV: B: Here, I push him against the car and I already had the feeling that he was acting weird. W: I saw it in his body language, just that gut feeling starts to act up. That's why I step in [in the video we see that William, first standing two feet away, steps in and grabs the right arm of the suspect]. How do I say this…I notice from how Bob positions himself and how he grabs hold of him that he's going to zero in on him because he places one leg in front of the other. Normally, Bob grabs a man, says "just act normal" and lets him go. But when I see that he is holding this guy for a longer period of time, and I see him grab his arms and push him harder against the car, then I know ok he's going with us. I see the suspect is getting tense and I see Bob tightly pressing him against that car and that to me is the moment I know this guy is not going to cooperate. I see that in how Bob stands, how he puts pressure on him, and then I step in to grab the other arm so Bob can't get hit. [emphasis added] While William first says he has a "gut feeling" and has difficulty to explain how he knows when to step in, it becomes clear that seeing Bob position himself with one leg back, pressing the suspect against the car, and hearing him say "He's tensing his muscles" functions as input. Through these actions, William understands Bob's projected line of action of an arrest, setting his own body into motion. Bob's comment "Just act normal" can be seen as a larger police narrative of normalizing civilian behavior, but indicates here that the suspect is acting the opposite of normal, signaling potential resistance. However, it is Bob's bodily action that complete the cue for William to act himself. Similarly, while watching himself on CCTV video, officer Tom indicates that he knows the situation is under control because he sees his colleague Louis is processing the suspects data on his phone while maintaining a fixed posture. He then argues that if his colleague Louis would position his hands in front of his chest "so he can act fast," this would indicate to Tom that Louis is getting ready to act and the situation is about to change.
Officer Lee also explains how he understands his colleague's projected line of action through bodily behaviors. While discussing a CCTV video footage of his attempt to arrest a man, I ask Lee how he knows his partner would approach. After his generic expression, "I just know," he then replies, "If out of the corner of my eye I see that you're looking at the same thing, then I know that we're in the same stage. We're moving toward the same thing, without verbal agreement." Here is the mutual projection of a line of action. In fact, Lee claims that, at that moment, his own body becomes action-ready as well. Then, in the video, we see him grab the suspect's legs. He argues this act is "automatic", that he was not asked to do it. He does it because he sees "chances are high" they are going to place the suspect on the ground. Actually, Lee does not need verbal instructions because he "knows," that is, he reads his colleagues' and the suspect's "body language," which both indicate impending resistance to arrest. He thus recognizes and understands that the arrest is going to happen on the ground. Foreseeing this, that is, signaling the projected line of action, Lee grabs the suspect's legs, so he can easily pull them out from underneath him. Watching and discussing videos thus reveals that through reading their colleagues' bodily action, police officers comprehend mutual projected lines of action and align their behaviors so as to act collectively. Sensing a projected line of action by reading bodily action is also what Officer Emre explicates when he directs my attention to his backup colleagues' arm movement in his video: Look, here I got scared. Look [pauses the video and points at the screen]. That officer moves and at one point he touches his firearm, and then I thought like, "Eh please let's not do that." I don't know if he opens the safety handle but he does touch it, plays with it kind of, moves it like this [Now he lifts his own gun out of the holster and puts it back]. And then I thought, like, let's not do that. I could see from that movement; he's not taking something out of his pocket. He was really reaching for his firearm.
Emre clearly indicates the moment he notices his backup colleague reaching for his firearm. Importantly, I would not have noticed whether the officer's hand was moving toward his pocket or his firearm, but Emre knows exactly where a firearm is located on the body because he also carries one. He elucidates his embodied awareness, also known as body mapping (de Jager, Tewson, Ludlow & Boydell, 2016). Emre says, "please let's not do that," because he knows that this colleague is projecting a line of action of possibly having to use the firearm to complete the arrest. In a standard interview, this moment would not have become so apparent, but in seeing it Emre reflects on it. I also ask him how he knows Jesse grabs his pepper spray because we do not see him do this in the video. He replies that he hears the safety pin open up, which again indicates a line of action: "When he did that, I knew, like ok, maybe we're going in now because if he sprays, then of course action will follow." Reading bodies and recognizing lines of action, activating how-to and when-to knowledge, thus requires a phenomenological awareness of the body.
Finally, by watching videos officers experience revelations about failures in their collective action. Officer Craig, for example, argues that because he and his colleague are both clearly yelling the same command: "Get on your belly", he thinks that they are both trying to attain the same goal. That is, he thinks that they are in a mutual line of action of getting the man to turn over. However, in the process of explaining their actions while watching the video, Craig reflects he may be impeding his colleague: I'm actually trying to [laughs] grab his ankle with the baton. I had that baton in my hands and then use sideways force so that he turns around, and if I see it like this I am wondering if I am impeding my colleague. I don't know what he is doing, I don't know if he is also pushing sideways. I did say "Get on your belly" but I'm not getting any wiser as to whether he [colleague] is doing the same thing or if he is just on top of him and we're working against each other, I really don't know.
Originally, Craig interprets the situation as the two of them doing the same thing, but the video shows him otherwise. This fragment shows that "Get on your belly" was a discursive goal that seemed to work in the situation, but was not aligned with (collective) bodily how-to knowledge. Watching the video elicits this realization. Officer Saul experiences a similar discovery when he argues that his and his colleague Ollie's actions on screen look aligned, but says their acts are actually not. Interestingly, he uses a metaphor of children playing Legos: "Parents say, 'oh look they're playing so cute together.' But if you look closely you see they're not playing together at all because one of 'em is busy with Legos and the other with something else. So it looks synchronized but Ollie just did his thing, and I did mine." Videos thus not only bring about awareness of failures in reading each other's' bodies as a barrier to move into a coordinated chain of action, but also accounts that get us closer to bodily action. In fact, police officers use more bodily how-to than they know.

Discussion/Conclusion
This article has demonstrated that watching, discussing, and examining videos with police officers who themselves have participated in violent situations yields substantial insights into situated action, how-to knowledge, and meaning-making processes. More specifically, I have shown that, with the help of videos, officers reveal that they use bodily actions of both civilians and their colleagues as resources to project lines of action. Reading bodies, that is, signaling cues such as bodily positioning, generates incentives for action. There are thus physical interactional resources for developing a line of action: through the recognition of certain bodily acts, positions, and sounds, police officers recognize "the next move." Seeing and hearing other bodies provides stimuli for action. In fact, officers continuously create and look for action incentives. These results also illustrate how officers make their acts intelligible to others. They not only understand colleagues' bodily behavior as signals as a result of ingrained how-to knowledge, but also because they learn to think of their conduct as a team. My interviewees frequently argued that they notice "out of the corner of their eyes" when another officer reaches for handcuffs. The reach is projecting a line of action, which signifies the next step and gives an incentive to act yourself. 'Knowing' thus exists in the action sequence, in responding sensibly to what others are doing: through reading and acting upon another person's body, I know that you know that together we are entering a new future chain of action. Video interviews thus allow officers to explicate the "indexicality" of experienced situations (Knoblauch & Schnettler, 2012), or more accurately, how reading of signals is an "indexical operation" (Goodwin, 2018).
Before discussing the drawbacks of video interviews, I will discuss several benefits. First, watching videos yields a substantial level of detail through playback and slow-motion functions: people are better able to clarify the significance of gazes, postures, ways of touching, unintended acts, of themselves and those of others. Such detailed descriptions of bodily awareness, and nonverbal behaviors, give insights into microlevel teamwork, which would remain obscure during standard interviews. In this way, our understanding of embodied practices and knowledge moves beyond taken for granted "we-both-just-know-what-to-do" justifications because people can now pinpoint when, where, and how they interpreted behaviors as projecting a line of action. Without this method, the physical aspects of policing would not have become so apparent. Second, officers watching themselves lowered the threshold to talk about emotional-corporeal experiences, which is notoriously difficult for members of this profession to discuss. More generally, video elicitation helps to understand how people sync their actions into collective conduct. Therefore, the method can be fruitfully applied to illuminate the bodily action component for professions that deal with forms of collective social control or regulation of others, e.g. security guards, emergency and medical workers. Another merit is that through video elicitation scholars can integrate and triangulate observational data on practices, with data on associated thoughts and emotions. In Schubert's words (2008, p. 199;201) it offers validation, i.e. cross-checking researchers' interpretations with those of respondents, and exploration, i.e. learning more about the meaning of practices, gaining feedback on cognitive, social, and material practices and knowledge resources. Thus, video elicitation is a better suited tool to study how-to knowledge that is situationally contingent, and useful for any discipline studying social behavior.
The drawback of video interviews is that it may raise ethical issues when officers are explicitly asked to evaluate one another. I tried to avoid these issues by introducing these sessions in terms of understanding policing during violence, by only watching videos with those who felt comfortable doing so, and about situations in which they themselves participated. It is imperative that the researcher is conscientious of officers' and supervisors' attitudes towards more serious actions captured on film, e.g. the misuse of power, prejudice toward civilians, or excessive use of force, and their willingness to discuss them in a non-normative way. Such field-specific knowledge is only accessible through a long-term immersion in police teams. A preliminary stage of ethnographic fieldwork is thus indispensable to prepare for the application of video elicitation; for gaining access to videos and for building rapport to address possible transgressions in a non-threatening manner. "Being there" during ride alongs, and thus entering officers' experiential worlds, enabled me to dig deeper during video interviews because I had learned police jargon, officers' sensitivities regarding the use of force, gained their trust and was able to ask pertinent questions due to my familiarity with their bodily actions. As such, video interviews offer ample opportunity to discuss broader issues such as the regulation of arms or the enactment of the state via policing, and relating them to actual situations. Another drawback is that police officers tend to focus more on civilian behavior than their own, e.g. scrutinize antagonists' errors instead of personal ones, and may account for their actions, emotions and thoughts based on the video. To minimize this, I urge scholars to consider how explanations relate to actual recorded occurrences, design clear interview instructions, and ask "how" questions to redirect focus to police behaviors, for example, "How are you using your body to do that?" Accordingly, video elicitation may fail to explain social action when the material is of poor visual quality, thus hampering discussion of sequences of action; bodily or spatial orientations, gestures, and facial expressions. In such cases, scholars could use other data sources such as case files. In this study, I used officers' self-written reports that offer sequential descriptions of occurrences to fill in gaps of video recordings, and elicit reflections. Lastly, narrowing focus to actual situations may limit broadening analysis beyond specific experiences.
In future studies, I would recommend researchers to embed video elicitation in both police research and cultural sociology because it helps make explicit situated embodied how-to knowledge and co-creation of action. It would be fruitful to set up heterogeneous focus groups to discuss videos together, for instance with officers and suspects, or protagonists and supervisors. In this way, researchers gain insight into 1) differing incentives for action, 2) experiences or explanations of involved parties, and 3) how expectations about the process or outcome of situations contradict or verify one another. Protagonists may come to understand their actions better by learning from other people's observations. Videos are particularly useful in this regard because they provide a relatively objective perspective on real-life behavior. Researchers could also compare data sessions focused on explaining, with reconstructing actions step-by-step to better understand the dynamics of police-citizen interactions. In my study, I noticed that CCTV was more efficacious in allowing officers to reflect upon the actions of multiple actors and mutually aligning efforts due to an elevated angle on natural settings. Bodyworn camera footage on the other hand, records verbal communication which includes cues about meaning-making. This occasions potential to reflect on (intonation of) speech, fostering analysis of interpretations of behavior. Interestingly, past and contemporary police films and television programs are drenched with exciting (car) chases and thrilling arrests. Officers watching their own actions on video is like watching a police film with themselves as actors. This momentarily provides a break from the mundane character of policing. It is no wonder then that officers frequently laughed when they saw themselves entering a scene, pointing at the screen and saying, "Look, there I am!" It could be argued that this "funny" aspect may have had an impact on the seriousness of explanations, but I think it is more likely that laughter served as a way to distance themselves from what they were seeing, allowing them to speak indepth about what they do. For example, when Officer Alex felt ashamed watching himself on video, laughing was first a way to highlight the absurdity of seeing himself, but then allowed him to let go of his policing role expectations and reflect upon his actions.
Although the disparity between talk and action is a central methodological problem in the social sciences, scholars continue to use methods that abstract data from situations through interview methods or lack social meaning in video analysis. Prior work on violent interactions in policing and cultural sociology has not only failed to apply tools of embodiment and ethnomethodological (sequential) analysis, but also relied heavily on a divide between talk and action. Here, I have demonstrated that watching, discussing, and examining videos with police officers who participated in recorded events elicits verbalizations of collectively shared bodily knowhow. Violent police-civilian encounters are thus embodied, that is, bodily understanding and action project prospective action and works in interactional sequences. The broader conclusion is that violent interactions are not devoid of culture, but that shared embodied dynamics of violence are part of interactive collective accomplishments. Collective bodily know-how can be considered 'culture' because it is learned behavior and used to co-create situations. If sociology's concern is to explain social action and engage with people's understanding, there should thus be a greater appreciation of bodily foundations of culture, cognition, and knowledge. Video interviews are a better suited method that helps to understand (police) action as an embodied practice, elicit how-to knowledge, and acknowledges that human conduct and social reality are ongoing accomplishments.

Funding
The work presented in this paper is funded by the European Research Council, Consolidator Grant number 683133, awarded to Don Weenink.

Declaration of Competing Interest
None.