Gender-based violence: Statistical data for four Colombian municipalities

In this article, we describe the dataset on the conditions for gender-based violence (GBV) for women in four municipalities of Colombia: Cali, Buenaventura, Jamundí, and Yumbo. The database was developed by the Observatory for Women's Equity (OEM), an entity resulting from an alliance between Universidad Icesi and Fundación WWB Colombia. The OEM's purpose is to construct measurements that make it possible to account for GBV suffered by women. The following types of violence were classified: psychological violence, physical violence, sexual violence, workplace violence, and economic violence. In addition to the module on GBV, the survey has other modules with which to establish a socioeconomic characterization of women and households, through which to identify how these conditions can be linked to GBV. The sample size was 1,593 women in the four mentioned municipalities.


Value of the Data
• The information collected for this survey incorporates an intersectional and gender perspective. The survey inquires about the individual characteristics of the women including ethnicity, sexual orientation, occupational category, educational level, income level, etc. In addition to investigating the incidence of each of these types of violence, the survey explores the perpetrator of the violence, whether it has been suffered several times or only once, and how it has occurred over time, taking into consideration the last week, the last month, the last year, whether it was more than a year ago or over 10 years ago. • Violence against women or gender-based violence is a violation of human rights and a major public health problem because of the implications it has on the physical and mental wellbeing of women and its social costs [1] . This makes it a problem that urgently needs to be addressed through better public policies. In turn, these better policies require, among other things, better sources of information, and despite the seriousness of the problem, there is a gap in the collection of information associated with gender-based violence [2,3] . • For this reason, the data collected in this survey are of interest to GBV scholars considering that they can be explored taking into consideration intersectional factors that may exacerbate the incidence of GBV such as ethnicity, sexual orientation, education level, occupational category, etc. Demographic and Health Surveys (DHS) provide important information on these forms of violence and are available for the vast majority of countries but have limitations because they only consider women up to 49 years of age. This limitation has already been pointed out by entities such as the WHO [4] , which has highlighted the need to disaggregate by age and include women over 50 years of age in the survey. • Although this survey does not allow national-level comparisons because most of the information on GBV comes from administrative records such as complaints to the prosecutor's office, the forensic medicine institute, or data collected from telephone lines, it does offer the possibility of comparing four municipalities in one of Colombia's regions, which participates significantly in the number of cases of different types of violence against women in the national total. Finally, it also allows quantitative approximations to identify the variables or factors that can help explain GBV.  Sample   Cali  96  130  154  49  42  19  490  Buenaventura  189  68  58  6  321  Jamundí  32  215  90  39  14  2  392  Yumbo  91  247  49  2  1  390  Total  408  660  351  96  57  21  1593 Expanded sample a a Since this sample corresponds to a probabilistic sampling that seeks to make an inference from the total population of women in the four municipalities, a final expansion factor is calculated using the unbiased Horvitz-Thompson estimator: ˆ t y,π = s y k πk Where π k is the probability of inclusion for the k-th final sampling unit. "S" indicates that the sum is performed on the random sample. So when we refer to an expanded sample we refer to the results of the inference for the entire population and not just for the sample.

Data Description
The data presented in this survey was collected from August 26 to October 26, 2020. The questionnaire was designed by the observatory's measurement team and reviewed by internal and external peers from Universidad Icesi who have expertise in gender issues and in statistical operations. The women answered questions associated with socioeconomic characterization, gender-based violence, and economic and financial autonomy.
The study's unit of analysis is made up of women of legal age (18 years, in Colombia). The sample was structured in four municipalities of Valle del Cauca: Santiago de Cali, Buenaventura, Yumbo, and Jamundí. A multistage stratified probability sampling was implemented with a selection of units by simple random sampling. Socioeconomic strata were used as the main stratification variable for Buenaventura, Yumbo, and Jamundí, whereas for Cali, we used zones, defined as follows: Extended center and peri-center zone (communes 3, 4, 8, 9, 10, 11, and 12); (2) Ladera zone (communes 1, 18, and 20); (3) North-south corridor urban zone (communes 2, 5, 17, 19, and 22); and (4) East urban zone (communes 6, 7, 13, 14, 15, 16, and 21). In the first stage, the households were selected via the telephone sampling frame, and in the second stage, a person within the household was selected to answer the survey (Only one woman per household is selected). Table 1 presents the sample by socioeconomic stratum 1 or socioeconomic level in an absolute fashion and the expanded sample for the four municipalities.
The municipalities have chosen by OEM observatory target the subnational level. The survey has been focused on Valle del Cauca since 2018, specifically in the municipalities of Cali (the capital of Valle del Cauca), Yumbo, Jamundí, and Buenaventura. These municipalities were selected strategically. Cali, Jamundí, and Yumbo are close to each other, and they have shared economies, businesses, and mobility. They make up the metropolitan area of Santiago de Cali, so having data from the three municipalities will help us understand the overall dynamics of the largest and most important city in Valle del Cauca. Buenaventura, on the other hand, is the point of contrast. Geographically located on the Pacific coast, it is the second most important port in Colombia, with high poverty, inequality, and violence indicators. It was also the worst-hit city in Valle del Cauca by the armed conflict and a national reference point for public policy advocacy in the Pacific. This city was selected by the survey's funders as the second city prioritized for work (the first being the capital, Cali).
The questionnaire contains 256 questions distributed as follows: sociodemographic characteristics, household characteristics, gender-based violence, and economic and financial autonomy. The duration of data collection averaged 28 min. Raw data and supplementary material are attached. Table 2 presents the general characteristics of the women surveyed. These results are for the unexpanded sample as well as for the rest of the tables in which results are presented.
Finally, it is important to note that the response rate for this survey was 72% (a total of 2213 calls were made and a total of 1593 surveys were completed).

Experimental Design, Materials and Methods
To design the survey, we first searched for national and international surveys that address gender-based violence, not only from its instrument but also based on feminist methodologies that make it possible to maintain an ethical and safe approach for the respondents. From this search, the following measurement experiences were specifically considered: • National survey of violence against women 2006, Mexico [5] : this survey is carried out on women users of the Mexican health system. In this sense, the modules on sexual and physical violence by partners or ex-partners were essential for the construction of the survey. • Health and demographic survey 2015, Colombia [6] : although this is a broad survey on health and sexuality, it has a very good module on gender-based violence that also presents the definitions that were addressed in our study, that is how this module of this survey was used to construct the OEM survey. • Sexist violence survey 2016, Cataluña [7] : this survey represents a significant advance for the measurement of gender-based violence since it includes the advances that have been made in other regions on this issue, both in operational and epistemological terms. In our case, this survey helped us to structure violence in the workplace and to measure violence by people who are not a partner. This last part focused on the recognition of the actors who exercise violence outside the couple. • National prevalence survey on gender-based violence and generations 2020, Uruguay [8] : This survey differentiates between what happens in the public and private spheres. In this sense, two modules were particularly useful for the construction of our survey: violence in the work environment that is part of the public sphere and violence in the context of the couple that is part of the private one. • National survey on violence against women in France 2003, France [9] : This is the first survey on violence against women that is representative of a country. In our case, it was especially useful in two areas, sexual and economic violence. Additionally, this survey shows us a methodological path to measure the prevalence of GBV.
Based on the above, five modules were designed to address the types of violence: psychological, physical, sexual, economic, and patrimonial, and violence at work. The questions were designed according to each type of violence and are adaptations or adjustments to the country context. In the first four modules, the question structure investigates the victimizing events in four dimensions: i) occurrence: whether the event happened or not, ii) victimizer: a person who perpetrated the event, iii) incidence: number of times that the event has occurred, iv) prevalence: when was the last time the victim was subject to the incident. In the last module on violence at work, the questions refer only to the occurrence of violent events in the workplace.
These modules on gender-based violence are accompanied by three other modules. The first two are on the socioeconomic characterization of the respondents and the characterization of  Has anyone threatened you and/or taken you away from your loved ones or pets (include if your loved ones or pets have been hurt)?
6.72 0.13 the households. The last module of the survey is on economic autonomy. The dimensions addressed made it possible to identify the link between the respondents' social and economic conditions and GBV. The survey modules include eight modules that are organized as follows: • Sociodemographic information: In this section, we explore basic information about women associated with their sociodemographic position. • Household: This section seeks to collect information on kinship relationships within households. • Gender-based violence: This section is made up of 4 modules whose conceptualization is the framework of what Profamilia [10] has defined in each of these types of violence: • Psychological violence: "This includes any action or omission intended to degrade or control the actions, behaviors, beliefs, and decisions of other people through intimidation, manipulation, threat, humiliation, isolation, or any conduct that involves damage to psychological health. This type of violence is one of the most common and naturalized in society, so we must learn to recognize and report it." For the construction of this section, the following 6 questions were used ( Table 3 ).
Asking questions about GBV implies that we must have special considerations to prevent women from being re-victimized. In this sense, all the designed modules that directly address this topic include within the question options: does not want to answer. For this reason, in Tables 3-8 , the NA column is included in the results, which shows the percentage of women who did not want to answer these questions.   Additionally, each of the questions in which they resulted in a Yes were asked ( Table 4 ).
• Economic violence: "This occurs when money is used as a factor to dominate or establish damaging power relationships. This type of violence can manifest itself when money is taken away from a person, who is prevented from spending it for his or her own benefit and that of his or her family, or denied money to control his or her independence." In this section, 15 violence recognition questions were used, divided into three parts ( Table 5 ). First, it asks about economic violence in general and that it could involve various actors (8 questions). A second part that asks about vicarious violence, that is, the violence that is exerted towards the couple's children with the intention of harming the couple (4 questions). And a third part that only focuses on violence within the couple (3 questions).
In this case, the questions about actors, prevalence, and incidence are added only to those that respond to general economic violence. In the case of vicarious violence, questions are asked about temporality and incidence, but not about the actors. And in the case of violence in the context of the couple, the question does not ask additional questions.
• Sexual violence: "This includes all sexual or verbal relationships or acts, not desired or accepted by the other person. Men or women can fall victim to sexual violence when force, physical or psychological coercion, or any other mechanism that nullifies or limits personal will is used against them." As with psychological violence, there are 6 questions on sexual violence (see Table 6 ) and the structure of derived questions is the same.
• Physical violence: "This includes all attacks to a person's body, whether through blows, throwing objects, confinement, shaking, or squeezing, among other behaviors that may cause physical damage." 2 In the case of physical violence, two questions are asked and the structure of the derived questions is the same as in the case of sexual and psychological violence (see Table 7 ).
• Violence at work: The aim here was to characterize acts that can be configured as violence at work based on the definition of the International Labor Organization (ILO) [11] that defines it as "Any action, incident or behavior that deviates from what is reasonable through which a person is attacked, threatened, humiliated or injured by another in the exercise of his professional activity or as a direct consequence of it." 3 Violence at work is measured with 9 general questions and an additional question to measure the level of reporting of this violence (see Table 8 ).
Economic and financial autonomy: This autonomy is conceptualized from the perspective of ECLAC, which means that financial and economic autonomy will be understood as a broad notion that refers to the capacity and material conditions that women need in order to have effective control over their own lives. This general sense of financial and economic autonomy combines two analytical concepts and three thematic categories. Being an autonomous woman in economic-financial terms means having a certain level of financial independence that makes it possible to make decisions that there are reasons to value. It also involves having some participation or control over the decisions involving the allocation, distribution, and enjoyment of resources. Financial independence means having the means to support oneself economically while maintaining a good quality of life, and to freely decide the destination of your income without requiring the authorization of a third party.

Telephone interviews
The survey was conducted by telephone due to the mandatory confinement imposed by the Covid-19-related restrictions in Colombia at the time of application. Telephone surveys have some limitations compared to face-to-face surveys, including high rates of abandonment and rejection of the survey by this means, which in Colombia is a mechanism used by criminals for theft. To compensate for this situation, the questionnaire was shortened so that data collection did not exceed 30 min and, on the other hand, an ethical and biosafety protocol was designed so that the women would feel in a more secure environment. However, the survey by this means also offers benefits and in the particular case of this survey that inquired about GBV, it affords some women a sense of anonymity to answer questions that may be very sensitive.
For the fieldwork of this statistical operation, the National Consulting Center -CNC 4 -(a company dedicated to carrying out social, business, and market studies that have extensive experience in the development of this type of survey) was contracted. The sampling frame available in the CNC is a telephone directory with about 6.9 million landline numbers and 10 million cell/mobile numbers that are constantly updated.
In the final base of this study, it has 55% cell/mobile registrations and 45% landline numbers. In the case of Colombia, the calls received are not charged, that is, the people who answered the survey did not have any charge. In the case of the cost assumed by the CNC for the year 2020, the cost per minute to a cell phone number was 50 COP (0.01USD) while the minute to landlines was 30 COP (0.008USD). The selection of telephone records is random; It works through a record download algorithm used by the Call Center of the CNC.

Informed consent
Since the survey process was carried out by telephone, informed consent was included at the beginning of the interview. Although the complete questionnaire can be consulted in the Mendeley repository in which the data and other materials of this survey were deposited, here is the section referred to: "Good morning. My name is ______ of the Centro Nacional de Consultoría, a private company dedicated to market, social and public opinion research, and work for the Observatorio para la Equidad de las Mujeres, of the ICESI university and WWB Foundation Colombia. We are conducting a citywide survey of women over the age of 18 on issues associated with family life, the economy, public participation and gender-based violence". This information will be used only for statistical and academic purposes to influence policies in favor of women's equality. This is in accordance with Law 1581 of 2012 on the protection of personal data. All information that you provide us will be kept strictly confidential and will not be disclosed to others. Your participation in this interview is voluntary and if any question arises that you do not want to answer, let me know and I will continue with the following questions.
We hope to count on you, since your participation is very important for this process. ¿Do you agree? With these details, I ask for your authorization to take your data and do the survey, which will take us approximately 20 min.
Authorize: Yes______ No______ Additionally, this survey displays an ethics protocol that is explained below.

Ethics protocol
The literature review and previous experiences conducted as part of the research showed that this type of measurement process where GBV is investigated, can put women at risk if they happen to be responding in spaces which they share with their partner or aggressor. Hence, the importance of training and protocols designed to reduce these negative impacts [12] . In addition to these risks, conducting the survey in times of pandemic implied biosecurity measures, so the survey was conducted by telephone and, at the time of application, the country's confinement measures were based on population groups, which meant that for it was difficult for some women to find time alone so as to be at easr when responding.
Given the above, it was necessary to design an ethics protocol that would protect respondents and interviewers. The interviewers received training that, in addition to including everything related to the questionnaire and the ethics protocol, included conceptual aspects about gender and GBV. They were also trained to practice emotional restraint. Considering that the respondent could be close to her possible aggressor, it was defined that the woman could name a fruit whenever she felt at risk, and with this signal, the interviewer would understand the situation of possible danger to which she would be exposed. Once this happened, the interviewer would stop asking questions and wait for an indication from the respondent that they could continue the interview. If the situation in which the woman surveyed involved greater risks, the woman would say the name of a fruit and hang up. The interviewer would then wait 5 min to call back and if the call was not answered, the interviewer would proceed to call the local authorities in charge. All this implied that the OEM, as the entity in charge of the survey, activated GBV attention routes with the women's or gender equality offices in the municipalities where the surveys were being administered. A virtual channel was also established and constantly reviewed by the OEM staff, to activate the route in a timely manner whenever required.