Is the Brief Child Abuse Potential Inventory (BCAPI) a valid measure of child abuse potential among mothers and fathers of young children in Germany?

Background: In order to prevent child abuse, instruments measuring child abuse potential (CAP) need to be appropriate, reliable and valid. Objective: This study aimed to con ﬁ rm the 6-factor structure of the Brief Child Abuse Potential Inventory (BCAPI) in a German sample of mothers and fathers, and to examine longitudinal predictors of CAP. Participants and setting: Two waves of data were collected from 197 mothers and 191 fathers of children aged 10 – 21 months for the “ Kinder in Deutschland – KiD 0 – 3 ” in-depth study. Families were strati ﬁ ed based on prior self-report data for screening purposes. Methods: 138 fathers and 147 mothers were included in the analysis (invalid: 25% mothers, 30% fathers). First, validity of reporting was examined. Second, con ﬁ rmatory factor analysis (CFA) was employed to assess factor structure. Third, internal reliability and criterion validity were examined. Finally, multivariate poisson regressions investigated longitudinal predictors of CAP in mothers. Results: A previously established six-factor structure was con ﬁ rmed for mothers but not fathers. CFA failed for fathers due to large numbers of variables with zero respectively; and iii) To investigate longitudinal predictors of child abuse potential for mothers and fathers in a non-English-speaking sample.


Use of the CAPI with mothers and fathers
The CAPI is used for the detection of child abuse potential in mothers and fathers. Research so far is limited on differences in psychometric properties of the CAPI comparing mothers and fathers. Some studies have investigated differences in mean CAPI scores in mothers and fathers with conflicting results: in Portugal, fathers had lower child abuse potential than mothers (Romero-Martínez et al., 2014), in Malaysia, fathers had higher CAPI mean scores (Chua & Abdul, 2001), and in the U.S., no differences were observed (Brewster, Nelson, McCanne, Lucas, & Milner, 1998). While predictors of child abuse potential were similar among fathers and mothers in Portugal (Romero-Martínez et al., 2014), a study from the U.S. found higher risk of distress, unhappiness and problems from people outside the family among mothers while fathers reported more parenting rigidity (Pittman & Buckley, 2006). It is not clear whether differences in scores between mothers and fathers reflect actual differences in child abuse potential or whether they are the result of reporting biases (e.g. questions may be phrased in a way that reflects mothers' child abuse potential better than fathers'). Differences relating to reporting bias can affect the factor structure of the instrument. Much of the published research also focuses on samples consistent predominantly of mothers. No studies thus far have investigated differences in factor structure and in prospective validity of the CAPI. Given these conflicting results, there is insufficient evidence about differences between mothers and fathers in measurement functioning of the CAPI.

The Brief Child Abuse Potential Inventory (BCAPI)
The BCAPI is a short form of the CAPI developed by Ondersma, Chaffin, Mullins, and LeBreton, (2005). It consists of 33 items, 24 of which are summarized in a child physical abuse scale. The rest of the items form validity scales: a three-item Random Response Scale and a six-item Lie Scale. Ondersma et al. developed the BCAPI using two independent development samples recruited from families in the U.S. involved with child protective services. Each item of the CAPI Abuse Risk Scale was ranked by its predictive importance of the full CAPI and future child protective reports. Item total correlations were then checked for each of the sub-groups (White, Native American, African American and Hispanic women and men). 39 items in total were identified. Items were then examined for content. Those with complex phrasing were removed, resulting in 33 items which were subjected to principal axis factoring (PAF) using oblimin rotation. This resulted in a seven-factor solution with 24 items. Cross-validation was then performed on two additional independent samples (Ondersma et al., 2005).
The BCAPI has been psychometrically evaluated in three studies to date. Ondersma et al. suggested a stable 7-factor solution in their U.S. samples which explained 66.2% of the variance in BCAPI abuse scores. Their two cross-validation samples (n = 713) consisted of parents (74% & 64% mothers) of diverse ethnic backgrounds (Caucasian, Native American, African American and Hispanic) who were referred for prevention and treatment of child abuse. They established good internal consistency (KR20 = .89) and high correlation of the BCAPI abuse score with the CAPI abuse score (r = .96). Both CAPI & BCAPI Scores correlated highly with the Beck Depression Inventory Scores (r = .67), suggesting overlap between the two in measuring depression. The BCAPI scores predicted reports of neglect more strongly than reports of physical abuse (Ondersma et al., 2005). Walker and Davies (2012) examined the validity of the BCAPI in a convenience sample recruited through schools in the U.K. (n = 358). This sample was less ethnically diverse than the U.S. sample consisting of 88% mothers, 92% White. Principal axis factoring with oblique rotation resulted in a six-factor solution. Internal consistency was good (α = .82) but criterion validity was not examined. The authors found that one item on the random response scale increased potential invalid protocols from 30.3% to 94%, possibly suggesting a need for adaptation for the U.K. population. Dawe, Taplin, and Mattick, (2017) examined the psychometric properties of the BCAPI in a sample of Australian mothers on opioid substitution therapy. PAF with oblimin rotation resulted in a six-factor solution which accounted for 53% of the variance in BCAPI Scores. Strong correlations were found between the BCAPI sub-scales and depression and anxiety. Further, strong correlations between all sub-scales except for rigidity were shown, suggesting that rigidity items may be tapping into an independent construct outside of other BCAPI sub-scales. Internal consistency of the BCAPI Abuse Risk Scale was high (KR20 = .90) and tests for criterion validity showed weak correlations between other theoretically-related constructs, such as substance use or childhood history of abuse, but showed strong correlations with psychological distress (Dawe et al., 2017).
The above studies are subject to a number of limitations. First, the vast majority of participants in the four samples were mothers and only those studies with higher percentages of mothers were able to replicate the six-factor structure. Only one of the studies investigated some psychometric properties specifically for fathers. Second, data analyses were conducted in SPSS for three studies, but no mention is made about the distribution of the data and the treatment of binary indicator variables which may be problematic when using that specific software without add-ons. Third, sampling biases for all four samples cannot be excluded. In particular, Walker & Davies achieved a participation rate of only 24.7%, with an additional 10% excluded due to missing information (Walker & Davies, 2012). Fourth, three studies recorded high numbers (> 30%) of invalid protocols, thus excluding a large part of the sample from further analysis, and one study did not report validity indices. To interpret the findings, these limitations must be taken into consideration.

Gender differences in BCAPI child abuse measures
No published study to date has examined differences in factor structure of the BCAPI among mothers as compared to fathers, nor has there been a systematic assessment of the factor structure solely in fathers. This type of research is important because differences in psychometric properties between men and women may indicate either actual differences in child abuse potential or measurement problems. Most research has focused on differences in mean BCAPI abuse scores and internal consistency. Mean scores have been found to be either higher among fathers (Kelley et al., 2015;Tucker, 2014) or not different between parents (Rodriguez, Baker, Pu, & Tucker, 2017).
Thus, the evidence base of the BCAPI is limited and of varying quality. Very little research has examined differences in BCAPI Abuse Scores and psychometric properties of the instrument in mothers and fathers. No studies have tested for measurement invariance between mothers and fathers or have investigated whether the BCAPI is a suitable measure for both parents. Furthermore, no evidence is available on the use of the BCAPI outside predominantly English-speaking countries. To respond to these gaps, the current study therefore had three aims: i) To investigate whether the 6-factor structure found in previous studies could be confirmed for mothers and fathers in this German sample respectively; ii) To examine psychometric properties of the BCAPI for mothers and fathers respectively; and iii) To investigate longitudinal predictors of child abuse potential for mothers and fathers in a non-Englishspeaking sample.

Sample
Data were collected from September 2014 to February 2015 (baseline) and seven months later (follow-up) as part of the Kinder in Deutschland aged zero to three years (KiD 0-3) in-depth study. The KiD 0-3 is a nationally representative study examining the epidemiology of psychosocial burden and risk for child abuse and neglect in Germany (Eickhorst et al., 2016). A pilot study with a sample of N = 6000 caregivers of toddlers aged 0-3 was carried out in two large German cities to test the risk inventory and two different types of sampling (Eickhorst et al., 2015). Families were recruited via phone and mail (city 1) or at child development reviews (child-well visits) in pediatric surgeries (city 2). Ethical approval for the pilot study was granted by the General Medical Council in the North-Rhine region (No 2,013,247). The in-depth study utilizes a sub-sample of the pilot study population of primary (n = 197) and secondary caregivers (n = 197) of children aged 10-14 months or 17-21 months (Zimmermann et al., 2016). In-depth study participants were recruited according to the amount of distal and proximal risk factors for child maltreatment they disclosed in the pilot study. Distal risk factors included family receipt of welfare benefit, overcrowding, > 2 children, maternal unemployment, low maternal education, maternal history of child abuse or neglect, mental disorder or substance abuse of any caregiver. Proximal risk factors included parental conflict, domestic violence, maternal depression, negative attitudes during pregnancy. High-risk families had a mean of 5.4 risk factors, medium-risk families a mean of 2.3 risk factors, and low-risk families a mean of 0.5 risk factors. Equal numbers of parent dyads from all three risk groups were recruited for the in-depth study but the high-risk group was slightly under-represented (Table 1).

Procedure
Primary and secondary caregivers filled in a posted standardized self-report questionnaire. Primary caregivers also completed an interviewer-assisted questionnaire during a home visit. All completed questionnaires were retrieved during the home visit. In 97% of cases, the primary caregiver was the mother, and in 94% of cases, the secondary caregiver was the father.
All caregivers gave written informed consent prior to participation in the study. Participants received small financial incentives (€40 per family) and were told that they could withdraw at any time without declaration of reasons. Confidentiality was maintained throughout the study. Completed questionnaires were placed in sealed envelopes by participants and only linked via participant ID number. Researchers in contact with participants were not aware of families' risk group categorization. Where researchers had concerns about a child's risk of harm, procedures were put in place for self-referral to existing services with follow-up support.

Measures
Data from all measures were collected at both baseline and follow-up except where explicitly stated. Child abuse potential was measured using the Brief Child Abuse Potential Inventory (BCAPI) (Ondersma et al., 2005). The BCAPI includes two validity scales, the lie scale and the random response scale. The actual child abuse risk scale includes 24 items measuring child abuse risk factors in an agree/disagree format, which are summed to a total score.
Socio-demographic characteristics on child age and gender, parental age and gender, parental relationship, training, and employment were collected with single-item questions from both fathers and mothers. Household information, education and immigration history of both parents were collected with single-item questions only from the primary caregivers. Immigration history was based on §6 of the German MighEV act: being a foreign national, born outside Germany or having any parent that immigrated to Germany. Overcrowding followed a definition used by Eurostat: less than one room per family, one room for both parents, one room per two children under 12 years, one room per children aged 12 years and older, and one room per other adult.
Psychiatric symptoms were measured using the Patient Health Questionnaire (PHQ-D) (Löwe, Spitzer, Zipfel, & Herzog, 2002). These included the PHQ-9 for depression, the GAD-7 for anxiety, and a six-item questionnaire on alcohol consumption. The PHQ-D has previously been used in German samples (Löwe, Kroenke, Herzog, & Graefe, 2004, 2008. Items of the depression and anxiety sub-scales that were measured in 4-point Likert-scales were summed up into a total scale score and then dichotomized based on validated clinical cut-offs (both > 9). Items on the alcohol consumption sub-scale were summed up into a total score and dichotomized (cut-off > 1). Parental childhood adverse experiences were measured using the ten-item Adverse Childhood Experiences (ACE) measure, which has been psychometrically tested in Germany and showed satisfactory internal consistency and good construct validity (Wingenfeld et al., 2011). This measured physical, emotional and sexual child abuse victimization; neglect; parental separation; mental health problems; drug abuse; domestic violence and imprisonment. A total sum score of exposures to childhood adverse experiences (dichotomous scaled in yes/no) was created (max. 10), and participants were dichotomized into those with multiple exposures (> 3 ACES) and those with no or fewer exposures based on existing studies on health outcomes (Hughes et al., 2017). This measure was only employed at follow-up.
Family violence was measured using six items from the national prevalence study by the National Society for the Prevention of Cruelty to Children in the United Kingdom (NSPCC; Radford, Corral, Bradley, & Fisher, 2013), which are based on the Juvenile Victimization Questionnaire (Finkelhor, Hamby, Ormrod, & Turner, 2005). Items were translated and adapted to parent report on abusive behaviors experienced by child. All items had a yes/no answering format. Three items measured child maltreatment (hitting, shaking or neglecting the child), and three items assessed domestic violence (destruction caused by argument, threat or violence) since birth of the child. A dichotomous variable was created for no abusive incidents versus any abusive incidents.
Perceived stress was measured using the four-item Perceived Stress Scale (PSS-4) (Cohen, Kamarck, & Mermelstein, 1983). This assesses the general state of perceived stress in a person's life (i.e., "In the past month, how often have you felt that you were unable to control the important things in your life?"). For the purposes of this study, the five-option response code was extended by an additional option that signified "always." The PSS was created for use in the general public and has been used successfully in Germany (Chu, Jahn, Khan, & Kraemer, 2016). A psychometric evaluation of the German ten-item PSS showed good internal consistency and construct validity (Klein et al., 2016). Two items were reverse coded. Items were summed with higher scores demonstrating higher levels of perceived stress. Internal consistency in the KiD 0-3 in-depth study was α = .80 for mothers and α = .77 for fathers (Liel, 2018).
Satisfaction in the parental relationship was measured using the four-item short form of the Dyadic Adjustment Scale (DAS) (Spanier, 1976). The DAS-4 showed good reliability and measured the couple satisfaction and relationship quality (Sabourin, Valois, & Lussier, 2005). Items were summed to a total score with higher scores showing lower levels of happiness in the relationship.
Division of parental tasks was measured using an adapted version of the Who Does What (WDW) (Cowan & Cowan, 1988) sub-scale on Child Related Tasks (6/18 months). 15 items assessed the distribution of parental tasks between both parents. One item "bringing the child to bed" and two items on childcare on weekdays and weekends were added to the original 12 parental tasks. A mean score of items on nine-point Likert scales was created where higher scores signified higher involvement by the father (1 = mother does it all, 5 = both parents do it equally, 9 = father does it all). One additional item measured satisfaction with division of parenting tasks on a five-point Likert scale. A higher score signified lower satisfaction. The WDW was only employed at baseline with an internal consistency of α = .84 for mothers and α = .81 for fathers (Liel, 2018).
Co-parenting was measured using five items on a six-point Likert-scale adapted from the Family Panel studies "Growing up in Germany" (AiD:A II; German Youth Institute, 2013) and "Panel Analysis of Intimate Relationships and Family Dynamics" (pairfam; Wilhelm et al., 2016). The items measured three dimensions of co-parenting: caregivers' collaboration in relation to parenting issues, problem solving and triangulation with the child. Items were summed with higher scores demonstrating lower levels of co-parenting. The internal consistency of total score in this study was α = .80 for both parents (Liel, 2018).
Parental self-efficacy was measured using a German version of the 16-item post-natal Self Efficacy and Nurturing Role Questionnaire (SENR) (Pedersen, Suwalsky, Cain, & Zaslow, 1987), which is adapted from the Parenting Sense of Competence Scale (Gibaud-Wallaston & Wandersman, 1978). It measures expectations in relation to parental competences, such as caring tasks for the child or understanding the child's needs, and has been utilized with mothers and fathers (Solmeyer & Feinberg, 2011). A total sum score was created with higher scores reflecting higher self-efficacy.
Missing data (< 10%) were handled by mean score replacement in CAPI, ACE, WDW and SENR. Given small numbers of items in the other measures, missing data have not been imputed there. Internal consistency of risk scales in this study using mother and father self-report ranged from α = .75 to α = .89 with one outlier: DAS (father-report) α = .64 (Liel, 2018).

Data analysis
Data analysis followed nine steps and used baseline data unless otherwise stated. Analyses were carried out separately for mothers and fathers. First, following Little's approach for testing whether data are missing completely at random, a missing values analysis was carried out in SPSS. Missing values for all BCAPI items were below 5%, which suggests that missing data are likely not a concern in terms of results bias. Second, characteristics of the sample were examined. Prevalence rates for binary variables were compared using a χ 2 test. Mean values for continuous variables were tested for significant differences between mothers and fathers using an independent samples t-test or Mann-Whitney U Test (non-normal distribution). Third, following Walker and Davies (2012), a validity filter was used for all BCAPI items, composed of the Lie Scale (six items) and Random Response Scale (three items). Fourth, differences between frequencies/means of valid and invalid protocols were examined using χ 2 test/t-test or Mann-Whitney U Test. Fifth, correlations between items were examined, tests for multicollinearity were conducted and distribution of the data was examined. Sixth, Confirmatory Factor Analysis (CFA) in MPLUS 8 using the Weighted Least Square Means and Variance Adjusted (WLSMV) estimator for categorical data was conducted to examine whether the six-factor structure identified in previous research (Walker & Davies, 2012) could be confirmed for fathers and mothers using valid protocols only. Where factor structure could not be confirmed, inadequate data or unsuitability of the BCAPI was assumed and no further analyses carried out. As missing data were less than 5%, a decision was made to use maximum likelihood estimation with pairwise deletion for the factor analysis component.
Seventh, internal consistency of the BCAPI Abuse Risk Scale was tested using the Kuder-Richardson Formula 20 (KR20) for dichotomous data, and correlations using Spearman's rho were conducted with measures thought to be theoretically and empirically associated with child abuse potential to establish concurrent validity. Finally, predictors of child abuse potential were tested using two separate longitudinal regression analyses (full BCAPI Abuse Risk Scale and CFA reduced BCAPI Abuse Risk Scale 2 ) with BCAPI Abuse Score at T2 as outcome and BCAPI Abuse Score at T1 as control variable using the valid protocols only. All risk factors were measured at T1 or reported retrospectively (e.g., childhood experience of abuse). Following Hosmer and Lemeshow (2000), all independent variables were first examined in univariate regressions and retained for multivariate regressions if p < 0.25 (Step 1). In multivariate regressions, all independent variables with p > 0.1 were subsequently removed (Step 2), and then all independent variables with p > 0.05 were removed until all independent variables were significant at p > 0.05 (Step 3). These models were estimated using poisson regression to accommodate for the use of count variables. Zero-inflation with Vuong's test was used to adjust for the skewed distribution of some of the BCAPI scores where necessary (Fig. 1) (Li et al., 1999).

Characteristics of the sample
Characteristics of the sample are presented in Table 1 and 2. Differences between mothers and fathers were observed with regards to migration history, employment, caregiving arrangements for children, mental health, adverse experiences in childhood, and BCAPI Abuse Scores (Table 2).

Validity indices
Validity indices were calculated for both mothers and fathers. Those with a score > 3 on the Lie Scale and a score > 2 on the Random Response Scale were excluded (Walker & Davies, 2012). Participants who scored > 3 on the Lie Scale and > 12 on the BCAPI Abuse Risk Scale were retained (Milner, 1986). 25% (n = 50) of mothers and 30% (n = 59) of fathers had invalid protocols. This resulted in a sample of 147 mothers and 138 fathers with valid protocols that was included in the analyses. The difference in invalid protocols between mothers and fathers was not significant. Using the Random Response Cut-Off > 0 from Ondersma et al. (2005) would have produced much higher rates of invalid protocols in mothers (41%) and fathers (43%).
These results suggest that fathers at higher risk for abusing their child may be more likely to have invalid protocols. According to testing by psychosocial risk groups, fathers in the group at high risk were more likely to have invalid protocols than those in the medium-or low-risk groups (44.6% versus 33.8% versus 21.5%, χ 2 (2,189) = 7.65, p < .05). There was no statistically significant difference for mothers. All subsequent analyses presented data using valid protocols only.

Characteristics of the data
Prior to factor analysis testing, data were assessed for multicollinearity using the variance inflation factor (VIF). For fathers and mothers, no variables fulfilled the assumptions for multicollinearity (VIF > 10). However, 11 variables for fathers and two variables for mothers correlated very highly with each other (r > .8), and thus these were kept under observation in the analyses (Field, 2009). Distribution of the data across the BCAPI Abuse Risk Scale was assessed. The data were non-normally distributed with a high number of zero scores for both mothers and fathers, particularly at baseline.

Confirming the six-factor solution
Walker and Davies (2012) established a six-factor structure in a sample of British parents (92.3% female) that was confirmed in a 2 The reduced BCAPI score is a sum score of all the items confirmed in the CFA. sample of Australian drug-using mothers (Dawe et al., 2017). Thus, Confirmatory Factor Analysis in MPLUS using the WLSMV robust estimator was employed to confirm the factor structure in this current sample. Analyses were run separately for fathers and mothers.
For mothers, two items were removed due to empty cells related to selection of the response option (all participants selected "disagree"), and one item was removed due to low factor loadings (< .4) (Bowen & Guo, 2012). After removal of these items, the sixfactor structure suggested by Walker and Davies could be confirmed. Model fit for the final model was χ 2 /df = 1.17, p = .062 for CMIN 203.586, df = 174; RMSEA .034 (p = .914), WRMR .727, CFI .992, TLI .991 (Table 3). High correlations between all factors except rigidity could be observed (Table 3).
For fathers, 13 items had to be removed due to empty cells and low variance (all participants selected "disagree"). This eliminated two factors and resulted in a non-positive definite covariance matrix. The six-factor structure identified by Walker and Davies (2012) could not be confirmed among fathers in this sample and thus no further analyses were conducted on the paternal data.
With regards to psychometric properties, these analyses confirmed the six-factor structure of the BCAPI for mothers but not for fathers. This demonstrates potential differences in suitability of the measure among parents. Taking into account that fathers reported lower abuse scores than mothers, this could indicate that BCAPI items are not gender-sensitive for fathers. It is possible that the wording results in inner rejection by fathers. While the measure has been successfully used in mothers in other populations and factor structure could be replicated, use in fathers is questionable based on these data and requires further research.
To investigate concurrent criterion validity of the BCAPI Abuse Risk Scale, correlations between the BCAPI reduced scale, the BCAPI sub-scales, and empirically and theoretically associated measures were conducted for mothers (Table 4). For mothers, moderate to strong correlations were observed between BCAPI scores and all hypothesized factors except for family violence and adverse childhood experience, both of which showed small correlations.

Predictors of child abuse potential in mothers
As the BCAPI Abuse scale is often used in clinical practice, we present two sets of regression results here: those using the full BCAPI Abuse Scale and those using the reduced BCAPI Abuse Score resulting from the CFA. Results for Hosmer & Lemeshow's Step 2, Step 3 and Step 4 (where applicable) of the zero-inflated poisson regression are presented for mothers (Table 5).
Step 2 of the analysis includes sub-sets of significant risk factors for mothers identified through correlations.
For mothers, baseline BCAPI Abuse Score indicated that perceived stress and alcohol abuse increased child abuse potential at follow-up. An additional risk factor for increased child abuse potential using the full BCAPI score was dissatisfaction with parental roles. Higher education status at baseline decreased child abuse potential at follow-up (Table 5).

Discussion
This study described the first psychometric evaluation of the BCAPI in a sample of mothers and fathers that was not predominately English speaking. It found significantly higher BCAPI mean scores among mothers than fathers, and a higher number of invalid protocols among fathers than mothers. The hypothesized 6-factor structure could be confirmed for mothers but not fathers. Internal C. Liel et al. Child Abuse & Neglect 88 (2019) 432-444 reliability and concurrent criterion validity were adequate for mothers. Longitudinal analyses examining risk factors for child abuse potential in mothers showed some hypothesized results. Only one study had examined differences in BCAPI mean scores between fathers and mothers and reported no differences (Rodriguez et al., 2017). Previous research on the CAPI found that mothers who were married or in relationships had higher CAPI abuse mean scores than fathers, whereas single fathers had higher CAPI scores than single mothers (Miragoli et al., 2015). Further research is needed with samples containing both caregivers and different population groups to investigate under which circumstances fathers or mothers may have a higher risk for child abuse potential.
This study found lower percentages of invalid study protocols compared to Walker and Davies (2012), but the number of invalid protocols was still high (25% for mothers; 30% for fathers). Among mothers, those with invalid protocols were more likely to have a history of migration. Among fathers, those with invalid protocols were found to have higher risk for child maltreatment. These fathers had higher BCAPI scores, elevated stress and depression rates, and were more dissatisfied with their relationship with the mother, including co-parenting and role distribution. This is concerning as the BCAPI is often used by clinicians and thus does not identify fathers at the highest risk of maltreating their children unless the validity indices are taken into consideration during screening. As no other studies have used the BCAPI with a population of fathers, future research is needed to examine whether these findings are replicable. This study was able to confirm the six-factor structure of the BCAPI among German mothers previously identified in the literature (Dawe et al., 2017;Walker & Davies, 2012). The six-factor structure could not be confirmed for fathers, mostly because insufficient variance for several items required removal of these. In the worst cases, insufficient variance led to empty cells, which means that for some variable combinations, certain dimensions have zero observations. These issues can make the use of robust methods, such as factor analysis for extracting the dimensions of child abuse potential, impossible. While this problem might be specific to this dataset, it might also point to an insufficient sensitivity of the questionnaire items in detecting underlying risk factors that contribute to child abuse potential, particularly among fathers. Future research should investigate whether insufficient variance is a specific problem with fathers also in other cultural contexts and languages and requires adaptation of the measure to make it suitable for clinical use with fathers.
Internal consistency of the BCAPI and its sub-scales among mothers was good except for the impact of others and rigidity sub-

Table 5
Risk and Protective Factors of BCAPI Abuse Score at follow-up in mothers using multivariate (zero-inflated) poisson regressions.

Predictors
Step 2 Step 3 Step 4 Step 2 Step 3  Notes: + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. scales. Internal consistency for the validity indices in this study was unacceptable. Previous research on the BCAPI and the CAPI has identified low levels of internal consistency on the validity scales. This is not surprising given the small number of items and the phrasing of the questions, which do not suggest associations between items (Haz & Ramírez, 2002;Walker & Davies, 2012). Poor internal validity on some of the other sub-scales may be related directly to low numbers of items within the sub-scales despite moderate inter-item correlations (Tavakol & Dennick, 2011).
Concurrent validity was also assessed by examining associations among commonly correlated constructs with child abuse potential. Concurrent validity was found to be adequate for mothers. Notably, the association of BCAPI with family violence was small among mothers, suggesting either social desirability bias on the family violence section of the questionnaire or a weakness of the BCAPI in relation to accurate identification of abusive parents. Future research may valuably use measures which measure actual incidents of abusive behavior rather than underlying risk factors.
4.1. Risk factors for child abuse potential using sum scores This paper identified significant and substantive differences with regards to risk factors for child abuse potential using the traditionally calculated BCAP abuse scale sum score versus the reduced sum score based on CFA results. Both scores were examined in this study as clinicians in practice use the full sum score of the whole 24-item measure.
For mothers, we find clear associations between BCAPI Abuse Scores at baseline and BCAPI Abuse Scores at T2 (for both full BCAPI and reduced BCAPI), suggesting that these scores tend to remain more or less stable over time without intervention (here, 7 months). For mothers, dissatisfaction with parental role distribution (risk), alcohol abuse (risk) and higher education level (protective) were predictive of full BCAPI score at follow-up over and above baseline BCAPI score. Baseline BCAPI score, alcohol abuse and higher education level were predictive of the reduced BCAPI score at follow-up.
Interestingly, presence of more than three adverse childhood experiences did not predict any BCAPI scores at follow-up. A large body of previous research has assumed linkages between childhood victimization and perpetration of abuse in parenthood (Deater-Deckard, Lansford, Dodge, Pettit, & Bates, 2003;McCloskey & Bailey, 2000). This view has become more nuanced in recent years, however, with research demonstrating surveillance or detection bias for families with prior involvement in child protective services as well as the moderating effects of safe, stable and nurturing relationships in breaking the intergenerational transmission of child abuse (Schofield & Lee, 2013;Widom et al., 2015). Adverse childhood experiences have been found to be associated in particular with poor mental health outcomes (Hughes et al., 2017), which again have been identified as a risk factor for child maltreatment (Meinck, Cluver, Boyes, & Mhlongo, 2015).
Alcohol abuse at T1 was identified as a specific risk factor for mothers in the present sample, replicating a large corpus of existing research (e.g., Walsh, MacMillan, & Jamieson, 2002;Widom & Hiller-Sturmhofel, 2001). A further predictor specific to maternal child abuse potential was higher education status. This was shown to have an inverse relationship and thus associated with lower BCAPI Abuse Scores. Thus far, little research has been conducted on the impacts of education on child abuse potential, but research has consistently shown the protective effects of secondary education on harsh parenting in adulthood (World Health Organization, 2002). A recent study from Italy found that fathers with a university degree had markedly lower CAPI Abuse Scores than those who had completed high school and those with only some schooling (Miragoli et al., 2015). Further research is needed to examine the role of parental education for child abuse prevention.

Limitations
This study is subject to a number of limitations. First, this study is based on a single dataset. Due to small sample size it was impossible to split the sample into a dataset to conduct EFA and CFA processes. As CFA is considered superior and more robust when testing a hypothesized factor structure, a CFA approach was used (Byrne, 2005). Further, this study had a sample size which exceeded that of previous studies (Dawe et al., 2017;Tucker, 2014). Second, the psychometric testing on a reduced dataset could not take the dyadic nature of the data into account. The number of invalid protocols for this study was so high that analyses were conducted on valid protocols for fathers and mothers independent of each other. Third, zero variance was particularly high among paternal BCAPI reports effectively making the data unsuitable for further analyses. Whether this is a general problem with paternal reporting on the BCAPI or specific to this dataset needs to be examined in further research. Fourth, no other validated measures of child abuse potential were part of the study; thus, no criterion validity testing could be carried out against another child abuse potential measure. As this study was part of a larger study on children at risk, participant burden had to be taken into account during data collection limiting the amount of measures included. Fourth, data collection utilized parent self-report which may be subject to desirability bias. However, the BCAPI was specifically developed to measure less stigmatizing indicators of child abuse potential rather than actual child abuse occurrence.
Despite these limitations, this is the first study investigating the properties of the BCAPI in a German sample. While the investigation showed adequate psychometric properties for mothers, a number of measurement problems could be identified in relation to fathers which necessitates future research.
Taking into account the results of the present study, a clinical use of the BCAPI in fathers is not recommended as it might produce data that are hard to interpret. As the BCAPI is a relatively new instrument, research thus far has focused on predominantly female samples and has not investigated gender differences with regards to internal consistency or construct validity. Future research is needed to establish whether these poor results for fathers are due to differences in measurement functioning for fathers in general or just for this specific population.
Author contributions: CL, KL and AE were involved in the overall design and management of the study and data collection. CL and FM had responsibility for conceptualizing and writing the paper. FM led the analyses. FM and JS conducted the analyses. CL, HK and AE contributed to the analyses and interpretation of findings. FM and JS wrote the manuscript with input from CL, HK, KL and AE. All authors reviewed and approved the final version.