Supervised Classes, Unsupervised Mixing Proportions: Detection of Bots in a Likert-Type Questionnaire

Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans—real or simulated under a measurement model—has been used to empirically choose cutoffs with a high nominal specificity. However, a high-specificity cutoff is less accurate when the target sample has a high contamination rate. In the present article, we propose the supervised classes, unsupervised mixing proportions (SCUMP) algorithm that chooses a cutoff to maximize accuracy. SCUMP uses a Gaussian mixture model to estimate, unsupervised, the contamination rate in the sample of interest. A simulation study found that, in the absence of model misspecification on the bots, our cutoffs maintained accuracy across varying contamination rates.


Factor space representation in functional method theory
In Functional method theory (FMT), items are represented as unit-length vectors in the factor space (Dupuis et al., 2015).We consider a matrix of item characteristics, having d items as rows and k factors as columns.This matrix is defined as satisfying two properties, orthogonal columns and normalized rows.
Orthogonal columns.The k factors are uncorrelated.
Unit-normalized rows.The d items each have Euclidean length 1.
Given a Likert-type response data matrix having n examinees as rows and d items as columns, we want to get the matrix of item characteristics K. To do so, the algorithm is as follows (Dupuis et al., 2020).

Nonresponsivity indices in factor space
In FMT, examinees are represented in the same factor space as items (Dupuis et al., 2015).We consider an examinee's response strategy vector, which is simply a k-dimensional vector collecting the raw response vector's correlations with the factors.Based on response strategy vectors, FMT offers two nonresponsivity indices (Dupuis et al., 2019(Dupuis et al., , 2020)): the response coherence; and the response reliability.
Response coherence.The examinee's response coherence is the Euclidean length of her response strategy.

Response reliability.
The inner product between the examinee's two unit-normed split-half response strategies.
For response reliability, the split-half procedure is as follows (Dupuis et al., 2020).
1. Construct the matrix of item characteristics for the entire test.
2. Partition into two subtests-subsets of rows-the matrix of item characteristics.
3. Calculate the examinee's two response strategy vectors, one per subset.
4. Unit-norm the two response strategy vectors, take their inner product, then apply Spearman-Brown correction.

Additional Figures and Tables
This section contains one Figure several Tables not present in the main manuscript.
Supplemental Figure 1 is a dot chart of mean-aggregate area under the curve (AUC) in the simulation study (described in the main manuscript).Each dot represents the mean of the AUC aggregated over cells with the same combination of bot distribution, human calibration sample size, and classifier.For example, the topmost dot is the mean AUC over the cells with uniform bots, n tr 0 = 400, and Supervised Classes, Unsupervised Mixing Proportions (SCUMP) Bayes classifier.This dot aggregates over different values of contamination rate n 1 n 1 +n 0 , as AUC is invariant to base rates.Supplemental Tables 1-4 report the results of the simulation study (described in the main manuscript) for the n = 1000 conditions.Supplemental Table 1 is for the 99% specificity classifier on uniform bots, corresponding to Figure 6a in the main manuscript; Supplemental Table 2 is for the SCUMP Bayes classifier on uniform bots, corresponding to Figure 6b in the main manuscript; Supplemental Table 3 is for the 99% specificity classifier on middle-responding bots, corresponding to Figure 6c in the main manuscript; and Supplemental Table 4 is for the SCUMP Bayes classifier on middle-responding bots, corresponding to Figure 6d in the main manuscript.Unlike Figure 6 in the main manuscript, these Tables also report sensitivity, flag rate, and AUC.Supplemental Tables 5-8 is analogous to Supplemental Tables 1-4 but for the n = 500 conditions.Unlike Tables 1-4, there are no corresponding Figures in the main

1.
With the n × d data matrix as input, calculate the correlation matrix, then do principal component analysis (PCA).Output the d × k matrix whose columns are the first k eigenvectors.2. Iteratively satisfy the two properties of the factor space representation until convergence.(a) Orthogonalization.In the first iteration, use the output of Step 1 as input; otherwise, use the output of Step 2b from the previous iteration.Calculate a correlation matrix, then do another PCA.Output the matrix of d × k principcal component scores.Note that the output then has uncorrelated columns, but the rows may be unnormalized.(b) Normalization.With the output of Step 2a, unit-normalize each row.Note that the output then has normalized rows, but the columns may be non-orthogonal.If columns are orthogonal, exit loop.

Figure 7 .
Figure 7 .Iterative algorithm for computing the functional method theory (FMT) matrix of item characteristics. manuscript.

Figure 8 .
Figure 8 .Dot chart of mean-aggregate AUCs from the simulation study.Each point aggregates cells under a combination of classifier, bot distribution, and number of calibration humans n tr 0 .bayes = Supervised Classes, Unsupervised Mixing Proportions Bayes classifier; spec99 = nominal 99% specificity-calibrated classifier; unif = uniform response style; mrs = middle response distribution; AUC = area under the curve.
Simulation study results for n = 1000, uniform bots, and 99% specificity classifier.Outcome measures are accuracy, specificity, sensitivity, flag rate, and area under the curve (AUC).Simulation study results for n = 500, uniform bots, and 99% specificity classifier.Outcome measures are accuracy, specificity, sensitivity, flag rate, and area under the curve (AUC).Simulation study results for n = 500, non-uniform bots, and Supervised Classes, Unsupervised Mixing Proportions (SCUMP) Bayes classifier.Outcome measures are accuracy, specificity, sensitivity, flag rate, and area under the curve (AUC).n = target sample size, n; contam = contamination rate; n0tr = number of humans in calibration sample, n tr 0 ; botdist = bot Likert response distribution; acc = accuracy; spec = specificity; sens = sensitivity; flag = flag rate; auc = area under the receiver operating characteristic curve; unif = uniform distribution; mrs = middle responding; scumpbayes = Bayes classifier with Supervised Classes, Unsupervised Mixing Proportions.