Can Incarceration Really Strip People of Racial Privilege?

We replicate and reexamine Saperstein and Penner’s prominent 2010 study which asks whether incarceration changes the probability that an individual will be seen as black or white (regardless of the individual’s phenotype). Our reexamination shows that only a small part of their empirical analysis is suitable for addressing this question (the fixed-effects estimates), and that these results are extremely fragile. Using data from the National Longitudinal Survey of Youth, we find that being interviewed in jail/prison does not increase the survey respondent’s likelihood of being classified as black, and avoiding incarceration during the survey period does not increase a person’s chances of being seen as white. We conclude that the empirical component of Saperstein and Penner’s work needs to be reconsidered and new methods for testing their thesis should be investigated. The data are provided for other researchers to explore. This version (SSRN) also contains updated supplemental analyses.

S APERSTEIN and Penner's 2010 paper "The Race of a Criminal Record: How Incarceration Colors Racial Perceptions" has been widely discussed by sociologists (e.g., in Contexts, Sociological Images, and Racism Review) and has also drawn attention from the general media (e.g., The Colbert Report 1 ). Saperstein and Penner used their analysis of National Longitudinal Survey of Youth (NLSY79) data to argue that race is a fluid social construction in the United States, not just at the macro level, but also for individuals as they experience the highly racialized event of going to jail or prison. They claim that "once incarcerated, individuals who were not seen as black before are more likely to be seen as such, and inmates who previously identified as white may strip themselves, or can be stripped by others, of their racial privilege" (Saperstein and Penner 2010:109). We reanalyze these data and conclude that Saperstein and Penner have significantly underestimated the robustness of racial privilege and overgeneralized their results.
A key part of Saperstein and Penner's provocative argument is that racial fluidity is not simply racial ambiguity or measurement error; all individuals, regardless of phenotype, are significantly less likely to be seen as white and more likely to be seen as black during (and even after) imprisonment. That is, when a survey interviewer goes into a prison, the setting provides more than just an additional clue about respondent racial identity in uncertain cases. It significantly alters what would have been an easy classification in a different context.
Saperstein and Penner present a variety of descriptive statistics and multivariate analyses intended to support their causal thesis about how respondent incarceration colors interviewer perceptions. However, we argue that most of these results simply tell us what we already know: African Americans, Latinos/as, and Native Americans have a higher incarceration rate than whites. In our opinion, Saperstein and Penner's fixed-effects models are the only ones with potential to meaningfully address the hypothesis that a change in incarceration status produces a change in racial self-identification and classification by others. As Saperstein and Penner (2010) themselves note: The addition of person fixed effects is necessary because previous research shows that characteristics such as skin tone, facial features, and names, which are not typically recorded in surveys, are linked to racial self-identification and classification (Bertrand and Mullainathan 2004;Maddox 2004). While we cannot control for these characteristics explicitly, to the degree that they do not vary over time, we can account for them using fixed effects regression. These models also account for any unmeasured, time-invariant, race-related selection bias that affects who is most likely to go to prison. Because they effectively compare observations within individuals, these models are useful in identifying whether incarceration causes changes in either racial self-identification or racial classification. (P. 97) We replicate the fixed-effects analyses and find that the results hinge on decisions that were not specified in the article. For parsimony, we focus our attention on what we believe to be Saperstein and Penner's strongest results for supporting their causal claims: the racial classification models using 17 years of panel data in which the race question was asked the same way throughout. However, we note that many of the problems that we discuss regarding the racial classification findings are also relevant for their racial self-identification models, which use just two years of panel data in which the race question was asked in very different ways 2 .
An especially important problem that is present in all of Saperstein and Penner's fixed-effects models concerns their treatment of missing data. While they state that "some respondents are missing data on their type of residence at the time of the survey; we remove these cases from our analyses" (2010:99), this was not done for the fixed-effects specifications. Relatedly, while Saperstein and Penner (2010) suggest that they selectively focus on their ever-incarcerated measure because a currently incarcerated/non-incarcerated variable produces "similar effects" in their descriptive analyses (p. 106), we find that effects are not similar for the fixed-effects models. They also argue that their ever-incarcerated results demonstrate a "lasting impact" of incarceration that persists somewhat after a person has been released (p. 103), but we find no evidence of even an initial impact of incarceration.
Our reanalysis proceeds in four steps. First, we explain why the overwhelming majority of Saperstein and Penner's (2010) results (those that do not account for person fixed-effects) are unsuitable for supporting the claim that "in statistical modeling terms, the relationship between race and incarceration is not unidirectional; it's recursive" (p. 110). Second, we provide new descriptive analyses focused on the ever-incarcerated population that we argue are better matched to the task of uncovering the proposed micro-level causal impact of incarceration on race. Third, we estimate new fixed-effects models that make use of all available non-missing data to estimate the relationship between incarceration and racial classification. We conclude that while a case can be made for other approaches to testing Saperstein and Penner's thesis (see Kramer, DeFina and Hannon, forthcoming), future research would do well to consider the use of rigorously conducted fixed-effects analyses.

Differentiating Within-and Between-Person Variation to Help Distinguish Causal Direction
While almost all earlier studies concerned with the incarceration-race relationship have modeled incarceration as the dependent variable and race as an independent variable, Saperstein and Penner (2010) aim to uncover how "the effects of race and incarceration may operate in both directions" (p. 110). Because their interest is in documenting how incarceration affects race, not how race affects incarceration, they attempt to remove the part of the association between race and imprisonment that reflects the tendency for certain types of individuals to be targeted by the criminal justice system (what they refer to as "race-related selection bias", 2010:97). With the exception of their fixed-effects analysis, Saperstein and Penner incorporated a control for previous racial classification/identification in order to reveal reverse causality. However, we show that the strategy of controlling for racial designation in the last time period does not effectively isolate the within-person change in race associated with the incarceration experience from the between-person selection effect which influences the likelihood of incarceration.
To illustrate how Saperstein and Penner's approach does not provide reliable causal evidence, we replicate and critically examine some of their strongest descriptive results 3 . Panel A of Table 1 provides our replication of their cross-tabulation of the percentage of person-years in each racial classification by whether or not the person is incarcerated. As in most of their analyses, Saperstein and Penner attempt to separate the causal impact of race on incarceration from the impact of incarceration on race by controlling for racial classification in the previous year. They find that, for the incarcerated, 89.6% of classifications were continuously white, while among the non-incarcerated 95.9% of classifications were white in both time periods. That is, more specifically, the incarcerated population had a higher (+6.3) percentage of classifications that were white in one year and other in the next.
Saperstein and Penner take this as evidence that the experience of incarceration changes how people are racially perceived. We view it only as evidence that (1) interviewers will arbitrarily switch between white and other when forced to fit certain types of respondents into a Black-White-Other coding scheme (Smith 1997) and (2) people that are unambiguously white are less likely to be subjected to incarceration. In line with our interpretation, when we limit the sample to personyears for respondents without a Latino/a, Native American, or Hawaiian Pacific Islander self-identity in 1979 (three groups often seen as racially ambiguous that also have a higher than average incarceration rate), we find that the percentage of continuous classifications as white are identical for both the incarcerated and non-incarcerated (99.0%) 4 .
Panel B of Table 1   classification histories for selected respondents who experienced incarceration at some point during the 17 available survey years 5 . The histories demonstrate that, for those four selected respondents, racial classification as white was more common before incarceration, and racial classification as black was more common during and afterwards. Saperstein and Penner (2010) assert that these cases "exemplify the pattern of results in both our descriptive findings and the multivariate analyses that follow" (p. 106). However, as we have just shown, the descriptive findings in Panel A of Table 1 are driven by between-individual variation in who goes to prison for how long (race-related selection bias), not differences in how a person is classified after the incarceration event. Unlike the results in Panel A, the classification histories presented in Panel B are directly relevant for Saperstein and Penner's theorizing about within-individual variation and the universally transformative power of incarceration. They are also the only descriptive results in the article that are consistent with the underlying logic of the fixed-effects models.  Notes: The sample includes 9,154 person-years with non-missing racial data for respondents who were interviewed in prison sometime between 1979 and 1998 (following Saperstein and Penner's coding, 1987 is omitted). Differences are statistically insignificant at the 0.10 level.
Indeed, outside of the use of fixed-effects regression, one simple way to differentiate the between-person variation in those subjected to incarceration from the within-person variation associated with the causal impact of imprisonment is to focus the analyses on the ever-incarcerated population (as is done in the case histories presented in Panel B of Table 1). The race-related selection effect in terms of who is mostly likely to ever go to prison is eliminated when everyone in the sample has been in prison 6 . By focusing on the life experiences of the ever-incarcerated population the key question implicitly moves from (1) how is the racial composition of the incarcerated different from that of the not incarcerated? to (2) how does the incarceration event affect a person's racial classification? The answer to the first question is well known, while the answer to second question has not yet been settled and is fundamental to Saperstein and Penner's novel theorizing regarding causal direction.
The Racial Classification Histories of the Ever Incarcerated Saperstein and Penner (2010) indicate that the four case histories they present are not representative, but they do not specify the ways and degree to which those particular observations are unique. Table 2 provides the mean pre-and post-incarceration racial classifications of ever-incarcerated respondents in order to demonstrate how unrepresentative those four cases are (with pre-/post-personyear classification differences ranging from 17% to 58%). As can be seen, the average difference in racial classification is less than 1%. Our descriptive results indicate that any tendency for respondents to be more likely to be classified as black rather than other is matched by an equally minuscule tendency for respondents to be more likely to be classified as white during and after incarceration than before. In sum, a closer look at the NLSY79 data through descriptive analysis provides no support for Saperstein and Penner's (2010) hypothesis that "incarceration affects the probability that an individual will be classified one way or the other" (p. 106).
For the sake of comparison, the above descriptive analysis followed the preversus post-incarceration distinction made by Saperstein and Penner. However, it is Notes: The sample includes 352 ever-incarcerated individuals with non-missing racial and residence data for the survey year immediately before and during the first known year of imprisonment (following Saperstein and Penner's coding, 1987 is omitted). Differences are statistically insignificant at the 0.10 level.
possible that this division biases the results toward non-significance. It is reasonable to expect that actually going into a prison to interview a respondent would have the most powerful impact on the interviewer's perception of respondent race. While Saperstein and Penner argue that interviewers that never go into a prison can still possibly learn of a respondent's earlier incarceration experience indirectly through questions about previous employment or child custody, it seems sensible to assume that these indirect and uncertain linkages will be weaker than the impact of sitting directly across from an inmate 7 . If so, averaging the cases where the respondent and interviewer are physically in the prison environment together with cases where the interviewer might or might not have indirectly heard about a respondent's earlier incarceration would mute the effect.
To explore this issue further, Table 3 examines the racial classification of everincarcerated respondents for the year immediately before and during the first year of imprisonment. Similar to Saperstein and Penner's use of the lagged racial classification variables, we use the first lag of the ever-incarcerated indicator and limit the sample to only those respondents with non-missing data in two consecutive time periods. The results provided in Table 3 indicate that there is no meaningful difference in the racial classification of respondents immediately before and during their first year of incarceration. Even if one was to disregard conventions regarding statistical significance, the direction of the effects displayed is contrary to Saperstein and Penner's hypothesis; respondents are slightly less likely (-0.6%) to be seen as black and slightly more likely (+0.6%) to be seen as white during imprisonment than immediately before they experienced incarceration.
The descriptive results in Tables 2 and 3 present something of a puzzle in that they are inconsistent with Saperstein and Penner's reported fixed-effects findings (a set of analyses which, as noted earlier, is directed at isolating the within-from between-individual variation). The results in Table 3 are similarly incongruent with the logical expectation that incarceration's effect on racial classification would be more pronounced closer to the incarceration event than several years after. Below we conduct our own fixed-effects analyses to further investigate these inconsistencies.

The "Lasting Impact" of Incarceration
After presenting a series of descriptive analyses that include both a measure of current incarceration status and an indicator of whether a respondent has ever been incarcerated, Saperstein and Penner switch to an exclusive focus on their ever-incarcerated variable for their multivariate regression analyses. They (2010) reason that the "descriptive results suggest that current and previous incarceration have similar effects on racial classification (Table 4), so we combine these into one measure for whether a respondent has ever been incarcerated in our regression models" (p. 106). Saperstein and Penner (2010) further argue that their everincarcerated results demonstrate a "lasting impact" of incarceration that "persists even after the respondents have been released" (p. 103, 105). Table 4 provides additional analyses (not presented in Saperstein and Penner's article) that return to the issue of whether current incarceration affects racial classification. More specifically, Table 4 gives linear probability fixed-effects regression estimates (with respondent and year fixed effects) for the impact of current incarceration on classification as white or black. Panel A utilizes the same coding of current incarceration and treatment of missing data employed by Saperstein and Penner (2010:105; see our Table 1). As can be seen, current incarceration has no statistically significant impact on racial classification after accounting for year and respondent fixed effects. The signs of the coefficients are also contrary to Saperstein and Penner's theorizing: a negative coefficient for incarceration's effect on black classification and a positive sign for white classification. Therefore, these results offer no support for the notion that being interviewed in prison or jail significantly changes how a person is racially classified (relative to earlier classifications when the respondent is interviewed at home or elsewhere). The results also challenge the interpretation of the previous incarceration estimates as reflecting a lasting or persistent impact, since there is no effect in the present to continue into the future 8 .
Panel B of Table 4 uses a somewhat different coding of current incarceration status, but ultimately comes to the same conclusion regarding its statistically insignificant impact. Saperstein and Penner's coding of incarceration status (used in Panel A) treats any respondent with a negative value for the residence variable as missing 9 . However, this treatment of missing data ignores the interviewer skip pattern where youth that were already known to be living at home with their legal guardian were assigned a value of -4 in the data 10 . Panel B of Table 4 corrects this omission, and it appears that this coding error has little substantive impact on the results for current incarceration (that is, both sets of results are unsupportive of Saperstein and Penner's theorizing). However, this does not mean that the error is irrelevant for the calculation of the ever-incarcerated variable.
Use of the ever-incarcerated measure introduces additional complications in coding that were not specified in Saperstein and Penner's article. While it is relatively simple to identify the individuals who were interviewed in prison/jail at some point between 1979 and 1998 and to use this information to create a sample filter, it is much more challenging to create a variable for regression analyses that effectively demarcates the never-incarcerated population. One specific complication that arises concerns how to deal with missing data. For example, if in the entire sociological science | www.sociologicalscience.com 196 March 2016 | Volume 3  Table 4, p. 105. The sample in Panel B includes 177,785 person-years with non-missing racial and residence data (following the NLSY recommended coding). All models include respondent and year fixed effects. All estimates are statistically insignificant (p>0.10) regardless of the choice of conventional versus robust standard errors and the inclusion or exclusion of sample weights.
However, we believe that there is no need to wrestle with the sample attrition and coding complications associated with an "ever" measure of incarceration. If the logic behind using the ever-incarcerated variable in the regression is to reveal "whether or not incarceration has a lasting impact on racial perceptions" (Saperstein and Penner 2010:105), the first order of business is to establish an initial effect that can then continue into the future. As shown, there is no initial effect. The lack of a contemporaneous effect of incarceration is critical in that one would expect that actually going into a prison to interview a respondent would have the most meaningful impact on the interviewer's perception of respondent race. While Saperstein and Penner (2010:105) contend that interviewers that never hear a respondent even mention prison can still detect an incarceration record via that respondent's "presentation of self," it is reasonable to surmise that such an indirect and untested mechanism would be less reliable than talking directly with someone living behind bars. Additionally, in the absence of any evidence (or specific theory) on the issue, it seems sensible to assume that the power of an event to alter perceptions will be greater closer to the event than it would be further away in time.
Moreover, Saperstein and Penner (2010) specifically ask, "if primed to think of criminals-for example, by interviewing someone in a prison-are people more likely to classify someone as black?" (p. 96). The estimates for current incarceration provide the most direct answer to that specific question-the answer is no.

Conclusions
We reanalyzed the NLSY79 data used in Saperstein and Penner's (2010) popular study which claimed that a prison record significantly increases the chances that a non-racially ambiguous survey respondent will be seen as black. Our reanalysis indicated that this claim is unsupported by the data 13 . While there was indeed individual fluctuation in racial categorization across the survey years, we found that the racial distributions of individuals before they were incarcerated were virtually identical to their racial distributions during and after incarceration. Likewise, using samples with well over 100,000 person-years, our fixed-effects analyses of the within-individual (temporal) relationship between incarceration status and racial classification failed to uncover any evidence in line with Saperstein and Penner's thesis.
We believe that our reanalysis reaffirms the importance of factors such as phenotype and the time and place where one was raised in racial identity and classification in the United States. In our opinion, race is primarily socially constructed at the macro level and for the overwhelming majority of people in the United States is a rigidly ascribed (not achieved) master status. Consistent with this view, the data indicate that, on average, an individual does not affect the likelihood of changing his or her position on the racial hierarchy by avoiding incarceration. Saperstein and Penner (2010) explicitly claim that their "findings are not limited to a specific or select group of individuals; they say something more general about the nature of racial divisions and the content of racial categories in the United States" (p. 109). While our reanalysis reveals that Saperstein and Penner's findings do not apply in an overall sense, it is possible that their hypothesis could be supported for certain subgroups that are phenotypically or categorically ambiguous 14 . In general, it is possible that more robust evidence supporting Saperstein and Penner's theorizing could be found using other data and estimation techniques. Indeed, in another study Saperstein and Penner used the Add Health data to examine the relationship between different types of criminal justice contact and racial classification by interviewers (Saperstein, Penner, and Kizer 2014). Unfortunately, in that paper they did not present any respondent fixed-effects analyses and instead exclusively relied on the approach of simply controlling for classification in the previous time period (along with other measurable factors). As pointed out earlier, this modeling strategy does not remove the race-related selection bias regarding who is targeted for control by the criminal justice system and thus cannot be used to reliably reveal reverse causality 15 .
As Allison (2005) notes, one of the benefits of fixed-effects regression for sociologists is that it facilitates controlling for the unobserved, time-invariant characteristics of individuals that can significantly complicate the interpretation of findings from non-experimental designs. As we have done in the present paper, future sociological research on the causal impact of incarceration on race should rely on results from fixed-effects models, not analyses conflating within-and between-individual variation 16 . Otherwise, new research in this area may end up rediscovering the established finding that certain racial and ethnic populations are more likely to be subjected to imprisonment, relabeling the finding, and over-theorizing the causal mechanism driving the relationship.

Notes
1 Colbert satirically notes that Saperstein and Penner's findings offer a solution to "America's inescapable racism" in that "minorities simply have to behave in ways that change our perception of their race." http://www.cc.com/video-clips/p8fj8f/the-colbertreport-black-history-month---stereotypes---racial-identity 2 In particular, it is important to remember that the word "white" was only on the 2002 survey item. For 1979, white/European racial background had to be inferred from a very limited list of two-dozen ethnic and national categories that excluded options like "Pakistani," "Thai," "Dutch," and "Swedish," but included choices like "American" and "other." Saperstein and Penner (2010:101) did not specify their coding procedure, but in order to reach the 56% self-identified as European listed in their summary statistics we had to code all those who said they were "American" and "other" as European. In our opinion, it is telling that Saperstein and Penner's own fixed effects self-identification models revealed an insignificant impact of incarceration for the black racial category, a category that required the least reinterpretation/recoding since it was at least listed on both surveys.
3 We note that while Saperstein and Penner (2010) describe these results as being about the "percentage of people" (p.105), they are actually about the percentage of person-years. The distinction can be important because non-whites are both more likely to experience incarceration and more likely to experience longer spells of incarceration. Alternatively stated, the percentage of person-years reflects both the tendency for whites to be less likely to be incarcerated and the tendency for whites to receive shorter sentences.
effect where the influence of incarceration will only be seen when certain variables are held constant in the model. In fact, they explicitly argue the opposite; the effect of incarceration on racial classification will be reduced through control variables such as income and education.
9 We follow Saperstein and Penner's coding of missing data in our Tables 1-3 to simplify comparison.
10 Representatives from the NLSY confirmed that the following HH1_1y codes were used for respondents that were residing in a parental household at the time of the survey (and not interviewed in jail/prison): 17, 19, and -4.
11 The result of Saperstein and Penner's coding is to include respondents in the sample if they had valid residence data for any (as opposed to all) of the years. For example, if an individual lived in her/his own dwelling in 1998, but had no valid data on the residence question before that (following Saperstein and Penner's definition of missing data), that case was apparently included in the sample and designated as never incarcerated in each year from 1979 to 1998. Furthermore, in order to exactly match the 177,633 and 160,387 person-year samples Saperstein and Penner used for their regression models, we had to go outside the range of the racial classification data (beyond 1998) and include observations that had a single instance of non-missing residence data up to 2004.
12 The knowledge that cases with a -4 for the residence question were not in fact missing on incarceration status allows one to estimate the fixed-effects models with a very robust sample of over 110,000 person-years (for respondents with continuous residence data across all years). Even with this large sample size and no other time-varying control variables, the effects of ever versus never being incarcerated were statistically insignificant (p>.3) and half of the coefficients had the wrong sign. In sum, when the sample is limited to respondents without missing data regarding "ever" versus "never," the fixed-effects results for the ever-incarcerated indicator become consistent with the results for current incarceration as well as our descriptive findings. Specific results are available on request.
13 The NLSY data and Stata code used to produce all of the results in Tables 1-4 are available here: http://www88.homepage.villanova.edu/lance.hannon/SupplementalMaterial. html. We note that any researcher interested in pursuing this topic further can download additional variables (and link them to our dataset using "CASEID") with the free and publicly available NLS Investigator: https://www.nlsinfo.org/investigator/pages/ login.jsp.
14 The NLSY is limited in its potential to clarify this issue. However, we examined one such possibility by focusing on individuals self-identifying as Latino/a. While Latinos/as represented less than 20% of survey respondents, they were involved in the vast majority of classification inconsistencies. Replicating the fixed-effects procedures underlying Table 4 we found no evidence that Latinos/as were more likely to be classified as black and less likely to be classified as white when incarcerated (n=29,717). Specific results are available on request.
15 In comparing the NLSY and Add Health studies, Saperstein, Penner, and Kizer (2014) note that "Perhaps most intriguing about the Add Health findings is that the changes in racial classification we observe, which are clearly patterned by criminal justice contact, do not appear to result from a direct priming of racialized crime stereotypes. In the NLSY study, we examined the effect on racial classification of being interviewed while incarcerated. By contrast, the Add Health respondents entered their answers to sensitive questions, including those referencing delinquent behavior and criminal justice contact, directly into a laptop computer. That means the interviewer would not have known sociological science | www.sociologicalscience.com 200 March 2016 | Volume 3 the precise details of the respondents' actions or experiences, and there were no other avenues through which the interviewer might know about a respondent's prior criminal justice contact unless he volunteered it during the interview" (p. 117-118). Saperstein, Penner, and Kizer (2014) suggest that the findings might reflect interviewers "responding to subtle, nonconscious changes in the respondents' demeanor" (p. 118). They do not mention the possibility that the results could be driven by unmeasured factors producing race-related selection bias (e.g., it is not that arrest makes people more likely to look unambiguously black, but that people with an unambiguously black appearance are more likely to be arrested). 16 We do not mean to imply that fixed-effects regression offers a complete solution. Instead, we mean to suggest that accounting for unmeasured, time-invariant individual characteristics is a needed first step.

Missing Data
Skipped or refused to answer any 1979 racial self-identity questions 3 0.13% Our conclusion: The NLSY respondents that changed from a non-Black to Black selfidentity from 1979 to 2002 constitute a very small group that is not representative of the general population. There were no cases of individuals switching from an Italian, Irish, Scottish, Welsh, German, Polish, Russian, Greek, Chinese, Japanese, Korean, or Vietnamese ethnic origin to Black. Curiously, the largest category of identity shifters involved youth answering exclusively "English" in 1979. It is possible that the intended meaning of the ethnic origin question was not clearly communicated to some young respondents and they simply assumed that the question was referring to first language.

Table A2. Among the 38 youth that exclusively self-identified as "English" in 1979 and then switched to Black in 2002, how did the respondent's legal guardian initially describe the youth's ethnicity in 1978?
Parent or Guardian's View of Respondent Race in 1978 N Percentage % Identifying Respondent as "English, Scottish, or Welsh" 0 00.00% % Identifying Respondent as "Another group, not listed" 2 05.26% % Identifying Respondent as "Black, Negro, African American" 36 94.74% Our conclusion: One characteristic of those that changed categories between 1979 and 2002 that makes them distinctly unrepresentative of the general population is that they overwhelmingly started off with inconsistent data (e.g., a mismatch between the youth's answer to the ethnic origin question and how a legal guardian or interviewer described the respondent). In line with the notion that one should expect a certain amount of error in the survey process, the overwhelming majority of youth that were counted as newly Black in 2002 would have been categorized as already Black using alternative initial measures of race/ethnicity. ! Notes: ** and *** indicate p<.01 and p<.001 respectively. Standard errors are in parentheses.
Our conclusion: While the effect of incarceration on racial self-identification in 2002 is still significant when controlling for racial self-identification in 1979, the residual incarceration effect is driven by the inclusion of a small number of unrepresentative cases where the respondent's self-identification in 1979 was inconsistent with the interviewer's observation in that same year.  Our conclusion: The strategy of controlling for racial categorization in an earlier time period is unhelpful as a means for uncovering whether incarceration changes a person's race. Finding a robust incarceration estimate does not mean that an incarceration record systematically drives racial change. An equivalent estimate for incarceration can emerge even when racial change is entirely driven by random error.

Figure 6. S&P's Own Results Demonstrate That The Method of Controlling For Prior Black Classification Cannot Reliably Distinguish Status-Driven Fluidity From Measurement Error
Our Conclusion: Saperstein and Penner noted that their "definitive response" to our SocSci replication study would appear in AJS. In their now published response, Saperstein and Penner (2016) conduct simulation analyses aimed at showing that their modeling tactic of controlling for prior racial category is not as problematic as we claimed. Despite implementing a simulation procedure that is strongly biased toward supporting their conclusion, Saperstein and Penner still find (but do not discuss) average randomfluctuation-only coefficients for Black classification that are larger than the effects presented as substantively meaningful and highly statistically significant just a few pages later. Thus, Saperstein and Penner's own simulation analyses (in Table 2) undermine the trustworthiness of the results from their primary modeling strategy (in Table 3 (1992, 1994, and 1998). The models incorporate respondent and year fixed effects. To account for nonessential ill conditioning in product-term analysis, the crack use and confidential reporting variables are centered. Never used crack is the omitted category in Panel B.
Our conclusion: Saperstein and Penner (2016:280) admit that their results could be driven by race-related selection biases, as respondents did not appear to be randomly assigned to the confidential survey mode and confidentiality may matter more for some populations than others in terms of willingness to report drug use. Saperstein and Penner correctly note that their chosen modeling strategy of simply controlling for previous race cannot account for these spurious influences. However, a fixed-effects approach is well suited for mitigating such concerns. Applying a fixed-effects model to the data reveals that Saperstein and Penner's reported findings were indeed a spurious byproduct of selection biases. Our results in Panel A indicate that whether or not interviewers heard about the respondent's crack use is irrelevant for their racial classification of that respondent. The results in Panel B suggest the possibility of a significant interaction between survey mode and reported frequent crack use, but, as in Panel A, the sign of the product term coefficient (positive) is contrary to Saperstein and Penner's hypothesis. While there is much in Saperstein and Penner's work that should inspire future research, the approach of simply controlling for prior racial category should be abandoned. Our conclusion: Like the selected racial classification case histories for the ever-incarcerated measure in Saperstein and Penner's (2010) earlier study, the presented cases for the everunemployed measure are outliers. The average pre-post difference is essentially zero (see the figures below).
1 Similar to case 1738/1728 in Saperstein and Penner's (2010) incarceration-focused study, case 9282 and case 9969 do not match up to the NLSY data in terms of the year that the respondent experienced long-term unemployment.
Note: Average differences in pre/post classification proportions are statistically insignificant (P>.10).  Our Conclusion: Saperstein and Penner's ever-unemployed measure has the same problems that we identified with the ever-incarcerated variable (especially in regard to the treatment of "evermissing"). Similar to their earlier study (2010), Saperstein and Penner (2012:692) note that, "Supplemental analyses (not reported) estimating classification models with all current effects and identification models with all lasting effects provide similar results, indicating that our substantive conclusions about racial fluidity and inequality are not affected by these coding decisions." However, as shown above, the specific fixed-effects results for current versus everunemployed status are not similar and would lead to different substantive conclusions.