Using Ordinary Least Squares in Higher Education Research: A Primer

Hu, Xiaodan

doi:10.1007/978-3-031-38077-8_13

Xiaodan Hu³

Part of the book series: Higher Education: Handbook of Theory and Research ((HATR,volume 39))

225 Accesses

Abstract

This chapter serves as a primer in utilizing ordinary least squares (OLS) in higher education research, by providing an overview of this commonly used quantitative approach, which often includes simple linear regression models and multiple linear regression models. The first section of the chapter reviews current literature to explain ways that OLS allows researchers to identify the goals of OLS and differentiate them from basic descriptive analyses and bivariate analyses. It then discusses the types of research questions that may be answered by OLS. The second section walks readers through an example application of OLS using a real-world dataset, reviewing the definitions, key components, and analytic steps in using OLS. The following section addresses important considerations in testing statistical assumptions and the influence of assumption violation before applying OLS. The fourth section further discusses the significance of considering heterogeneous effects in contemporary higher education. The chapter closes with topics related to interpreting findings, as well as the broader application of OLS in higher education research contexts.

Nicholas Hillman was the Associate Editor for this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Partial Least Squares: The Gestation Period

This fast car can move faster: a review of PLS-SEM application in higher education research

Article 28 April 2020

Concepts and Applications of Multivariate Multilevel (MVML) Analysis and Multilevel Structural Equation Modeling (MLSEM)

Notes

1.
The statistical property of unbiasedness refers to whether “an estimator whose expected value of its sampling distribution equals the true value of the population parameter” (Ezell & Land, 2005, p. 943). When the sample estimate is neither an underestimate nor an overestimate of the unknown population parameter, it is unbiased.
2.
The public-use data is available at https://nces.ed.gov/datalab/. One major difference between the public-use data and restricted-use data is around how analytic weights are provided in this complex survey design. For example, variance estimation is provided through both Balanced Repeated Replication (BRR) and a Taylor series linearization, but only the BRR variance estimation method is supported for users of public-use data (Duprey et al., 2020).
3.
The dependent variable indicates the total tuition and fees charged at the primary institution during the first academic year in postsecondary education after high school completion or exit. This value accounted for students’ attendance status to reflect students’ number of months enrolled full- or part-time in a given academic year. The primary institution was identified based on transcript records with the earliest start date excluding summer enrollments immediately following high school completion/exit.
4.
Note that the hierarchical procedure, sometimes called block-wise entry procedure, is used to add or remove variables from regression model in multiple steps. This is different from hierarchical linear modeling when data have a nested structure (e.g., class sections, departments, colleges).
5.
Note that percent refers to the rate of change, which is different from actual percentage points change. For example, if the baseline undergraduate enrollment being Pell-eligible is 30%, and it increased by 10 percent in a given year. The new undergraduate enrollment being Pell-eligible is 30% + (30% ×10%) = 33%. If it increased by 10 percentage points in a given year, the new undergraduate enrollment being Pell-eligible is 30% + 10% = 40%.
6.
IPEDS defines race/ethnicity based on categories developed in 1997 by the Office of Management and Budget (OMB). These categories “describe groups to which individuals belong, identify with, or belong in the eyes of the community” (NCES, n.d.). In particular, individuals indicating their ethnicity as Hispanic/Latino are defined as “a person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race” (NCES, n.d.).

References

Abdallah, W., Goergen, M., & O’Sullivan, N. (2015). Endogeneity: How failure to correct for it can cause wrong inferences and some remedies. British Journal of Management, 26(4), 791–804. https://doi.org/10.1111/1467-8551.12113
Article Google Scholar
Aiken, L. S., West, S. G., & Reno, R. R. (1991). Multiple regression: Testing and interpreting interactions. Sage.
Google Scholar
Allison, P. D. (2001). Missing data. Sage.
Google Scholar
Allison, P. D. (2009). Fixed effects regression models. Sage.
Book Google Scholar
An, B. P. (2013a). The influence of dual enrollment on academic performance and college readiness: Differences by socioeconomic status. Research in Higher Education, 54(4), 407–432. https://doi.org/10.1007/s11162-012-9278-z
Article Google Scholar
An, B. P. (2013b). The impact of dual enrollment on college degree attainment: Do low-SES students benefit? Educational Evaluation and Policy Analysis, 35(1), 57–75. https://doi.org/10.3102/0162373712461933
Article Google Scholar
An, B. P. (2015). The role of academic motivation and engagement on the relationship between dual enrollment and academic performance. The Journal of Higher Education, 86(1), 98–126. https://doi.org/10.1353/jhe.2015.0005
Article Google Scholar
An, B. P., & Taylor, J. L. (2019). A review of empirical studies on dual enrollment: Assessing educational outcomes. In M. B. Paulsen & L. W. Perna (Eds.), Higher education: Handbook of theory and research (Vol. 34, pp. 99–151). Springer. https://doi.org/10.1007/978-3-030-03457-3_3
Chapter Google Scholar
Archibald, R. B., & Feldman, D. (2006). State higher education spending and the tax revolt. The Journal of Higher Education, 77(4), 618–644. https://doi.org/10.1353/jhe.2006.0029
Article Google Scholar
Attewell, P., Lavin, D., Domina, T., & Levey, T. (2006). New evidence on college remediation. The Journal of Higher Education, 77(5), 886–924.
Article Google Scholar
Attewell, P., Monaghan, D., & Kwong, D. (2015). Data mining for the social sciences: An introduction. University of California Press.
Google Scholar
Avery, C., Howell, J. S., & Page, L. (2014). A review of the role of college applications on students’ post-secondary outcomes. College Board. https://jeric.ed.gov/?id=ED556466
Google Scholar
Babyak, M. A. (2004). What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 66(3), 411–421.
Google Scholar
Bahr, P. R. (2019). The labor market returns to a community college education for noncompleting students. The Journal of Higher Education, 90(2), 210–243. https://doi.org/10.1080/00221546.2018.1486656
Article Google Scholar
Bailey, M., & Dynarski, S. (2011). Inequality in post-secondary education. In G. J. Duncan & R. J. Murnane (Eds.), Whither opportunity? Rising inequality, schools and children’s life chances (pp. 117–132). Sage.
Google Scholar
Bailey, T., Calcagno, J. C., Jenkins, D., Leinbach, T., & Kienzl, G. (2006). Is student-right-to-know all you should know? An analysis of community college graduation rates. Research in Higher Education, 47, 491–519. https://doi.org/10.1007/s11162-005-9005-0
Article Google Scholar
Baker, D. J., & Doyle, W. R. (2017). Impact of community college student debt levels on credit accumulation. The Annals of the American Academy of Political and Social Science, 671(1), 132–153. https://doi.org/10.1177/0002716217703043
Article Google Scholar
Baker, T. L., & Vélez, W. (1996). Access to and opportunity in postsecondary education in the United States: A review. Sociology of Education, 69, 82–101.
Google Scholar
Balfanz, R., DePaoli, J. L., Ingram, E. S., Bridgeland, J. M., & Fox, J. H. (2016). Closing the college gap: A roadmap to post-secondary readiness and attainment. Civic Enterprises. https://eric.ed.gov/?id=ED572785
Google Scholar
Barnett, E. (2018). Differentiated dual enrollment and other collegiate experiences: Lessons from the STEM early college expansion partnership. Community College Research Center, Teachers College, Columbia University. http://www.jff.org/publications/differentiated-dual-enrollment-and-other-collegiate-experiences
Google Scholar
Berger, A., Turk-Bicakci, L., Garet, M., Song, M., Knudson, J., Haxton, C., Zeiser, K., Hoshen, G., Ford, J., Stephan, J., Keating, K., & Cassidy, L. (2013). Early college, early success: Early college high school initiative impact study. American Institutes for Research. https://files.eric.ed.gov/fulltext/ED577243.pdf
Google Scholar
Bielby, R. M., House, E., Flaster, A., & DesJardins, S. L. (2013). Instrumental variables: Conceptual issues and an application considering high school course taking. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (Vol. 28, pp. 263–321). Springer. https://doi.org/10.1007/978-94-007-5836-0_6
Chapter Google Scholar
Biswas, A., Das, S., & Das, S. (2019). OLS: Is that so useless for regression with categorical data? In A. K. Laha (Ed.), Advances in analytics and applications (pp. 227–242). Springer. https://doi.org/10.1007/978-981-13-1208-3_18
Chapter Google Scholar
Boatman, A., Evans, B. J., & Soliz, A. (2017). Understanding loan aversion in education: Evidence from high school seniors, community college students, and adults. AERA Open, 3(1), 1–16. https://doi.org/10.1177/2332858416683649
Article Google Scholar
Bohrnstedt, G. W., & Carter, T. M. (1971). Robustness in regression analysis. Sociological Methodology, 3, 118–146.
Article Google Scholar
Burnham, K. P., Anderson, D. R., & Huyvaert, K. P. (2011). AIC model selection and multimodel inference in behavioral ecology: Some background, observations, and comparisons. Behavioral Ecology and Sociobiology, 65, 23–35. https://doi.org/10.1007/s00265-010-1029-6
Article Google Scholar
Cabrera, A. F., & La Nasa, S. M. (2000). Understanding the college-choice process. New Directions for Institutional Research, 2000(107), 5–22.
Article Google Scholar
Card, D., & Payne, A. A. (2021). High school choices and the gender gap in STEM. Economic Inquiry, 59(1), 9–28. https://doi.org/10.1111/ecin.12934
Article Google Scholar
Castleman, B. L., & Page, L. C. (2017). Parental influences on postsecondary decision making: Evidence from a text messaging experiment. Educational Evaluation and Policy Analysis, 39(2), 361–377. https://doi.org/10.3102/0162373716687393
Article Google Scholar
Cellini, S. R. (2009). Crowded colleges and college crowd-out: The impact of public subsidies on the two-year college market. American Economic Journal: Economic Policy, 1(2), 1–30. https://doi.org/10.1257/pol.1.2.1
Cellini, S. R., & Chaudhary, L. (2014). The labor market returns to a for-profit college education. Economics of Education Review, 43, 125–140. https://doi.org/10.1016/j.econedurev.2014.10.001
Article Google Scholar
Cellini, S. R., & Goldin, C. (2014). Does federal student aid raise tuition? New evidence on for-profit colleges. American Economic Journal: Economic Policy, 6(4), 174–206. https://doi.org/10.1257/pol.6.4.174
Article Google Scholar
Cheslock, J. J., & Gianneschi, M. (2008). Replacing state appropriations with alternative revenue sources: The case of voluntary support. The Journal of Higher Education, 79(2), 208–229. https://doi.org/10.1353/jhe.2008.0012
Article Google Scholar
Cheslock, J. J., & Rios-Aguilar, C. (2011). Multilevel analysis in higher education research: A multidisciplinary approach. In J. C. Smart & M. B. Paulsen (Eds.), Higher education: Handbook of theory and research (Vol. 26, pp. 85–123). Springer. https://doi.org/10.1007/978-94-007-0702-3_3
Chapter Google Scholar
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003. https://doi.org/10.1037/0003-066X.49.12.997
Article Google Scholar
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Erlbaum.
Google Scholar
Conway, K. M. (2009). Exploring persistence of immigrant and native students in an urban community college. The Review of Higher Education, 32(3), 321–352. https://doi.org/10.1353/rhe.0.0059
Article Google Scholar
Coughlin, C., & Castilla, C. (2014). The effect of private high school education on the college trajectory. Economics Letters, 125(2), 200–203. https://doi.org/10.1016/j.econlet.2014.09.002
Article Google Scholar
Cowan, J., & Goldhaber, D. (2015). How much of a “running start” do dual enrollment programs provide students? The Review of Higher Education, 38(3), 425–460. https://doi.org/10.1353/rhe.2015.0018
Article Google Scholar
Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29.
Article Google Scholar
D’Amico, M. M., Dika, S. L., Elling, T. W., Algozzine, B., & Ginn, D. J. (2014). Early integration and other outcomes for community college transfer students. Research in Higher Education, 55, 370–399. https://doi.org/10.1007/s11162-013-9316-5
Dannenberg, M., & Hyslop, A. (2019). Building a fast track to college: An executive summary. Alliance for Excellent Education. http://edreformnow.org/wp-content/uploads/2019/02/ERN-AEE-Fast-Track-FINAL.pdf
Google Scholar
DeAngelo, L., & Franke, R. (2016). Social mobility and reproduction for whom? College readiness and first-year retention. American Educational Research Journal, 53(6), 1588–1625. https://doi.org/10.3102/0002831216674805
Article Google Scholar
DesJardins, S. L., & Toutkoushian, R. K. (2005). Are students really rational? The development of rational thought and its application to student choice. In J. C. Smart (Ed.), Higher education: Handbook of theory and research (pp. 191–240). Springer.
Chapter Google Scholar
Dowd, A. C., Cheslock, J. J., & Melguizo, T. (2008). Transfer access from community colleges and the distribution of elite higher education. The Journal of Higher Education, 79(4), 442–472. https://doi.org/10.1353/jhe.0.0010
Article Google Scholar
Doyle, W. R. (2012). The politics of public college tuition and state financial aid. The Journal of Higher Education, 83(5), 617–647.
Article Google Scholar
Doyle, W. R., Dziesinski, A. B., & Delaney, J. A. (2021). Modeling volatility in public funding for higher education. Journal of Education Finance, 46(4), 563–591.
Google Scholar
Duprey, M. A., Pratt, D. J., Wilson, D. H., Jewell, D. M., Brown, D. S., Caves, L. R., Kinney, S. K., Mattox, T. L., Ritchie, N. S., Rogers, J. E., Spagnardi, C. M., Wescott, J. D., & Christopher, E. M. (2020). High school longitudinal study of 2009 (HSLS:09) postsecondary education transcript study and student financial aid records collection: Data file documentation. U.S. Department of Education. https://nces.ed.gov/pubs2020/2020004.pdf
Google Scholar
Evans, B. J. (2019). How college students use advanced placement credit. American Educational Research Journal, 56(3), 925–954. https://doi.org/10.3102/0002831218807428
Article Google Scholar
Evans, B. J. (2021). Understanding the complexities of experimental analysis in the context of higher education. In L. W. Perna (Ed.), Higher education: Handbook of theory and research (Vol. 36, pp. 611–661). Springer. https://doi.org/10.1007/978-3-030-44007-7_12
Chapter Google Scholar
Evans, B. J., Boatman, A., & Soliz, A. (2019). Framing and labeling effects in preferences for borrowing for college: An experimental analysis. Research in Higher Education, 60(4), 438–457. https://doi.org/10.1007/s11162-018-9518-y
Article Google Scholar
Ezell, M. E., & Land, K. C. (2005). Ordinary least squares (OLS). In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 943–950). Elsevier. https://doi.org/10.1016/B0-12-369398-5/00171-7
Chapter Google Scholar
Faircloth, S. C., Alcantar, C. M., & Stage, F. K. (2015). Use of large-scale data sets to study educational pathways of American Indian and Alaska native students. New Directions for Institutional Research, 2014(163), 5–24. https://doi.org/10.1002/ir.20083
Article Google Scholar
Fanelli, D. (2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90(3), 891–904. https://doi.org/10.1007/s11192-011-0494-7
Article Google Scholar
Ferguson, C. J., & Heene, M. (2012). A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science, 7(6), 555–561. https://doi.org/10.1177/1745691612459059
Article Google Scholar
Fernandez, F., Fu, Y.-C., Hu, X., & Moradel, J. J. (2023). Examining the influence of Texas’ strategic plan for increasing university research: Loose coupling and research production at regional public universities. The Journal of Higher Education, advance online publication. https://doi.org/10.1080/00221546.2023.2192161
Fowles, J. (2014). Funding and focus: Resource dependence in public higher education. Research in Higher Education, 55, 272–287. https://doi.org/10.1007/s11162-013-9311-x
Article Google Scholar
Frederick, A. B., Schmidt, S. J., & Davis, L. S. (2012). Federal policies, state responses, and community college outcomes: Testing an augmented Bennett hypothesis. Economics of Education Review, 31(6), 908–917. https://doi.org/10.1016/j.econedurev.2012.05.009
Article Google Scholar
Fryer, R. G., & Greenstone, M. (2010). The changing consequences of attending historically black colleges and universities. American Economic Journal: Applied Economics, 2(1), 116–148.
Google Scholar
Furquim, F., Corral, D., & Hillman, N. (2020). A primer for interpreting and designing difference-in-differences studies in higher education research. In L. W. Perna (Ed.), Higher education: Handbook of theory and research (Vol. 35, pp. 1–58). Springer. https://doi.org/10.1007/978-3-030-31365-4_5
Chapter Google Scholar
Gansemer-Topf, A. M., & Schuh, J. H. (2006). Institutional selectivity and institutional expenditures: Examining organizational factors that contribute to retention and graduation. Research in Higher Education, 47(6), 613–642. https://doi.org/10.1007/s11162-006-9009-4
Article Google Scholar
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
Google Scholar
Giani, M. (2019). The correlates of credit loss: How demographics, pre-transfer academics, and institutions relate to the loss of credits for vertical transfer students. Research in Higher Education, 60(8), 1113–1141. https://doi.org/10.1007/s11162-019-09548-w
Article Google Scholar
Giani, M., Alexander, C., & Reyes, P. (2014). Exploring variation in the impact of dual-credit coursework on postsecondary outcomes: A quasi-experimental analysis of Texas students. The High School Journal, 97(4), 200–218.
Article Google Scholar
Goldrick-Rab, S., Kelchen, R., Harris, D. N., & Benson, J. (2016). Reducing income inequality in educational attainment: Experimental evidence on the impact of financial aid on college completion. American Journal of Sociology, 121(6), 1762–1817. https://doi.org/10.1086/685442
Article Google Scholar
Gottfried, M. A., & Plasman, J. S. (2018). Linking the timing of career and technical education coursetaking with high school dropout and college-going behavior. American Educational Research Journal, 55(2), 325–361. https://doi.org/10.3102/0002831217734805
Article Google Scholar
Grubb, J. M., Scott, P. H., & Good, D. W. (2017). The answer is yes: Dual enrollment benefits students at the community college. Community College Review, 45(2), 79–98. https://doi.org/10.1177/0091552116682590
Article Google Scholar
Guo, S., & Fraser, M. (2015). Propensity score analysis (2nd ed.). Sage.
Google Scholar
Gurantz, O. (2015). Who loses out? Registration order, course availability, and student behaviors in community college. The Journal of Higher Education, 86(4), 524–563. https://doi.org/10.1353/jhe.2015.0021
Article Google Scholar
Harper, S. R., Carini, R. M., Bridges, B. K., & Hayek, J. C. (2004). Gender differences in student engagement among African American undergraduates at historically black colleges and universities. Journal of College Student Development, 45(3), 271–284. https://doi.org/10.1353/csd.2004.0035
Article Google Scholar
Harrell, F. E. (2001). Regression modeling strategies: With applications to linear models, logistic regression, and survival analysis. Springer.
Book Google Scholar
Hearn, J. C. (1984). The relative roles of academic, ascribed, and socioeconomic characteristics in college destinations. Sociology of Education, 57(1), 22–30.
Article Google Scholar
Hearn, J. C. (1988). Attendance at higher-cost colleges: Ascribed, socioeconomic, and academic influences on student enrollment patterns. Economics of Education Review, 7(1), 65–76.
Article Google Scholar
Hearn, J. C. (1991). Academic and nonacademic influences on the college destinations of 1980 high school graduates. Sociology of Education, 64(3), 158–171.
Article Google Scholar
Hemelt, S. W., & Marcotte, D. E. (2011). The impact of tuition increases on enrollment at public colleges and universities. Educational Evaluation and Policy Analysis, 33(4), 435–457. https://doi.org/10.3102/0162373711415261
Article Google Scholar
Hemelt, S. W., & Swiderski, T. (2022). College comes to high school: Participation and performance in Tennessee’s innovative wave of dual-credit courses. Educational Evaluation and Policy Analysis, 44(2), 313–341. https://doi.org/10.3102/01623737211052310
Article Google Scholar
Hillman, N. W. (2012). Tuition discounting for revenue management. Research in Higher Education, 53(3), 263–281. https://doi.org/10.1007/s11162-011-9233-4
Article Google Scholar
Hillman, N., & Weichman, T. (2016). Education deserts: The continued significance of “place” in the twenty-first century. American Council on Education Center for Policy Research and Strategy.
Google Scholar
Hoffmann, J. P. (2004). Generalized linear models: An applied approach. Pearson.
Google Scholar
Howell, J. S., & Pender, M. (2016). The costs and benefits of enrolling in an academically matched college. Economics of Education Review, 51, 152–168. https://doi.org/10.1016/j.econedurev.2015.06.008
Article Google Scholar
Hu, S., & Hossler, D. (2000). Willingness to pay and preference for private institutions. Research in Higher Education, 41(6), 685–701.
Article Google Scholar
Hu, X., & Ortagus, J. C. (2023). National evidence of the relationship between dual enrollment and student loan debt. Educational Policy, 37(5), 1241–1276. https://doi.org/10.1177/08959048221087204
Article Google Scholar
Hu, X., Fernandez, F., & Gándara, D. (2021). Are donations bigger in Texas? Analyzing the impact of a policy to match donations to Texas’ emerging research universities. American Educational Research Journal, 58(4), 850–882. https://doi.org/10.3102/0002831220968947
Article Google Scholar
Huitema, B. E., Mckean, J. W., & Mcknight, S. (1999). Autocorrelation effects on least-squares intervention analysis of short time series. Educational and Psychological Measurement, 59(5), 767–786.
Article Google Scholar
Hunt, J. M., Tandberg, D. A., & Park, T. J. (2019). Presidential compensation and institutional revenues: Testing the return on investment for public university presidents. The Review of Higher Education, 42(2), 619–640. https://doi.org/10.1353/rhe.2019.0009
Article Google Scholar
Ishitani, T. T., & McKitrick, S. A. (2016). Are student loan default rates linked to institutional capacity? Journal of Student Financial Aid, 46(1), 17–37.
Article Google Scholar
Jaccard, J., & Turrisi, R. (2003). Interaction effects in multiple regression. Sage.
Book Google Scholar
Jacoby, D. (2006). Effects of part-time faculty employment on community college graduation rates. The Journal of Higher Education, 77(6), 1081–1103. https://doi.org/10.1353/jhe.2006.0050
Article Google Scholar
Jaquette, O., Curs, B. R., & Posselt, J. R. (2016). Tuition rich, mission poor: Nonresident enrollment growth and the socioeconomic and racial composition of public research universities. The Journal of Higher Education, 87(5), 635–673. https://doi.org/10.1353/jhe.2016.0025
Article Google Scholar
Kane, T. J., & Rouse, C. E. (1995). Labor-market returns to two- and four-year college. The American Economic Review, 85(3), 600–614.
Google Scholar
Kanny, M. A. (2015). Dual enrollment participation from the student perspective. New Directions for Community Colleges, 2015, 59–70. https://doi.org/10.1002/cc.20133
Article Google Scholar
Kilgo, C. A., Ezell sheets, J. K., & Pascarella, E. T. (2015). The link between high-impact practices and student learning: Some longitudinal evidence. Higher Education, 69, 509–525. https://doi.org/10.1007/s10734-014-9788-z
Article Google Scholar
Killgore, L. (2009). Merit and competition in selective college admissions. The Review of Higher Education, 32(4), 469–488.
Article Google Scholar
Kim, J., & Shim, W. (2019). What do rankings measure? The U.S. news rankings and student experience at liberal arts colleges. The Review of Higher Education, 42(3), 933–964. https://doi.org/10.1353/rhe.2019.0025
Article Google Scholar
Kim, J., DesJardins, S. L., & McCall, B. P. (2009). Exploring the effects of student expectations about financial aid on postsecondary choice: A focus on income and racial/ethnic differences. Research in Higher Education, 50, 741–774. https://doi.org/10.1007/s11162-009-9143-x
Article Google Scholar
Kim, J., Kim, J., DesJardins, S. L., & McCall, B. P. (2015). Completing algebra II in high school: Does it increase college access and success? The Journal of Higher Education, 86(4), 628–662. https://doi.org/10.1353/jhe.2015.0018
Article Google Scholar
Klasik, D., & Zahran, W. (2022). The art of sophisticated quantitative description in higher education research. In L. W. Perna (Ed.), Higher education: Handbook of theory and research (Vol. 37). Springer. https://doi.org/10.1007/978-3-030-76660-3_12
Chapter Google Scholar
Labaree, D. F. (1997). Public goods, private goods: The American struggle over educational goals. American Educational Research Journal, 34(1), 39–81. https://doi.org/10.3102/00028312034001039
Article Google Scholar
Leigh, D. E., & Gill, A. M. (1997). Labor market returns to community colleges: Evidence for returning adults. Journal of Human Resources, 32(2), 334–353.
Article Google Scholar
Lever, J., Krzywinski, M., & Altman, N. (2016). Model selection and overfitting. Nature Methods, 13, 703–704. https://doi.org/10.1038/nmeth.3968
Article Google Scholar
Li, A. Y., & Gándara, D. (2020). The promise of “free” tuition and program design features: Impacts on first-time college enrollment. In L. W. Perna & E. J. Smith (Eds.), Improving research-based knowledge of college promise programs (pp. 219–240). American Educational Research Association.
Chapter Google Scholar
Li, A. Y., & Kelchen, R. (2021). Institutional and state-level factors related to paying back student loan debt among public, private, and for-profit colleges. Journal of Student Financial Aid, 50(2), 1–19. https://doi.org/10.55504/0884-9153.1686
Article Google Scholar
Lin, C.-H., Borden, V. M. H., & Chen, J.-H. (2020). A study on effects of financial aid on student persistence in dual enrollment and advanced placement participation. Journal of College Student Retention: Research, Theory & Practice, 22(3), 378–401. https://doi.org/10.1177/1521025117753732
Article Google Scholar
Linsenmeier, D. M., Rosen, H. S., & Rouse, C. E. (2006). Financial aid packages and college enrollment decisions: An econometric case study. Review of Economics and Statistics, 88(1), 126–145.
Article Google Scholar
Liu, X., & Borden, V. (2019). Addressing self-selection and endogeneity in higher education research. In J. Huisman & M. Tight (Eds.), Theory and method in higher education research (Vol. 5, pp. 129–151). Emerald. https://doi.org/10.1108/S2056-375220190000005009
Chapter Google Scholar
Long, J. S., & Freese, J. (2014). Regression models for categorical dependent variables using Stata. Stata Press.
Google Scholar
López, N., Erwin, C., Binder, M., & Chavez, M. J. (2018). Making the invisible visible: Advancing quantitative methods in higher education using critical race theory and intersectionality. Race Ethnicity and Education, 21(2), 180–207. https://doi.org/10.1080/13613324.2017.1375185
Article Google Scholar
MacKinnon, J. G. (2013). Thirty years of heteroskedasticity-robust inference. In X. Chen & N. Swanson (Eds.), Recent advances and future directions in causality, prediction, and specification analysis (pp. 437–461). Springer. https://doi.org/10.1007/978-1-4614-1653-1_17
Chapter Google Scholar
Malcom-Piqueux, L. (2015). Application of person-centered approaches to critical quantitative research: Exploring inequities in college financing strategies. New Directions for Institutional Research, 2014(163), 59–73. https://doi.org/10.1002/ir.20086
Article Google Scholar
Marken, S., Gray, L., & Lewis, L. (2013). Dual enrollment programs and courses for high school students at postsecondary institutions: 2010–11 (NCES 2013–002). U.S. Department of Education. National Center for Education Statistics. https://nces.ed.gov/pubs2013/2013002.pdf
Google Scholar
McCall, B. P., & Bielby, R. M. (2012). Regression discontinuity design: Recent developments and a guide to practice for researchers in higher education. In J. Smart & M. Paulsen (Eds.), Higher education: Handbook of theory and research (Vol. 27, pp. 249–290). Springer. https://doi.org/10.1007/978-94-007-2950-6_5
Chapter Google Scholar
McCambly, H., Aguilar-Smith, S., Felix, E., Hu, X., & Baber, L. (2023). Community colleges as racialized organizations: Outlining opportunities for equity. Community College Review, advance online publication. https://doi.org/10.1177/00915521231182121
McClure, K. R., & Titus, M. (2018). Spending up the ranks? The relationship between striving for prestige and administrative expenditure at U.S. public research universities. The Journal of Higher Education, 89(6), 961–987. https://doi.org/10.1080/00221546.2018.1449079
Article Google Scholar
McLendon, M. K., Hearn, J. C., & Mokher, C. G. (2009). Partisans, professionals, and power: The role of political factors in state higher education funding. The Journal of Higher Education, 80(6), 686–713.
Article Google Scholar
McNeish, D. M. (2014). Analyzing clustered data with OLS regression: The effect of a hierarchical data structure. Multiple Linear Regression Viewpoints, 40(1), 11–16.
Google Scholar
Mehl, G., Wyner, J., Barnett, E., Fink, J., & Jenkins, D. (2020). The dual enrollment playbook: A guide to equitable acceleration for students. Aspen Institute. https://ccrc.tc.columbia.edu/publications/dual-enrollment-playbook-equitable-acceleration.html
Google Scholar
Miles, J. (2005). Tolerance and variance inflation factor. In B. S. Everitt & D. C. Howell (Eds.), Encyclopedia of statistics in behavioral science. https://doi.org/10.1002/0470013192.bsa683
Chapter Google Scholar
Minaya, V. (2021). Can dual enrollment algebra reduce racial/ethnic gaps in early STEM outcomes? Evidence from Florida. Community College Research Center. https://ccrc.tc.columbia.edu/publications/dual-enrollment-algebra-stem-outcomes.html
Google Scholar
Minicozzi, A. (2005). The short term effect of educational debt on job decisions. Economics of Education Review, 24(4), 417–430. https://doi.org/10.1016/j.econedurev.2004.05.008
Article Google Scholar
Monks, J. (2014). The role of institutional and state aid policies in average student debt. The Annals of the American Academy of Political and Social Science, 655(1), 123–142. https://doi.org/10.1177/0002716214539093
Article Google Scholar
Moretti, E. (2004). Estimating the social return to higher education: Evidence from longitudinal and repeated cross-sectional data. Journal of Econometrics, 121(1–2), 175–212. https://doi.org/10.1016/j.jeconom.2003.10.015
Article Google Scholar
Museus, S. D. (2023). An evolving QuantCrit: The quantitative research complex and a theory of racialized quantitative systems. In L. W. Perna (Ed.), Higher education: Handbook of theory and research (Vol. 38, pp. 631–664). Springer. https://doi.org/10.1007/978-3-031-06696-2_5
Chapter Google Scholar
Museus, S. D., Lutovsky, B. R., & Colbeck, C. L. (2007). Access and equity in dual enrollment programs: Implications for policy formation. Higher Education in Review, 4, 1–19.
Google Scholar
National Center for Education Statistics. (n.d.). IPEDS 2022–23 data collection system. View Glossary. https://surveys.nces.ed.gov/ipeds/public/glossary
Olitsky, N. H. (2014). How do academic achievement and gender affect the earnings of STEM majors? A propensity score matching approach. Research in Higher Education, 55(3), 245–271. https://doi.org/10.1007/s11162-013-9310-y
Article Google Scholar
Page, L. C., & Scott-Clayton, J. (2016). Improving college access in the United States: Barriers and policy responses. Economics of Education Review, 51, 4–22. https://doi.org/10.1016/j.econedurev.2016.02.009
Article Google Scholar
Park, J. J., & Kim, S. (2020). Harvard’s personal rating: The impact of private high school attendance. Asian American Policy Review, 30, 79–80.
Google Scholar
Park, T. J., Flores, S. M., & Ryan, C. J. (2018). Labor market returns for graduates of Hispanic-serving institutions. Research in Higher Education, 59(1), 29–53. https://doi.org/10.1007/s11162-017-9457-z
Article Google Scholar
Perna, L. W. (2003). The private benefits of higher education: An examination of the earnings premium. Research in Higher Education, 44(4), 461–472.
Article Google Scholar
Pigott, T. D., & Polanin, J. R. (2020). Methodological guidance paper: High-quality meta-analysis in a systematic review. Review of Educational Research, 90(1), 24–46. https://doi.org/10.3102/0034654319877153
Article Google Scholar
Pike, G. R. (1991). Using structural equation models with latent variables to study student growth and development. Research in Higher Education, 32(5), 499–524. https://doi.org/10.1007/BF00992625
Article Google Scholar
Pompelia, S. (2020). Dual enrollment access. Education Commission of the State. https://files.eric.ed.gov/fulltext/ED602439.pdf
Google Scholar
Porter, S. R. (2015). Quantile regression: Analyzing changes in distributions instead of means. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (Vol. 30, pp. 335–382). Springer. https://doi.org/10.1007/978-3-319-12835-1_8
Chapter Google Scholar
Pretlow, J., & Wathington, H. D. (2014). Expanding dual enrollment: Increasing postsecondary access for all? Community College Review, 42(1), 41–54. https://doi.org/10.1177/0091552113509664
Article Google Scholar
Rask, K. (2010). Attrition in STEM fields at a liberal arts college: The importance of grades and pre-collegiate preferences. Economics of Education Review, 29(6), 892–900. https://doi.org/10.1016/j.econedurev.2010.06.013
Article Google Scholar
Reynolds, C. L., & DesJardins, S. L. (2009). The use of matching methods in higher education research: Answering whether attendance at a 2-year institution results in differences in educational attainment. In J. C. Smart (Ed.), Higher education: Handbook of theory and research (Vol. 24, pp. 47–97). Springer. https://doi.org/10.1007/978-1-4020-9628-0_2
Chapter Google Scholar
Ridgeway, G., Kovalchik, S. A., Griffin, B. A., & Kabeto, M. U. (2015). Propensity score analysis with survey weighted data. Journal of Causal Inference, 3(2), 237–249. https://doi.org/10.1515/jci-2014-0039
Article Google Scholar
Ro, H. K., Fernandez, F., & Kim, S. (2023). Understanding political efficacy among Asian American undergraduates at research universities. Journal of Student Affairs Research and Practice, 60(2), 150–163. https://doi.org/10.1080/19496591.2021.1994409
Article Google Scholar
Rodriguez, A., Furquim, F., & DesJardins, S. L. (2018). Categorical and limited dependent variable modeling in higher education. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (Vol. 33, pp. 295–370). Springer. https://doi.org/10.1007/978-3-319-72490-4_7
Chapter Google Scholar
Roksa, J., & Velez, M. (2012). A late start: Delayed entry, life course transitions and bachelor’s degree completion. Social Forces, 90(3), 769–794. https://doi.org/10.1093/sf/sor018
Article Google Scholar
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. https://doi.org/10.1037/0033-2909.86.3.638
Article Google Scholar
Royall, R. M. (1986). The effect of sample size on the meaning of significance tests. The American Statistician, 40(4), 313–315.
Google Scholar
Salkind, N. J., & Frey, B. B. (2021). Statistics for people who (think they) hate statistics. Sage.
Google Scholar
Santiago, D. A. (2007). Choosing Hispanic-serving institutions (HSIs): A closer look at Latino students’ college choices. Excelencia in Education. https://eric.ed.gov/?id=ED506053
Google Scholar
Schudde, L. (2018). Heterogeneous effects in education: The promise and challenge of incorporating intersectionality into quantitative methodological approaches. Review of Research in Education, 42(1), 72–92. https://doi.org/10.3102/0091732X18759040
Scott-Clayton, J., & Minaya, V. (2016). Should student employment be subsidized? Conditional counterfactuals and the outcomes of work-study participation. Economics of Education Review, 52, 1–18. https://doi.org/10.1016/j.econedurev.2015.06.006
Article Google Scholar
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
Article Google Scholar
Sjoquist, D. L., & Winters, J. V. (2015). State merit-based financial aid programs and college attainment. Journal of Regional Science, 55(3), 364–390. https://doi.org/10.1111/jors.12161
Article Google Scholar
Slavin, R., & Smith, D. (2009). The relationship between sample sizes and effect sizes in systematic reviews in education. Educational Evaluation and Policy Analysis, 31(4), 500–506. https://doi.org/10.3102/0162373709352369
Article Google Scholar
Smith, K., Jagesic, S., Wyatt, J., & Ewing, M. (2018). AP STEM participation and postsecondary STEM outcomes: Focus on underrepresented minority, first-generation, and female students. College Board. https://eric.ed.gov/?id=ED581514
Google Scholar
Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance – Or vice versa. Journal of the American Statistical Association, 54(285), 30–34.
Google Scholar
Stewart, D.-L. (2020). Twisted at the roots: The intransigence of inequality in U.S. higher education. Change, 52(2), 13–16.
Article Google Scholar
Stratton, L. S. (2014). College enrollment: An economic analysis. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (Vol. 29, pp. 327–384). Springer.
Chapter Google Scholar
Strauss, L. C., & Volkwein, J. F. (2002). Comparing student performance and growth in 2- and 4-year institutions. Research in Higher Education, 43(2), 133–161.
Article Google Scholar
Tabron, L. A., & Thomas, A. K. (2023). Deeper than wordplay: A systematic review of critical quantitative approaches in education research (2007–2021). Review of Educational Research, advance online publication. https://doi.org/10.3102/00346543221130017
Taylor, J. L. (2015). Accelerating pathways to college: The (in)equitable effects of community college dual credit. Community College Review, 43, 355–379. https://doi.org/10.1177/0091552115594880
Article Google Scholar
Taylor, J. L., Allen, T. O., An, B. P., Denecker, C., Edmunds, J. A., Fink, J., Giani, M. S., Hodara, M., Hu, X., Tobolowsky, B. F., & Chen, W. (2022). Research priorities for advancing equitable dual enrollment policy and practice. University of Utah. https://cherp.utah.edu/_resources/documents/publications/research_priorities_for_advancing_equitable_dual_enrollment_policy_and_practice.pdf
Google Scholar
Thomas, N., Marken, S., Gray, L., & Lewis, L. (2013). Dual credit and exam-based courses in US public high schools: 2010-11 (NCES 2013-001). U.S. Department of Education/National Center for education statistics. https://nces.ed.gov/pubs2013/2013001.pdf
Google Scholar
Titus, M. A. (2009). The production of bachelor’s degrees and financial aspects of state higher education policy: A dynamic analysis. The Journal of Higher Education, 80(4), 439–468. https://doi.org/10.1353/jhe.0.0055
Article Google Scholar
Turner, N. (2012). Who benefits from student aid? The economic incidence of tax-based federal student aid. Economics of Education Review, 31(4), 463–481. https://doi.org/10.1016/j.econedurev.2011.12.008
Article Google Scholar
Tyler, J. H., Murnane, R. J., & Willett, J. B. (2003). Who benefits from a GED? Evidence for females from High School and Beyond. Economics of Education Review, 22(3), 237–247. https://doi.org/10.1016/S0272-7757(02)00054-7
Waddell, G. R., & Singell, L. D. (2011). Do no-loan policies change the matriculation patterns of low-income students? Economics of Education Review, 30(2), 203–214. https://doi.org/10.1016/j.econedurev.2010.10.004
Article Google Scholar
Webber, D. A. (2016). Are college costs worth it? How ability, major, and debt affect the returns to schooling. Economics of Education Review, 53, 296–310. https://doi.org/10.1016/j.econedurev.2016.04.007
Article Google Scholar
Webber, D. A., & Ehrenberg, R. G. (2010). Do expenditures other than instructional expenditures affect graduation and persistence rates in American higher education? Economics of Education Review, 29(6), 947–958. https://doi.org/10.1016/j.econedurev.2010.04.006
Article Google Scholar
Weisberg, S. (2014). Applied linear regression (4th ed.). Wiley.
Google Scholar
Wells, R. S., & Stage, F. K. (2015). Past, present, and future of critical quantitative research in higher education. New Directions for Institutional Research, 2014(163), 103–112. https://doi.org/10.1002/ir.20089
Article Google Scholar
What Works Clearinghouse. (2022). What Works Clearinghouse procedures and standards handbook, version 5.0. U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance (NCEE). https://ies.ed.gov/ncee/wwc/Handbooks
Whatley, M. (2022). Introduction to quantitative analysis for international educators. Springer.
Google Scholar
Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604. https://doi.org/10.1037/0003-066X.54.8.594
Article Google Scholar
Wolniak, G. C., & Engberg, M. E. (2019). Do “high-impact” college experiences affect early career outcomes? The Review of Higher Education, 42(3), 825–858. https://doi.org/10.1353/rhe.2019.0021
Article Google Scholar
Wolniak, G. C., & Pascarella, E. T. (2007). Initial evidence on the long-term impacts of work colleges. Research in Higher Education, 48(1), 39–71. https://doi.org/10.1007/s11162-006-9023-6
Article Google Scholar
Yeung, R., Gigliotti, P., & Nguyen-Hoang, P. (2019). The impact of U.S. news college rankings on the compensation of college and university presidents. Research in Higher Education, 60(1), 1–17. https://doi.org/10.1007/s11162-018-9501-7
Article Google Scholar
Zhang, L. (2010). The use of panel data models in higher education policy studies. In J. C. Smart (Ed.), Higher education: Handbook of theory and research (Vol. 25, pp. 307–350). Springer. https://doi.org/10.1007/978-90-481-8598-6_8
Chapter Google Scholar
Zinth, J. (2018). STEM dual enrollment: Model policy components. Education Commission of the States. https://www.ecs.org/wp-content/uploads/STEM-Dual-Enrollment-Model-Policy-Components.pdf
Google Scholar
Zinth, J., & Barnett, E. (2018). Rethinking dual enrollment to reach more students. Promising practices. Education Commission of the States. https://eric.ed.gov/?id=ED582909
Google Scholar

Download references

Author information

Authors and Affiliations

Higher Education and Student Affairs, College of Education, Northern Illinois University, DeKalb, IL, USA
Xiaodan Hu

Authors

Xiaodan Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodan Hu .

Editor information

Editors and Affiliations

University of Pennsylvania, Philadelphia, PA, USA
Laura W. Perna

Section Editor information

University of Wisconsin-Madison, Madison, WI, USA
Nicholas Hillman

Appendices

Appendix A: Data Preparation for Illustration Replication

The public-use HSLS:09 data can be found at https://nces.ed.gov/datalab/. Once the student-level dataset is downloaded and opened in Stata, the following commands were run to generate the OLS example data.dta file.

***sample selection*** *Generate sample with known primary first-year institution identified by 2017 . keep if X5PFYEAR > 0 (14,815 observations deleted) *Exclude students who did not have any dual credit indicated in their postsecondary transcript . drop if X5HSCRDERN <= 0 (6,954 observations deleted) *Exclude students with no valid values of the tuition level of a primary first-year institution . drop if X5PFYTUITION < 0 (1 observation deleted) ***variable selection*** **sociodemographic characteristics in 11th grade** *sex* . recode X2SEX (1=0) (2=1) (-9=.) . label define X2SEXF 0 "Male" 1 "Female", replace . label values X2SEX X2SEXF *race/ethnicity* . recode X2RACE (-9=.) *family income* . recode X2FAMINCOME (-8=.) *parental education* . recode X2PAREDU (-8=.) *dependent indicator* . gen dependent = 0 . replace dependent = 1 if P2DEPENDNUM>0 **students’ expectations of college and costs in 11th grade *educational expectations* . replace X2STUEDEXPCT=. if X2STUEDEXPCT<0 *financial aid expectation* . gen finaidexp = 0 . replace finaidexp=1 if S2QUALNEED==1| S2QUALACHIEVE==1 *the importance of cost of attendance when choosing a college* . recode S2COSTATTEND (-9=.)(-8=.)(-7=.) **academic performance** *high school overall GPA (honor-weighted)* . replace X3TGPAWGT=. if X3TGPAWGT < 0 *the number of AP/IB credits earned . recode X3TCREDAPIB (-9=.)(-8=.)(-7=.)(-6=.)(-1=.) **high school characteristics** *high school location* . recode X3LOCALE (-9=.)(-8=.)(-7=.)(-1=.) *high school control* . recode X3CONTROL (-9=.)(-8=.)(-7=.)(-1=.) (3=2) **enrollment intensity** . recode X5PFYENRLSTAT (-9=.) **Create a hypothetical unit var for independence testing because the public-use data suppressed school ID** . egen SimuID = group(X3REGION X3CONTROL X3LOCALE X2SCHOOLCLI) **Fill missing values with the variable's median value . fillmissing X2FAMINCOME X2PAREDU X3TGPAWGT X2STUEDEXPCT X3TCREDAPIB X3LOCALE X3CONTROL S2COSTATTEND X5PFYENRLSTAT SimuID, with(median) **Keep only relevant variables for demonstration purposes . keep X5PFYTUITION X5HSCRDERN X2SEX X2RACE X2FAMINCOME X2PAREDU dependent S2COSTATTEND finaidexp X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB X3LOCALE X3CONTROL X5PFYENRLSTAT firstgen lowinc SimuID STU_ID *Save file . save "OLS example data.dta", replace

Appendix B: HSLS Variables Used for the Illustrated Example

Dependent Variable		Tuition and fees charged at primarily first-year institution (logged)	continuous	X5PFYTUITION
Independent Variables		The number of known dual credits earned	discrete	X5HSCRDERN
Control Variables	sociodemographic characteristics in their 11th grade	Sex	0 = Male; 1 = Female	X2SEX
		Race/Ethnicity	1 = White; 2 = Black/African American, non-Hispanic; 3 = Hispanic; 4 = Asian or Native Hawaiian/Pacific Islander, non-Hispanic; 5 = American Indian or Alaska Native, non-Hispanic; 6 = More than one race, non-Hispanic	X2RACE
		Total family income from all sources	1 = less than or equal to $15,000; 2 = family income > $15,000 and <= $35,000; 3 = family income > $35,000 and <= $55,000; 4 = family income > $55,000 and <= $75,000; 5 = family income > $75,000 and <= $95,000; 6 = family income > $95,000 and <= $115,000; 7 = family income > $115,000 and <= $135,000; 8 = family income > $135,000 and <= $155,000; 9 = family income > $155,000 and <=$175,000; 10 = family income > $175,000 and <= $195,000; 11 = family income > $195,000 and <= $215,000; 12 = family income > $215,000 and <= $235,000; 13 = family income above $235,000	X2FAMINCOME
		Parents’/guardians’ highest level of education	1 = No postsecondary degree, 2 = Associate’s degree; 3 = Bachelor’s degree; 4 = Master’s degree; 5 = Ph.D./M.D/Law/other high level professional degree	X2PAREDU
		Number of dependents on respondent’s parents	discrete	P2DEPENDNUM
	Students’ expectations of college and associated costs in their 11th grade	How far in school sample member thinks the respondent will get	1 = No postsecondary degree or don’t know; 2 = Associate’s degree attempt or attainment; 3 = Bachelor’s degree attempt or attainment; 4 = Master attempt or attainment; 5 = Ph.D./M.D./law degree/high level professional degree attempt or attainment	X2STUEDEXPCT
		whether students expect being qualified for financial aid based on financial need or academic achievement	0 = No; 1 = Yes	S2QUALNEED, S2QUALACHIEVE
		Importance of cost of attendance when choosing college/school	1 = Very important; 2 = Somewhat important; 3 = Not at all important	S2COSTATTEND
	academic performance in 12th grade	Overall GPA, honors-weighted	discrete	X3TGPAWGT
	academic performance in 12th grade	Credits earned in AP/IB combined	discrete	X3TCREDAPIB
	high school characteristics	Location	1 = Urban; 2 = Suburban; 3 = Town; 4 = Rural	X3LOCALE
	high school characteristics	Control	1 = Public; 2 = Catholic or other private	X3CONTROL
	Enrollment status first-year in college	students’ enrollment intensity status	1 = Exclusively part time; 2 = Exclusively part time; 3 = Mixed full time and part time	X5PFYENRLSTAT

Appendix C: Diagnosis and Data Recoding to Address Assumption Violations

. use "OLS example data.dta", clear **Stata code used in Section 2 *Generate descriptive summary of the dependent and focal independent variables only (Table 1) . sum X5PFYTUITION X5HSCRDERN, detail *Run simple regression (Table 2 & Table 3) . regress X5PFYTUITION X5HSCRDERN *Generate scatterplot with CI (Figure 1a) . twoway (scatter X5PFYTUITION X5HSCRDERN) (lfitci X5PFYTUITION X5HSCRDERN, lcolor(black) color(%50)) ,legend(order(2 "95% CI" 3 "Fitted values")) scheme(s1mono) xtitle("The number of dual credits earned in high school") ytitle("Tuition and Fees Charged", size(small)) *Use hierarchical procedures to select control variables (Table 4) . nestreg: regress X5PFYTUITION X5HSCRDERN (i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent) (i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT) (X3TGPAWGT X3TCREDAPIB) (i.X3LOCALE i.X3CONTROL) (i.X5PFYENRLSTAT) *Test omitted variable . regress X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT . linktest . estat ovtest *Test overfitting . overfit: regress X5PFYTUITION . overfit: regress X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT *Generate descriptive summary of all variables . sum X5PFYTUITION X5HSCRDERN X2SEX X2RACE X2FAMINCOME X2PAREDU dependent S2COSTATTEND finaidexp X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB X3LOCALE X3CONTROL X5PFYENRLSTAT, detail *Generate scatter plot matrix (results omitted) . graph matrix X5PFYTUITION X5HSCRDERN X2SEX X2RACE X2FAMINCOME X2PAREDU dependent S2COSTATTEND finaidexp X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB X3LOCALE X3CONTROL X5PFYENRLSTAT *Calculate the correlation matrix [Option 1] . correlate X5PFYTUITION X5HSCRDERN X2SEX X2RACE X2FAMINCOME X2PAREDU dependent S2COSTATTEND finaidexp X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB X3LOCALE X3CONTROL X5PFYENRLSTAT *Calculate the correlation matrix [Option 2] (Table 5) . pwcorr X5PFYTUITION X5HSCRDERN X2SEX X2RACE X2FAMINCOME X2PAREDU dependent S2COSTATTEND finaidexp X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB X3LOCALE X3CONTROL X5PFYENRLSTAT, star(.05) bonferroni **Stata code used in Section 3 *1. multicollinearity *Perform multiple linear regression model . regress X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Calculate VIF for each independent variable . vif *Recode race* . recode X2RACE(8=1)(3=2)(4=3)(5=3)(2=4)(7=4)(1=5)(6=6) . label define X2RACEF 1 "White, non-Hispanic" 2 "Black, non-Hispanic" 3 "Hispanic" 4 "AAPI, non-Hispanic" 5 " Amer. Indian/Alaska Native, non-Hispanic" 6 "More than one race, non-Hispanic", replace . label values X2RACE X2RACEF *Recode parent education level* . recode X2PAREDU (2=1)(3=1)(4=2)(5=3)(6=4)(7=5) . label define X2PAREDUF 1 "No postsecondary degree" 2 "AA" 3 "BA" 4 "Master" 5 "Doctoral or Prof", replace . label values X2PAREDU X2PAREDUF *Recode education aspiration* . recode X2STUEDEXPCT (2=1)(3=1)(4=1)(5=2)(6=2)(7=3)(8=3) (9=4)(10=4)(11=5)(12=5)(13=1) . label define X2EDEXPF 1 "No postsecondary degree or don't know" 2 "AA attempt or attainment" 3 "BA attempt or attainment" 4 "Master attempt or attainment" 5 "Doctoral or Prof attempt or attainment", replace . label values X2STUEDEXPCT X2EDEXPF *Rerun regression model and calculate VIF (code omitted) **2. linearity** *Perform revised multiple linear regression model . regress X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Generate standardized residuals . predict r, resid *Plot standardized residuals against the predictor variables . scatter r X5HSCRDERN, scheme(s1mono) **Plot augmented partial residuals with lowess smoothed line (Figure 3a) . acprplot X5HSCRDERN, scheme(s1mono) lowess *Plot the pattern to the residual against the fitted (predicted) values with a reference line at y=0 . rvfplot, yline(0) scheme(s1mono) *Conduct numerical tests: Breusch-Pagan / Cook-Weisberg test for heteroskedasticity (the Breusch-Pagan test) . estat hettest *Drop standardized residuals for subsequent analyses . drop r *Transform dependent variable with natural log . gen log_X5PFYTUITION = ln(X5PFYTUITION) *Perform revised multiple linear regression model . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Generate standardized residuals . predict r, resid *Plot standardized residuals against the predictor variables . scatter r X5HSCRDERN, scheme(s1mono) **Plot augmented partial residuals with lowess smoothed line (Figure 3b) . acprplot X5HSCRDERN, scheme(s1mono) lowess **Plot augmented partial residuals with lowess smoothed line for other continuous/discrete variables . scatter r X3TGPAWGT, scheme(s1mono) . acprplot X3TGPAWGT, scheme(s1mono) lowess . scatter r X3TCREDAPIB, scheme(s1mono) . acprplot X3TCREDAPIB, scheme(s1mono) lowess **3. Normality *Plot a kernel density plot to be overlaid on a normal density plot (Figure 4) . kdensity r, normal scheme(s1mono) *Plot a standardized normal probability (p-p) plot (Figure 5) . pnorm r, scheme(s1mono) *Plot the quantiles of a var against the quantiles of a normal distribution (Figure 6a) . qnorm r, scheme(s1mono) *Conduct numerical tests: perform the shapiro-wilk w test for normality with 4<=n<=2000 observations . swilk r * Conduct numerical tests for inter-quartile range and symmetric distribution . iqr r *Drop standardized residuals for subsequent analyses . drop r **Address influential observations** *Perform revised multiple linear regression model . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Predict the studentized residual . predict r, rstudent *Display stem-and-leaf plots . stem r *Drop influential observations . drop if abs(r) > 3 *Re-plot the quantiles of a var against the quantiles of a normal distribution after excluding influential observations (Figure 6b) . qnorm r, scheme(s1mono) *Drop studentized residuals for subsequent analyses . drop r **4. equal variance (homoscedasticity)** *Perform revised multiple linear regression model . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Plot the pattern to the residual against the fitted (predicted) values with a reference line at y=0 (figure 7b) . rvfplot, yline(0) scheme(s1mono) * Conduct numerical tests: Breusch-Pagan / Cook-Weisberg test for heteroskedasticity (the Breusch-Pagan test) . estat hettest **5. independence** *Generate standardized residuals . predict r, resid *Plot the standardized residuals against the unit variable (Figure 8) . scatter r SimuID, scheme(s1mono) xtitle(SimuID) **Save file . save "OLS example data final.dta", replace

Appendix D: Full Model Specification for Results Interpretation

. use "OLS example data final.dta", clear **Test interaction effect between the focal independent variable and institutional control*** . regress log_X5PFYTUITION c.X5HSCRDERN##i.X3CONTROL i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Plot interaction effect of c.X5HSCRDERN*i.X3CONTROL (Figure 9a) . margins X3CONTROL, at (X5HSCRDERN = (1(1)21)) . marginsplot, scheme(s1mono) yscale(range(8.4 9.2)) **Test interaction effect between the focal independent variable and Hispanic race*** . gen Hispanic=0 . replace Hispanic=1 if X2RACE==3 . regress log_X5PFYTUITION c.X5HSCRDERN i.X2RACE c.X5HSCRDERN#i.Hispanic i.X2SEX i.X2FAMINCOME i.X2PAREDU i.dependent X3TGPAWGT i.X2STUEDEXPCT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.S2COSTATTEND i.finaidexp i.X5PFYENRLSTAT *Plot interaction effect of c.X5HSCRDERN*i.Hispanic (Figure 9b) . margins Hispanic, at (X5HSCRDERN = (1(1)21)) . marginsplot, scheme(s1mono) yscale(range(8.4 9.2)) *Generate coefficients for Hispanic and non-Hispanic group . margins, dydx(X5HSCRDERN) at(Hispanic=(0 1)) ***Subgroup analysis *Generate a scatterplot with 95% CI for non-Hispanic Students Subgroup (Figure 10a) . twoway (lfitci log_X5PFYTUITION X5HSCRDERN if Hispanic==0) (scatter log_X5PFYTUITION X5HSCRDERN if Hispanic==0, msymbol(o)), legend(order(1 "95% CI" 2 "Fitted values" 3 "non-Hispanic Students")) xtitle("The number of dual credits earned in high school") ytitle("Tuition and Fees Charged", size(small)) graphregion(color(white)) scheme(s1mono) *Generate a scatterplot with 95% CI for Hispanic Students Subgroup (Figure 10b) . twoway (lfitci log_X5PFYTUITION X5HSCRDERN if Hispanic==1) (scatter log_X5PFYTUITION X5HSCRDERN if Hispanic==1, msymbol(o)), legend(order(1 "95% CI" 2 "Fitted values" 3 "Hispanic Students")) xtitle("The number of dual credits earned in high school") ytitle("Tuition and Fees Charged", size(small)) graphregion(color(white)) scheme(s1mono) *Run multiple linear regression for the non-Hispanic student subgroup . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT if Hispanic==0 . est store nonHispanicStudents *Run multiple linear regression for the Hispanic student subgroup . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT if Hispanic==1 . est store HispanicStudents *Compare regression coefficients across groups . suest nonHispanicStudents HispanicStudents . test [nonHispanicStudents_mean]X5HSCRDERN = [HispanicStudents_mean]X5HSCRDERN *Generate output tables with user-written command outreg2 (Table 6) . outreg2 [nonHispanicStudents HispanicStudents] using regsub_output, stats(coef se) alpha(0.001, 0.01, 0.05) asterisk(coef) dec(3) adjr2 replace . seeout **Final Model Specifications** *Include the dependent variable and the focal independent variable . regress log_X5PFYTUITION X5HSCRDERN . est store Model1 *Add the block of sociodemographic variables . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent . est store Model2 *Add the block of cost expectation variables . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT . est store Model3 *Add the block of academic performance variables . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB . est store Model4 *Add the block of high school characteristics variables . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL . est store Model5 *Add the block of enrollment intensity variables . regress log_X5PFYTUITION X5HSCRDERN i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT . est store Model6 *Add the interaction term (Table 7) . regress log_X5PFYTUITION X5HSCRDERN c.X5HSCRDERN#i.Hispanic i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT . est store Model7 *Calculate robust standard errors . regress log_X5PFYTUITION X5HSCRDERN c.X5HSCRDERN#i.Hispanic i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT, vce(robust) . est store Model8 *Generate output tables with user-written command outreg2 (Table 8) . outreg2 [Model1 Model2 Model3 Model4 Model5 Model6 Model7 Model8] using reg_output, stats(coef se) alpha(0.001, 0.01, 0.05) asterisk(coef) dec(3) adjr2 replace . seeout *Generate standardized beta . regress log_X5PFYTUITION X5HSCRDERN c.X5HSCRDERN#i.Hispanic i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT, beta *Generate predicted values holding covariates at their means . regress log_X5PFYTUITION X5HSCRDERN c.X5HSCRDERN#i.Hispanic i.X2SEX i.X2RACE i.X2FAMINCOME i.X2PAREDU i.dependent i.S2COSTATTEND i.finaidexp i.X2STUEDEXPCT X3TGPAWGT X3TCREDAPIB i.X3LOCALE i.X3CONTROL i.X5PFYENRLSTAT . margins, at(X5HSCRDERN =(3 6 9 12 15 18 30)) atmeans

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Hu, X. (2024). Using Ordinary Least Squares in Higher Education Research: A Primer. In: Perna, L.W. (eds) Higher Education: Handbook of Theory and Research. Higher Education: Handbook of Theory and Research, vol 39. Springer, Cham. https://doi.org/10.1007/978-3-031-38077-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-38077-8_13
Published: 01 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38076-1
Online ISBN: 978-3-031-38077-8
eBook Packages: EducationReference Module Humanities and Social SciencesReference Module Education

Publish with us

Policies and ethics

Using Ordinary Least Squares in Higher Education Research: A Primer

Abstract

Access this chapter

Similar content being viewed by others

Partial Least Squares: The Gestation Period

This fast car can move faster: a review of PLS-SEM application in higher education research

Concepts and Applications of Multivariate Multilevel (MVML) Analysis and Multilevel Structural Equation Modeling (MLSEM)

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Appendices

Appendix A: Data Preparation for Illustration Replication

Appendix B: HSLS Variables Used for the Illustrated Example

Appendix C: Diagnosis and Data Recoding to Address Assumption Violations

Appendix D: Full Model Specification for Results Interpretation

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Using Ordinary Least Squares in Higher Education Research: A Primer

Abstract

Access this chapter

Similar content being viewed by others

Partial Least Squares: The Gestation Period

This fast car can move faster: a review of PLS-SEM application in higher education research

Concepts and Applications of Multivariate Multilevel (MVML) Analysis and Multilevel Structural Equation Modeling (MLSEM)

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Appendices

Appendix A: Data Preparation for Illustration Replication

Appendix B: HSLS Variables Used for the Illustrated Example

Appendix C: Diagnosis and Data Recoding to Address Assumption Violations

Appendix D: Full Model Specification for Results Interpretation

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Share this entry

Publish with us

Search

Navigation