skip to main content
10.1145/3593013.3594100acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

Published:12 June 2023Publication History

ABSTRACT

When determining which machine learning model best performs some high impact risk assessment task, practitioners commonly use the Area under the Curve (AUC) to defend and validate their model choices. In this paper, we argue that the current use and understanding of AUC as a model performance metric misunderstands the way the metric was intended to be used. To this end, we characterize the misuse of AUC and illustrate how this misuse negatively manifests in the real world across several risk assessment domains. We locate this disconnect in the way the original interpretation of AUC has shifted over time to the point where issues pertaining to decision thresholds, class balance, statistical uncertainty, and protected groups remain unaddressed by AUC-based model comparisons, and where model choices that should be the purview of policymakers are hidden behind the veil of mathematical rigor. We conclude that current model validation practices involving AUC are not robust, and often invalid.

References

  1. J. Khadijah Abdurahman. 2021. Calculating the Souls of Black Folk: Predictive Analytics in the New York City Administration for Children’s Services. Columbia Journal of Race and Law 11, 4 (2021), 75–110.Google ScholarGoogle ScholarCross RefCross Ref
  2. Rediet Abebe, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghavan, and David G. Robinson. 2020. Roles for Computing in Social Change. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 252–260. https://doi.org/10.1145/3351095.3372871Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N M Adams and D J Hand. 1999. Comparing classifiers when the misallocation costs are uncertain. Pattern Recognit. 32, 7 (July 1999), 1139–1147.Google ScholarGoogle ScholarCross RefCross Ref
  4. American Civil Liberties Union, Center on Race, Inequality, and the Law at NYU Law, The Justice Roundtable, The Leadership Conference Education Fund, The Leadership Conference on Civil and Human Rights, Media Mobilizing Project, Upturn. 2019. Comment Letter to Department of Justice on PATTERN First Step Act. https://civilrights.org/resource/comment-letter-to-department-of-justice-on-pattern-first-step-act/.Google ScholarGoogle Scholar
  5. Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May 23, 2016.Google ScholarGoogle Scholar
  6. Robert Balfanz and Vaughan Byrnes. 2019. Early warning indicators and intervention systems: State of the field. Handbook of student engagement interventions 2010 (2019), 45–55.Google ScholarGoogle Scholar
  7. Michelle Bao, Angela Zhou, Samantha A. Zottola, Brian Brubach, Sarah L. Desmarais, Aaron Horowitz, Kristian Lum, and Suresh Venkatasubramanian. 2021. It’s COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks. ArXiv abs/2106.05498 (2021).Google ScholarGoogle Scholar
  8. Ehsan Bokhari. 2014. Dangerous Predictions: Evaluation Methods for and Consequences of Predicting Dangerous Behavior.Ph. D. Dissertation. University of Illinois at Urbana-Champaign.Google ScholarGoogle Scholar
  9. Ehsan Bokhari. 2023. Clinical (In) Efficiency in the Prediction of Dangerous Behavior. Journal of Educational and Behavioral Statistics 0 (2023), 10769986221144727.Google ScholarGoogle ScholarCross RefCross Ref
  10. Alex J Bowers. 2021. Early warning systems and indicators of dropping out of upper secondary school: the emerging role of digital technologies., 173 pages. "https://www.oecd-ilibrary.org/education/oecd-digital-education-outlook-2021_c8e57e15-enGoogle ScholarGoogle Scholar
  11. Alex J. Bowers. 2021. Early warning systems and indicators of dropping out of upper secondary school: the emerging role of digital technologies. https://www.oecd-ilibrary.org/content/component/c8e57e15-enGoogle ScholarGoogle Scholar
  12. Alex J. Bowers, Ryan A. Sprott, and Sherry A. Taff. 2013. Do We Know Who Will Drop Out?: A Review of the Predictors of Dropping out of High School: Precision, Sensitivity, and Specificity. The High School Journal 96 (2013), 100 – 77.Google ScholarGoogle ScholarCross RefCross Ref
  13. Alex J Bowers and Xiaoliang Zhou. 2019. Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR) 24, 1 (2019), 20–46.Google ScholarGoogle ScholarCross RefCross Ref
  14. Andrew P Bradley. 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition 30, 7 (1997), 1145–1159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tim Brennan, William Dieterich, and Beate Ehret. 2009. Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System. Crim. Justice Behav. 36, 1 (Jan. 2009), 21–40.Google ScholarGoogle ScholarCross RefCross Ref
  16. Julie Bruch, Jonathan Gellar, Lindsay Cattell, John Hotchkiss, and Phil Killewald. 2020. Using data from schools and child welfare agencies to predict near-term academic risks. Technical Report. Mathematica Policy Research.Google ScholarGoogle Scholar
  17. Julie Bruch, Jonathan Gellar, Lindsay Cattell, John Hotchkiss, Phil Killewald, 2020. Using data from schools and child welfare agencies to predict near-term academic risks: Appendices. Technical Report. Mathematica Policy Research.Google ScholarGoogle Scholar
  18. Brandon Buskey and Marissa Gerchick. 2022. ACLU Statement on the PATTERN Risk Assessment Tool. https://www.aclu.org/other/aclu-statement-pattern-risk-assessment-tool.Google ScholarGoogle Scholar
  19. Lindsay Cattell and Julie Bruch. 2021. Identifying students at risk using prior performance versus a machine learning algorithm. Technical Report. Mathematica Policy Research.Google ScholarGoogle Scholar
  20. National Consumer Law Center. 2016. Past Imperfect: How Credit Scores and Other Analytics “Bake In” and Perpetuate Past Discrimination. Technical Report. National Consumer Law Center. https://www.nclc.org/wp-content/uploads/2022/09/Past_Imperfect.pdfGoogle ScholarGoogle Scholar
  21. Thomas H Cohen, Christopher T Lowenkamp, and William E Hicks. 2018. Revalidating the federal pretrial risk assessment instrument (PTRA): A research summary. Fed. Probation 82 (2018), 23.Google ScholarGoogle Scholar
  22. Chad Coleman, Ryan S Baker, and Shonte Stephenson. 2019. A Better Cold-Start for Early Prediction of Student At-Risk Status in New School Districts.https://eric.ed.gov/?id=ED599170Google ScholarGoogle Scholar
  23. Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic Decision Making and the Cost of Fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD ’17). Association for Computing Machinery, New York, NY, USA, 797–806. https://doi.org/10.1145/3097983.3098095Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ethan Corey. 2020. NEW DATA SUGGESTS RISK ASSESSMENT TOOLS HAVE LITTLE IMPACT ON PRETRIAL INCARCERATION. https://theappeal.org/new-data-suggests-risk-assessment-tools-have-little-impact-on-pretrial-incarceration.Google ScholarGoogle Scholar
  25. Corinna Cortes and Mehryar Mohri. 2003. AUC optimization vs. error rate minimization. Advances in neural information processing systems 16 (2003).Google ScholarGoogle Scholar
  26. Wisconsin Supreme Court. 2016. Wisconsin v. Loomis, 881 N.W.2d (Wis. 2016).Google ScholarGoogle Scholar
  27. Mona JE Danner, Marie VanNostrand, and Lisa M Spruance. 2015. Risk-based pretrial release recommendation and supervision guidelines: Exploring the effect on officer recommendations, judicial decision-making, and pretrial outcome. Technical Report. Luminosity.Google ScholarGoogle Scholar
  28. Mona JE Danner, Marie VanNostrand, and Lisa M Spruance. 2016. Race and gender neutral pretrial risk assessment, release recommendations, and supervision: VPRAI and PRAXIS revised.Google ScholarGoogle Scholar
  29. Fernando Delgado, Solon Barocas, and Karen Levy. 2022. An uncommon task: Participatory design in legal AI. Proceedings of the ACM on Human-Computer Interaction 6, CSCW1 (2022), 1–23.Google ScholarGoogle ScholarCross RefCross Ref
  30. Sarah Desmarais and Jay Singh. 2013. Risk assessment instruments validated and implemented in correctional settings in the United States. Technical Report. Council of State Governments.Google ScholarGoogle Scholar
  31. William Dieterich, Christina Mendoza, and Tim Brennan. 2016. COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Northpointe Inc 7, 4 (2016).Google ScholarGoogle Scholar
  32. Lori E Dodd and Margaret S Pepe. 2003. Partial AUC estimation and regression. Biometrics 59, 3 (2003), 614–623.Google ScholarGoogle ScholarCross RefCross Ref
  33. Laurel Eckhouse, Kristian Lum, Cynthia Conti-Cook, and Julie Ciccolini. 2019. Layers of bias: A unified approach for understanding problems with risk assessment. Criminal Justice and Behavior 46, 2 (2019), 185–209.Google ScholarGoogle ScholarCross RefCross Ref
  34. equivant. 2019. Practitioner’s Guide to COMPAS Core. https://www.equivant.com/wp-content/uploads/Practitioners-Guide-to-COMPAS-Core-040419.pdfGoogle ScholarGoogle Scholar
  35. Virginia Eubanks. 2018. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Press, Inc., USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tom Fawcett. 2004. ROC graphs: Notes and practical considerations for researchers. Machine learning 31, 1 (2004), 1–38.Google ScholarGoogle Scholar
  37. Seena Fazel, Matthias Burghart, Thomas Fanshawe, Sharon Danielle Gil, John Monahan, and Rongqin Yu. 2022. The predictive performance of criminal risk assessment tools used at sentencing: Systematic review of validation studies. Journal of Criminal Justice 81 (2022).Google ScholarGoogle Scholar
  38. Todd Feathers. 2023. False Alarm: How Wisconsin Uses Race and Income to Label Students “High Risk”. https://themarkup.org/machine-learning/2023/04/27/false-alarm-how-wisconsin-uses-race-and-income-to-label-students-high-riskGoogle ScholarGoogle Scholar
  39. Stephanie K Glaberson. 2019. Coding over the cracks: predictive analytics and child protection. Fordham Urb. LJ 46 (2019), 307.Google ScholarGoogle Scholar
  40. Dan Goldhaber, Malcolm Wolff, and Timothy Daly. 2020. Assessing the Accuracy of Elementary School Test Scores as Predictors of Students’ High School Outcomes.Google ScholarGoogle Scholar
  41. Ames Grawert and Patricia Richman. 2022. The First Step Act’s Prison Reforms: Uneven Implementation and the Path Forward. https://www.brennancenter.org/our-work/research-reports/first-step-acts-prison-reforms.Google ScholarGoogle Scholar
  42. Ben Green. 2021. Data science as political action: grounding data science in a politics of justice. Journal of Social Computing 2, 3 (2021), 249–265.Google ScholarGoogle ScholarCross RefCross Ref
  43. David J Hand. 2009. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine learning 77, 1 (2009), 103–123.Google ScholarGoogle Scholar
  44. David J Hand and Robert J Till. 2001. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine learning 45 (2001), 171–186.Google ScholarGoogle Scholar
  45. James A Hanley and Barbara J McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve.Radiology 143, 1 (1982), 29–36.Google ScholarGoogle Scholar
  46. J A Hanley and B J McNeil. 1983. A method of comparing the areas under receiver operating characteristic curves derived from the same cases.Radiology 148, 3 (1983), 839–843. https://doi.org/10.1148/radiology.148.3.6878708 PMID: 6878708.Google ScholarGoogle Scholar
  47. L Maaike Helmus and Kelly M Babchishin. 2017. Primer on Risk Assessment and the Statistics Used to Evaluate Its Accuracy. Crim. Justice Behav. 44, 1 (Jan. 2017), 8–25.Google ScholarGoogle Scholar
  48. Emily M Homer and Benjamin W Fisher. 2020. Police in schools and student arrest rates across the United States: Examining differences by race, ethnicity, and gender. Journal of school violence 19, 2 (2020), 192–204.Google ScholarGoogle ScholarCross RefCross Ref
  49. Northpointe Inc. 2022. Technical Manual For COMPAS-R Core.Google ScholarGoogle Scholar
  50. Harold Jordan. 2013. Beyond Zero Tolerance: Discipline and Policing in Pennsylvania Public Schools. https://pubintlaw.org/wp-content/uploads/2013/05/beyond-zero-tolerance-ACLU.pdf.Google ScholarGoogle Scholar
  51. Harold Jordan and Ghadah Makoshi. 2022. STUDENT ARRESTS IN ALLEGHENY COUNTY PUBLIC SCHOOLS: The Need for Transparency and Accountability. https://www.endzerotolerance.org/student-arrest-report.Google ScholarGoogle Scholar
  52. Nathan Kallus and Angela Zhou. 2019. The fairness of risk scores beyond classification: Bipartite ranking and the xauc metric. Advances in neural information processing systems 32 (2019).Google ScholarGoogle Scholar
  53. Jared Knowles. 2015. Of Needles and Haystacks: Building an Accurate Statewide Dropout Early Warning System in Wisconsin. Journal of Educational Data Mining 07, 03 (2015).Google ScholarGoogle Scholar
  54. Elizabeth Laird, Hugh Grant-Chapman, Cody Venzke, and Hannah Quay de la Vallee. 2022. Hidden Harms: The Misleading Promise of Monitoring Students Online. https://cdt.org/insights/report-hidden-harms-the-misleading-promise-of-monitoring-students-online/.Google ScholarGoogle Scholar
  55. Sunbok Lee and Jae Young Chung. 2019. The Machine Learning-Based Dropout Early Warning System for Improving the Performance of Dropout Prediction. Applied Sciences 9, 15 (2019). https://doi.org/10.3390/app9153093Google ScholarGoogle Scholar
  56. Kiki Leutner, Josh Liff, Lindsey Zuloaga, and Nathan Mondragon. 2021. HireVue Assessment Science. https://webapi.hirevue.com/wp-content/uploads/2021/11/2021_10_HireVue_Assessment_Science_white_paper-FINAL-1.pdfGoogle ScholarGoogle Scholar
  57. Charles X. Ling, Jin Huang, and Harry Zhang. 2003. AUC: A Better Measure than Accuracy in Comparing Learning Algorithms. In Advances in Artificial Intelligence, Yang Xiang and Brahim Chaib-draa (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 329–341.Google ScholarGoogle Scholar
  58. Jorge M Lobo, Alberto Jiménez-Valverde, and Raimundo Real. 2008. AUC: a misleading measure of the performance of predictive distribution models. Global ecology and Biogeography 17, 2 (2008), 145–151.Google ScholarGoogle Scholar
  59. Brian Lovins and Lori Lovins. 2015. Riverside Pretrial Assistance to California Counties (PACC) Project: Validation of a Pretrial Risk Assessment Tool. Technical Report. Correctional Consultants Inc.Google ScholarGoogle Scholar
  60. Christopher T Lowenkamp. 2009. The development of an actuarial risk assessment instrument for US pretrial services. Fed. Probation 73 (2009), 33.Google ScholarGoogle Scholar
  61. Barbara J McNeil and James A Hanley. 1984. Statistical approaches to the analysis of receiver operating characteristic (ROC) curves. Medical decision making 4, 2 (1984), 137–150.Google ScholarGoogle Scholar
  62. Ryan Meekins, Stephen Adams, Peter A. Beling, Kevin Farinholt, Nathan Hipwell, Ali Chaudhry, Sherwood Polter, and Qing Dong. 2018. Cost-sensitive Classifier Selection when there is Additional Cost Information. Proceedings of The International Workshop on Cost-Sensitive Learning 88 (05 May 2018), 17–30. https://proceedings.mlr.press/v88/meekins18a.htmlGoogle ScholarGoogle Scholar
  63. Charles E. Metz. 1978. Basic principles of ROC analysis. Seminars in Nuclear Medicine 8, 4 (1978), 283–298. https://doi.org/10.1016/S0001-2998(78)80014-2Google ScholarGoogle ScholarCross RefCross Ref
  64. Charles E Metz and Helen B Kronman. 1980. Statistical significance tests for binormal ROC curves. Journal of Mathematical Psychology 22, 3 (1980), 218–243. https://doi.org/10.1016/0022-2496(80)90020-6Google ScholarGoogle ScholarCross RefCross Ref
  65. Harikrishna Narasimhan, Andrew Cotter, Maya R. Gupta, and Serena Wang. 2019. Pairwise Fairness for Ranking and Regression. https://arxiv.org/abs/1906.05330Google ScholarGoogle Scholar
  66. Hina Naveed. 2022. “If I Wasn’t Poor, I Wouldn’t Be Unfit" The Family Separation Crisis in the US Child Welfare System. https://www.aclu.org/report/if-i-wasnt-poor-i-wouldnt-be-unfit-family-separation-crisis-us-child-welfare-system.Google ScholarGoogle Scholar
  67. Department of Justice. April 2022. First Step Act Annual Report. https://www.ojp.gov/first-step-act-annual-report-april-2022.Google ScholarGoogle Scholar
  68. National Institute of Justice. December 2021. 2021 Review and Revalidation of the First Step Act Risk Assessment Tool. https://nij.ojp.gov/library/publications/2021-review-and-revalidation-first-step-act-risk-assessment-tool.Google ScholarGoogle Scholar
  69. National Institute of Justice. January 2020. The First Step Act of 2018: Risk and needs assessment system – Update. https://www.ojp.gov/First-Step-Act-of-2018-Risk-and-Needs-Assessment-System-UPDATE.Google ScholarGoogle Scholar
  70. National Institute of Justice. January 2021. 2020 Review and Revalidation of the First Step Act Risk Assessment Tool. https://www.ojp.gov/pdffiles1/nij/256084.pdf.Google ScholarGoogle Scholar
  71. National Institute of Justice. March 2023. 2022 Review and Revalidation of the First Step Act Risk Assessment Tool. https://nij.ojp.gov/library/publications/2022-review-and-revalidation-first-step-act-risk-assessment-tool.Google ScholarGoogle Scholar
  72. US Department of Justice. 2019. The first step act of 2018: Risk and needs assessment system.Google ScholarGoogle Scholar
  73. Partnership on Artificial Intelligence. 2020. Algorithmic Risk Assessment and COVID-19: Why PATTERN Should Not Be Used. https://partnershiponai.org/wp-content/uploads/2021/07/Why-PATTERN-Should-Not-Be-Used.pdf.Google ScholarGoogle Scholar
  74. Juan C Perdomo, Tolani Britton, Moritz Hardt, and Rediet Abebe. 2023. Difficult Lessons on Social Prediction from Wisconsin Public Schools.Google ScholarGoogle Scholar
  75. Foster Provost and Tom Fawcett. 1997. Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions., 6 pages.Google ScholarGoogle Scholar
  76. Foster Provost and Tom Fawcett. 2000. Robust Classification for Imprecise Environments. https://doi.org/10.48550/ARXIV.CS/0009007Google ScholarGoogle Scholar
  77. Foster J. Provost, Tom Fawcett, and Ron Kohavi. 1998. The Case against Accuracy Estimation for Comparing Induction Algorithms., 9 pages.Google ScholarGoogle Scholar
  78. Emily Putnam-Hornstein, Rhema Vaithianathan, Jacquelyn McCroskey, and Daniel Webster. 2022. Los Angeles County Risk Stratification Model: Methodology & Implementation Report. https://dcfs.lacounty.gov/wp-content/uploads/2022/08/Risk-Stratification-Methodology-Report_8.29.22.pdfGoogle ScholarGoogle Scholar
  79. Chelsea Sierra Queen. 2022. Predictive Utility of the El Paso Pretrial Risk Assessment Instrument-Revised (EPPRA-R). Ph. D. Dissertation. The University of Texas at El Paso.Google ScholarGoogle Scholar
  80. Marnie E Rice and Grant T Harris. 2005. Comparing effect sizes in follow-up studies: ROC Area, Cohen’s d, and r. Law Hum. Behav. 29, 5 (Oct. 2005), 615–620.Google ScholarGoogle ScholarCross RefCross Ref
  81. Katherine Rittenhouse, Emily Putnam-Hornstein, and Rhema Vaithianathan. 2022. Algorithms, Humans, and Racial Disparities in Child Protective Services: Evidence from the Allegheny Family Screening Tool. https://krittenh.github.io/katherine-rittenhouse.com/Rittenhouse_Algorithms.pdfGoogle ScholarGoogle Scholar
  82. Dorothy Roberts. 2009. Shattered bonds: The color of child welfare. Civitas Books, New York, NY, USA.Google ScholarGoogle Scholar
  83. Dorothy Roberts. 2022. Torn Apart: How the Child Welfare System Destroys Black Families–and How Abolition Can Build a Safer World. Basic Books, New York, NY, USA.Google ScholarGoogle Scholar
  84. Lorena Rodriguez. 2020. ALL DATA IS NOT CREDIT DATA. Columbia Law Review 120, 7 (2020), 1843–1884.Google ScholarGoogle Scholar
  85. Saharon Rosset. 2004. Model Selection via the AUC. In Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Alberta, Canada) (ICML ’04). Association for Computing Machinery, New York, NY, USA, 89. https://doi.org/10.1145/1015330.1015400Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Anjana Samant, Aaron Horowitz, Kath Xu, and Sophie Beiers. 2022. Family Surveillance by Algorithm: The Rapidly Spreading Tools Few Have Heard Of. https://www.aclu.org/fact-sheet/family-surveillance-algorithmGoogle ScholarGoogle Scholar
  87. Devansh Saxena, Karla Badillo-Urquiola, Pamela J. Wisniewski, and Shion Guha. 2020. A Human-Centered Review of Algorithms Used within the U.S. Child Welfare System. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3313831.3376229Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Pittsburgh Public Schools. 2019. On Track To Equity: Integrating Equity Throughout PPS. https://www.pghschools.org/equity.Google ScholarGoogle Scholar
  89. Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19). Association for Computing Machinery, New York, NY, USA, 59–68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Jay P Singh, Sarah L Desmarais, and Richard A Van Dorn. 2013. Measurement of predictive validity in violence risk assessment studies: A second-order systematic review. Behav. Sci. Law 31, 1 (2013), 55–73.Google ScholarGoogle ScholarCross RefCross Ref
  91. Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, and Haiyi Zhu. 2022. Imagining New Futures beyond Predictive Systems in Child Welfare: A Qualitative Study with Impacted Stakeholders. In 2022 ACM Conference on Fairness, Accountability, and Transparency (Seoul, Republic of Korea) (FAccT ’22). Association for Computing Machinery, New York, NY, USA, 1162–1177. https://doi.org/10.1145/3531146.3533177Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Megan T Stevenson and Sandra G Mayson. 2022. Pretrial detention and the value of liberty. Va. L. Rev. 108 (2022), 709.Google ScholarGoogle Scholar
  93. John A. Swets. 1988. Measuring the Accuracy of Diagnostic Systems. Science 240, 4857 (1988), 1285–1293. https://doi.org/10.1126/science.3287615 arXiv:https://www.science.org/doi/pdf/10.1126/science.3287615Google ScholarGoogle Scholar
  94. Victoria A Terranova. 2020. Colorado Pretrial Assessment Tool Validation Final Report. Technical Report. University of Northern Colorado.Google ScholarGoogle Scholar
  95. American Civil Liberties Union. 2019. Cops and No Counselors: How the Lack of School Mental Health Staff Is Harming Students. https://www.aclu.org/report/cops-and-no-counselors.Google ScholarGoogle Scholar
  96. Office of Civil Rights U.S. Department of Education. 2021. An Overview of Exclusionary Discipline Practices in Public Schools for the 2017-18 School Year. https://www2.ed.gov/about/offices/list/ocr/data.html.Google ScholarGoogle Scholar
  97. Evaluation U.S. Department of Education Office of Planning, Policy Development Policy, and Program Studies Service. 2016. Issue Brief: Early Warning Systems. https://www2.ed.gov/rschstat/eval/high-school/early-warning-systems-brief.pdfGoogle ScholarGoogle Scholar
  98. Rhema Vaithianathan, Haley Dinh, Allon Kalisher, Chamari Kithulgoda, Emily Kulick, Megh Mayur, Athena Ning, and Diana Benavides Prado. 2019. Implementing a Child Welfare Decision Aide in Douglas County: Methodology Report.Google ScholarGoogle Scholar
  99. Rhema Vaithianathan, Emily Kulick, Emily Putnam-Hornstein, and D Benavides-Prado. 2019. Allegheny family screening tool: Methodology, version 2. Technical Report. Center for Social Data Analytics. https://www.alleghenycountyanalytics.us/wp-content/uploads/2019/05/Methodology-V2-from-16-ACDHS-26_PredictiveRisk_Package_050119_FINAL-7.pdfGoogle ScholarGoogle Scholar
  100. Rhema Vaithianathan, Emily Putnam-Hornstein, Nan Jiang, Parma Nand, and Tim Maloney. 2017. Developing predictive models to support child maltreatment hotline screening decisions: Allegheny County methodology and implementation. Technical Report. Center for Social Data Analytics. https://www.alleghenycountyanalytics.us/wp-content/uploads/2019/05/Methodology-V1-from-16-ACDHS-26_PredictiveRisk_Package_050119_FINAL.pdfGoogle ScholarGoogle Scholar
  101. Jan Y Verbakel, Ewout W Steyerberg, Hajime Uno, Bavo De Cock, Laure Wynants, Gary S Collins, and Ben Van Calster. 2020. ROC curves for clinical prediction models part 1. ROC plots showed no added value above the AUC when evaluating the performance of clinical prediction models. J. Clin. Epidemiol. 126 (Oct. 2020), 207–216.Google ScholarGoogle ScholarCross RefCross Ref
  102. Emma Williams. 2020. ‘Family Regulation,’ Not ‘Child Welfare’: Abolition Starts with Changing our Language. https://imprintnews.org/opinion/family-regulation-not-child-welfare-abolition-starts-changing-language/45586#0Google ScholarGoogle Scholar
  103. Tianbao Yang and Yiming Ying. 2022. AUC maximization in the era of big data and AI: A survey. Comput. Surveys 55, 8 (2022), 1–37.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
    June 2023
    1929 pages
    ISBN:9798400701924
    DOI:10.1145/3593013

    Copyright © 2023 Owner/Author

    This work is licensed under a Creative Commons Attribution-NoDerivatives International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 12 June 2023

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)244
    • Downloads (Last 6 weeks)19

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format