research-article

The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

Authors:
Kweku Kwegyir-Aggrey

Brown University, USA

Brown University, USA

0000-0002-6971-7355
View Profile

,
Marissa Gerchick

American Civil Liberties Union, USA

American Civil Liberties Union, USA

0009-0007-3831-8961
View Profile

,
Malika Mohan

American Civil Liberties Union, USA

American Civil Liberties Union, USA

0009-0005-7553-4116
View Profile

,
Aaron Horowitz

American Civil Liberties Union, USA

American Civil Liberties Union, USA

0000-0001-7931-8756
View Profile

,
Suresh Venkatasubramanian

Brown University, USA

Brown University, USA

0000-0001-7679-7130
View Profile

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and TransparencyJune 2023Pages 1570–1583https://doi.org/10.1145/3593013.3594100

Published:12 June 2023Publication History

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

Pages 1570–1583

ABSTRACT

When determining which machine learning model best performs some high impact risk assessment task, practitioners commonly use the Area under the Curve (AUC) to defend and validate their model choices. In this paper, we argue that the current use and understanding of AUC as a model performance metric misunderstands the way the metric was intended to be used. To this end, we characterize the misuse of AUC and illustrate how this misuse negatively manifests in the real world across several risk assessment domains. We locate this disconnect in the way the original interpretation of AUC has shifted over time to the point where issues pertaining to decision thresholds, class balance, statistical uncertainty, and protected groups remain unaddressed by AUC-based model comparisons, and where model choices that should be the purview of policymakers are hidden behind the veil of mathematical rigor. We conclude that current model validation practices involving AUC are not robust, and often invalid.

References

J. Khadijah Abdurahman. 2021. Calculating the Souls of Black Folk: Predictive Analytics in the New York City Administration for Children’s Services. Columbia Journal of Race and Law 11, 4 (2021), 75–110.Google ScholarCross Ref
Rediet Abebe, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghavan, and David G. Robinson. 2020. Roles for Computing in Social Change. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 252–260. https://doi.org/10.1145/3351095.3372871Google ScholarDigital Library
N M Adams and D J Hand. 1999. Comparing classifiers when the misallocation costs are uncertain. Pattern Recognit. 32, 7 (July 1999), 1139–1147.Google ScholarCross Ref
American Civil Liberties Union, Center on Race, Inequality, and the Law at NYU Law, The Justice Roundtable, The Leadership Conference Education Fund, The Leadership Conference on Civil and Human Rights, Media Mobilizing Project, Upturn. 2019. Comment Letter to Department of Justice on PATTERN First Step Act. https://civilrights.org/resource/comment-letter-to-department-of-justice-on-pattern-first-step-act/.Google Scholar
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May 23, 2016.Google Scholar
Robert Balfanz and Vaughan Byrnes. 2019. Early warning indicators and intervention systems: State of the field. Handbook of student engagement interventions 2010 (2019), 45–55.Google Scholar
Michelle Bao, Angela Zhou, Samantha A. Zottola, Brian Brubach, Sarah L. Desmarais, Aaron Horowitz, Kristian Lum, and Suresh Venkatasubramanian. 2021. It’s COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks. ArXiv abs/2106.05498 (2021).Google Scholar
Ehsan Bokhari. 2014. Dangerous Predictions: Evaluation Methods for and Consequences of Predicting Dangerous Behavior.Ph. D. Dissertation. University of Illinois at Urbana-Champaign.Google Scholar
Ehsan Bokhari. 2023. Clinical (In) Efficiency in the Prediction of Dangerous Behavior. Journal of Educational and Behavioral Statistics 0 (2023), 10769986221144727.Google ScholarCross Ref
Alex J Bowers. 2021. Early warning systems and indicators of dropping out of upper secondary school: the emerging role of digital technologies., 173 pages. "https://www.oecd-ilibrary.org/education/oecd-digital-education-outlook-2021_c8e57e15-enGoogle Scholar
Alex J. Bowers. 2021. Early warning systems and indicators of dropping out of upper secondary school: the emerging role of digital technologies. https://www.oecd-ilibrary.org/content/component/c8e57e15-enGoogle Scholar
Alex J. Bowers, Ryan A. Sprott, and Sherry A. Taff. 2013. Do We Know Who Will Drop Out?: A Review of the Predictors of Dropping out of High School: Precision, Sensitivity, and Specificity. The High School Journal 96 (2013), 100 – 77.Google ScholarCross Ref
Alex J Bowers and Xiaoliang Zhou. 2019. Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR) 24, 1 (2019), 20–46.Google ScholarCross Ref
Andrew P Bradley. 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition 30, 7 (1997), 1145–1159.Google ScholarDigital Library
Tim Brennan, William Dieterich, and Beate Ehret. 2009. Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System. Crim. Justice Behav. 36, 1 (Jan. 2009), 21–40.Google ScholarCross Ref
Julie Bruch, Jonathan Gellar, Lindsay Cattell, John Hotchkiss, and Phil Killewald. 2020. Using data from schools and child welfare agencies to predict near-term academic risks. Technical Report. Mathematica Policy Research.Google Scholar
Julie Bruch, Jonathan Gellar, Lindsay Cattell, John Hotchkiss, Phil Killewald, 2020. Using data from schools and child welfare agencies to predict near-term academic risks: Appendices. Technical Report. Mathematica Policy Research.Google Scholar
Brandon Buskey and Marissa Gerchick. 2022. ACLU Statement on the PATTERN Risk Assessment Tool. https://www.aclu.org/other/aclu-statement-pattern-risk-assessment-tool.Google Scholar
Lindsay Cattell and Julie Bruch. 2021. Identifying students at risk using prior performance versus a machine learning algorithm. Technical Report. Mathematica Policy Research.Google Scholar
National Consumer Law Center. 2016. Past Imperfect: How Credit Scores and Other Analytics “Bake In” and Perpetuate Past Discrimination. Technical Report. National Consumer Law Center. https://www.nclc.org/wp-content/uploads/2022/09/Past_Imperfect.pdfGoogle Scholar
Thomas H Cohen, Christopher T Lowenkamp, and William E Hicks. 2018. Revalidating the federal pretrial risk assessment instrument (PTRA): A research summary. Fed. Probation 82 (2018), 23.Google Scholar
Chad Coleman, Ryan S Baker, and Shonte Stephenson. 2019. A Better Cold-Start for Early Prediction of Student At-Risk Status in New School Districts.https://eric.ed.gov/?id=ED599170Google Scholar
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic Decision Making and the Cost of Fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD ’17). Association for Computing Machinery, New York, NY, USA, 797–806. https://doi.org/10.1145/3097983.3098095Google ScholarDigital Library
Ethan Corey. 2020. NEW DATA SUGGESTS RISK ASSESSMENT TOOLS HAVE LITTLE IMPACT ON PRETRIAL INCARCERATION. https://theappeal.org/new-data-suggests-risk-assessment-tools-have-little-impact-on-pretrial-incarceration.Google Scholar
Corinna Cortes and Mehryar Mohri. 2003. AUC optimization vs. error rate minimization. Advances in neural information processing systems 16 (2003).Google Scholar
Wisconsin Supreme Court. 2016. Wisconsin v. Loomis, 881 N.W.2d (Wis. 2016).Google Scholar
Mona JE Danner, Marie VanNostrand, and Lisa M Spruance. 2015. Risk-based pretrial release recommendation and supervision guidelines: Exploring the effect on officer recommendations, judicial decision-making, and pretrial outcome. Technical Report. Luminosity.Google Scholar
Mona JE Danner, Marie VanNostrand, and Lisa M Spruance. 2016. Race and gender neutral pretrial risk assessment, release recommendations, and supervision: VPRAI and PRAXIS revised.Google Scholar
Fernando Delgado, Solon Barocas, and Karen Levy. 2022. An uncommon task: Participatory design in legal AI. Proceedings of the ACM on Human-Computer Interaction 6, CSCW1 (2022), 1–23.Google ScholarCross Ref
Sarah Desmarais and Jay Singh. 2013. Risk assessment instruments validated and implemented in correctional settings in the United States. Technical Report. Council of State Governments.Google Scholar
William Dieterich, Christina Mendoza, and Tim Brennan. 2016. COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Northpointe Inc 7, 4 (2016).Google Scholar
Lori E Dodd and Margaret S Pepe. 2003. Partial AUC estimation and regression. Biometrics 59, 3 (2003), 614–623.Google ScholarCross Ref
Laurel Eckhouse, Kristian Lum, Cynthia Conti-Cook, and Julie Ciccolini. 2019. Layers of bias: A unified approach for understanding problems with risk assessment. Criminal Justice and Behavior 46, 2 (2019), 185–209.Google ScholarCross Ref
equivant. 2019. Practitioner’s Guide to COMPAS Core. https://www.equivant.com/wp-content/uploads/Practitioners-Guide-to-COMPAS-Core-040419.pdfGoogle Scholar
Virginia Eubanks. 2018. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Press, Inc., USA.Google ScholarDigital Library
Tom Fawcett. 2004. ROC graphs: Notes and practical considerations for researchers. Machine learning 31, 1 (2004), 1–38.Google Scholar
Seena Fazel, Matthias Burghart, Thomas Fanshawe, Sharon Danielle Gil, John Monahan, and Rongqin Yu. 2022. The predictive performance of criminal risk assessment tools used at sentencing: Systematic review of validation studies. Journal of Criminal Justice 81 (2022).Google Scholar
Todd Feathers. 2023. False Alarm: How Wisconsin Uses Race and Income to Label Students “High Risk”. https://themarkup.org/machine-learning/2023/04/27/false-alarm-how-wisconsin-uses-race-and-income-to-label-students-high-riskGoogle Scholar
Stephanie K Glaberson. 2019. Coding over the cracks: predictive analytics and child protection. Fordham Urb. LJ 46 (2019), 307.Google Scholar
Dan Goldhaber, Malcolm Wolff, and Timothy Daly. 2020. Assessing the Accuracy of Elementary School Test Scores as Predictors of Students’ High School Outcomes.Google Scholar
Ames Grawert and Patricia Richman. 2022. The First Step Act’s Prison Reforms: Uneven Implementation and the Path Forward. https://www.brennancenter.org/our-work/research-reports/first-step-acts-prison-reforms.Google Scholar
Ben Green. 2021. Data science as political action: grounding data science in a politics of justice. Journal of Social Computing 2, 3 (2021), 249–265.Google ScholarCross Ref
David J Hand. 2009. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine learning 77, 1 (2009), 103–123.Google Scholar
David J Hand and Robert J Till. 2001. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine learning 45 (2001), 171–186.Google Scholar
James A Hanley and Barbara J McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve.Radiology 143, 1 (1982), 29–36.Google Scholar
J A Hanley and B J McNeil. 1983. A method of comparing the areas under receiver operating characteristic curves derived from the same cases.Radiology 148, 3 (1983), 839–843. https://doi.org/10.1148/radiology.148.3.6878708 PMID: 6878708.Google Scholar
L Maaike Helmus and Kelly M Babchishin. 2017. Primer on Risk Assessment and the Statistics Used to Evaluate Its Accuracy. Crim. Justice Behav. 44, 1 (Jan. 2017), 8–25.Google Scholar
Emily M Homer and Benjamin W Fisher. 2020. Police in schools and student arrest rates across the United States: Examining differences by race, ethnicity, and gender. Journal of school violence 19, 2 (2020), 192–204.Google ScholarCross Ref
Northpointe Inc. 2022. Technical Manual For COMPAS-R Core.Google Scholar
Harold Jordan. 2013. Beyond Zero Tolerance: Discipline and Policing in Pennsylvania Public Schools. https://pubintlaw.org/wp-content/uploads/2013/05/beyond-zero-tolerance-ACLU.pdf.Google Scholar
Harold Jordan and Ghadah Makoshi. 2022. STUDENT ARRESTS IN ALLEGHENY COUNTY PUBLIC SCHOOLS: The Need for Transparency and Accountability. https://www.endzerotolerance.org/student-arrest-report.Google Scholar
Nathan Kallus and Angela Zhou. 2019. The fairness of risk scores beyond classification: Bipartite ranking and the xauc metric. Advances in neural information processing systems 32 (2019).Google Scholar
Jared Knowles. 2015. Of Needles and Haystacks: Building an Accurate Statewide Dropout Early Warning System in Wisconsin. Journal of Educational Data Mining 07, 03 (2015).Google Scholar
Elizabeth Laird, Hugh Grant-Chapman, Cody Venzke, and Hannah Quay de la Vallee. 2022. Hidden Harms: The Misleading Promise of Monitoring Students Online. https://cdt.org/insights/report-hidden-harms-the-misleading-promise-of-monitoring-students-online/.Google Scholar
Sunbok Lee and Jae Young Chung. 2019. The Machine Learning-Based Dropout Early Warning System for Improving the Performance of Dropout Prediction. Applied Sciences 9, 15 (2019). https://doi.org/10.3390/app9153093Google Scholar
Kiki Leutner, Josh Liff, Lindsey Zuloaga, and Nathan Mondragon. 2021. HireVue Assessment Science. https://webapi.hirevue.com/wp-content/uploads/2021/11/2021_10_HireVue_Assessment_Science_white_paper-FINAL-1.pdfGoogle Scholar
Charles X. Ling, Jin Huang, and Harry Zhang. 2003. AUC: A Better Measure than Accuracy in Comparing Learning Algorithms. In Advances in Artificial Intelligence, Yang Xiang and Brahim Chaib-draa (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 329–341.Google Scholar
Jorge M Lobo, Alberto Jiménez-Valverde, and Raimundo Real. 2008. AUC: a misleading measure of the performance of predictive distribution models. Global ecology and Biogeography 17, 2 (2008), 145–151.Google Scholar
Brian Lovins and Lori Lovins. 2015. Riverside Pretrial Assistance to California Counties (PACC) Project: Validation of a Pretrial Risk Assessment Tool. Technical Report. Correctional Consultants Inc.Google Scholar
Christopher T Lowenkamp. 2009. The development of an actuarial risk assessment instrument for US pretrial services. Fed. Probation 73 (2009), 33.Google Scholar
Barbara J McNeil and James A Hanley. 1984. Statistical approaches to the analysis of receiver operating characteristic (ROC) curves. Medical decision making 4, 2 (1984), 137–150.Google Scholar
Ryan Meekins, Stephen Adams, Peter A. Beling, Kevin Farinholt, Nathan Hipwell, Ali Chaudhry, Sherwood Polter, and Qing Dong. 2018. Cost-sensitive Classifier Selection when there is Additional Cost Information. Proceedings of The International Workshop on Cost-Sensitive Learning 88 (05 May 2018), 17–30. https://proceedings.mlr.press/v88/meekins18a.htmlGoogle Scholar
Charles E. Metz. 1978. Basic principles of ROC analysis. Seminars in Nuclear Medicine 8, 4 (1978), 283–298. https://doi.org/10.1016/S0001-2998(78)80014-2Google ScholarCross Ref
Charles E Metz and Helen B Kronman. 1980. Statistical significance tests for binormal ROC curves. Journal of Mathematical Psychology 22, 3 (1980), 218–243. https://doi.org/10.1016/0022-2496(80)90020-6Google ScholarCross Ref
Harikrishna Narasimhan, Andrew Cotter, Maya R. Gupta, and Serena Wang. 2019. Pairwise Fairness for Ranking and Regression. https://arxiv.org/abs/1906.05330Google Scholar
Hina Naveed. 2022. “If I Wasn’t Poor, I Wouldn’t Be Unfit" The Family Separation Crisis in the US Child Welfare System. https://www.aclu.org/report/if-i-wasnt-poor-i-wouldnt-be-unfit-family-separation-crisis-us-child-welfare-system.Google Scholar
Department of Justice. April 2022. First Step Act Annual Report. https://www.ojp.gov/first-step-act-annual-report-april-2022.Google Scholar
National Institute of Justice. December 2021. 2021 Review and Revalidation of the First Step Act Risk Assessment Tool. https://nij.ojp.gov/library/publications/2021-review-and-revalidation-first-step-act-risk-assessment-tool.Google Scholar
National Institute of Justice. January 2020. The First Step Act of 2018: Risk and needs assessment system – Update. https://www.ojp.gov/First-Step-Act-of-2018-Risk-and-Needs-Assessment-System-UPDATE.Google Scholar
National Institute of Justice. January 2021. 2020 Review and Revalidation of the First Step Act Risk Assessment Tool. https://www.ojp.gov/pdffiles1/nij/256084.pdf.Google Scholar
National Institute of Justice. March 2023. 2022 Review and Revalidation of the First Step Act Risk Assessment Tool. https://nij.ojp.gov/library/publications/2022-review-and-revalidation-first-step-act-risk-assessment-tool.Google Scholar
US Department of Justice. 2019. The first step act of 2018: Risk and needs assessment system.Google Scholar
Partnership on Artificial Intelligence. 2020. Algorithmic Risk Assessment and COVID-19: Why PATTERN Should Not Be Used. https://partnershiponai.org/wp-content/uploads/2021/07/Why-PATTERN-Should-Not-Be-Used.pdf.Google Scholar
Juan C Perdomo, Tolani Britton, Moritz Hardt, and Rediet Abebe. 2023. Difficult Lessons on Social Prediction from Wisconsin Public Schools.Google Scholar
Foster Provost and Tom Fawcett. 1997. Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions., 6 pages.Google Scholar
Foster Provost and Tom Fawcett. 2000. Robust Classification for Imprecise Environments. https://doi.org/10.48550/ARXIV.CS/0009007Google Scholar
Foster J. Provost, Tom Fawcett, and Ron Kohavi. 1998. The Case against Accuracy Estimation for Comparing Induction Algorithms., 9 pages.Google Scholar
Emily Putnam-Hornstein, Rhema Vaithianathan, Jacquelyn McCroskey, and Daniel Webster. 2022. Los Angeles County Risk Stratification Model: Methodology & Implementation Report. https://dcfs.lacounty.gov/wp-content/uploads/2022/08/Risk-Stratification-Methodology-Report_8.29.22.pdfGoogle Scholar
Chelsea Sierra Queen. 2022. Predictive Utility of the El Paso Pretrial Risk Assessment Instrument-Revised (EPPRA-R). Ph. D. Dissertation. The University of Texas at El Paso.Google Scholar
Marnie E Rice and Grant T Harris. 2005. Comparing effect sizes in follow-up studies: ROC Area, Cohen’s d, and r. Law Hum. Behav. 29, 5 (Oct. 2005), 615–620.Google ScholarCross Ref
Katherine Rittenhouse, Emily Putnam-Hornstein, and Rhema Vaithianathan. 2022. Algorithms, Humans, and Racial Disparities in Child Protective Services: Evidence from the Allegheny Family Screening Tool. https://krittenh.github.io/katherine-rittenhouse.com/Rittenhouse_Algorithms.pdfGoogle Scholar
Dorothy Roberts. 2009. Shattered bonds: The color of child welfare. Civitas Books, New York, NY, USA.Google Scholar
Dorothy Roberts. 2022. Torn Apart: How the Child Welfare System Destroys Black Families–and How Abolition Can Build a Safer World. Basic Books, New York, NY, USA.Google Scholar
Lorena Rodriguez. 2020. ALL DATA IS NOT CREDIT DATA. Columbia Law Review 120, 7 (2020), 1843–1884.Google Scholar
Saharon Rosset. 2004. Model Selection via the AUC. In Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Alberta, Canada) (ICML ’04). Association for Computing Machinery, New York, NY, USA, 89. https://doi.org/10.1145/1015330.1015400Google ScholarDigital Library
Anjana Samant, Aaron Horowitz, Kath Xu, and Sophie Beiers. 2022. Family Surveillance by Algorithm: The Rapidly Spreading Tools Few Have Heard Of. https://www.aclu.org/fact-sheet/family-surveillance-algorithmGoogle Scholar
Devansh Saxena, Karla Badillo-Urquiola, Pamela J. Wisniewski, and Shion Guha. 2020. A Human-Centered Review of Algorithms Used within the U.S. Child Welfare System. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3313831.3376229Google ScholarDigital Library
Pittsburgh Public Schools. 2019. On Track To Equity: Integrating Equity Throughout PPS. https://www.pghschools.org/equity.Google Scholar
Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19). Association for Computing Machinery, New York, NY, USA, 59–68.Google ScholarDigital Library
Jay P Singh, Sarah L Desmarais, and Richard A Van Dorn. 2013. Measurement of predictive validity in violence risk assessment studies: A second-order systematic review. Behav. Sci. Law 31, 1 (2013), 55–73.Google ScholarCross Ref
Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, and Haiyi Zhu. 2022. Imagining New Futures beyond Predictive Systems in Child Welfare: A Qualitative Study with Impacted Stakeholders. In 2022 ACM Conference on Fairness, Accountability, and Transparency (Seoul, Republic of Korea) (FAccT ’22). Association for Computing Machinery, New York, NY, USA, 1162–1177. https://doi.org/10.1145/3531146.3533177Google ScholarDigital Library
Megan T Stevenson and Sandra G Mayson. 2022. Pretrial detention and the value of liberty. Va. L. Rev. 108 (2022), 709.Google Scholar
John A. Swets. 1988. Measuring the Accuracy of Diagnostic Systems. Science 240, 4857 (1988), 1285–1293. https://doi.org/10.1126/science.3287615 arXiv:https://www.science.org/doi/pdf/10.1126/science.3287615Google Scholar
Victoria A Terranova. 2020. Colorado Pretrial Assessment Tool Validation Final Report. Technical Report. University of Northern Colorado.Google Scholar
American Civil Liberties Union. 2019. Cops and No Counselors: How the Lack of School Mental Health Staff Is Harming Students. https://www.aclu.org/report/cops-and-no-counselors.Google Scholar
Office of Civil Rights U.S. Department of Education. 2021. An Overview of Exclusionary Discipline Practices in Public Schools for the 2017-18 School Year. https://www2.ed.gov/about/offices/list/ocr/data.html.Google Scholar
Evaluation U.S. Department of Education Office of Planning, Policy Development Policy, and Program Studies Service. 2016. Issue Brief: Early Warning Systems. https://www2.ed.gov/rschstat/eval/high-school/early-warning-systems-brief.pdfGoogle Scholar
Rhema Vaithianathan, Haley Dinh, Allon Kalisher, Chamari Kithulgoda, Emily Kulick, Megh Mayur, Athena Ning, and Diana Benavides Prado. 2019. Implementing a Child Welfare Decision Aide in Douglas County: Methodology Report.Google Scholar
Rhema Vaithianathan, Emily Kulick, Emily Putnam-Hornstein, and D Benavides-Prado. 2019. Allegheny family screening tool: Methodology, version 2. Technical Report. Center for Social Data Analytics. https://www.alleghenycountyanalytics.us/wp-content/uploads/2019/05/Methodology-V2-from-16-ACDHS-26_PredictiveRisk_Package_050119_FINAL-7.pdfGoogle Scholar
Rhema Vaithianathan, Emily Putnam-Hornstein, Nan Jiang, Parma Nand, and Tim Maloney. 2017. Developing predictive models to support child maltreatment hotline screening decisions: Allegheny County methodology and implementation. Technical Report. Center for Social Data Analytics. https://www.alleghenycountyanalytics.us/wp-content/uploads/2019/05/Methodology-V1-from-16-ACDHS-26_PredictiveRisk_Package_050119_FINAL.pdfGoogle Scholar
Jan Y Verbakel, Ewout W Steyerberg, Hajime Uno, Bavo De Cock, Laure Wynants, Gary S Collins, and Ben Van Calster. 2020. ROC curves for clinical prediction models part 1. ROC plots showed no added value above the AUC when evaluating the performance of clinical prediction models. J. Clin. Epidemiol. 126 (Oct. 2020), 207–216.Google ScholarCross Ref
Emma Williams. 2020. ‘Family Regulation,’ Not ‘Child Welfare’: Abolition Starts with Changing our Language. https://imprintnews.org/opinion/family-regulation-not-child-welfare-abolition-starts-changing-language/45586#0Google Scholar
Tianbao Yang and Yiming Ying. 2022. AUC maximization in the era of big data and AI: A survey. Comput. Surveys 55, 8 (2022), 1–37.Google ScholarDigital Library

Recommendations

Stakeholder Risk Assessment: An Outcome-Based Approach

Requirements engineering must manage the risks arising from project stakeholders. The Outcome-Based Stakeholder Risk Assessment Model (Obsram) provides guidance in stakeholder identification, identification of stakeholder impacts and perceptions, ...
Read More
AUC: a better measure than accuracy in comparing learning algorithms
AI'03: Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence

Predictive accuracy has been widely used as the main criterion for comparing the predictive ability of classification systems (such as C4.5, neural networks, and Naive Bayes). Most of these classifiers also produce probability estimations of the ...
Read More
Risk Management and Risk Assessment at ENISA: Issues and Challenges
ARES '06: Proceedings of the First International Conference on Availability, Reliability and Security

In this talk, the main directions followed in current and future work in the area of Risk Management and Risk Assessment at ENISA will be presented. The efforts in this area range from an initial inventory of Risk Management /Risk Assessment methods and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
June 2023
1929 pages
ISBN:9798400701924
DOI:10.1145/3593013

Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution-NoDerivatives International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2023
Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 244
  Total Downloads
- Downloads (Last 12 months)244
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Recommendations

Stakeholder Risk Assessment: An Outcome-Based Approach

AUC: a better measure than accuracy in comparing learning algorithms

Risk Management and Risk Assessment at ENISA: Issues and Challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Recommendations

Stakeholder Risk Assessment: An Outcome-Based Approach

AUC: a better measure than accuracy in comparing learning algorithms

Risk Management and Risk Assessment at ENISA: Issues and Challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media