research-article

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

Authors:
Nithya Sambasivan

Google Research, United States

Google Research, United States
View Profile

,
Shivani Kapania

Google Research India, India

Google Research India, India
View Profile

,
Hannah Highfill

Google Inc., United States

Google Inc., United States
View Profile

,
Diana Akrong

Google Research Accra, Ghana

Google Research Accra, Ghana
View Profile

,
Praveen Paritosh

Google, United States

Google, United States
View Profile

,
Lora M Aroyo

Google, United States

Google, United States
View Profile

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing SystemsMay 2021Article No.: 39Pages 1–15https://doi.org/10.1145/3411764.3445518

Published:07 May 2021Publication History

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Pages 1–15

ABSTRACT

AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations. Paradoxically, data is the most under-valued and de-glamorised aspect of AI. In this paper, we report on data practices in high-stakes AI, from interviews with 53 AI practitioners in India, East and West African countries, and USA. We define, identify, and present empirical evidence on Data Cascades—compounding events causing negative, downstream effects from data issues—triggered by conventional AI/ML practices that undervalue data quality. Data cascades are pervasive (92% prevalence), invisible, delayed, but often avoidable. We discuss HCI opportunities in designing and incentivizing data excellence as a first-class citizen of AI, resulting in safer and more robust systems for all.

References

[n.d.]. 2019 Kaggle ML & DS Survey | Kaggle. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/29/2020).Google Scholar
[n.d.]. AI Readiness Index 2019 | AI4D | IAPD. https://ai4d.ai/index2019/. (Accessed on 09/14/2020).Google Scholar
[n.d.]. Landscape of AI-ML Research in India. http://www.itihaasa.com/pdf/Report_Final_ES.pdf. (Accessed on 09/15/2020).Google Scholar
[n.d.]. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php. (Accessed on 09/15/2020).Google Scholar
[n.d.]. A Vision of AI for Joyful Education - Scientific American Blog Network. https://blogs.scientificamerican.com/observations/a-vision-of-ai-for-joyful-education/. (Accessed on 09/14/2020).Google Scholar
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291–300.Google ScholarDigital Library
Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.Google ScholarDigital Library
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. (2016). arXiv:1606.06565Google Scholar
Appen. 2020. The 2020 Machine Learning Report and State of AI. https://appen.com/whitepapers/the-state-of-ai-and-machine-learning-report/. (Accessed on 09/16/2020).Google Scholar
Lora Aroyo, Lucas Dixon, Nithum Thain, Olivia Redfield, and Rachel Rosen. 2019. Crowdsourcing subjective tasks: the case study of understanding toxicity in online discussions. In Companion Proceedings of The 2019 World Wide Web Conference. 1100–1105.Google ScholarDigital Library
[11] Lora Aroyo, Anca Dumitrache, Jennimaria Palomaki, Praveen Paritosh, Alex Quinn, Olivia Rhinehart, Mike Schaekermann, Michael Tseng, and Chris Welty.[n.d.]. https://sadworkshop.wordpress.com/Google Scholar
Lora Aroyo and Chris Welty. 2014. The Three Sides of CrowdTruth. Human Computation 1, 1 (Sep. 2014). https://doi.org/10.15346/hc.v1i1.34Google ScholarCross Ref
Lora Aroyo and Chris Welty. 2015. Truth Is a Lie: Crowd Truth and the Seven Myths of Human Annotation. AI Magazine 36, 1 (Mar. 2015), 15–24. https://doi.org/10.1609/aimag.v36i1.2564Google ScholarDigital Library
Jonathan Bailey. 2019. Why Siraj Raval’s Plagiarism is the Future of Plagiarism - Plagiarism Today. https://www.plagiarismtoday.com/2019/10/16/why-siraj-ravals-plagiarism-is-the-future-of-plagiarism/. (Accessed on 09/15/2020).Google Scholar
Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2–11.Google ScholarCross Ref
Anja Bechmann and Geoffrey C Bowker. 2019. Unsupervised by any other name: Hidden layers of knowledge production in artificial intelligence on social media. Big Data & Society 6, 1 (2019), 2053951718819569.Google ScholarCross Ref
Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarDigital Library
Yoshua Bengio. 2020. Time to rethink the publication process in machine learning - Yoshua Bengio. https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publication-process-in-machine-learning/. (Accessed on 08/18/2020).Google Scholar
Anant Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J Elmore, Samuel Madden, and Aditya G Parameswaran. 2014. Datahub: Collaborative data science & dataset version management at scale. (2014). arXiv:1409.0798Google Scholar
Joshua Blumenstock. 2018. Don’t forget people in the use of big data for development.Google Scholar
Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2019. Data validation for machine learning. In Conference on Systems and Machine Learning (SysML). https://www. sysml. cc/doc/2019/167. pdf.Google Scholar
Waylon Brunette, Clarice Larson, Shourya Jain, Aeron Langford, Yin Yin Low, Andrew Siew, and Richard Anderson. 2020. Global goods software for the immunization cold chain. In Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies. 208–218.Google ScholarDigital Library
Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and where: A characterization of data provenance. In International conference on database theory. Springer, 316–330.Google ScholarCross Ref
Andrew Burt and Patrick Hall. 2020. What to Do When AI Fails – O’Reilly. https://www.oreilly.com/radar/what-to-do-when-ai-fails/. (Accessed on 09/16/2020).Google Scholar
Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative crowdsourcing for labeling machine learning datasets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2334–2346.Google ScholarDigital Library
Kuang Chen, Joseph M Hellerstein, and Tapan S Parikh. 2011. Data in the First Mile.. In CIDR. Citeseer, 203–206.Google Scholar
Xu Chu, Ihab F Ilyas, and Paolo Papotti. 2013. Holistic data cleaning: Putting violations into context. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 458–469.Google ScholarDigital Library
Josh Cowls, Thomas King, Mariarosaria Taddeo, and Luciano Floridi. 2019. Designing AI for social good: Seven essential factors. Available at SSRN 3388669(2019).Google Scholar
Ward Cunningham. 1992. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger 4, 2 (1992), 29–30.Google ScholarDigital Library
Florian Daniel, Pavel Kucherbaev, Cinzia Cappiello, Boualem Benatallah, and Mohammad Allahbakhsh. 2018. Quality control in crowdsourcing: A survey of quality attributes, assessment techniques, and assurance actions. ACM Computing Surveys (CSUR) 51, 1 (2018), 1–40.Google ScholarDigital Library
Maria De-Arteaga, William Herlands, Daniel B Neill, and Artur Dubrawski. 2018. Machine learning for the developing world. ACM Transactions on Management Information Systems (TMIS) 9, 2(2018), 1–14.Google ScholarDigital Library
Alan Dix, Alan John Dix, Janet Finlay, Gregory D Abowd, and Russell Beale. 2003. Human-computer interaction. Pearson Education.Google Scholar
Farzana Dudhwala and Lotta Björklund Larsen. 2019. Recalibration in counting and accounting practices: Dealing with algorithmic output in public and private. Big Data & Society 6, 2 (2019), 2053951719858751.Google ScholarCross Ref
Hamid Ekbia and Bonnie Nardi. 2014. Heteromation and its (dis) contents: The invisible division of labor between humans and machines. First Monday (2014).Google Scholar
Melanie Feinberg. 2017. A design perspective on data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2952–2963.Google ScholarDigital Library
Kathleen Fisher and Robert Gruber. 2005. PADS: a domain-specific language for processing ad hoc data. ACM Sigplan Notices 40, 6 (2005), 295–304.Google ScholarDigital Library
Luciano Floridi, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, 2018. AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines 28, 4 (2018), 689–707.Google ScholarDigital Library
Andrew Forward and Timothy C Lethbridge. 2002. The relevance of software documentation, tools and technologies: a survey. In Proceedings of the 2002 ACM symposium on Document engineering. 26–33.Google ScholarDigital Library
Martin Fowler. 2019. TechnicalDebt. https://martinfowler.com/bliki/TechnicalDebt.html. (Accessed on 09/16/2020).Google Scholar
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2018. Datasheets for datasets. (2018). arXiv:1803.09010Google Scholar
Lisa Gitelman. 2013. Raw data is an oxymoron. MIT press.Google Scholar
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.Google Scholar
Laura M Haas, Mauricio A Hernández, Howard Ho, Lucian Popa, and Mary Roth. 2005. Clio grows up: from research prototype to industrial tool. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 805–810.Google ScholarDigital Library
Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8–12.Google ScholarDigital Library
Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, 2018. Applied machine learning at facebook: A datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 620–629.Google ScholarCross Ref
Benjamin Heinzerling. 2020. NLP’s Clever Hans Moment has Arrived. https://thegradient.pub/nlps-clever-hans-moment-has-arrived/Google Scholar
Keith Hiatt, Michael Kleinman, and Mark Latonero. [n.d.]. Tech folk: ’Move fast and break things’ doesn’t work when lives are at stake | The Guardian. https://www.theguardian.com/global-development-professionals-network/2017/feb/02/technology-human-rights. (Accessed on 08/25/2020).Google Scholar
Charles Hill, Rachel Bellamy, Thomas Erickson, and Margaret Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 162–170.Google ScholarCross Ref
J Hirschberg. 1998. Every time I fire a linguist, my performance goes up, and other myths of the statistical natural language processing revolution. Invited talk. In Fifteenth National Conference on Artificial Intelligence (AAAI-98).Google Scholar
Chien-Ju Ho, Aleksandrs Slivkins, Siddharth Suri, and Jennifer Wortman Vaughan. 2015. Incentivizing high quality crowdwork. In Proceedings of the 24th International Conference on World Wide Web. 419–429.Google ScholarDigital Library
Victoria Hodge and Jim Austin. 2004. A survey of outlier detection methodologies. Artificial intelligence review 22, 2 (2004), 85–126.Google Scholar
Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and Visualizing Data Iteration in Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarDigital Library
Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2020. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. (2020). arXiv:2010.13561Google Scholar
Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop.Google Scholar
John PA Ioannidis, Sander Greenland, Mark A Hlatky, Muin J Khoury, Malcolm R Macleod, David Moher, Kenneth F Schulz, and Robert Tibshirani. 2014. Increasing value and reducing waste in research design, conduct, and analysis. The Lancet 383, 9912 (2014), 166–175.Google Scholar
Lilly Irani. 2015. The cultural work of microwork. New Media & Society 17, 5 (2015), 720–739.Google ScholarCross Ref
Lilly C Irani and M Six Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI conference on human factors in computing systems. 611–620.Google ScholarDigital Library
Azra Ismail and Neha Kumar. 2018. Engaging solidarity in data collection practices for community health. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–24.Google ScholarDigital Library
Ayush Jain, Akash Das Sarma, Aditya Parameswaran, and Jennifer Widom. 2017. Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. (2017). arXiv:1701.06207Google Scholar
Kaggle. 2019. 2019 Kaggle ML & DS Survey. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/27/2020).Google Scholar
Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarDigital Library
Sean Kandel, Andreas Paepcke, Joseph M Hellerstein, and Jeffrey Heer. 2012. Enterprise data analysis and visualization: An interview study. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2917–2926.Google ScholarDigital Library
Sasikiran Kandula and Jeffrey Shaman. 2019. Reappraising the utility of Google Flu Trends. PLoS computational biology 15, 8 (2019), e1007258.Google Scholar
Hannah Kerner. [n.d.]. Too many AI researchers think real-world problems are not relevant | MIT Technology Review. https://www.technologyreview.com/2020/08/18/1007196/ai-research-machine-learning-applications-problems-opinion/. (Accessed on 08/18/2020).Google Scholar
Mary Beth Kery, Amber Horvath, and Brad A Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists.. In CHI, Vol. 10. 3025453–3025626.Google ScholarDigital Library
Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2017. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2017), 1024–1038.Google ScholarCross Ref
Ákos Kiss and Tamás Szirányi. 2013. Evaluation of manually created ground truth for multi-view people localization. In Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications. 1–6.Google ScholarDigital Library
Laura Koesten, Kathleen Gregory, Paul Groth, and Elena Simperl. 2019. Talking datasets: Understanding data sensemaking behaviours. (2019). arXiv:1911.09041Google Scholar
Laura Koesten, Emilia Kacprzak, Jeni Tennison, and Elena Simperl. 2019. Collaborative Practices with Structured Data: Do Tools Support What Users Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarDigital Library
Sanjay Krishnan, Michael J Franklin, Ken Goldberg, and Eugene Wu. 2017. Boostclean: Automated error detection and repair for machine learning. (2017). arXiv:1711.01299Google Scholar
Sanjay Krishnan, Daniel Haas, Michael J Franklin, and Eugene Wu. 2016. Towards reliable interactive data cleaning: A user survey and recommendations. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–5.Google ScholarDigital Library
Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J Franklin, and Ken Goldberg. 2016. Activeclean: Interactive data cleaning for statistical modeling. Proceedings of the VLDB Endowment 9, 12 (2016), 948–959.Google ScholarDigital Library
David Lazer and Ryan Kennedy. 2015. What We Can Learn From the Epic Failure of Google Flu Trends | WIRED. https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/. (Accessed on 08/27/2020).Google Scholar
Zachary C Lipton and Jacob Steinhardt. 2018. Troubling trends in machine learning scholarship. (2018). arXiv:1807.03341Google Scholar
Maria Littmann, Katharina Selig, Liel Cohen-Lavi, Yotam Frank, Peter Hönigschmid, Evans Kataka, Anja Mösch, Kun Qian, Avihai Ron, Sebastian Schmid, 2020. Validity of machine learning in biology and medicine increased through collaborations across fields of expertise. Nature Machine Intelligence(2020), 1–7.Google Scholar
Raoni Lourenço, Juliana Freire, and Dennis Shasha. 2019. Debugging machine learning pipelines. In Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning. 1–10.Google ScholarDigital Library
Yaoli Mao, Dakuo Wang, Michael Muller, Kush R Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilović. 2019. How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question?Proceedings of the ACM on Human-Computer Interaction 3, GROUP(2019), 1–23.Google Scholar
Gary Marcus. 2018. Deep learning: A critical appraisal. (2018). arXiv:1801.00631Google Scholar
David Martin, Benjamin V Hanrahan, Jacki O’Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. 224–235.Google ScholarDigital Library
Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.Google ScholarDigital Library
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. (2019). arXiv:1908.09635Google Scholar
Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, 2014. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging 34, 10 (2014), 1993–2024.Google Scholar
Tim Menzies. 2019. The five laws of SE for AI. IEEE Software 37, 1 (2019), 81–85.Google ScholarDigital Library
Hannah Miller and Richard Stirling. 2019. Government AI Readiness Index 2019 — Oxford Insights — Oxford Insights. https://www.oxfordinsights.com/ai-readiness2019. (Accessed on 09/14/2020).Google Scholar
Naja Holten Møller, Claus Bossen, Kathleen H Pine, Trine Rask Nielsen, and Gina Neff. 2020. Who does the work of data?Interactions 27, 3 (2020), 52–55.Google Scholar
Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How data science workers work with data: Discovery, capture, curation, design, creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarDigital Library
Tadhg Nagle, C. Thomas Redman, and David Sammon. 2017. Only 3% of Companies’ Data Meets Basic Quality Standards. https://hbr.org/2017/09/only-3-of-companies-data-meets-basic-quality-standards. (Accessed on 08/27/2020).Google Scholar
Safiya Umoja Noble. 2018. Algorithms of oppression: How search engines reinforce racism. NYU Press.Google Scholar
Lawrence A Palinkas, Sarah M Horwitz, Carla A Green, Jennifer P Wisdom, Naihua Duan, and Kimberly Hoagwood. 2015. Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Administration and policy in mental health and mental health services research 42, 5 (2015), 533–544.Google Scholar
Praveen Paritosh. 2018. The missing science of knowledge curation: improving incentives for large-scale knowledge curation. In Companion Proceedings of the The Web Conference 2018. 1105–1106.Google ScholarDigital Library
Praveen Paritosh, Kurt Bollacker, Maria Stone, Lora Aroyo, and Sarah Luger. 2020. Evaluating Evaluation of AI Systems (Meta-Eval 2020). http://eval.how/aaai-2020/. (Accessed on 09/16/2020).Google Scholar
Praveen Paritosh, Matt Lease, Mike Schaekermann, and Lora Aroyo. 2020. First workshop on Data Excellence (DEW 2020). http://eval.how/dew2020/. (Accessed on 09/16/2020).Google Scholar
Samir Passi and Steven Jackson. 2017. Data vision: Learning to see through algorithmic abstraction. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2436–2447.Google ScholarDigital Library
Samir Passi and Steven J Jackson. 2018. Trust in data science: collaboration, translation, and accountability in corporate data science projects. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–28.Google ScholarDigital Library
Samir Passi and Phoebe Sengers. 2020. Making data science systems work. Big Data & Society 7, 2 (2020), 2053951720939605.Google ScholarCross Ref
Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 667–676.Google ScholarDigital Library
James W Pennebaker. 2011. The secret life of pronouns. New Scientist 211, 2828 (2011), 42–45.Google Scholar
Fahad Pervaiz, Aditya Vashistha, and Richard Anderson. 2019. Examining the challenges in development data pipeline. In Proceedings of the 2nd ACM SIGCAS Conference on Computing and Sustainable Societies. 13–21.Google ScholarDigital Library
Kathleen H Pine and Max Liboiron. 2015. The politics of measurement and action. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3147–3156.Google ScholarDigital Library
Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2017. Data Management Challenges in Production Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA, 1723–1726. https://doi.org/10.1145/3035918.3054782Google ScholarDigital Library
Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2018. Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Record 47, 2 (2018), 17–28.Google ScholarDigital Library
Vijayshankar Raman and Joseph M Hellerstein. 2001. Potter’s wheel: An interactive data cleaning system. In VLDB, Vol. 1. 381–390.Google Scholar
Thomas C. Redman. 2018. If Your Data Is Bad, Your Machine Learning Tools Are Useless. https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-uselessGoogle Scholar
Rashida Richardson, Jason M Schultz, and Kate Crawford. 2019. Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. NYUL Rev. Online 94(2019), 15.Google Scholar
Jeffrey Saltz, Michael Skirpan, Casey Fiesler, Micha Gorelick, Tom Yeh, Robert Heckman, Neil Dewar, and Nathan Beard. 2019. Integrating ethics within machine learning courses. ACM Transactions on Computing Education (TOCE) 19, 4 (2019), 1–26.Google ScholarDigital Library
Jeffrey S Saltz and Nancy W Grady. 2017. The ambiguity of data science team roles and the need for a data science workforce framework. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2355–2361.Google ScholarCross Ref
Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. 2021. Re-imagining Algorithmic Fairness in India and Beyond. In ACM FaccT.Google Scholar
Nithya Sambasivan, Garen Checkley, Amna Batool, Nova Ahmed, David Nemer, Laura Sanely Gaytán-Lugo, Tara Matthews, Sunny Consolvo, and Elizabeth Churchill. 2018. ” Privacy is not for me, it’s for those rich women”: Performative Privacy Practices on Mobile Phones by Women in South Asia. In Fourteenth Symposium on Usable Privacy and Security ({SOUPS} 2018). 127–142.Google Scholar
Nithya Sambasivan and Jess Holbrook. 2018. Toward responsible AI for the next billion users. interactions 26, 1 (2018), 68–71.Google Scholar
Morgan Klaus Scheuerman, Jacob M Paul, and Jed R Brubaker. 2019. How computers see gender: An evaluation of gender classification in commercial facial analysis services. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–33.Google ScholarDigital Library
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In Advances in neural information processing systems. 2503–2511.Google Scholar
David Sculley, Jasper Snoek, Alex Wiltschko, and Ali Rahimi. 2018. Winner’s curse? On pace, progress, and empirical rigor. (2018).Google Scholar
Zheyuan Ryan Shi, Claire Wang, and Fei Fang. 2020. Artificial Intelligence for Social Good: A Survey. arxiv:2001.01818 [cs.CY]Google Scholar
David Soergel, Adam Saunders, and Andrew McCallum. 2013. Open Scholarship and Peer Review: a Time for Experimentation. (2013).Google Scholar
Eliza Strickland. 2019. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectrum 56, 4 (2019), 24–31.Google ScholarCross Ref
Iryna Susha, Åke Grönlund, and Rob Van Tulder. 2019. Data driven social partnerships: Exploring an emergent trend in search of research challenges and questions. Government Information Quarterly 36, 1 (2019), 112–128.Google ScholarCross Ref
Astra Taylor. 2018. The Automation Charade. https://logicmag.io/failure/the-automation-charade/.Google Scholar
Alex S. Taylor, Siân Lindley, Tim Regan, David Sweeney, Vasillis Vlachokyriakos, Lillie Grainger, and Jessica Lingel. 2015. Data-in-Place: Thinking through the Relations Between Data and Community(CHI ’15). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2702123.2702558Google ScholarDigital Library
David R Thomas. 2006. A general inductive approach for analyzing qualitative evaluation data. American journal of evaluation 27, 2 (2006), 237–246.Google Scholar
Rachel Thomas and David Uminsky. 2020. The Problem with Metrics is a Fundamental Problem for AI. (2020). arXiv:2002.08512Google Scholar
Nenad Tomašev, Julien Cornebise, Frank Hutter, Shakir Mohamed, Angela Picciariello, Bec Connelly, Danielle CM Belgrave, Daphne Ezer, Fanny Cachat van der Haert, Frank Mugisha, 2020. AI for social good: unlocking the opportunity for positive impact. Nature Communications 11, 1 (2020), 1–6.Google ScholarCross Ref
Jennifer Wortman Vaughan. 2017. Making better use of the crowd: How crowdsourcing can advance machine learning research. The Journal of Machine Learning Research 18, 1 (2017), 7026–7071.Google ScholarDigital Library
Janet Vertesi and Paul Dourish. 2011. The value of data: considering the context of production in data economies. In Proceedings of the ACM 2011 conference on Computer supported cooperative work. 533–542.Google ScholarDigital Library
Bret Victor. 2013. Media for Thinking the Unthinkable. http://worrydream.com/MediaForThinkingTheUnthinkable/. (Accessed on 09/15/2020).Google Scholar
Kiri Wagstaff. 2012. Machine learning that matters. (2012). arXiv:1206.4656Google Scholar
Sarah Myers West, Meredith Whittaker, and Kate Crawford. 2019. Discriminating systems: Gender, race and power in AI. AI Now Institute (2019), 1–33.Google Scholar
Amy X Zhang, Michael Muller, and Dakuo Wang. 2020. How do data science workers collaborate? roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1(2020), 1–23.Google ScholarDigital Library
Jing Zhang, Xindong Wu, and Victor S Sheng. 2016. Learning from crowdsourced labeled data: a survey. Artificial Intelligence Review 46, 4 (2016), 543–576.Google ScholarDigital Library
Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering(2020).Google ScholarDigital Library

Index Terms

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI
1. Applied computing
2. Human-centered computing

Index terms have been assigned to the content through auto-classification.

Recommendations

Open data and information sharing in developing nations
dg.o '14: Proceedings of the 15th Annual International Conference on Digital Government Research

The potential positive impacts of open data and information sharing initiatives, along with their attendant challenges, have captured the attention and imagination of governments around the world, in both industrialized and developing countries. ...
Read More
Public Health Calls for/with AI: An Ethnographic Perspective
CSCW

Artificial Intelligence (AI) based technologies are increasingly being integrated into public sector programs to help with decision-support and effective distribution of constrained resources. The field of Computer Supported Cooperative Work (CSCW) has ...
Read More
Artificially Intelligent Technology for the Margins: A Multidisciplinary Design Agenda
CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems

There has been increasing interest in socially just use of Artificial Intelligence (AI) and Machine Learning (ML) in the development of technology that may be extended to marginalized people. However, the exploration of such technologies entails the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
May 2021
10862 pages
ISBN:9781450380966
DOI:10.1145/3411764
General Chairs:
Yoshifumi Kitamura
Tohoku University, Japan
,
Aaron Quigley
University of New South Wales, Australia
,
Program Chairs:
Katherine Isbister
University of California Santa Cruz, USA
,
Takeo Igarashi
The University of Tokyo, Japan
,
Publications Chairs:
Pernille Bjørn
University of Copenhagen, Denmark
,
Steven Drucker
Microsoft Research, USA
Copyright © 2021 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 May 2021
Check for updates
Badges
- Best Paper
Author Tags
AI
Data
Ghana
India
Kenya
ML
Nigeria
USA
Uganda
application-domain experts
data cascades
data collectors
data politics
data quality
developers
high-stakes AI
raters
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate6,199of26,314submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 6,967
  Total Downloads
- Downloads (Last 12 months)2,481
- Downloads (Last 6 weeks)324
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Open data and information sharing in developing nations

Public Health Calls for/with AI: An Ethnographic Perspective

Artificially Intelligent Technology for the Margins: A Multidisciplinary Design Agenda