skip to main content
10.1145/3411764.3445518acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Best Paper

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

Published:07 May 2021Publication History

ABSTRACT

AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations. Paradoxically, data is the most under-valued and de-glamorised aspect of AI. In this paper, we report on data practices in high-stakes AI, from interviews with 53 AI practitioners in India, East and West African countries, and USA. We define, identify, and present empirical evidence on Data Cascades—compounding events causing negative, downstream effects from data issues—triggered by conventional AI/ML practices that undervalue data quality. Data cascades are pervasive (92% prevalence), invisible, delayed, but often avoidable. We discuss HCI opportunities in designing and incentivizing data excellence as a first-class citizen of AI, resulting in safer and more robust systems for all.

References

  1. [n.d.]. 2019 Kaggle ML & DS Survey | Kaggle. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/29/2020).Google ScholarGoogle Scholar
  2. [n.d.]. AI Readiness Index 2019 | AI4D | IAPD. https://ai4d.ai/index2019/. (Accessed on 09/14/2020).Google ScholarGoogle Scholar
  3. [n.d.]. Landscape of AI-ML Research in India. http://www.itihaasa.com/pdf/Report_Final_ES.pdf. (Accessed on 09/15/2020).Google ScholarGoogle Scholar
  4. [n.d.]. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php. (Accessed on 09/15/2020).Google ScholarGoogle Scholar
  5. [n.d.]. A Vision of AI for Joyful Education - Scientific American Blog Network. https://blogs.scientificamerican.com/observations/a-vision-of-ai-for-joyful-education/. (Accessed on 09/14/2020).Google ScholarGoogle Scholar
  6. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291–300.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. (2016). arXiv:1606.06565Google ScholarGoogle Scholar
  9. Appen. 2020. The 2020 Machine Learning Report and State of AI. https://appen.com/whitepapers/the-state-of-ai-and-machine-learning-report/. (Accessed on 09/16/2020).Google ScholarGoogle Scholar
  10. Lora Aroyo, Lucas Dixon, Nithum Thain, Olivia Redfield, and Rachel Rosen. 2019. Crowdsourcing subjective tasks: the case study of understanding toxicity in online discussions. In Companion Proceedings of The 2019 World Wide Web Conference. 1100–1105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Lora Aroyo, Anca Dumitrache, Jennimaria Palomaki, Praveen Paritosh, Alex Quinn, Olivia Rhinehart, Mike Schaekermann, Michael Tseng, and Chris Welty.[n.d.]. https://sadworkshop.wordpress.com/Google ScholarGoogle Scholar
  12. Lora Aroyo and Chris Welty. 2014. The Three Sides of CrowdTruth. Human Computation 1, 1 (Sep. 2014). https://doi.org/10.15346/hc.v1i1.34Google ScholarGoogle ScholarCross RefCross Ref
  13. Lora Aroyo and Chris Welty. 2015. Truth Is a Lie: Crowd Truth and the Seven Myths of Human Annotation. AI Magazine 36, 1 (Mar. 2015), 15–24. https://doi.org/10.1609/aimag.v36i1.2564Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jonathan Bailey. 2019. Why Siraj Raval’s Plagiarism is the Future of Plagiarism - Plagiarism Today. https://www.plagiarismtoday.com/2019/10/16/why-siraj-ravals-plagiarism-is-the-future-of-plagiarism/. (Accessed on 09/15/2020).Google ScholarGoogle Scholar
  15. Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2–11.Google ScholarGoogle ScholarCross RefCross Ref
  16. Anja Bechmann and Geoffrey C Bowker. 2019. Unsupervised by any other name: Hidden layers of knowledge production in artificial intelligence on social media. Big Data & Society 6, 1 (2019), 2053951718819569.Google ScholarGoogle ScholarCross RefCross Ref
  17. Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yoshua Bengio. 2020. Time to rethink the publication process in machine learning - Yoshua Bengio. https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publication-process-in-machine-learning/. (Accessed on 08/18/2020).Google ScholarGoogle Scholar
  19. Anant Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J Elmore, Samuel Madden, and Aditya G Parameswaran. 2014. Datahub: Collaborative data science & dataset version management at scale. (2014). arXiv:1409.0798Google ScholarGoogle Scholar
  20. Joshua Blumenstock. 2018. Don’t forget people in the use of big data for development.Google ScholarGoogle Scholar
  21. Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2019. Data validation for machine learning. In Conference on Systems and Machine Learning (SysML). https://www. sysml. cc/doc/2019/167. pdf.Google ScholarGoogle Scholar
  22. Waylon Brunette, Clarice Larson, Shourya Jain, Aeron Langford, Yin Yin Low, Andrew Siew, and Richard Anderson. 2020. Global goods software for the immunization cold chain. In Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies. 208–218.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and where: A characterization of data provenance. In International conference on database theory. Springer, 316–330.Google ScholarGoogle ScholarCross RefCross Ref
  24. Andrew Burt and Patrick Hall. 2020. What to Do When AI Fails – O’Reilly. https://www.oreilly.com/radar/what-to-do-when-ai-fails/. (Accessed on 09/16/2020).Google ScholarGoogle Scholar
  25. Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative crowdsourcing for labeling machine learning datasets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2334–2346.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kuang Chen, Joseph M Hellerstein, and Tapan S Parikh. 2011. Data in the First Mile.. In CIDR. Citeseer, 203–206.Google ScholarGoogle Scholar
  27. Xu Chu, Ihab F Ilyas, and Paolo Papotti. 2013. Holistic data cleaning: Putting violations into context. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 458–469.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Josh Cowls, Thomas King, Mariarosaria Taddeo, and Luciano Floridi. 2019. Designing AI for social good: Seven essential factors. Available at SSRN 3388669(2019).Google ScholarGoogle Scholar
  29. Ward Cunningham. 1992. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger 4, 2 (1992), 29–30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Florian Daniel, Pavel Kucherbaev, Cinzia Cappiello, Boualem Benatallah, and Mohammad Allahbakhsh. 2018. Quality control in crowdsourcing: A survey of quality attributes, assessment techniques, and assurance actions. ACM Computing Surveys (CSUR) 51, 1 (2018), 1–40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Maria De-Arteaga, William Herlands, Daniel B Neill, and Artur Dubrawski. 2018. Machine learning for the developing world. ACM Transactions on Management Information Systems (TMIS) 9, 2(2018), 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Alan Dix, Alan John Dix, Janet Finlay, Gregory D Abowd, and Russell Beale. 2003. Human-computer interaction. Pearson Education.Google ScholarGoogle Scholar
  33. Farzana Dudhwala and Lotta Björklund Larsen. 2019. Recalibration in counting and accounting practices: Dealing with algorithmic output in public and private. Big Data & Society 6, 2 (2019), 2053951719858751.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hamid Ekbia and Bonnie Nardi. 2014. Heteromation and its (dis) contents: The invisible division of labor between humans and machines. First Monday (2014).Google ScholarGoogle Scholar
  35. Melanie Feinberg. 2017. A design perspective on data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2952–2963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kathleen Fisher and Robert Gruber. 2005. PADS: a domain-specific language for processing ad hoc data. ACM Sigplan Notices 40, 6 (2005), 295–304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Luciano Floridi, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, 2018. AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines 28, 4 (2018), 689–707.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Andrew Forward and Timothy C Lethbridge. 2002. The relevance of software documentation, tools and technologies: a survey. In Proceedings of the 2002 ACM symposium on Document engineering. 26–33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Martin Fowler. 2019. TechnicalDebt. https://martinfowler.com/bliki/TechnicalDebt.html. (Accessed on 09/16/2020).Google ScholarGoogle Scholar
  40. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2018. Datasheets for datasets. (2018). arXiv:1803.09010Google ScholarGoogle Scholar
  41. Lisa Gitelman. 2013. Raw data is an oxymoron. MIT press.Google ScholarGoogle Scholar
  42. Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.Google ScholarGoogle Scholar
  43. Laura M Haas, Mauricio A Hernández, Howard Ho, Lucian Popa, and Mary Roth. 2005. Clio grows up: from research prototype to industrial tool. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 805–810.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, 2018. Applied machine learning at facebook: A datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 620–629.Google ScholarGoogle ScholarCross RefCross Ref
  46. Benjamin Heinzerling. 2020. NLP’s Clever Hans Moment has Arrived. https://thegradient.pub/nlps-clever-hans-moment-has-arrived/Google ScholarGoogle Scholar
  47. Keith Hiatt, Michael Kleinman, and Mark Latonero. [n.d.]. Tech folk: ’Move fast and break things’ doesn’t work when lives are at stake | The Guardian. https://www.theguardian.com/global-development-professionals-network/2017/feb/02/technology-human-rights. (Accessed on 08/25/2020).Google ScholarGoogle Scholar
  48. Charles Hill, Rachel Bellamy, Thomas Erickson, and Margaret Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 162–170.Google ScholarGoogle ScholarCross RefCross Ref
  49. J Hirschberg. 1998. Every time I fire a linguist, my performance goes up, and other myths of the statistical natural language processing revolution. Invited talk. In Fifteenth National Conference on Artificial Intelligence (AAAI-98).Google ScholarGoogle Scholar
  50. Chien-Ju Ho, Aleksandrs Slivkins, Siddharth Suri, and Jennifer Wortman Vaughan. 2015. Incentivizing high quality crowdwork. In Proceedings of the 24th International Conference on World Wide Web. 419–429.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Victoria Hodge and Jim Austin. 2004. A survey of outlier detection methodologies. Artificial intelligence review 22, 2 (2004), 85–126.Google ScholarGoogle Scholar
  52. Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and Visualizing Data Iteration in Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2020. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. (2020). arXiv:2010.13561Google ScholarGoogle Scholar
  54. Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop.Google ScholarGoogle Scholar
  55. John PA Ioannidis, Sander Greenland, Mark A Hlatky, Muin J Khoury, Malcolm R Macleod, David Moher, Kenneth F Schulz, and Robert Tibshirani. 2014. Increasing value and reducing waste in research design, conduct, and analysis. The Lancet 383, 9912 (2014), 166–175.Google ScholarGoogle Scholar
  56. Lilly Irani. 2015. The cultural work of microwork. New Media & Society 17, 5 (2015), 720–739.Google ScholarGoogle ScholarCross RefCross Ref
  57. Lilly C Irani and M Six Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI conference on human factors in computing systems. 611–620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Azra Ismail and Neha Kumar. 2018. Engaging solidarity in data collection practices for community health. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Ayush Jain, Akash Das Sarma, Aditya Parameswaran, and Jennifer Widom. 2017. Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. (2017). arXiv:1701.06207Google ScholarGoogle Scholar
  60. Kaggle. 2019. 2019 Kaggle ML & DS Survey. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/27/2020).Google ScholarGoogle Scholar
  61. Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Sean Kandel, Andreas Paepcke, Joseph M Hellerstein, and Jeffrey Heer. 2012. Enterprise data analysis and visualization: An interview study. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2917–2926.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Sasikiran Kandula and Jeffrey Shaman. 2019. Reappraising the utility of Google Flu Trends. PLoS computational biology 15, 8 (2019), e1007258.Google ScholarGoogle Scholar
  64. Hannah Kerner. [n.d.]. Too many AI researchers think real-world problems are not relevant | MIT Technology Review. https://www.technologyreview.com/2020/08/18/1007196/ai-research-machine-learning-applications-problems-opinion/. (Accessed on 08/18/2020).Google ScholarGoogle Scholar
  65. Mary Beth Kery, Amber Horvath, and Brad A Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists.. In CHI, Vol. 10. 3025453–3025626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2017. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2017), 1024–1038.Google ScholarGoogle ScholarCross RefCross Ref
  67. Ákos Kiss and Tamás Szirányi. 2013. Evaluation of manually created ground truth for multi-view people localization. In Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications. 1–6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Laura Koesten, Kathleen Gregory, Paul Groth, and Elena Simperl. 2019. Talking datasets: Understanding data sensemaking behaviours. (2019). arXiv:1911.09041Google ScholarGoogle Scholar
  69. Laura Koesten, Emilia Kacprzak, Jeni Tennison, and Elena Simperl. 2019. Collaborative Practices with Structured Data: Do Tools Support What Users Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Sanjay Krishnan, Michael J Franklin, Ken Goldberg, and Eugene Wu. 2017. Boostclean: Automated error detection and repair for machine learning. (2017). arXiv:1711.01299Google ScholarGoogle Scholar
  71. Sanjay Krishnan, Daniel Haas, Michael J Franklin, and Eugene Wu. 2016. Towards reliable interactive data cleaning: A user survey and recommendations. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J Franklin, and Ken Goldberg. 2016. Activeclean: Interactive data cleaning for statistical modeling. Proceedings of the VLDB Endowment 9, 12 (2016), 948–959.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. David Lazer and Ryan Kennedy. 2015. What We Can Learn From the Epic Failure of Google Flu Trends | WIRED. https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/. (Accessed on 08/27/2020).Google ScholarGoogle Scholar
  74. Zachary C Lipton and Jacob Steinhardt. 2018. Troubling trends in machine learning scholarship. (2018). arXiv:1807.03341Google ScholarGoogle Scholar
  75. Maria Littmann, Katharina Selig, Liel Cohen-Lavi, Yotam Frank, Peter Hönigschmid, Evans Kataka, Anja Mösch, Kun Qian, Avihai Ron, Sebastian Schmid, 2020. Validity of machine learning in biology and medicine increased through collaborations across fields of expertise. Nature Machine Intelligence(2020), 1–7.Google ScholarGoogle Scholar
  76. Raoni Lourenço, Juliana Freire, and Dennis Shasha. 2019. Debugging machine learning pipelines. In Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning. 1–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Yaoli Mao, Dakuo Wang, Michael Muller, Kush R Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilović. 2019. How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question?Proceedings of the ACM on Human-Computer Interaction 3, GROUP(2019), 1–23.Google ScholarGoogle Scholar
  78. Gary Marcus. 2018. Deep learning: A critical appraisal. (2018). arXiv:1801.00631Google ScholarGoogle Scholar
  79. David Martin, Benjamin V Hanrahan, Jacki O’Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. 224–235.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. (2019). arXiv:1908.09635Google ScholarGoogle Scholar
  82. Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, 2014. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging 34, 10 (2014), 1993–2024.Google ScholarGoogle Scholar
  83. Tim Menzies. 2019. The five laws of SE for AI. IEEE Software 37, 1 (2019), 81–85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Hannah Miller and Richard Stirling. 2019. Government AI Readiness Index 2019 — Oxford Insights — Oxford Insights. https://www.oxfordinsights.com/ai-readiness2019. (Accessed on 09/14/2020).Google ScholarGoogle Scholar
  85. Naja Holten Møller, Claus Bossen, Kathleen H Pine, Trine Rask Nielsen, and Gina Neff. 2020. Who does the work of data?Interactions 27, 3 (2020), 52–55.Google ScholarGoogle Scholar
  86. Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How data science workers work with data: Discovery, capture, curation, design, creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Tadhg Nagle, C. Thomas Redman, and David Sammon. 2017. Only 3% of Companies’ Data Meets Basic Quality Standards. https://hbr.org/2017/09/only-3-of-companies-data-meets-basic-quality-standards. (Accessed on 08/27/2020).Google ScholarGoogle Scholar
  88. Safiya Umoja Noble. 2018. Algorithms of oppression: How search engines reinforce racism. NYU Press.Google ScholarGoogle Scholar
  89. Lawrence A Palinkas, Sarah M Horwitz, Carla A Green, Jennifer P Wisdom, Naihua Duan, and Kimberly Hoagwood. 2015. Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Administration and policy in mental health and mental health services research 42, 5 (2015), 533–544.Google ScholarGoogle Scholar
  90. Praveen Paritosh. 2018. The missing science of knowledge curation: improving incentives for large-scale knowledge curation. In Companion Proceedings of the The Web Conference 2018. 1105–1106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Praveen Paritosh, Kurt Bollacker, Maria Stone, Lora Aroyo, and Sarah Luger. 2020. Evaluating Evaluation of AI Systems (Meta-Eval 2020). http://eval.how/aaai-2020/. (Accessed on 09/16/2020).Google ScholarGoogle Scholar
  92. Praveen Paritosh, Matt Lease, Mike Schaekermann, and Lora Aroyo. 2020. First workshop on Data Excellence (DEW 2020). http://eval.how/dew2020/. (Accessed on 09/16/2020).Google ScholarGoogle Scholar
  93. Samir Passi and Steven Jackson. 2017. Data vision: Learning to see through algorithmic abstraction. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2436–2447.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Samir Passi and Steven J Jackson. 2018. Trust in data science: collaboration, translation, and accountability in corporate data science projects. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Samir Passi and Phoebe Sengers. 2020. Making data science systems work. Big Data & Society 7, 2 (2020), 2053951720939605.Google ScholarGoogle ScholarCross RefCross Ref
  96. Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 667–676.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. James W Pennebaker. 2011. The secret life of pronouns. New Scientist 211, 2828 (2011), 42–45.Google ScholarGoogle Scholar
  98. Fahad Pervaiz, Aditya Vashistha, and Richard Anderson. 2019. Examining the challenges in development data pipeline. In Proceedings of the 2nd ACM SIGCAS Conference on Computing and Sustainable Societies. 13–21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Kathleen H Pine and Max Liboiron. 2015. The politics of measurement and action. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3147–3156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2017. Data Management Challenges in Production Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA, 1723–1726. https://doi.org/10.1145/3035918.3054782Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2018. Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Record 47, 2 (2018), 17–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Vijayshankar Raman and Joseph M Hellerstein. 2001. Potter’s wheel: An interactive data cleaning system. In VLDB, Vol. 1. 381–390.Google ScholarGoogle Scholar
  103. Thomas C. Redman. 2018. If Your Data Is Bad, Your Machine Learning Tools Are Useless. https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-uselessGoogle ScholarGoogle Scholar
  104. Rashida Richardson, Jason M Schultz, and Kate Crawford. 2019. Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. NYUL Rev. Online 94(2019), 15.Google ScholarGoogle Scholar
  105. Jeffrey Saltz, Michael Skirpan, Casey Fiesler, Micha Gorelick, Tom Yeh, Robert Heckman, Neil Dewar, and Nathan Beard. 2019. Integrating ethics within machine learning courses. ACM Transactions on Computing Education (TOCE) 19, 4 (2019), 1–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Jeffrey S Saltz and Nancy W Grady. 2017. The ambiguity of data science team roles and the need for a data science workforce framework. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2355–2361.Google ScholarGoogle ScholarCross RefCross Ref
  107. Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. 2021. Re-imagining Algorithmic Fairness in India and Beyond. In ACM FaccT.Google ScholarGoogle Scholar
  108. Nithya Sambasivan, Garen Checkley, Amna Batool, Nova Ahmed, David Nemer, Laura Sanely Gaytán-Lugo, Tara Matthews, Sunny Consolvo, and Elizabeth Churchill. 2018. ” Privacy is not for me, it’s for those rich women”: Performative Privacy Practices on Mobile Phones by Women in South Asia. In Fourteenth Symposium on Usable Privacy and Security ({SOUPS} 2018). 127–142.Google ScholarGoogle Scholar
  109. Nithya Sambasivan and Jess Holbrook. 2018. Toward responsible AI for the next billion users. interactions 26, 1 (2018), 68–71.Google ScholarGoogle Scholar
  110. Morgan Klaus Scheuerman, Jacob M Paul, and Jed R Brubaker. 2019. How computers see gender: An evaluation of gender classification in commercial facial analysis services. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In Advances in neural information processing systems. 2503–2511.Google ScholarGoogle Scholar
  112. David Sculley, Jasper Snoek, Alex Wiltschko, and Ali Rahimi. 2018. Winner’s curse? On pace, progress, and empirical rigor. (2018).Google ScholarGoogle Scholar
  113. Zheyuan Ryan Shi, Claire Wang, and Fei Fang. 2020. Artificial Intelligence for Social Good: A Survey. arxiv:2001.01818 [cs.CY]Google ScholarGoogle Scholar
  114. David Soergel, Adam Saunders, and Andrew McCallum. 2013. Open Scholarship and Peer Review: a Time for Experimentation. (2013).Google ScholarGoogle Scholar
  115. Eliza Strickland. 2019. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectrum 56, 4 (2019), 24–31.Google ScholarGoogle ScholarCross RefCross Ref
  116. Iryna Susha, Åke Grönlund, and Rob Van Tulder. 2019. Data driven social partnerships: Exploring an emergent trend in search of research challenges and questions. Government Information Quarterly 36, 1 (2019), 112–128.Google ScholarGoogle ScholarCross RefCross Ref
  117. Astra Taylor. 2018. The Automation Charade. https://logicmag.io/failure/the-automation-charade/.Google ScholarGoogle Scholar
  118. Alex S. Taylor, Siân Lindley, Tim Regan, David Sweeney, Vasillis Vlachokyriakos, Lillie Grainger, and Jessica Lingel. 2015. Data-in-Place: Thinking through the Relations Between Data and Community(CHI ’15). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2702123.2702558Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. David R Thomas. 2006. A general inductive approach for analyzing qualitative evaluation data. American journal of evaluation 27, 2 (2006), 237–246.Google ScholarGoogle Scholar
  120. Rachel Thomas and David Uminsky. 2020. The Problem with Metrics is a Fundamental Problem for AI. (2020). arXiv:2002.08512Google ScholarGoogle Scholar
  121. Nenad Tomašev, Julien Cornebise, Frank Hutter, Shakir Mohamed, Angela Picciariello, Bec Connelly, Danielle CM Belgrave, Daphne Ezer, Fanny Cachat van der Haert, Frank Mugisha, 2020. AI for social good: unlocking the opportunity for positive impact. Nature Communications 11, 1 (2020), 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  122. Jennifer Wortman Vaughan. 2017. Making better use of the crowd: How crowdsourcing can advance machine learning research. The Journal of Machine Learning Research 18, 1 (2017), 7026–7071.Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Janet Vertesi and Paul Dourish. 2011. The value of data: considering the context of production in data economies. In Proceedings of the ACM 2011 conference on Computer supported cooperative work. 533–542.Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Bret Victor. 2013. Media for Thinking the Unthinkable. http://worrydream.com/MediaForThinkingTheUnthinkable/. (Accessed on 09/15/2020).Google ScholarGoogle Scholar
  125. Kiri Wagstaff. 2012. Machine learning that matters. (2012). arXiv:1206.4656Google ScholarGoogle Scholar
  126. Sarah Myers West, Meredith Whittaker, and Kate Crawford. 2019. Discriminating systems: Gender, race and power in AI. AI Now Institute (2019), 1–33.Google ScholarGoogle Scholar
  127. Amy X Zhang, Michael Muller, and Dakuo Wang. 2020. How do data science workers collaborate? roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1(2020), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. Jing Zhang, Xindong Wu, and Victor S Sheng. 2016. Learning from crowdsourced labeled data: a survey. Artificial Intelligence Review 46, 4 (2016), 543–576.Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering(2020).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format