ABSTRACT
AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations. Paradoxically, data is the most under-valued and de-glamorised aspect of AI. In this paper, we report on data practices in high-stakes AI, from interviews with 53 AI practitioners in India, East and West African countries, and USA. We define, identify, and present empirical evidence on Data Cascades—compounding events causing negative, downstream effects from data issues—triggered by conventional AI/ML practices that undervalue data quality. Data cascades are pervasive (92% prevalence), invisible, delayed, but often avoidable. We discuss HCI opportunities in designing and incentivizing data excellence as a first-class citizen of AI, resulting in safer and more robust systems for all.
- [n.d.]. 2019 Kaggle ML & DS Survey | Kaggle. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/29/2020).Google Scholar
- [n.d.]. AI Readiness Index 2019 | AI4D | IAPD. https://ai4d.ai/index2019/. (Accessed on 09/14/2020).Google Scholar
- [n.d.]. Landscape of AI-ML Research in India. http://www.itihaasa.com/pdf/Report_Final_ES.pdf. (Accessed on 09/15/2020).Google Scholar
- [n.d.]. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php. (Accessed on 09/15/2020).Google Scholar
- [n.d.]. A Vision of AI for Joyful Education - Scientific American Blog Network. https://blogs.scientificamerican.com/observations/a-vision-of-ai-for-joyful-education/. (Accessed on 09/14/2020).Google Scholar
- Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291–300.Google ScholarDigital Library
- Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.Google ScholarDigital Library
- Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. (2016). arXiv:1606.06565Google Scholar
- Appen. 2020. The 2020 Machine Learning Report and State of AI. https://appen.com/whitepapers/the-state-of-ai-and-machine-learning-report/. (Accessed on 09/16/2020).Google Scholar
- Lora Aroyo, Lucas Dixon, Nithum Thain, Olivia Redfield, and Rachel Rosen. 2019. Crowdsourcing subjective tasks: the case study of understanding toxicity in online discussions. In Companion Proceedings of The 2019 World Wide Web Conference. 1100–1105.Google ScholarDigital Library
- [11] Lora Aroyo, Anca Dumitrache, Jennimaria Palomaki, Praveen Paritosh, Alex Quinn, Olivia Rhinehart, Mike Schaekermann, Michael Tseng, and Chris Welty.[n.d.]. https://sadworkshop.wordpress.com/Google Scholar
- Lora Aroyo and Chris Welty. 2014. The Three Sides of CrowdTruth. Human Computation 1, 1 (Sep. 2014). https://doi.org/10.15346/hc.v1i1.34Google ScholarCross Ref
- Lora Aroyo and Chris Welty. 2015. Truth Is a Lie: Crowd Truth and the Seven Myths of Human Annotation. AI Magazine 36, 1 (Mar. 2015), 15–24. https://doi.org/10.1609/aimag.v36i1.2564Google ScholarDigital Library
- Jonathan Bailey. 2019. Why Siraj Raval’s Plagiarism is the Future of Plagiarism - Plagiarism Today. https://www.plagiarismtoday.com/2019/10/16/why-siraj-ravals-plagiarism-is-the-future-of-plagiarism/. (Accessed on 09/15/2020).Google Scholar
- Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2–11.Google ScholarCross Ref
- Anja Bechmann and Geoffrey C Bowker. 2019. Unsupervised by any other name: Hidden layers of knowledge production in artificial intelligence on social media. Big Data & Society 6, 1 (2019), 2053951718819569.Google ScholarCross Ref
- Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarDigital Library
- Yoshua Bengio. 2020. Time to rethink the publication process in machine learning - Yoshua Bengio. https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publication-process-in-machine-learning/. (Accessed on 08/18/2020).Google Scholar
- Anant Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J Elmore, Samuel Madden, and Aditya G Parameswaran. 2014. Datahub: Collaborative data science & dataset version management at scale. (2014). arXiv:1409.0798Google Scholar
- Joshua Blumenstock. 2018. Don’t forget people in the use of big data for development.Google Scholar
- Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2019. Data validation for machine learning. In Conference on Systems and Machine Learning (SysML). https://www. sysml. cc/doc/2019/167. pdf.Google Scholar
- Waylon Brunette, Clarice Larson, Shourya Jain, Aeron Langford, Yin Yin Low, Andrew Siew, and Richard Anderson. 2020. Global goods software for the immunization cold chain. In Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies. 208–218.Google ScholarDigital Library
- Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and where: A characterization of data provenance. In International conference on database theory. Springer, 316–330.Google ScholarCross Ref
- Andrew Burt and Patrick Hall. 2020. What to Do When AI Fails – O’Reilly. https://www.oreilly.com/radar/what-to-do-when-ai-fails/. (Accessed on 09/16/2020).Google Scholar
- Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative crowdsourcing for labeling machine learning datasets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2334–2346.Google ScholarDigital Library
- Kuang Chen, Joseph M Hellerstein, and Tapan S Parikh. 2011. Data in the First Mile.. In CIDR. Citeseer, 203–206.Google Scholar
- Xu Chu, Ihab F Ilyas, and Paolo Papotti. 2013. Holistic data cleaning: Putting violations into context. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 458–469.Google ScholarDigital Library
- Josh Cowls, Thomas King, Mariarosaria Taddeo, and Luciano Floridi. 2019. Designing AI for social good: Seven essential factors. Available at SSRN 3388669(2019).Google Scholar
- Ward Cunningham. 1992. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger 4, 2 (1992), 29–30.Google ScholarDigital Library
- Florian Daniel, Pavel Kucherbaev, Cinzia Cappiello, Boualem Benatallah, and Mohammad Allahbakhsh. 2018. Quality control in crowdsourcing: A survey of quality attributes, assessment techniques, and assurance actions. ACM Computing Surveys (CSUR) 51, 1 (2018), 1–40.Google ScholarDigital Library
- Maria De-Arteaga, William Herlands, Daniel B Neill, and Artur Dubrawski. 2018. Machine learning for the developing world. ACM Transactions on Management Information Systems (TMIS) 9, 2(2018), 1–14.Google ScholarDigital Library
- Alan Dix, Alan John Dix, Janet Finlay, Gregory D Abowd, and Russell Beale. 2003. Human-computer interaction. Pearson Education.Google Scholar
- Farzana Dudhwala and Lotta Björklund Larsen. 2019. Recalibration in counting and accounting practices: Dealing with algorithmic output in public and private. Big Data & Society 6, 2 (2019), 2053951719858751.Google ScholarCross Ref
- Hamid Ekbia and Bonnie Nardi. 2014. Heteromation and its (dis) contents: The invisible division of labor between humans and machines. First Monday (2014).Google Scholar
- Melanie Feinberg. 2017. A design perspective on data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2952–2963.Google ScholarDigital Library
- Kathleen Fisher and Robert Gruber. 2005. PADS: a domain-specific language for processing ad hoc data. ACM Sigplan Notices 40, 6 (2005), 295–304.Google ScholarDigital Library
- Luciano Floridi, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, 2018. AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines 28, 4 (2018), 689–707.Google ScholarDigital Library
- Andrew Forward and Timothy C Lethbridge. 2002. The relevance of software documentation, tools and technologies: a survey. In Proceedings of the 2002 ACM symposium on Document engineering. 26–33.Google ScholarDigital Library
- Martin Fowler. 2019. TechnicalDebt. https://martinfowler.com/bliki/TechnicalDebt.html. (Accessed on 09/16/2020).Google Scholar
- Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2018. Datasheets for datasets. (2018). arXiv:1803.09010Google Scholar
- Lisa Gitelman. 2013. Raw data is an oxymoron. MIT press.Google Scholar
- Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.Google Scholar
- Laura M Haas, Mauricio A Hernández, Howard Ho, Lucian Popa, and Mary Roth. 2005. Clio grows up: from research prototype to industrial tool. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 805–810.Google ScholarDigital Library
- Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8–12.Google ScholarDigital Library
- Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, 2018. Applied machine learning at facebook: A datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 620–629.Google ScholarCross Ref
- Benjamin Heinzerling. 2020. NLP’s Clever Hans Moment has Arrived. https://thegradient.pub/nlps-clever-hans-moment-has-arrived/Google Scholar
- Keith Hiatt, Michael Kleinman, and Mark Latonero. [n.d.]. Tech folk: ’Move fast and break things’ doesn’t work when lives are at stake | The Guardian. https://www.theguardian.com/global-development-professionals-network/2017/feb/02/technology-human-rights. (Accessed on 08/25/2020).Google Scholar
- Charles Hill, Rachel Bellamy, Thomas Erickson, and Margaret Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 162–170.Google ScholarCross Ref
- J Hirschberg. 1998. Every time I fire a linguist, my performance goes up, and other myths of the statistical natural language processing revolution. Invited talk. In Fifteenth National Conference on Artificial Intelligence (AAAI-98).Google Scholar
- Chien-Ju Ho, Aleksandrs Slivkins, Siddharth Suri, and Jennifer Wortman Vaughan. 2015. Incentivizing high quality crowdwork. In Proceedings of the 24th International Conference on World Wide Web. 419–429.Google ScholarDigital Library
- Victoria Hodge and Jim Austin. 2004. A survey of outlier detection methodologies. Artificial intelligence review 22, 2 (2004), 85–126.Google Scholar
- Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and Visualizing Data Iteration in Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarDigital Library
- Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2020. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. (2020). arXiv:2010.13561Google Scholar
- Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop.Google Scholar
- John PA Ioannidis, Sander Greenland, Mark A Hlatky, Muin J Khoury, Malcolm R Macleod, David Moher, Kenneth F Schulz, and Robert Tibshirani. 2014. Increasing value and reducing waste in research design, conduct, and analysis. The Lancet 383, 9912 (2014), 166–175.Google Scholar
- Lilly Irani. 2015. The cultural work of microwork. New Media & Society 17, 5 (2015), 720–739.Google ScholarCross Ref
- Lilly C Irani and M Six Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI conference on human factors in computing systems. 611–620.Google ScholarDigital Library
- Azra Ismail and Neha Kumar. 2018. Engaging solidarity in data collection practices for community health. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–24.Google ScholarDigital Library
- Ayush Jain, Akash Das Sarma, Aditya Parameswaran, and Jennifer Widom. 2017. Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. (2017). arXiv:1701.06207Google Scholar
- Kaggle. 2019. 2019 Kaggle ML & DS Survey. https://www.kaggle.com/c/kaggle-survey-2019. (Accessed on 08/27/2020).Google Scholar
- Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarDigital Library
- Sean Kandel, Andreas Paepcke, Joseph M Hellerstein, and Jeffrey Heer. 2012. Enterprise data analysis and visualization: An interview study. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2917–2926.Google ScholarDigital Library
- Sasikiran Kandula and Jeffrey Shaman. 2019. Reappraising the utility of Google Flu Trends. PLoS computational biology 15, 8 (2019), e1007258.Google Scholar
- Hannah Kerner. [n.d.]. Too many AI researchers think real-world problems are not relevant | MIT Technology Review. https://www.technologyreview.com/2020/08/18/1007196/ai-research-machine-learning-applications-problems-opinion/. (Accessed on 08/18/2020).Google Scholar
- Mary Beth Kery, Amber Horvath, and Brad A Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists.. In CHI, Vol. 10. 3025453–3025626.Google ScholarDigital Library
- Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2017. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2017), 1024–1038.Google ScholarCross Ref
- Ákos Kiss and Tamás Szirányi. 2013. Evaluation of manually created ground truth for multi-view people localization. In Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications. 1–6.Google ScholarDigital Library
- Laura Koesten, Kathleen Gregory, Paul Groth, and Elena Simperl. 2019. Talking datasets: Understanding data sensemaking behaviours. (2019). arXiv:1911.09041Google Scholar
- Laura Koesten, Emilia Kacprzak, Jeni Tennison, and Elena Simperl. 2019. Collaborative Practices with Structured Data: Do Tools Support What Users Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarDigital Library
- Sanjay Krishnan, Michael J Franklin, Ken Goldberg, and Eugene Wu. 2017. Boostclean: Automated error detection and repair for machine learning. (2017). arXiv:1711.01299Google Scholar
- Sanjay Krishnan, Daniel Haas, Michael J Franklin, and Eugene Wu. 2016. Towards reliable interactive data cleaning: A user survey and recommendations. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–5.Google ScholarDigital Library
- Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J Franklin, and Ken Goldberg. 2016. Activeclean: Interactive data cleaning for statistical modeling. Proceedings of the VLDB Endowment 9, 12 (2016), 948–959.Google ScholarDigital Library
- David Lazer and Ryan Kennedy. 2015. What We Can Learn From the Epic Failure of Google Flu Trends | WIRED. https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/. (Accessed on 08/27/2020).Google Scholar
- Zachary C Lipton and Jacob Steinhardt. 2018. Troubling trends in machine learning scholarship. (2018). arXiv:1807.03341Google Scholar
- Maria Littmann, Katharina Selig, Liel Cohen-Lavi, Yotam Frank, Peter Hönigschmid, Evans Kataka, Anja Mösch, Kun Qian, Avihai Ron, Sebastian Schmid, 2020. Validity of machine learning in biology and medicine increased through collaborations across fields of expertise. Nature Machine Intelligence(2020), 1–7.Google Scholar
- Raoni Lourenço, Juliana Freire, and Dennis Shasha. 2019. Debugging machine learning pipelines. In Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning. 1–10.Google ScholarDigital Library
- Yaoli Mao, Dakuo Wang, Michael Muller, Kush R Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilović. 2019. How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question?Proceedings of the ACM on Human-Computer Interaction 3, GROUP(2019), 1–23.Google Scholar
- Gary Marcus. 2018. Deep learning: A critical appraisal. (2018). arXiv:1801.00631Google Scholar
- David Martin, Benjamin V Hanrahan, Jacki O’Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. 224–235.Google ScholarDigital Library
- Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.Google ScholarDigital Library
- Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. (2019). arXiv:1908.09635Google Scholar
- Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, 2014. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging 34, 10 (2014), 1993–2024.Google Scholar
- Tim Menzies. 2019. The five laws of SE for AI. IEEE Software 37, 1 (2019), 81–85.Google ScholarDigital Library
- Hannah Miller and Richard Stirling. 2019. Government AI Readiness Index 2019 — Oxford Insights — Oxford Insights. https://www.oxfordinsights.com/ai-readiness2019. (Accessed on 09/14/2020).Google Scholar
- Naja Holten Møller, Claus Bossen, Kathleen H Pine, Trine Rask Nielsen, and Gina Neff. 2020. Who does the work of data?Interactions 27, 3 (2020), 52–55.Google Scholar
- Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How data science workers work with data: Discovery, capture, curation, design, creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarDigital Library
- Tadhg Nagle, C. Thomas Redman, and David Sammon. 2017. Only 3% of Companies’ Data Meets Basic Quality Standards. https://hbr.org/2017/09/only-3-of-companies-data-meets-basic-quality-standards. (Accessed on 08/27/2020).Google Scholar
- Safiya Umoja Noble. 2018. Algorithms of oppression: How search engines reinforce racism. NYU Press.Google Scholar
- Lawrence A Palinkas, Sarah M Horwitz, Carla A Green, Jennifer P Wisdom, Naihua Duan, and Kimberly Hoagwood. 2015. Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Administration and policy in mental health and mental health services research 42, 5 (2015), 533–544.Google Scholar
- Praveen Paritosh. 2018. The missing science of knowledge curation: improving incentives for large-scale knowledge curation. In Companion Proceedings of the The Web Conference 2018. 1105–1106.Google ScholarDigital Library
- Praveen Paritosh, Kurt Bollacker, Maria Stone, Lora Aroyo, and Sarah Luger. 2020. Evaluating Evaluation of AI Systems (Meta-Eval 2020). http://eval.how/aaai-2020/. (Accessed on 09/16/2020).Google Scholar
- Praveen Paritosh, Matt Lease, Mike Schaekermann, and Lora Aroyo. 2020. First workshop on Data Excellence (DEW 2020). http://eval.how/dew2020/. (Accessed on 09/16/2020).Google Scholar
- Samir Passi and Steven Jackson. 2017. Data vision: Learning to see through algorithmic abstraction. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2436–2447.Google ScholarDigital Library
- Samir Passi and Steven J Jackson. 2018. Trust in data science: collaboration, translation, and accountability in corporate data science projects. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–28.Google ScholarDigital Library
- Samir Passi and Phoebe Sengers. 2020. Making data science systems work. Big Data & Society 7, 2 (2020), 2053951720939605.Google ScholarCross Ref
- Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 667–676.Google ScholarDigital Library
- James W Pennebaker. 2011. The secret life of pronouns. New Scientist 211, 2828 (2011), 42–45.Google Scholar
- Fahad Pervaiz, Aditya Vashistha, and Richard Anderson. 2019. Examining the challenges in development data pipeline. In Proceedings of the 2nd ACM SIGCAS Conference on Computing and Sustainable Societies. 13–21.Google ScholarDigital Library
- Kathleen H Pine and Max Liboiron. 2015. The politics of measurement and action. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3147–3156.Google ScholarDigital Library
- Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2017. Data Management Challenges in Production Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA, 1723–1726. https://doi.org/10.1145/3035918.3054782Google ScholarDigital Library
- Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2018. Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Record 47, 2 (2018), 17–28.Google ScholarDigital Library
- Vijayshankar Raman and Joseph M Hellerstein. 2001. Potter’s wheel: An interactive data cleaning system. In VLDB, Vol. 1. 381–390.Google Scholar
- Thomas C. Redman. 2018. If Your Data Is Bad, Your Machine Learning Tools Are Useless. https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-uselessGoogle Scholar
- Rashida Richardson, Jason M Schultz, and Kate Crawford. 2019. Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. NYUL Rev. Online 94(2019), 15.Google Scholar
- Jeffrey Saltz, Michael Skirpan, Casey Fiesler, Micha Gorelick, Tom Yeh, Robert Heckman, Neil Dewar, and Nathan Beard. 2019. Integrating ethics within machine learning courses. ACM Transactions on Computing Education (TOCE) 19, 4 (2019), 1–26.Google ScholarDigital Library
- Jeffrey S Saltz and Nancy W Grady. 2017. The ambiguity of data science team roles and the need for a data science workforce framework. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2355–2361.Google ScholarCross Ref
- Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. 2021. Re-imagining Algorithmic Fairness in India and Beyond. In ACM FaccT.Google Scholar
- Nithya Sambasivan, Garen Checkley, Amna Batool, Nova Ahmed, David Nemer, Laura Sanely Gaytán-Lugo, Tara Matthews, Sunny Consolvo, and Elizabeth Churchill. 2018. ” Privacy is not for me, it’s for those rich women”: Performative Privacy Practices on Mobile Phones by Women in South Asia. In Fourteenth Symposium on Usable Privacy and Security ({SOUPS} 2018). 127–142.Google Scholar
- Nithya Sambasivan and Jess Holbrook. 2018. Toward responsible AI for the next billion users. interactions 26, 1 (2018), 68–71.Google Scholar
- Morgan Klaus Scheuerman, Jacob M Paul, and Jed R Brubaker. 2019. How computers see gender: An evaluation of gender classification in commercial facial analysis services. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–33.Google ScholarDigital Library
- David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In Advances in neural information processing systems. 2503–2511.Google Scholar
- David Sculley, Jasper Snoek, Alex Wiltschko, and Ali Rahimi. 2018. Winner’s curse? On pace, progress, and empirical rigor. (2018).Google Scholar
- Zheyuan Ryan Shi, Claire Wang, and Fei Fang. 2020. Artificial Intelligence for Social Good: A Survey. arxiv:2001.01818 [cs.CY]Google Scholar
- David Soergel, Adam Saunders, and Andrew McCallum. 2013. Open Scholarship and Peer Review: a Time for Experimentation. (2013).Google Scholar
- Eliza Strickland. 2019. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectrum 56, 4 (2019), 24–31.Google ScholarCross Ref
- Iryna Susha, Åke Grönlund, and Rob Van Tulder. 2019. Data driven social partnerships: Exploring an emergent trend in search of research challenges and questions. Government Information Quarterly 36, 1 (2019), 112–128.Google ScholarCross Ref
- Astra Taylor. 2018. The Automation Charade. https://logicmag.io/failure/the-automation-charade/.Google Scholar
- Alex S. Taylor, Siân Lindley, Tim Regan, David Sweeney, Vasillis Vlachokyriakos, Lillie Grainger, and Jessica Lingel. 2015. Data-in-Place: Thinking through the Relations Between Data and Community(CHI ’15). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2702123.2702558Google ScholarDigital Library
- David R Thomas. 2006. A general inductive approach for analyzing qualitative evaluation data. American journal of evaluation 27, 2 (2006), 237–246.Google Scholar
- Rachel Thomas and David Uminsky. 2020. The Problem with Metrics is a Fundamental Problem for AI. (2020). arXiv:2002.08512Google Scholar
- Nenad Tomašev, Julien Cornebise, Frank Hutter, Shakir Mohamed, Angela Picciariello, Bec Connelly, Danielle CM Belgrave, Daphne Ezer, Fanny Cachat van der Haert, Frank Mugisha, 2020. AI for social good: unlocking the opportunity for positive impact. Nature Communications 11, 1 (2020), 1–6.Google ScholarCross Ref
- Jennifer Wortman Vaughan. 2017. Making better use of the crowd: How crowdsourcing can advance machine learning research. The Journal of Machine Learning Research 18, 1 (2017), 7026–7071.Google ScholarDigital Library
- Janet Vertesi and Paul Dourish. 2011. The value of data: considering the context of production in data economies. In Proceedings of the ACM 2011 conference on Computer supported cooperative work. 533–542.Google ScholarDigital Library
- Bret Victor. 2013. Media for Thinking the Unthinkable. http://worrydream.com/MediaForThinkingTheUnthinkable/. (Accessed on 09/15/2020).Google Scholar
- Kiri Wagstaff. 2012. Machine learning that matters. (2012). arXiv:1206.4656Google Scholar
- Sarah Myers West, Meredith Whittaker, and Kate Crawford. 2019. Discriminating systems: Gender, race and power in AI. AI Now Institute (2019), 1–33.Google Scholar
- Amy X Zhang, Michael Muller, and Dakuo Wang. 2020. How do data science workers collaborate? roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1(2020), 1–23.Google ScholarDigital Library
- Jing Zhang, Xindong Wu, and Victor S Sheng. 2016. Learning from crowdsourced labeled data: a survey. Artificial Intelligence Review 46, 4 (2016), 543–576.Google ScholarDigital Library
- Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering(2020).Google ScholarDigital Library
Index Terms
- “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI
Recommendations
Open data and information sharing in developing nations
dg.o '14: Proceedings of the 15th Annual International Conference on Digital Government ResearchThe potential positive impacts of open data and information sharing initiatives, along with their attendant challenges, have captured the attention and imagination of governments around the world, in both industrialized and developing countries. ...
Public Health Calls for/with AI: An Ethnographic Perspective
CSCWArtificial Intelligence (AI) based technologies are increasingly being integrated into public sector programs to help with decision-support and effective distribution of constrained resources. The field of Computer Supported Cooperative Work (CSCW) has ...
Artificially Intelligent Technology for the Margins: A Multidisciplinary Design Agenda
CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing SystemsThere has been increasing interest in socially just use of Artificial Intelligence (AI) and Machine Learning (ML) in the development of technology that may be extended to marginalized people. However, the exploration of such technologies entails the ...
Comments