Abstract
Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation tasks. However, the annotation results often contain label noise, as the annotation skills vary depending on the crowd workers and their ability to complete the task correctly. Learning from Crowds is a framework which directly trains the models using noisy labeled data from crowd workers. In this study, we propose a novel Learning from Crowds model, inspired by SelectiveNet proposed for the selective prediction problem. The proposed method called Label Selection Layer trains a prediction model by automatically determining whether to use a worker’s label for training using a selector network. A major advantage of the proposed method is that it can be applied to almost all variants of supervised learning problems by simply adding a selector network and changing the objective function for existing models, without explicitly assuming a model of the noise in crowd annotations. The experimental results show that the performance of the proposed method is almost equivalent to or better than the Crowd Layer, which is one of the state-of-the-art methods for Deep Learning from Crowds, except for the regression problem case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that the experimental setup in the previous study [24] differs from our split setup because they only splits the data into train and test.
References
Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., Navab, N.: AggNet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans. Med. Imaging 35, 1313–1321 (2016)
Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Braylan, A., Lease, M.: Modeling and aggregation of complex annotations via annotation distances. In: Proceedings of The Web Conference 2020 (2020)
Braylan, A., Lease, M.: Aggregating complex annotations via merging and matching. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2021)
Chen, P., Sun, H., Yang, Y., Chen, Z.: Adversarial learning from crowds. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)
Chen, Z., et al.: Structured probabilistic end-to-end learning from crowds. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence (2020)
Chow, C.: On optimum recognition error and reject tradeoff. IEEE Trans. Inf. Theory 16, 41–46 (1970)
Chu, Z., Ma, J., Wang, H.: Learning from crowds by modeling common confusions. In: Proceedings of the AAAI Conference on Artificial Intelligence (2021)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc. Series C (Appl. Stat.) 28, 20–28 (1979)
Deng, J., et al.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Falcon, W.: The PyTorch Lightning team: PyTorch Lightning (2019)
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (2017)
Geifman, Y., El-Yaniv, R.: SelectiveNet: a deep neural network with an integrated reject option. In: Proceedings of the 36th International Conference on Machine Learning (2019)
Kajino, H., Tsuboi, Y., Kashima, H.: A convex formulation for learning from crowds. In: Proceedings of the AAAI Conference on Artificial Intelligence (2012)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations (2015)
Mozannar, H., Sontag, D.: Consistent estimators for learning to defer to an expert. In: Proceedings of the 37th International Conference on Machine Learning (2020)
Oyama, S., Baba, Y., Sakurai, Y., Kashima, H.: Accurate integration of crowdsourced labels using workers’ self-reported confidence scores. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (2013)
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (2005)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)
Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)
Rodrigues, F., Lourenço, M., Ribeiro, B., Pereira, F.C.: Learning supervised topic models for classification and regression from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2409–2422 (2017)
Rodrigues, F., Pereira, F., Ribeiro, B.: Sequence labeling with multiple annotators. Mach. Learn. 95, 165–181 (2014)
Rodrigues, F., Pereira, F.C.: Deep learning from crowds. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 44, 157–173 (2008)
Sabetpour, N., Kulkarni, A., Xie, S., Li, Q.: Truth discovery in sequence labels from crowds. In: 2021 IEEE International Conference on Data Mining (2021)
Sang, T.K., F, E., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 (2003)
Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)
Takeoka, K., Dong, Y., Oyamada, M.: Learning with unsure responses. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (2020)
Wang, S., Dang, D.: A generative answer aggregation model for sentence-level crowdsourcing task. IEEE Transactions on Knowledge and Data Engineering 34, 3299–3312 (2022)
Welinder, P., Branson, S., Perona, P., Belongie, S.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems (2010)
Whitehill, J., Wu, T.F., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yoshimura, K., Kashima, H. (2024). Label Selection Approach to Learning from Crowds. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1962. Springer, Singapore. https://doi.org/10.1007/978-981-99-8132-8_16
Download citation
DOI: https://doi.org/10.1007/978-981-99-8132-8_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8131-1
Online ISBN: 978-981-99-8132-8
eBook Packages: Computer ScienceComputer Science (R0)