Skip to main content

Tag Me If You Can: Insights into the Challenges of Supporting Unrestricted P2P News Tagging

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1283))

Included in the following conference series:

  • 675 Accesses

Abstract

Peer-to-Peer news portals allow Internet users to write news articles and make them available online to interested readers. Despite the fact that authors are free in their choice of topics, there are a number of quality characteristics that an article must meet before it is published. In addition to meaningful titles, comprehensibly written texts and meaningful images, relevant tags are an important criteria for the quality of such news. In this case study, we discuss the challenges and common mistakes that Peer-to-Peer reporters face when tagging news and how incorrect information can be corrected through the orchestration of existing Natural Language Processing services. Lastly, we use this illustrative example to give insight into the challenges of dealing with bottom-up taxonomies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    ShortNews discontinued operations in 2018.

  2. 2.

    Based on the data available to us. It is very likely that new tags have been added this year, but our dataset does not represent this.

References

  1. Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: improving search and exploration in the tag space. In: Collaborative Web Tagging Workshop at WWW 2006, pp. 15–33, May 2006

    Google Scholar 

  2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information 5, 135–146 (2017). https://doi.org/10.1162/tacl_a_00051

  3. Breslin, J.G., Passant, A., Decker, S.: The Social Semantic Web. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01172-6

    Book  Google Scholar 

  4. Bäumer, F.S., Geierhos, M.: Flexible ambiguity resolution and incompleteness detection in requirements descriptions via an indicator-based configuration of text analysis pipelines. In: Proceedings of the 51st Hawaii International Conference on System Sciences, pp. 5746–5755 (2018). https://doi.org/10.24251/HICSS.2018.720

  5. de Castilho, R.E., Gurevych, I.: A broad-coverage collection of portable NLP components for building shareable analysis pipelines. In: Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT, pp. 1–11. ACL and Dublin City University, Dublin (2014). https://doi.org/10.3115/v1/w14-5201

  6. Chuang, S.L., Chien, L.F.: Topic hierarchy generation for text segments: a practical web-based approach. ACM J. 1–33 (2005)

    Google Scholar 

  7. deepset: deepset - open sourcing German BERT (2019). https://deepset.ai/german-bert. Accessed 28 Nov 2019

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018)

    Google Scholar 

  9. Engesser, S.: Die Qualität des Partizipativen Journalismus im Web: Bausteine für ein integratives theoretisches Konzept und eine explanative empirische Analyse. VS Verlag für Sozialwissenschaften, Wiesbaden (2013)

    Google Scholar 

  10. Ienco, D., Meo, R.: Towards the automatic construction of conceptual taxonomies. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 327–336. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85836-2_31

    Chapter  Google Scholar 

  11. Karl, H., Kundisch, D., Meyer auf der Heide, F., Wehrheim, H.: A case for a new IT ecosystem: on-the-fly computing. Bus. Inf. Syst. Eng. (2019). https://doi.org/10.1007/s12599-019-00627-x

  12. Kim, W., Choi, B.J., Hong, E.K., Kim, S.K., Lee, D.: A taxonomy of dirty data. Data Min. Knowl. Disc. 7(1), 81–99 (2003). https://doi.org/10.1023/A:1021564703268

    Article  MathSciNet  Google Scholar 

  13. Kopp, M., Schönhagen, P.: Die Laien kommen! Wirklich? Eine Untersuchung zum Rollenselbstbild sogenannter Bürgerjournalistinnen und Bürgerjournalisten. In: Quandt, T., Schweiger, W. (eds.) Journalismus Online- Partizipation oder Profession, pp. 79–94. VS Verlag für Sozialwissenschaften, Wiesbaden (2008)

    Google Scholar 

  14. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  15. Mathes, A.: Folksonomies - cooperative classification and communication through shared metadata. Computer Mediated Communication, LIS590CMC. University of Illinois Urbana-Champaign, Graduate School of Library and Information Science (2004)

    Google Scholar 

  16. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). https://doi.org/10.1016/j.artint.2012.07.001

    Article  MathSciNet  MATH  Google Scholar 

  17. Neuberger, C.: Wandel der aktuellen Öffentlichkeit im Internet. Ph.D. thesis, Westfälische Wilhelms-Universität, Münster (2004)

    Google Scholar 

  18. Neuberger, C.: Das ende des gatekeeper-zeitalters. In: Lehmann, K., Schetsche, M. (eds.) Die Google-Gesellschaft, Bielefeld, pp. 205–211 (2005)

    Google Scholar 

  19. Tsui, E., Wang, W.M., Cheung, C.F., Lau, A.S.M.: A concept-relationship acquisition and inference approach for hierarchical taxonomy construction from tags. Inf. Process. Manag. 46(1), 44–57 (2010). https://doi.org/10.1016/j.ipm.2009.05.009

    Article  Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st Conference on Neural Information Processing Systems, pp. 5998–6008. Curran Associates (2017)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the German Research Foundation (DFG) within the Collaborative Research Centre On-The-Fly Computing (SFB 901).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frederik S. Bäumer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bäumer, F.S., Kersting, J., Buff, B., Geierhos, M. (2020). Tag Me If You Can: Insights into the Challenges of Supporting Unrestricted P2P News Tagging. In: Lopata, A., Butkienė, R., Gudonienė, D., Sukackė, V. (eds) Information and Software Technologies. ICIST 2020. Communications in Computer and Information Science, vol 1283. Springer, Cham. https://doi.org/10.1007/978-3-030-59506-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59506-7_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59505-0

  • Online ISBN: 978-3-030-59506-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics