Skip to main content

Information and Relation Extraction for Semantic Annotation of eBook Texts

  • Conference paper
Recent Advances in Intelligent Informatics

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 235))

Abstract

This paper presents our algorithmic approach for information and relation extraction from unstructured texts (such as from eBook sections or webpages), performing other useful analytics on the text, and automatically generating a semantically meaningful structure (RDF schema). Our algorithmic formulation parses the unstructured text from eBooks and identifies key concepts described in the eBook along with relationship between the concepts. The extracted information is then used for four purposes: (a) for generating some computed metadata about the text source (such as readability of an eBook), (b) generate a concept profile for each distinct part of text, (c) identifying and plotting relationship between key concepts described in the text, and (d) to generate RDF representation for the text source. We have done our experiments on eBook texts from Computer Science domain; however, the approach can be applied to work on different forms of text in other domains as well. The results are not only useful for concept based tagging and navigation of unstructured text documents (such as eBook) but can also be used to design a comprehensive and sophisticated learning recommendation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the ACM Conference on Digital Libraries, pp. 85–94. ACM (2000)

    Google Scholar 

  • Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Data Mining for Improving Textbooks. SIGKDD Explorations 13(2), 7–19 (2011)

    Article  Google Scholar 

  • Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: ACM DEV (2010)

    Google Scholar 

  • Banko, M.: Open Information Extraction for the Web. Ph. D. dissertation, University of Washington (2009)

    Google Scholar 

  • Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: Proceedings ACL 2008, pp. 28–36 (2008)

    Google Scholar 

  • Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proceedings IJCAI (2007)

    Google Scholar 

  • Bitton, D., Faerber, F., Haas, L., Shanmugasundaram, J.: One platform for mining structured and unstructured data: dream or reality? In: Proceedings 32nd VLDB, pp. 1261–1262 (2006)

    Google Scholar 

  • Brin, S.: Extracting patterns and relations from the world wide web. In: Proceedings of International Workshop in the World Wide Web and Databases, pp. 172–183 (1998)

    Google Scholar 

  • Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam: Open Information Extraction: the Second Generation. In: Proceedings 22nd IJCAI, pp. 3–10 (2011)

    Google Scholar 

  • GuoDong, Z., Jian, S., Jie, Z., Min, Z.: Exploring various knowledge in relation extraction. In: Proceedings ACL 2005, pp. 427–434 (2005)

    Google Scholar 

  • Horn, C., Zhila, A., Gelbukh, A., Kern, R., Lex, E.: Using Factual Density to Measure Informativeness of Web Documents. In: Proceedings of the 19th Nordic Conference on Computational Linguistics (NODALIDA). Linkoping University Electronic Press, Oslo (2013)

    Google Scholar 

  • Justeson, J.S., Katz, S.M.: Technical terminology: Some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1) (1995)

    Google Scholar 

  • Kambhatla, N.: Combining lexical, syntactic and semantic features with maximum entropy models. In: Proceedings 22nd ACL (2004)

    Google Scholar 

  • Lent, B., Agarawal, R., Srikant, R.: Discovering trends in text databases. In: Proceedings KDD (1997)

    Google Scholar 

  • Piskorski, J., Yangarber, R.: Multi-source, Multilingual Information Extraction and Summarization, Theory and Applications of Natural Language Processing. In: Poibeau, T., et al. (eds.) Information Extraction: Past, Present and Future. Introductory Survey, Springer, Heidelberg (2012)

    Google Scholar 

  • Singh, V.K., Piryani, R., Uddin, A.: An eBook-based eResource Recommender System. In: Proceedings 5th International Conference on Pattern Recognition and Machine Intelligence, Kolkata, India. LNCS. Springer (2013c)

    Google Scholar 

  • Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment Analysis of Movie Reviews and Blog Posts: Evaluating SentiWordNet with different Linguistic Features and Scoring Schemes. In: Proceedings of 2013 IEEE International Advanced Computing Conference. IEEE Press, Ghaziabad-India (2013a)

    Google Scholar 

  • Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment Analysis of Movie Reviews: A new Feature-based Heuristic for Aspect-Level sentiment classification. In: Proceedings of the International Multi Conference on Automation, Computing, control, Communication and Compressed Sensing. IEEE Press (2013b)

    Google Scholar 

  • Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings 48th ACL, pp. 118–127 (2010)

    Google Scholar 

  • Zhila, A., Gelbukh, A.: Comparison of Open Information Extraction for English and Spanish. In: 19th Annual International Conference Dialog 2013, Bekasovo, Russia, pp. 714–722 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Uddin, A., Piryani, R., Singh, V.K. (2014). Information and Relation Extraction for Semantic Annotation of eBook Texts. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds) Recent Advances in Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 235. Springer, Cham. https://doi.org/10.1007/978-3-319-01778-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01778-5_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01777-8

  • Online ISBN: 978-3-319-01778-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics