ABSTRACT
The ARCOMEM project is about memory institutions like archives, museums and libraries in the age of the Social Web. Social media are becoming more and more pervasive in all areas of life. ARCOMEM's aim is to help to transform archives into collective memories that are more tightly integrated with their community of users and to exploit Web 2.0 and the wisdom of crowds to make Web archiving a more selective and meaning-based process. ARCOMEM (FP7-IST-270239) is an Integrating Project in the FP7 program of the European Commission, which involves twelve partners from academia, industry and public sector. The project will run from January 1, 2011 to December 31, 2013.
- A. Arvidson and F. Lettenström. The Kulturarw Project - The Swedish Royal Web Archive. Electronic library, 16(2), 1998.Google Scholar
- R. Baeza-Yates, C. Castillo, M. Marin, and A. Rodriguez. Crawling a country: better strategies than breadth-first for web page ordering. In Special interest tracks and posters of the 14th international conference on World Wide Web, WWW '05, pages 864--872, New York, 2005. ACM. Google ScholarDigital Library
- Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Sustainable economics for a digital planet, ensuring long-term access to digital information, 2010.Google Scholar
- S. Chakrabarti, M. V. D. Berg, and B. Dom. Focused crawling: a new approach to topic-specific web resource discovery. In Computer Networks, pages 1623--1640, 1999. Google ScholarDigital Library
- J. Cho, H. Garcia-Molina, and L. Page. Efficient crawling through url ordering. In Proceedings of the seventh international conference on World Wide Web 7, WWW7, pages 161--172, Amsterdam, The Netherlands, The Netherlands, 1998. Elsevier Science Publishers B. V. Google ScholarDigital Library
- H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. In Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02), 2002.Google Scholar
- P. G. Enser, C. J. Sandom, J. S. Hare, and P. H. Lewis. Facing the reality of semantic image retrieval. Journal of Documentation, 63(4):465 -- 481, 2007.Google ScholarCross Ref
- A. Goyal, B.-W. On, F. Bonchi, and L. V. S. Lakshmanan. Gurumine: A pattern mining system for discovering leaders and tribes. In Proceedings of the 2009 IEEE International Conference on Data Engineering, pages 1471--1474, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarDigital Library
- S.-M. Kim and E. Hovy. Automatic detection of opinion bearing words and sentences. In Companion Volume to the Proceedings of IJCNLP-05, the Second International Joint Conference on Natural Language Processing, pages 61--66, Jeju Island, KR, 2005.Google Scholar
- R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of the 12th international conference on World Wide Web, WWW '03, pages 568--576, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- J. Masanès. Web archiving. Springer, 2006. Google ScholarDigital Library
- D. Maynard, Y. Li, and W. Peters. NLP Techniques for Term Extraction and Ontology Population. In P. Buitelaar and P. Cimiano, editors, Bridging the Gap between Text and Knowledge - Selected Contributions to Ontology Learning and Population from Text. IOS Press, 2008. Google ScholarDigital Library
- D. McClosky, M. Surdeanu, and C. D. Manning. Event extraction as dependency parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11, pages 1626--1635, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
- F. Menczer, G. Pant, and P. Srinivasan. Topical web crawlers: Evaluating adaptive algorithms. ACM Trans. Internet Technol., 4:378--419, Nov. 2004. Google ScholarDigital Library
- G. Mohr, M. Kimpton, M. Stack, and I. Ranitovic. Introduction to Heritrix, an archival quality web crawler. In 4th International Web Archiving Workshop (IWAW04), 2004.Google Scholar
Index Terms
- ARCOMEM: from collect-all ARchives to COmmunity MEMories
Recommendations
Exploiting the social and semantic web for guided web archiving
TPDL'12: Proceedings of the Second international conference on Theory and Practice of Digital LibrariesThe constantly growing amount of Web content and the success of the Social Web lead to increasing needs for Web archiving. These needs go beyond the pure preservation of Web pages. Web archives are turning into "community memories" that aim at building ...
The archival acid test: evaluating archive performance on advanced HTML and JavaScript
JCDL '14: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital LibrariesWhen preserving web pages, archival crawlers sometimes produce a result that varies from what an end-user expects. To quantitatively evaluate the degree to which an archival crawler is capable of comprehensively reproducing a web page from the live web ...
Uncovering the unarchived web
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalMany national and international heritage institutes realize the importance of archiving the web for future culture heritage. Web archiving is currently performed either by harvesting a national domain, or by crawling a pre-defined list of websites ...
Comments