ABSTRACT
The web is a universal repository of information where there is an excellent opportunity to exploit the integration of online biological resources for knowledge discovery. A major challenge is to support the effective flow of information among the sources and services on the web and their interconnection with legacy systems that are designed to operate with traditional relational databases. To address this problem, a possible strategy is to combine information from disparate data sources and display it in a single integrated framework to the user without having to populate local databases. This is called online or on-the-fly data integration. BioXBase is a user-centric biological query system which extracts user requested query information over internet from multiple biological sources and organizes a wide variety of information into a homogeneous unified view to the user after data is cleaned, processed and integrated. BioXBase system has improved the results retrieved approximately by 30% compared to a system that has only a local database. The BioXBase system is further enhanced by 20% while combining the results of both BioMap (a local database) and BioXBase (on the fly system), making the results more significant in biological domain. The results were validated by statistical methods such as precision, recall and power-law degree distribution analysis.
- Drakos, N. (1994). The LaTeX to HTML translator. Internal report. Computer Based Learning Unit, University of Leeds, January 1994.Google Scholar
- Brabrand, C., Moller, A. and Schwartzbach, M.I.(2001). Static validation of dynamically generated HTML. In Workshop on Program Analysis for Software Tools and Engineering. Google ScholarDigital Library
- Haas, L. M., Lin, E. T., and Roth, M. A. (2002). Data integration through database federation. IBM Systems Journal 41, 4 (Oct. 2002), 578--596. Google ScholarDigital Library
- Haas, L. M., Miller, R. J., Niswonger, B., Tork Roth, M., Schwarz, P. M. and Wimmers E. L. (1999). Transforming Heterogeneous Data with Database Middleware: Beyond Integration. IEEE Data Engineering Bulletin, 22(1):31--36.Google Scholar
- Suciu, D. (2002). Distributed query evaluation on semi structured data. ACM Trans. Database System. 27, 1 (Mar. 2002), 1--62. Page 234. Google ScholarDigital Library
- Köhler, J., Philippi, S. and Lange, M. (2003). SEMEDA-Ontology based integration of biological databases, Bioinformatics, vol. 19, no. 18, pp. 2420--2427.Google ScholarCross Ref
- Draper, D., Halevy, A. Y., and Weld, D. S. (2001). The Nimble XML Data Integration System. Proceedings of the 17th international Conference on Data Engineering (2001).IEEE Computer Society, Washington, DC, 155--160. Google ScholarDigital Library
- http://disl.cc.gatech.edu/XWRAP/xwrap.htmlGoogle Scholar
- http://www-static.cc.gatech.edu/projects/disl/XWRAPElite/Google Scholar
- http://sunsite.unc.edu/pub/suninfo/standards/xml/why/xmlapps.html.Google Scholar
- Mork, P., Shaker R., Halevy, A. and Tarczy, P. (2002). PQL: A Declarative Query Language over Dynamic Biological Schemata. Proceedings of the Annual Symposium of the American Medical, 2002 - sigpubs.biostr.washington.edu. Pages - 1--5.Google Scholar
- Carey, M. J., Haas, L. M., Schwarz, P. M., Arya, M., Cody, W. F., Fagin, R., Flickner, M., Luniewski, A., Niblack, W., Petkovic, D., Thomas, J., Williams, J. H. and Wimmers, E. L.(1995). Towards heterogeneous multi- media information systems: The Garlic approach. In Proc. of the 5th Int. Workshop on Research Issues in Data Engineering - Distributed Object Management (RIDE-DOM'95), pages 124--131. IEEE Computer Society Press, 1995. Google ScholarDigital Library
- Mork, P., Halevy, A. and Hornoch, T.(2001). A Model for Data Integration Systems of Biomedical Data Applied to Online Genetic Databases. In Proceedings of the Symposium of the American Medical Informatics Association. Page 7.Google Scholar
- Güler, S., Eberhart, A. and Rojas, L., (2003). Web-based exchange of biochemical information Bioinformatics Vol. 19 no. 13., Pages 1730--1731.Google Scholar
- Ives, Z. G., Halevy, A. Y. and Weld, D. S. (2001). Integrating network-bound XML data. IEEE Data engineering Bulletin Special Issue on XML, 24(2), June 2001.Google Scholar
- Hernández, M. A., Miller, R. J. and Haas, L. M.(2001).Clio: A Semi-Automatic Tool For Schema Mapping. SIGMOD, 2001. Google ScholarDigital Library
- http://www.ebi.ac.uk/interpro/project_outlines.htmlGoogle Scholar
- http://www.pir.uniprot.org/Google Scholar
- http://www.genome.jp/kegg/Google Scholar
- Barabasi, A. L. and Oltvai, Z. N.(2004). Network biology: understanding the cell's functional organization.Nature Rev. Genet. 5, 101--113 (2004).Google ScholarCross Ref
- Borish, L. C. and J. W. Steinke. (2003). Cytokines and chemokines. J. Allergy Clin. Immunol. 111:S460--S475.Google ScholarCross Ref
- Chen, R., Pan, S. and Brentnall, T. A., Aebersold, R.(2005) Proteomic profiling of pancreatic cancer for biomarker discovery. Mol Cell Proteomics 2005;4:523--33.Google Scholar
- Palakal, M., Mukhopadhyay, S. and Stephens, M. (2005). Identification of Biological Relationships from Text Documents. Book in "Medical Informatics: Advances in Knowledge Management and Data Mining in Biomedicine, Ed. H. Chen. Kluwer Publishers, pp.449--489.Google Scholar
Index Terms
- On-the-fly data integration models for biological databases
Recommendations
An on demand data integration model for biological databases
This paper presents a user-centric biological query system for information integration and knowledge acquisition from distributed, semantically heterogeneous data sources. The proposed system, BioXBase, extracts user requested query information over the ...
Biomedical association mining and validation
ISB '10: Proceedings of the International Symposium on BiocomputingDuring last decade, the data published in biomedical literature has increased exponentially. With this growth, it has become hard to manually read all the papers for required information. Many text mining algorithms and approaches have been developed to ...
Comments