skip to main content
10.3115/990403.990454dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

A search algorithm and data structure for an efficient information system

Published:01 September 1969Publication History

ABSTRACT

This paper describes a system for information storage, retrieval, and updating, with special attention to the search algorithm and data structure demanded for maximum program efficieny. The program efficiency is especially warranted when a natural language or a symbolic language is involved in the searching process.The system is a basic framework for an efficient information system. It can be implemented for text processing and document retrieval; numerical data retrieval; and for handling of large files such as dictionaries, catalogs, and personnel records, as well as graphic informations. Currently, eight commands are implemented and operational in batch mode on a CDC 3600: STORE, RETRIEVE, ADD, DELETE, REPLACE, PRINT, COMPRESS and LIST. Further development will be on the use of teletype console, CRT terminal, and plotter under a time-sharing environment for producing immediate responses.The maximum program efficiency is obtained through a unique search algorithm and data structure. Instead of examining the recall ratio and the precision ratio at a higher level, this efficiency is measured in the most basic term of "average number of searches" required for looking up an item. In order to identify an item, at least one search is necessary even if it is found the first time. However, through the use of the hash-address of a key or keyword, in conjunction with an indirect-chaining list-structured table, and a large available space list, the average number of searches required for retrieving a certain item is 1.25 regardless of the size of the file in question. This is to be compared with 15.6 searches for the binary search technique in a 50,000-item file, and 5.8 searches for the letter-table method with no regard to file size.

References

  1. Becker, Joseph, and Hayes, Robert M. Information Storage and Retrieval: tools, elements, theories. Wiley, New York, 1963.Google ScholarGoogle Scholar
  2. Bobrow, D. G. "Syntactic Theory in Computer Implementations," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 217--251.Google ScholarGoogle Scholar
  3. Borko, Harold. "Indexing and Classification," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 99--125.Google ScholarGoogle Scholar
  4. Bourne, Charles P. Methods of Information Handling. Wiley, New York, 1963.Google ScholarGoogle Scholar
  5. Hayes, David G. Introduction to Computational Linguistics. American Elsevier, New York, 1967.Google ScholarGoogle Scholar
  6. Johnson, L. R. "Indirect Chaining Method for Addressing on Secondary Keys," Communications of the ACM, 4(May, 1961), pp. 218--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. King, Donald W. "Design and Evaluation of Information Systems," Annual Review of Information Science and Technology, Volume 3, edited by Carlos A. Cuadra. Encyclopedia Britannica, Chicago, 1968, pp. 61--103.Google ScholarGoogle Scholar
  8. Knuth, Donald E. The Art of Computer Programming, Volume 1/Fundamental Algorithms. Addison-Wesley, Reading, Massachusetts, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lamb, Sydney M. and Jacobsen, William H., Jr. "A High-Speed Large-Capacity Dictionary System," Readings in Automatic Language Processing, edited by David G. Hays. American Elsevier, New York, 1966, pp. 51--72. Also in Mechanical Translation, 6 (November, 1961), pp. 76--107.Google ScholarGoogle Scholar
  10. Lee, T. C.; Want, H. T.; and Yang, S. C. "An Experimental Model for Chinese to English Machine Translation." Paper presented at the Annual Meeting of the Association for Machine Translation and Computational Linguistics, San Françisco, 1966.Google ScholarGoogle Scholar
  11. Lee, T. C.; Wang, H. T.; Yang, S. C.; and Farmer, E. Linguistic Studies for Chinese to English Machine Translation. Itek Corporation, Lexington, Massachusetts, 1965. Also available from ERIC Document Reproduction Service as ED 010 872.Google ScholarGoogle Scholar
  12. Maurer, W. D. Programming: an introduction to computer Languages and techniques. Holden-Day, San Francisco, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Meadow, Charles T. The Analysis of Information Systems: A Programmer's Introduction to Information Retrieval. Wiley, New York, 1967.Google ScholarGoogle Scholar
  14. Morris, Robert. "Scatter Storage Techniques," Communications of the ACM, 11(January, 1968), pp. 38--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Pendergraft, E. D. "Translating Languages," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 291--323.Google ScholarGoogle Scholar
  16. Peterson, W. W. "Addressing for Random-Access Storage," IBM J. Res. Dev. 1(April, 1957), pp. 130--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Salton, Gerard. Automatic Information Organization and Retrieval. McGraw-Hill, New York, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sedelow, Salley Yeates, and Sedelow, Walter A., Jr. "Stylistic Analysis," Automatic Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 181--213.Google ScholarGoogle Scholar
  19. See, Richard. "Machine-Aided Translation and Information Retrieval," Electronic Handling of Information: Testing & Evaluation<, edited by Allen Kent; Orrin E. Taulbee; Jack Belzer; and Gordon D. Goldstein. Thompson, Washington, D. C., and Academic Press, London, 1967, pp. 89--108.Google ScholarGoogle Scholar
  20. Shoffner, Ralph M. "Organization, Maintenance and Search of Machine Files," Annual Review of Information Science and Technology, Volume 3, edited by Carlos A. Cuadra. Encyclopedia Britannica, Chicago, 1968, pp. 137--167.Google ScholarGoogle Scholar
  21. Simmons, R. F. "Answering English Questions by Computer," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 253--289.Google ScholarGoogle Scholar
  22. Travis, Larry E. "Analytic Information Retrieval," Natural Language and the Computer, edited by Paul L. Garvin. McGraw-Hill, New York, 1963, pp. 310--353.Google ScholarGoogle Scholar
  23. Venezky, Richard L. "Storage, Retrieval, and Editing of Information for a Dictionary," American Documentation, 19 (January, 1968), pp. 71--79.Google ScholarGoogle ScholarCross RefCross Ref
  24. Wegner, Peter. Programming Languages, Information Structures, and Machine Organization. McGraw-Hill, New York, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Wyllys, Ronald E. "Extracting and Abstracting by Computer," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 127--179.Google ScholarGoogle Scholar
  26. Yang, S. C. "Automatic Segmentation and Phrase-Structure Parsing: a Simple Chinese Parser," Thought and Word, 6 (January, 1969), pp. 324--331.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    COLING '69: Proceedings of the 1969 conference on Computational linguistics
    September 1969
    1907 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 1 September 1969

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate1,537of1,537submissions,100%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader