ABSTRACT
This paper describes a system for information storage, retrieval, and updating, with special attention to the search algorithm and data structure demanded for maximum program efficieny. The program efficiency is especially warranted when a natural language or a symbolic language is involved in the searching process.The system is a basic framework for an efficient information system. It can be implemented for text processing and document retrieval; numerical data retrieval; and for handling of large files such as dictionaries, catalogs, and personnel records, as well as graphic informations. Currently, eight commands are implemented and operational in batch mode on a CDC 3600: STORE, RETRIEVE, ADD, DELETE, REPLACE, PRINT, COMPRESS and LIST. Further development will be on the use of teletype console, CRT terminal, and plotter under a time-sharing environment for producing immediate responses.The maximum program efficiency is obtained through a unique search algorithm and data structure. Instead of examining the recall ratio and the precision ratio at a higher level, this efficiency is measured in the most basic term of "average number of searches" required for looking up an item. In order to identify an item, at least one search is necessary even if it is found the first time. However, through the use of the hash-address of a key or keyword, in conjunction with an indirect-chaining list-structured table, and a large available space list, the average number of searches required for retrieving a certain item is 1.25 regardless of the size of the file in question. This is to be compared with 15.6 searches for the binary search technique in a 50,000-item file, and 5.8 searches for the letter-table method with no regard to file size.
- Becker, Joseph, and Hayes, Robert M. Information Storage and Retrieval: tools, elements, theories. Wiley, New York, 1963.Google Scholar
- Bobrow, D. G. "Syntactic Theory in Computer Implementations," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 217--251.Google Scholar
- Borko, Harold. "Indexing and Classification," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 99--125.Google Scholar
- Bourne, Charles P. Methods of Information Handling. Wiley, New York, 1963.Google Scholar
- Hayes, David G. Introduction to Computational Linguistics. American Elsevier, New York, 1967.Google Scholar
- Johnson, L. R. "Indirect Chaining Method for Addressing on Secondary Keys," Communications of the ACM, 4(May, 1961), pp. 218--222. Google ScholarDigital Library
- King, Donald W. "Design and Evaluation of Information Systems," Annual Review of Information Science and Technology, Volume 3, edited by Carlos A. Cuadra. Encyclopedia Britannica, Chicago, 1968, pp. 61--103.Google Scholar
- Knuth, Donald E. The Art of Computer Programming, Volume 1/Fundamental Algorithms. Addison-Wesley, Reading, Massachusetts, 1968. Google ScholarDigital Library
- Lamb, Sydney M. and Jacobsen, William H., Jr. "A High-Speed Large-Capacity Dictionary System," Readings in Automatic Language Processing, edited by David G. Hays. American Elsevier, New York, 1966, pp. 51--72. Also in Mechanical Translation, 6 (November, 1961), pp. 76--107.Google Scholar
- Lee, T. C.; Want, H. T.; and Yang, S. C. "An Experimental Model for Chinese to English Machine Translation." Paper presented at the Annual Meeting of the Association for Machine Translation and Computational Linguistics, San Françisco, 1966.Google Scholar
- Lee, T. C.; Wang, H. T.; Yang, S. C.; and Farmer, E. Linguistic Studies for Chinese to English Machine Translation. Itek Corporation, Lexington, Massachusetts, 1965. Also available from ERIC Document Reproduction Service as ED 010 872.Google Scholar
- Maurer, W. D. Programming: an introduction to computer Languages and techniques. Holden-Day, San Francisco, 1968. Google ScholarDigital Library
- Meadow, Charles T. The Analysis of Information Systems: A Programmer's Introduction to Information Retrieval. Wiley, New York, 1967.Google Scholar
- Morris, Robert. "Scatter Storage Techniques," Communications of the ACM, 11(January, 1968), pp. 38--44. Google ScholarDigital Library
- Pendergraft, E. D. "Translating Languages," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 291--323.Google Scholar
- Peterson, W. W. "Addressing for Random-Access Storage," IBM J. Res. Dev. 1(April, 1957), pp. 130--146.Google ScholarDigital Library
- Salton, Gerard. Automatic Information Organization and Retrieval. McGraw-Hill, New York, 1968. Google ScholarDigital Library
- Sedelow, Salley Yeates, and Sedelow, Walter A., Jr. "Stylistic Analysis," Automatic Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 181--213.Google Scholar
- See, Richard. "Machine-Aided Translation and Information Retrieval," Electronic Handling of Information: Testing & Evaluation<, edited by Allen Kent; Orrin E. Taulbee; Jack Belzer; and Gordon D. Goldstein. Thompson, Washington, D. C., and Academic Press, London, 1967, pp. 89--108.Google Scholar
- Shoffner, Ralph M. "Organization, Maintenance and Search of Machine Files," Annual Review of Information Science and Technology, Volume 3, edited by Carlos A. Cuadra. Encyclopedia Britannica, Chicago, 1968, pp. 137--167.Google Scholar
- Simmons, R. F. "Answering English Questions by Computer," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 253--289.Google Scholar
- Travis, Larry E. "Analytic Information Retrieval," Natural Language and the Computer, edited by Paul L. Garvin. McGraw-Hill, New York, 1963, pp. 310--353.Google Scholar
- Venezky, Richard L. "Storage, Retrieval, and Editing of Information for a Dictionary," American Documentation, 19 (January, 1968), pp. 71--79.Google ScholarCross Ref
- Wegner, Peter. Programming Languages, Information Structures, and Machine Organization. McGraw-Hill, New York, 1968. Google ScholarDigital Library
- Wyllys, Ronald E. "Extracting and Abstracting by Computer," Automated Language Processing, edited by Harold Borko. Wiley, New York, 1967, pp. 127--179.Google Scholar
- Yang, S. C. "Automatic Segmentation and Phrase-Structure Parsing: a Simple Chinese Parser," Thought and Word, 6 (January, 1969), pp. 324--331.Google Scholar
Recommendations
The fringe-saving A* search algorithm: a feasibility study
IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligenceIn this paper, we develop Fringe-Saving A* (FSA*), an incremental version of A* that repeatedly finds shortest paths in a known gridworld from a given start cell to a given goal cell while the traversability costs of cells increase or decrease. The ...
Efficient Local Search with Conflict Minimization: A Case Study of the n-Queens Problem
Backtracking search is frequently applied to solve a constraint-based search problem, but it often suffers from exponential growth of computing time. We present an alternative to backtracking search: local search with conflict minimization. We have ...
Distinctiveness-Sensitive Nearest-Neighbor Search for Efficient Similarity Retrieval of Multimedia Information
ICDE '01: Proceedings of the 17th International Conference on Data EngineeringAbstract: Nearest neighbor (NN) search in high dimensional feature space is widely used for similarity retrieval of multi-media information. However, recent research results in the database literature reveal that a curious problem happens in high ...
Comments