- 1.A. Borg, R. E. Kessler, and D. W. Wall. Generation and analysis of very long address traces. In Proc. of the 17th Annual Int. Syrup. on Computer Architecture, pages 270-281, May 1990. Google ScholarDigital Library
- 2.E. Cornish, E. Granston, and A. Veidenbaum. Compiler-directed data prefetching in multiprocessor with memory hierarchies. In Proc. 1990 Int. Conf. on Supercomputing, pages 354-368, 1990. Google ScholarDigital Library
- 3.J. L. H ennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Mateo, CA, 1990. Google ScholarDigital Library
- 4.M. D. Hill. Aspects of Cache Memory and Instruction Buffer Performance. PhD thesis, University of California, Berkeley, 1987. Google ScholarDigital Library
- 5.N. P. Jouppi. Improving direct-mapped cache performance by the addition of a small fullyassociative cache and prefetch buffers. In Proc. of the 17th Annual Int. Syrup. on Computer Architecture, pages 364-373, May 1990. Google ScholarDigital Library
- 6.D. Kroft. Lookup-free instruction fetch/prefetch cache organization. In Proc. of the 8th Annual Int. Symp. on Computer Architecture, pages 81- 87, 1981. Google ScholarDigital Library
- 7.J. K. F. Lee and A. J. Smith. Branch prediction strategies and branch target buffer design. Computer, pages 6-22, January 1984.Google Scholar
- 8.R. L. Lee, P-C. Yew, and D. H. Lawrie. Data prefetching in shared memory multiprocessors. In Proc. of the Int. Conf. on Parallel Processing, pages 28-31, 1987.Google Scholar
- 9.C. H. Perleberg and A. J. S1,,iih. Branch target buffer design and optimiza, ,r,. Technical Report UCB/CSD 89/552, Univc,-ity of California, Berkeley, December 1989. Google ScholarDigital Library
- 10.A. K. Porterfield. Software molt,~)ds for improvement of cache performance on supercomputer application. Technical Report COMP TR 89-93, Rice University, May 1989.Google Scholar
- 11.S. Przybylski. The performance impact of block sizes and fetch strategies. In Proc. of the 17lh Annual Int. Symp. on Computer Architecture, pages 160-169, May 1990. Google ScholarDigital Library
- 12.A. J. Smith. Cache memories. A CM Computing Surveys, 14(3):473-530, September 1982. Google ScholarDigital Library
- 13.J. E. Smith. Decoupled access/execute computer architecture. In Proc. of the 9th Annual Int. Syrup. on Computer Architecture, pages 112-119, 1982. Google ScholarDigital Library
- 14.The Perfect Club, et al. The Perfect Club benchmarks: Effective performance evaluation of supercomputers. Int. J. of Supercompuler Applications, 23(3):5-40, Fall 1989.Google Scholar
Index Terms
- An effective on-chip preloading scheme to reduce data access penalty
Recommendations
Recency-based TLB preloading
Special Issue: Proceedings of the 27th annual international symposium on Computer architecture (ISCA '00)Caching and other latency tolerating techniques have been quite successful in maintaining high memory system performance for general purpose processors. However, TLB misses have become a serious bottleneck as working sets are growing beyond the capacity ...
Recency-based TLB preloading
ISCA '00: Proceedings of the 27th annual international symposium on Computer architectureCaching and other latency tolerating techniques have been quite successful in maintaining high memory system performance for general purpose processors. However, TLB misses have become a serious bottleneck as working sets are growing beyond the capacity ...
Comments