Skip to main content

Exploring Data Locality for Clustered Enterprise Applications

  • Conference paper
  • 1769 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8055))

Abstract

Exploring data locality is crucial to achieve good performance on a distributed system. For many complex, constantly evolving applications, relying on programmers to write their code so as to explore data locality results often in sub-par performance. We propose an automatic approach for dealing with this problem. Instead of expecting programmers to identify data locality, the solution developed here relies on a stochastic analysis of the data-access patterns exhibited by the application at run-time. The analysis makes it possible to correlate not only domain data but application functionality as well. This information is used to explore data locality in clustered enterprise applications by combining two orthogonal and complementary approaches. The first approach reduces the memory foot-print by using a more compact in-memory representation for the application’s domain classes and, furthermore, by delaying the loading of less frequently accessed data. The second approach generates a new request distribution policy. It employs the Latent Dirichlet Allocation partitioning algorithm, generating sub-sets of highly correlated application functionality. Every cluster node is responsible for processing requests belonging to a single sub-set. The combination of these approaches allows cluster nodes to make better use of their memory, thereby increasing the computational efficiency of the system. The work has been validated on the TPC-W benchmark, demonstrating significant performance improvements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amza, C., Cox, A.L., Zwaenepoel, W.: Conflict-aware scheduling for dynamic content applications. In: Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems, vol. 4, pp. 6–20. USENIX Association (2003)

    Google Scholar 

  2. Amza, C., Cox, A.L., Zwaenepoel, W.: A comparative evaluation of transparent scaling techniques for dynamic content servers. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), pp. 230–241. IEEE (2005)

    Google Scholar 

  3. Bhattacharya, S., Nanda, M.G., Gopinath, K., Gupta, M.: Reuse, Recycle to De-bloat Software. In: Mezini, M. (ed.) ECOOP 2011. LNCS, vol. 6813, pp. 408–432. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Cardellini, V., Casalicchio, E., Colajanni, M., Yu, P.: The state of the art in locally distributed Web-server systems. ACM Computing Surveys (CSUR) 34(2), 263–311 (2002)

    Article  Google Scholar 

  6. Chis, A.E., Mitchell, N., Schonberg, E., Sevitsky, G., O’Sullivan, P., Parsons, T., Murphy, J.: Patterns of Memory Inefficiency. In: Mezini, M. (ed.) ECOOP 2011. LNCS, vol. 6813, pp. 383–407. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Denning, P.J., Schwartz, S.C.: Properties of the working-set model. Communications of the ACM 15(3), 191–198 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  8. Elnikety, S., Dropsho, S., Zwaenepoel, W.: Tashkent+: Memory-aware load balancing and update filtering in replicated databases. ACM SIGOPS Operating Systems Review 41(3), 399–412 (2007)

    Article  Google Scholar 

  9. Fernandes, S., Cachopo, J.: Strict serializability is harmless: a new architecture for enterprise applications. In: Proceedings of the ACM International Conference on Object-Oriented Programming Systems, Languages and Applications, Portland, Oregon, USA, pp. 257–276. ACM (2011)

    Google Scholar 

  10. Garbatov, S., Cachopo, J.: Importance Analysis for Predicting Data Access Behaviour in Object-Oriented Applications. Journal of Computer Science and Technologies 14(1), 37–43 (2010)

    Google Scholar 

  11. Garbatov, S., Cachopo, J.: Predicting Data Access Patterns in Object-Oriented Applications Based on Markov Chains. In: Proceedings of the Fifth International Conference on Software Engineering Advances (ICSEA 2010), Nice, France, pp. 465–470 (2010)

    Google Scholar 

  12. Garbatov, S., Cachopo, J.: Data Access Pattern Analysis and Prediction for Object-Oriented Applications. INFOCOMP Journal of Computer Science 10(4), 1–14 (2011)

    Google Scholar 

  13. Garbatov, S., Cachopo, J.: Optimal Functionality and Domain Data Clustering based on Latent Dirichlet Allocation. In: Proceedings of the Sixth International Conference on Software Engineering Advances (ICSEA 2011), Barcelona, Spain, pp. 245–250. ThinkMind (2011)

    Google Scholar 

  14. Garbatov, S., Cachopo, J.: Decreasing Memory Footprints for Better Enterprise Java Application Performance. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part I. LNCS, vol. 7446, pp. 430–437. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Garbatov, S., Cachopo, J.: Explicit use of working-set correlation for load-balancing in clustered web servers. In: Proceedings of the Seventh International Conference on Software Engineering Advances (ICSEA 2012), Lisbon, Portugal (2012) (in print)

    Google Scholar 

  16. Garbatov, S., Cachopo, J., Pereira, J.: Data Access Pattern Analysis based on Bayesian Updating. In: Proceedings of the First Symposium of Informatics (INForum 2009), Lisbon, Paper 23 (2009)

    Google Scholar 

  17. Jones, R.E., Ryder, C.: A study of Java object demographics. In: Proceedings of the 7th International Symposium on Memory Management, Tucson, AZ, USA, pp. 121–130. ACM (2008)

    Google Scholar 

  18. Pai, V., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., Nahum, E.: Locality-aware request distribution in cluster-based network servers. In: Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, United States, pp. 205–216. ACM (1998)

    Google Scholar 

  19. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  20. Smith, W.: TPC-W: Benchmarking An Ecommerce Solution. Intel Corporation (2000)

    Google Scholar 

  21. Zhang, Q., Riska, A., Sun, W., Smirni, E., Ciardo, G.: Workload-aware load balancing for clustered web servers. IEEE Transactions on Parallel and Distributed Systems 16(3), 219–233 (2005)

    Article  Google Scholar 

  22. Zhong, M., Shen, K., Seiferas, J.: Correlation-Aware Object Placement for Multi-Object Operations. In: Proceedings of the 2008 the 28th International Conference on Distributed Computing Systems, pp. 512–521. IEEE Computer Society (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Garbatov, S., Cachopo, J. (2013). Exploring Data Locality for Clustered Enterprise Applications. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 2013. Lecture Notes in Computer Science, vol 8055. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40285-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40285-2_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40284-5

  • Online ISBN: 978-3-642-40285-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics