skip to main content
10.1145/1996092.1996096acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Static type checking of Hadoop MapReduce programs

Published:08 June 2011Publication History

ABSTRACT

MapReduce is a programming model for the development of Web-scale programs. It is based on concepts from functional programming, namely higher-order functions, which can be strongly typed using parametric polymorphism. Yet this connection is tenuous. For example, in Hadoop, the connection between the two phases of a MapReduce computation is unsafe: there is no static type check of the generic type parameters involved. We provide a static check for Hadoop programs without asking the user to write any more code. To this end, we use strongly typed higher-order functions checked by the standard Java 5 type checker together with the Hadoop program. We also generate automatically the code needed to execute this program with a standard Hadoop implementation.

References

  1. J. Berthold, M. Dieterle, and R. Loogen. Implementing Parallel Google Map-Reduce in Eden. In Proc. Euro-Par, LNCS 5704, pages 990--1002, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Comm. ACM, 51(1):107--113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. A. Herrmann and C. Lengauer. Transforming Functional Prototypes to Efficient Parallel Programs. In Rabhi and Gorlatch {10}, chapter 3, pages 65--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Igarashi, B. C. Pierce, and P. Wadler. Featherweight Java: A Minimal Core Calculus for Java and GJ. In Proc. OOPSLA, pages 132--146, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Jardak, J. Riihijarvi, F. Oldewurtel, and P. Mahönen. Parallel Processing of Data from Very Large-Scale Wireless Sensor Networks. In Proc. HPDC Workshops, pages 787--794, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Kuchen and J. Striegnitz. Features from Functional Programming for a C++ Skeleton Library. Concurrency Computat.: Pract. Exper., 17(7--8):739--756, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Lämmel. Google's MapReduce Programming Model -- Revisited. Sci. Comput. Program., 70(1):1--30, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. In Proc. SIGMOD, pages 1099--1110, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. C. Pierce. Types and Programming Languages. MIT Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. A. Rabhi and S. Gorlatch, editors. Patterns and Skeletons for Parallel and Distributed Computing. Springer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Wiley, A. Connolly, J. P. Gardner, S. Krughof, M. Balazinska, B. Howe, Y. Kwon, and Y. Bu. Astronomy in the Cloud: Using MapReduce for Image Coaddition. CoRR, abs/1010.1015, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Static type checking of Hadoop MapReduce programs

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              MapReduce '11: Proceedings of the second international workshop on MapReduce and its applications
              June 2011
              82 pages
              ISBN:9781450307000
              DOI:10.1145/1996092

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 8 June 2011

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Upcoming Conference

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader