MASSIVE SIMULATIONS USING MAPREDUCE MODEL MAPREDUCE WIELOKROTNYCH OBLICZENIACH NUMERYCZNYCH

. In the last few years cloud computing is growing as a dominant solution for large scale numerical problems. It is based on MapReduce programming model, which provides high scalability and flexibility, but also optimizes costs of computing infrastructure. This paper studies feasibility of MapReduce model for scientific problems consisting of many independent simulations. Experiment based on variability analysis for simple electro- magnetic problem with over 10,000 scenarios proves that platform has nearly linear scalability with over 80% of theoretical maximum performance.


Introduction
Distributed processing has a long history in a computer science. Since early beginning it has taken advantage of multiple processors available at the same time. Recently popular term "cloud computing" is not scientifically precise, but its general understanding is based on flexibility of architecture, calculations on-demand, and lower costs related with the model.
Despite objections related with privacy, cloud computing is one of the most growing elements of todays technology [3]. It provides both, flexibility and performance [6]. Top IT enterprises (eg. Amazon, Microsoft, Google) are developing their datacenters with hundreds of thousands of computing nodes to process huge datasets. Academia users also intensively investigate how new architecture could be applied into scientific problems [1].
It can not be denied that cloud computing has commercial roots. But it is worth to mention that similar systems could be created in educational or community based environments. Volunteer computing is based on thousands of private machines working together to solve important, large scale problems [5].
In this paper we will verify usability and overall performance of cloud computing scheme in real engineering problem. Variability analysis for simple electromagnetic model is used as a exemplary problem for MapReduce massive simulations platform. Obtained results will be compared with classical single processing algorithm.

MapReduce processing
Cloud computing is deeply connected with MapReduce distributed processing model. Idea behind MapReduce is to automatically split processing into containers of datasets, which could be distributed over many independent computing nodes. Each of the nodes is a computational unit, it could be high power rack server, standard computer or even single virtual machine.
During MapReduce operation (as shown on Fig. 1), firstly, master node is splitting InputData into datasets to distribute them among all working nodes. This stage is also about making a copies of all data to provide redundancy.
Second step is to run application transforming splitted Input-Data parts into intermediate key-value format data for further processing. This process is called Mapping. Mapper task makes another split of data and working on a smaller datasets in own allocated memory. Output results of Mapper stage are sets of nonsorted data. Because Mapping is a processing of all data and each Mapper is working independently, we do not know when and where a specific InputData dataset is processed. Next phase can start, when all Mappers finish processing of assigned part of data. When all data are processed, Mappers are terminated to release memory. Third step of MapReduce algorithm is called "shuffle and sort". During this phase, Reducers (also launched on working nodes) start to work and get data from partitioned outputs from Mappers. This looks like searching for a values (small results) with indexes and making a list in a numerical order.
When all outputs in datasets from Mappers are indexed and sorted into size-accepted blocksthey are ready to final merge. Last step is also called "reduce". During this stage, Reducers are using a list of order and saving all data in one single file into final output. Such scheme is universal, and allows to efficiently solve wide range of problems.
There are number of numerical algorithms where simulator is run repetitively for many different parameters. Stochastic optimization, evolutionary algorithms, Monte Carlo methods, variability analysis are only the most popular examples [2]. As described above MapReduce model consists of two main stages: mapper and reducer. First of them is directly connected with input data, what makes him suitable for processing of many similar simulations.
Straight forward approach to use MapReduce for variability analysis is to replace mapper by the problem simulator and input data by the problem parameters [4]. Then MapReduce framework will spread calculations over all available computing nodes. Proper use of reducer stage is the key for optimal implementation. Its role is to aggregate mapper results.
Depending of type of the algorithm, reducer could be used to choose optimal solution, find the best individual in population, calculate histogram of solutions, or to other tasks.

Hadoop basics
MapReduce model has many implementations, such as MongoDB, Riak, DiscoProject. But there is one which is definitely the most popular, Hadoop project [8] developed by Apache foundation in open-source model.
Hadoop Project is an advanced, modular environment based on Java Virtual Machine (JVM) which provides many benefits: multiplatform support, multi-language scripts solutions allowance, code (tasks) protection during execution. Hadoop platform offers a optimization of hardware utilization and redundancy management features which are important, because cloud clusters are usually based on low-costs, faulty machines.

Numerical experiment
To test platform efficiency we constructed variability analysis of simple electromagnetic problem. As it is shown on Fig. 2, it is based on square shaped conductor in which circular hole is created. Obviously location of the hole will influence resistance of the conductor, but it is non-linear and not trivial relation. We used MapReduce model to conduct analysis of the problem.
Firstly we created a file with 100100 scenarios to calculate a resistance for each center of the hole (x,y). Points were generated using uniform method which provides an even distribution of parameters inside given range. Having solution for scalar potential, resistance R could be found as: 2 In case of our test problem, reducer is not intended to provide statistical function, but it is simple input-output rewriter. For efficiency investigations final result is computation time needed to process all scenarios.
Single simulation (solution of FEM problem) in our experiment takes about 4 seconds on generic, modern computer. To complete all of the analysis, total number of 101101=10201 scenarios has to be solved. What means that simple, serial processing will take 40804 seconds (11 hours 20 minutes and 4 seconds). Reducing this time is the main reason for application of Hadoop platform.

Hadoop platform configuration
Test platform is a cluster of four physical servers. Each of them has two quad-core Intel Xeon E5620 CPUs running at 1.6 GHz, and 256GB SSD drive. Total memory available for Hadoop was 24GB. Nodes were connected to gigabit Ethernet and running GNU/Linux 64-bit version. The newest release of Hadoop 2.6.0 had been installed.

Fig. 3. Optimization of number of vCores based on computation time
One of the servers is devoted for administration of the cluster, so total number of available computational cores is 24 CPU. Hadoop could be configured to use any number of vCores. Results for different settings are presented on Fig. 3. One can see that optimal results are for 22 cores. The general rule is that because of platform stability and ApplicationMaster process, two cores has to be dedicated for internal Hadoop management.

Hadoop performance
Next performance experiment is designed to determine scalability of the platform. Series of simulations for different numbers of scenarios were run and total time of computations were measured. Results are presented on Fig. 4a) for very small cases (<25) and on Fig. 4b) for larger problems. For comparison reasons results from single computer (PC) were also plotted. As seen (Fig. 4), Hadoop startup time is about 20 seconds, so for less than 6 scenarios (4 seconds each) we see that single computer is faster, and there is no need to use cloud-based computing environment. For larger problems cloud platform shows its advantage. Scaling plot is very close to linear function, where steepness of the line is controlled by the number of processing cores attached to Hadoop.
The most spectacular speed-ups are observed for large numbers of scenarios. For instance, for 1066 scenarios single PC simulation time is about 4300 seconds, while using Hadoop with 22 cores configuration only 275 seconds. What gives impressive 15 performance gain. On the other hand, it is still below theoretical maximum speed-up which is 22.
Hadoop is complicated computing engine providing flexibility, fault control and other features for managing very large problems. It is expected that it will introduce some overhead, but one can ask what is the level and scalability of the overhead. To address this issue we run set of larger problems, reaching up to 10,000 of scenarios. As one can seeas number of scenarios grows overhead is going down. This is very desirable effect, which allows us to use nearly 80% of theoretical platform performance for 10,000 scenarios. At the same time, we can make prediction that even for 1 million of scenarios it will be above 10%. This level of overhead can not be neglected, but it also could be explained as a necessary cost for cloud computing flexibility.

Conclusions
Presented experiments have shown that cloud computing is attractive solution for chosen scientific analyses. MapReduce scheme could be easily used to efficiently manage massive numbers of independent simulations.
Apache Hadoop platform connected with electromagnetic solver implemented with FEM presented nearly perfect scalability. On the other hand, we observed overhead at the level of 20%. It can be justified by unique features of Hadoop, but we believe that more efforts at configuration level should reduce this value. Performance and overhead issues are very problem specific. In our variability analysis external application is intensively called what is not typical for MapReduce, so results should not be projected onto different types of problems.
Cloud computing was inspired by economic aspects of calculations, so the financial costs of presented solutions should be also discussed. Cloud services prices are decreasing every day, but even now they are below 0.05 USD per hour of server (see Table 1). What means that total cost of our 10,000 scenarios experiment would be about 5 USD. This amount is practically negligible comparing to a classical clusters investments. So we can conclude that, the migration scientific problems into the cloud technologies, could save time as well as money.