3. Materials and Methods
The concept underlying the DTAM algorithm proposed in this paper is shown in
Figure 3. The system periodically analyses the user input-data information, such as the sizes and numbers of files. The IDPCP algorithm uses the change in the intermediate-data volume assigned to each slave node to predict the number of Map and Reduce tasks in the future and dynamically adjust the schedule. The master node analyses the digestibility of the user input data, and as each slave node periodically returns information on the Map and Reduce operations, the algorithm predicts the number of Map and Reduce tasks required by each slave node in the future. In addition, the number of dynamic tasks is adjusted according to the limit on the number of tasks, which is set by the system administrator. The DTAM algorithm is based on the assumption that the sum of the upper limits on the numbers of Map and Reduce tasks remains unchanged.
In the example shown in
Figure 3, the DTAM algorithm adjusts the upper limits on the numbers of Map and Reduce tasks for Slave Node 1 to five and seven, respectively, while for Slave Node 2, these values are adjusted to seven and five. The maximum numbers of Map and Reduce tasks for Slave Node 3 are adjusted to seven and five, respectively, and the values for Slave Node 4 are adjusted to three and nine. The DTAM algorithm can dynamically adjust the number of Map and Reduce tasks to allocate the cloud computing resource requirements without affecting the computer processing load and can solve the problem of intermediate-data skew in the operation of the cloud MapReduce framework. In the example in
Figure 3, the DTAM adjusts the upper limit on the number of Reduce tasks for Slave Node 4 to nine. The number of Reduce tasks executed by Slave Node 4 is six, which is higher than for the other slave nodes; hence, in order to prevent Slave Node 4 from becoming a laggard in the future, which would prolong the completion time of small-scale cloud application tasks, the DTAM increases the number of Reduce tasks to speed up the completion time and decreases the number of Map tasks to lessen the computer processing load. The DTAM also increases the number of other slave nodes processing Map tasks, to avoid a reduction in the input-data digestibility.
The IDPCP algorithm calculates the wave number,
Wn, for the number of digested user input data in the future based on the quantity of user input data and the upper limit on the number of Map tasks for each slave node. The volume of intermediate data generated by the Map tasks in each slave node in the future is shown in Equation (1), where
Fn is the total number of files inputted by the user;
Sn is the total number of slave nodes;
Mdn is the upper limit on the number of Map tasks for each slave node; and n represents a specific cloud application process. The IDPCP algorithm periodically predicts the wave number for input-data processing in order to adjust the numbers of Map and Reduce tasks appropriately. The algorithm uses an exponential smoothing method to predict the number of intermediate data for each slave node, as shown in Equation (2). Based on the actual intermediate-data volume,
It, generated by each slave node in the current cycle, we set a smoothing coefficient
α to predict the intermediate-data volume,
Ip+1, for the next cycle and assign a smoothing coefficient (1 −
α) for the predicted intermediate-data volume,
Ip, for the current cycle. In order to prevent the fixed value of the smoothing coefficient affecting the convergence speed and causing drastic changes, the IDPCP algorithm uses Equation (3) to limit the dynamic adjustment range of
α.
The DTAM predicts the intermediate-data volumes and uses the formula for the deviation value to calculate the upper limits on the numbers of Map and Reduce tasks required. The algorithm calculates the average intermediate-data volume that each slave node in the small-scale MapReduce cloud architecture system must process and obtains the standard deviation, Iσ, for the number of intermediate data, as shown in Equation (4) (where xi is the predicted volume of continuous intermediate data for each slave node; μ is the average number of intermediate data that must be processed by all slave nodes; and N is the total number of slave nodes). The DTAM algorithm finds the percentage Pi of the intermediate-data volume deviation that each slave node needs to process in the next cycle based on the standard deviation of the intermediate-data volume, as shown in Equation (5). The method used in the DTAM to calculate the adjustment to the number of Reduce tasks is shown in Equation (6). The maximum number of Map tasks, Mdn, for each slave node and the maximum number of Reduce tasks, Rdn, are added together to give a total limit.
The algorithm, then, calculates the ratio between this sum and the deviation in the number of intermediate data, obtained from Equation (5), to give the adjusted value. Since each slave node handles different input data, we also adjust the buffer parameters. For each slave-node system default Fds and MapReduce application default Fapp input-data size, we integrate the ratio of the difference between the estimated number of Reduce tasks, Rvn, and the system default number of Reduce tasks, Rdn, as a buffer parameter. In this way, we obtain the upper limit, Rvnp, on the number of Reduce tasks for each slave node in the next cycle. The calculation method used in the DTAM for the adjustment to the number of Map tasks is shown in Equation (7); this gives a value for Mvnp, which represents the upper limit on the number of Map tasks for each slave node in the next cycle and which is used to dynamically adjust the number of Map and Reduce program tasks to improve the performance degradation caused by laggard nodes. The DTAM proposed in this paper conducts a comprehensive performance prediction consideration for n slave nodes. Each slave node predicts its number of Map and Reduce tasks, respectively. Therefore, the time complexity of the dynamic task volume adjustment mechanism is O(n).
4. Implementation of the DTAM Algorithm
In this work, we systematically implemented a traditional MapReduce software architecture for small-scale cloud applications based on the recommendations in the literature [
16,
17,
18,
19]. Refer to Hadoop MapReduce framework for small-scale cloud architecture implementation of the DTAM in the PHP and C programming languages. In the small-scale cloud system used in this paper, a PHP program was used to realise the deployment task module, and a C program was used as the runtime system to transmit data and execute the cloud application tasks. The same system was installed on each slave node of the system, and the same environmental parameters were used. The implementation of the system is illustrated in
Figure 4.
Before the system can execute a small-scale cloud application using the MapReduce framework, it must determine the relevant parameter settings and the content of the configuration file, which contains settings such as the location of the input-data source, the number of operating slave nodes, the maximum numbers of Map and Reduce tasks for each slave node, the application to be executed, the number of intermediate data and the key value of the intermediate-data partition to start the Reduce task. The master module on the master node reads the configuration file to determine the initial settings; it then communicates with the runtime system of each slave node to confirm that it can operate the application program normally, and at the same time, clears the input data for each slave node to ensure that there are no remaining unprocessed input data. The master module confirms the name of the input file and receives periodic information from each slave node that indicates the number of available Map tasks. The master module sends commands to its Map function, based on the IP location of each slave node, to perform the Map task. Since each slave node periodically returns the number of available Map tasks to the master module, the runtime system is used as a bridge to share the IP address of each node with the others. The Map tasks of each slave node use the IP address information of the runtime system to distribute the input file of the master module to the Map tasks of each slave node as input data. The Map task of each slave node processes the Map function application after receiving the above input data and generates intermediate data.
After the Map task is processed by each slave node, the master module is notified that the task is complete. The master module delivers to each specific slave node the available Reduce tasks for subsequent processing based on the
K value of the intermediate data. The data are allocated to each slave node as the intermediate data of the Reduce task to process the Reduce function program. When the Reduce task has been processed by each slave node, a result file is generated and sent to the master module for integration. On completion of the Reduce task, the master module of the system notifies each slave node to return the result file, which is used to generate the output file. We implemented our algorithm, called DTAM with IDPCP, based on a traditional MapReduce architecture for small-scale cloud applications, as shown in
Figure 5. The system uses the IDPCP algorithm to obtain the information on the input file and to integrate the maximum number of Map tasks for each slave node to calculate the wave number of future digestion input files. The intermediate-data information generated by the runtime system of each slave node for its Map task is recorded by the master module (‘Record Intermediate Data’ in
Figure 5). The system implementation uses the size of the memory space as the intermediate-data accumulation record and predicts the volume of intermediate data to be generated by each slave node in the next cycle. The master module of the system uses this information to set upper limits on the numbers of Map and Reduce tasks for each slave node, in order to improve the processing efficiency and avoid the problem of performance degradation due to laggard nodes.
5. Results and Discussion
The experimental environment was based on nine identical computers with the following specifications: AMD Athlon II X6 1055T CPU, 4 GB memory, 500 GB hard disk and an Ethernet card. Because the CPU had six cores, we assign six to Mdn and Rdn, respectively, in all experiments. The cloud applications used in the experiment were Word Count, All Unique Combinations, Inverted Index, Radix Sort and Session Mean Value calculations. The input-data types of the Word Count and All Unique Combinations applications were from 001 to 008; the numbers of data files were 96, 144, 192 and 240; files of sizes 4, 8, 12 and 16 MB were used. We analysed the performance of our system and compared it with a simulation of a traditional Hadoop (Google Cloud platform) MapReduce framework. The Word Count application counts the number of all words in the input-data file. The Map task program generates intermediate data (such as Key = a, Value = 1) and records the information, and the Reduce task program processes all the intermediate-data files from each slave node. The summation calculation generates partial output files, and, finally, the master node unifies all of the partial output files from the slave nodes to obtain the final result file.
As shown in
Figure 6, for files of size 4 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 9.54%, 6.58%, 4.46% and 10.73%, respectively, and the overall average was better than that of the traditional method by 7.83%. As shown in
Figure 6, for files of size 8 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 3.34%, 4.48%, 6.08% and 7.93%, respectively, and the overall average was better than that of the traditional method by 5.48%. As shown in
Figure 6, for files of size 12 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 4.81%, 5.50%, 3.50% and 15.53%, respectively, and the overall average was better than that of the traditional method by 10.33%. As shown in
Figure 6, for files of size 16 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −7.01%, 4.20%, 17.08% and 12.61%, respectively, and the overall average was better than that of the traditional method by 6.72%. It can be seen from the experimental results that the proposed DTAM algorithm can reduce the problem of end time delay by dynamically adjusting the number of Map and Reduce tasks to disperse the workload of a busy slave node.
As shown in
Figure 7, for files of size 4 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −7.09%, 19.63%, 18.85% and 22.51%, respectively, and the overall average was better than that of the traditional method by 13.48%. As shown in
Figure 7, for files of size 8 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −4.06%, 18.49%, 21.15% and 23.37%, respectively, and the overall average was better than that of the traditional method by 14.74%. As shown in
Figure 7, for files of size 12 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −5.49%, 13.74%, 25.99% and 25.76%, respectively, and the overall average was better than that of the traditional method by 15.00%. As shown in
Figure 7, for files of size of 16 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 2.83%, 7.52%, 17.87% and 22.36%, respectively, and the overall average was better than that of the traditional method by 12.65%. From our results, it can be observed that the DTAM can increase the workload of idle slave nodes by dynamically adjusting the number of Map and Reduce program tasks to reduce the problem of end time delay.
The All Unique Combinations application obtains a non-repetitive combination of all input-data numbers. The Map program generates intermediate data (such as Key = 1, Value = 1) and records this information. In the Reduce program, all non-repetitive values in all the intermediate-data files from the slave nodes are combined to generate all permutations and combinations and to create some output files. Finally, the master node unifies all the partial output files from the slave nodes and deletes identical entries to obtain the final result file.
As shown in
Figure 8, for files of size 4 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 5.86%, 3.23%, 9.92% and 8.88%, respectively, and the overall average was better than that of the traditional method by 6.97%. As shown in
Figure 8, for files of size 8 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 11.54%, 17.51%, 12.37% and 15.40%, respectively, and the overall average was better than that of the traditional method by 14.20%. As shown in
Figure 8, for files of size 12 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 5.87%, 14.95%, 8.97% and 7.11%, respectively, and the overall average was better than that of the traditional method by 9.23%. As shown in
Figure 8, for files of size 16 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 7.30%, 10.56%, 6.53% and 15.27%, respectively, and the overall average was better than that of the traditional method by 9.92%.
As shown in
Figure 9, for files of size 4 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −3.73%, 20.97%, 18.99% and 16.92%, respectively and the overall average was better than c the traditional method by 13.29%. As shown in
Figure 9, for files of size 8 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −1.80%, 28.80%, 32.37% and 27.63%, respectively and the overall average was better than that of the traditional method by 21.75%. As shown in
Figure 9, for files of size 12 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by 18.61%, 10.30%, 30.41% and 34.61%, respectively and the overall average was better than that of the traditional method by 23.48%. As shown in
Figure 9, for files of size 16 MB, the performance of the DTAM on 96, 144, 192 and 240 files was better than that of Hadoop by −1.98%, 20.85%, 16.51% and 20.97%, respectively and the overall average was better than that of the traditional method by 14.09%. It can be seen from the experimental results that the proposed DTAM algorithm can reduce the overall end time delay by dynamically adjusting the number of Map and Reduce program tasks to evenly distribute the workloads of the slave nodes.
The Inverted Index application finds the locations of specific strings in the input file. In the Map task, the position information between the input-data words and the words is processed, and an intermediate-data file is generated. In the Reduce task program, the string position is recorded in a partial output file. Finally, the master node unifies all partial output files from the slave nodes to obtain the final search result file. As shown in
Figure 10, the DTAM performed better than Hadoop by 8.70%, 9.17, 7.76% and 8.03% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. As shown in
Figure 11, the DTAM performed better than Hadoop by 7.13%, 9.89, 8.75% and 11.56% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. From the experimental results, it can be seen that the proposed DTAM algorithm can mitigate the performance degradation for small-scale cloud applications caused by the phenomenon of laggards, by dynamically adjusting the number of Map and Reduce program tasks. The Radix Sort application sorts input-data content, which process the input-data content value segmentation in the Map program and generates intermediate-data files. When the Reduce program has completed the sorting job for each content value, the master node unifies all partial output files from the slave nodes to give the final result file.
As shown in
Figure 12, the DTAM performed better than Hadoop by 5.72%, 6.90%, 6.69% and 7.36% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. As shown in
Figure 13, the DTAM performed better than Hadoop by 14.82%, 14.75%, 13.04% and 14.58% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. The experimental results indicate that our DTAM with IDPCP algorithm successfully adjusts the numbers of Map and Reduce tasks to mitigate the performance degradation of small-scale cloud applications caused by the phenomenon of laggards. The Session Mean Value application averages the input data. The Map task involves processing the sum of the same values and counting them, while in the Reduce task, the results of all the intermediate-data files from the slave nodes are directly summed up. Finally, the files are unified and averaged by the master node to give the final result. As shown in
Figure 14, the DTAM performed better than Hadoop by 13.92%, 13.92%, 12.27% and 15.06% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. As shown in
Figure 15, the DTAM performed better than Hadoop by 4.47%, 6.00%, 4.21% and 5.96% for files of sizes 0.5, 1, 1.5 and 2 MB, respectively. Our experimental results show that the DTAM with IDPCP algorithm can give an improvement and can avoid exceeding the maximum number of tasks set for each slave node. It is limited by the phenomenon of laggard nodes, which degrades the performance of small-scale cloud applications.