ABSTRACT
We expect that the size and the complexity of future supercomputers will increase on their path to exascale systems and beyond. Therefore, system software has to adapt to the complexity of these systems for a simplification of the development of scalable applications. In this paper, we present a unikernel operating system design for HPC. It extends the multi-kernel approach while providing better programmability and scalability for hierarchical systems, such as HLRS' Hazel Hen, which base on multiple cluster-on-a-chip processors. We prove the scalability of the design via micro benchmarks by taking the example of HermitCore---our prototype implementation of the new design.
- Baumann, A., Barham, P., Dagand, P.-E., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (New York, NY, USA, 2009), SOSP '09, ACM, pp. 29--44. Google ScholarDigital Library
- Bratterud, A., Walla, A., Haugerud, H., Engelstad, and P.E., Begnum, K. IncludeOS: A Resource Efficient Unikernel for Cloud Services. In Proceedings of the 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom) (2015). Google ScholarDigital Library
- Broquedis, F., Furmento, N., Goglin, B., Namyst, R., and Wacrenier, P.-A. Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective. In Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism (Berlin, Heidelberg, 2009), IWOMP '09, Springer-Verlag, pp. 79--92. Google ScholarDigital Library
- Bull, J. M., Reid, F., and McDonnell, N. A Microbenchmark Suite for OpenMP Tasks. In Proceedings of the 8th International Conference on OpenMP in a Heterogeneous World (Berlin, Heidelberg, 2012), IWOMP'12, Springer-Verlag, pp. 271--274. Google ScholarDigital Library
- Clauss, C., Lankes, S., Reble, P., and Bemmerl, T. Recent Advances and Future Prospects in iRCCE and SCC-MPICH. In Proceedings of the 3rd Symposium of the Many-core Applications Research Community (MARC) (Ettlingen, Germany, jul 2011), KIT Scientific Publishing. Poster Abstract.Google Scholar
- Clauss, C., Lankes, S., Reble, P., and Bemmerl, T. New System Software for Parallel Programming Models on the Intel SCC Many-Core Processor. Concurrency and Computation: Practice and Experience 27, 9 (2015), 2235--2259.Google Scholar
- Clauss, C., Lankes, S., Reble, P., Galowicz, J., Pickartz, S., and Bemmerl, T. iRCCE: A Non-blocking Communication Extension to the RCCE Communication Library for the Intel Single-Chip Cloud Computer -- Version 2.0 iRCCE FLAIR. Tech. rep., Chair for Operating Systems, RWTH Aachen University, 2013. Users' Guide and API Manual.Google Scholar
- Clauss, C., Moschny, T., et al. Dynamic Process Management with Allocation-internal Co-Scheduling towards Interactive Supercomputing. In Proc. 1th Workshop Co-Scheduling of HPC Applicat. (Jan 2016).Google Scholar
- Clauss, C., and Pickartz, S. A collection of MPI Benchmarks. http://dx.doi.org/10.5281/zenodo.50723, 2015.Google Scholar
- Giampapa, M., Gooding, T., Inglett, T., and Wisniewski, R. W. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. In 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (Nov 2010), pp. 1--10. Google ScholarDigital Library
- Intel LAN Access Division. PCI-SIG SR-IOV Primer. Tech. Rep. 2.5, Intel Corporation, January 2011.Google Scholar
- Kantee, A. Flexible Operating System Internals -- The Design and Implementation of the Anykernel and Rump Kernels. PhD thesis, Department of Computer Science and Engineering, Aalto University, Aalto, Finland, 2012.Google Scholar
- Kelly, S. M., and Brightwell, R. Software Architecture of the Light Weight Kernel, Catamount. In In Cray User Group (2005), pp. 16--19.Google Scholar
- Lankes, S., Roehl, T., Terboven, C., and Bemmerl, T. Node-Based Memory Management for Scalable NUMA Architectures. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers (ROSS 2012) in conjunction with 26th International Conference on Supercomputing (ICS 2012) (San Servolo Island, Venice, Italy, jun 2012), pp. 10:1--10:8. Google ScholarDigital Library
- Madhavapeddy, A., Mortier, R., Rotsos, C., Scott, D., Singh, B., Gazagnaire, T., Smith, S., Hand, S., and Crowcroft, J. Unikernels: Library Operating Systems for the Cloud. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (New York, NY, USA, 2013), ASPLOS '13, ACM, pp. 461--472. Google ScholarDigital Library
- Marathe, J., and Mueller, F. Hardware Profile-guided Automatic Page Placement for ccNUMA Systems. In Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (New York, NY, USA, 2006), PPoPP '06, ACM, pp. 90--99. Google ScholarDigital Library
- Mattson, T. G., van der Wijngaart, R. F., Riepen, M., Lehnig, T., Brett, P., Haas, W., Kennedy, P., Howard, J., Vangal, S., Borkar, N., Ruhl, G., and Dighe, S. The 48-core SCC Processor: the Programmer's View. In 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (Nov 2010), pp. 1--11. Google ScholarDigital Library
- Oral, S., Wang, F., Dillow, D. A., Miller, R., Shipman, G. M., Maxwell, D., Henseler, D., Becklehimer, J., and Larkin, J. Reducing Application Runtime Variability on Jaguar XT5. In Proceedings of Cray User Group (CUG'10) (2010).Google Scholar
- Park, Y., Van Hensbergen, E., Hillenbrand, M., Inglett, T., Rosenburg, B. S., Ryu, K. D., and Wisniewski, R. W. FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment. SBAC-PAD (2012), 211--218. Google ScholarDigital Library
- Peter, S., Li, J., Zhang, I., Ports, D. R. K., Woos, D., Krishnamurthy, A., Anderson, T., and Roscoe, T. Arrakis: The Operating System is the Control Plane. In OSDI'14: Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation (Oct. 2014), USENIX Association. Google ScholarDigital Library
- Pickartz, S., Breitbart, J., and Lankes, S. Implications of Process-Migration in Virtualized Environments. In Proc. 1th Workshop Co-Scheduling of HPC Applicat. (Jan 2016).Google Scholar
- Pickartz, S., Gad, R., Lankes, S., Nagel, L., Süß, T., Brinkmann, A., and Krempel, S. Migration Techniques in HPC Environments. In Euro-Par 2014: Parallel Processing Workshops, vol. 8806 of Lecture Notes in Computer Science. Springer International Publishing, 2014, pp. 486--497.Google Scholar
- Regehr, J. Inferring Scheduling Behavior with Hourglass. In Proceedings of the USENIX Annual Technical Conference, FREENIX Track (Monterey, CA, USA, jun 2002), pp. 143--156. Google ScholarDigital Library
- Shimosawa, T., Gerofi, B., Takagi, M., Nakamura, G., Shirasawa, T., Saeki, Y., Shimizu, M., Hori, A., and Ishikawa, Y. Interface for Heterogeneous Kernels: A Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore Architectures. 2014 21st International Conference on High Performance Computing (HiPC) (2014), 1--10.Google ScholarCross Ref
- Tsafrir, D., Etsion, Y., Feitelson, D. G., and Kirkpatrick, S. System Noise, OS Clock Ticks, and Fine-Grained Parallel Applications. ACM, 2005.Google ScholarDigital Library
- Wisniewski, R. W., Inglett, T., Keppel, P., Murty, R., and Riesen, R. mOS: An Architecture for Extreme-Scale Operating Systems. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS '14) (New York, New York, USA, June 2014), ACM Request Permissions, pp. 1--8. Google ScholarDigital Library
- Yoshii, K., Iskra, K., Naik, H., Beckman, P., and Broekema, P. C. Characterizing the Performance of Big Memory on Blue Gene Linux. In Proceedings of the 2nd International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2'09) (2009), pp. 65--72. Google ScholarDigital Library
- HermitCore: A Unikernel for Extreme Scale Computing
Recommendations
The Organization and Management of Grid Infrastructures
Grid computing technology has become fundamental to e-Science. As the virtual organizations established by scientific communities progress from testing their applications to more routine usage, maintaining reliable and adaptive grid infrastructures ...
MGC middleware for grid computing: the Globus Toolkit
ACAI '11: Proceedings of the International Conference on Advances in Computing and Artificial IntelligenceGrid computing has made substantial advances during the last decade. A major concern in Grid environments is dealing with the high degree of heterogeneity of resources that can range from laptops and PCs to supercomputers. The unified virtual view of ...
Interoperability of BOINC and EGEE
Today basically two types of grid systems are in use: service grids and desktop grids. Service grids offer an infrastructure for grid users, thus require notable management to keep the service running. On the other hand, desktop grids aim to utilize ...
Comments