Abstract
Data-driven application systems often depend on complex, data-intensive programs operating on distributed datasets that originate from a variety of scientific instruments and repositories to provide time-critical responses for observed phenomena in different areas of science, e.g., weather warning systems, seismology, and ocean sciences, among others. A major challenge for these observational science application systems is the integration of data into the scientist’s workflow and how these workflows could leverage advanced networking and distributed computational capabilities to analyze real-time data streams. In particular and, moreover, in the case of dynamic data-driven applications systems (DDDAS), such capabilities become even more imperative. In this chapter, we present the DyNamo network-centric platform that addresses some of the critical challenges faced by dynamic data-driven workflows. DyNamo enables high-performance, adaptive, performance-isolated data flows across distributed cloud computing resources and community data repositories for analyzing data for observational science applications. DyNamo is capable of dynamically provisioning appropriate computing, networking and storage resources from diverse, national-scale cyberinfrastructures (CI). Through easy-to-use interfaces and integration with the Pegasus Workflow Management System, DyNamo is able to automate the orchestration of data-driven science workflows on the provisioned infrastructures, thereby offering capabilities that are crucial for support of DDDAS environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A metric to express the intensity of precipitation.
- 2.
Also known as the data link layer, which is the second level in the seven-layer open systems interconnection (OSI) reference model for network protocol design.
- 3.
OpenFlow is a communications protocol that gives access to the forwarding plane of a network switch or router over the network.
References
D. McLaughlin, D. Pepyne, V. Chandrasekar, B. Philips, J. Kurose, M. Zink, K. Droegemeier, S. Cruz-Pol, F. Junyent, J. Brotzge, D. Westbrook, N. Bharadwaj, Y. Wang, E. Lyons, K. Hondl, Y. Liu, E. Knapp, M. Xue, A. Hopf, K. Kloesel, A. DeFonzo, P. Kollias, K. Brewster, R. Contreras, B. Dolan, T. Djaferis, E. Insanic, S. Frasier, and F. Carr, “Short-wavelength technology and the potential for distributed networks of small radar systems,” Bulletin of the American Meteorological Society, vol. 90, no. 12, pp. 1797–1818, 2009. [Online]. Available: https://doi.org/10.1175/2009BAMS2507.1
R. Wu, B. Liu, Y. Chen, E. Blasch, H. Ling, and G. Chen, “A container-based elastic cloud architecture for pseudo real-time exploitation of wide area motion imagery (wami) stream,” J. Signal Process. Syst., vol. 88, no. 2, p. 219–231, Aug. 2017.
I. Baldin, J. Chase, Y. Xin, A. Mandal, P. Ruth, C. Castillo, V. Orlikowski, C. Heermann, and J. Mills, “Exogeni: A multi-domain infrastructure-as-a-service testbed,” in The GENI Book, R. McGeer, M. Berman, C. Elliott, and R. Ricci, Eds. Springer International Publishing, 2016, pp. 279–315.
E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P. J. Maechling, R. Mayani, W. Chen, R. Ferreira da Silva, M. Livny, and K. Wenger, “Pegasus, a workflow management system for science automation,” Future Generation Computer Systems, vol. 46, no. 0, pp. 17–35, 2015.
D. K. Krishnappa, D. Irwin, E. Lyons, and M. Zink, “Cloudcast: Cloud computing for short-term weather forecasts,” Computing in Science & Engineering, vol. 15, no. 04, pp. 30–37, jul 2013.
E. J. Lyons, M. Zink, and B. Philips, “Efficient data processing with exogeni for the casa dfw urban testbed,” in 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 2017, pp. 5977–5980.
L. Li, W. Schmid, and J. Joss, “Nowcasting of motion and growth of precipitation with radar over a complex orography,” Journal of Applied Meteorology, vol. 34, no. 6, pp. 1286–1300, 1995. [Online]. Available: https://doi.org/10.1175/1520-0450(1995)034%3C1286:NOMAGO%3E2.0.CO;2
E. Ruzanski and V. Chandrasekar, “Weather radar data interpolation using a kernel-based lagrangian nowcasting technique,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 6, pp. 3073–3083, June 2015.
“Doppler radar and weather observations (second edition),” in Doppler Radar and Weather Observations (Second Edition), second edition ed., R. J. Doviak and D. S. Zrnic, Eds.San Diego: Academic Press, 1993, p. iv. [Online]. Available: http://www.sciencedirect.com/science/article/pii/B9780122214226500024
NOAA/NCDC, “U.s. billion-dollar weather & climate disasters 1980-2019,” Press Release, https://www.ncdc.noaa.gov/billions/events.pdf.
P. R. Mahapatra and V. V. Makkapati, “Studies on a high-compression technique for weather radar reflectivity data,” in 2005 5th International Conference on Information Communications Signal Processing, Dec 2005, pp. 895–899.
“The geojson specification (rfc 7946),” https://tools.ietf.org/html/rfc7946.
C. Maple, “Geometric design and space planning using the marching squares and marching cube algorithms,” in Proc. 2003 Intl. Conf. Geometric Modeling and Graphics, 2003, pp. 90–95.
K. Keahey, P. Riteau, D. Stanzione, T. Cockerill, J. Mambretti, P. Rad, and P. Ruth, “Chameleon: a scalable production testbed for computer science research,” in Contemporary High Performance Computing: From Petascale toward Exascale, 1st ed., ser. Chapman & Hall/CRC Computational Science, J. Vetter, Ed. Boca Raton, FL: CRC Press, 2018, vol. 3, ch. 5.
B. Teitelbaum, S. Hares, L. Dunn, R. Neilson, V. Narayan, and F. Reichmeyer, “Internet2 qbone: building a testbed for differentiated services,” IEEE network, vol. 13, no. 5, pp. 8–16, 1999.
E. D. Dart, K. A. Antypas, G. R. Bell, E. W. Bethel, R. Carlson, V. Dattoria, K. De, I. T. Foster, B. Helland, M. C. Hester et al., “Advanced scientific computing research network requirements review: Final report 2015,” 2016.
“Openstack.” [Online]. Available: https://www.openstack.org/
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “Openflow: Enabling innovation in campus networks,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, p. 69–74, Mar. 2008.
C. A. Stewart, D. Y. Hancock, M. Vaughn, J. Fischer, T. Cockerill, L. Liming, N. Merchant, T. Miller, J. M. Lowe, D. C. Stanzione et al., “Jetstream: performance, early experiences, and early results,” in Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, 2016, pp. 1–8.
“Open Science Grid,” https://www.opensciencegrid.org.
J. Varia, “Best practices in architecting cloud applications in the aws cloud,” in Cloud Computing: Principles and Paradigms. Wiley Online Library, 2011, vol. 18, pp. 459–490.
I. Baldin, P. Ruth, C. Wang, and J. S. Chase, “The future of multi-clouds: A survey of essential architectural elements,” in 2018 International Scientific and Technical Conference Modern Computer Network Technologies (MoNeTeC), Oct 2018, pp. 1–13.
“Texas lonestar education and research network (learn),” http://www.tx-learn.org/.
M. Cevik, P. Ruth, K. Keahey, and P. Riteau, “Wide-area software defined networking experiments using chameleon,” in IEEE INFOCOM 2019 – IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), April 2019.
A. Mandal, P. Ruth, I. Baldin, Y. Xin, C. Castillo, G. Juve, M. Rynge, E. Deelman, and J. Chase, “Adapting scientific workflows on networked clouds using proactive introspection,” in IEEE/ACM Utility and Cloud Computing (UCC), 2015.
Mobius Github Repository, https://github.com/RENCI-NRIG/Mobius.
A. Mandal, P. Ruth, I. Baldin, R. F. Da Silva, and E. Deelman, “Toward prioritization of data flows for scientific workflows using virtual software defined exchanges,” in 2017 IEEE 13th International Conference on e-Science (e-Science), Oct 2017, pp. 566–575.
J. van der Ham, F. Dijkstra, P. Grosso, R. van der Pol, A. Toonk, and C. de Laat, “A distributed topology information system for optical networks based on the semantic web,” Optical Switching and Networking, vol. 5, no. 2, pp. 85–93, 2008, advances in IP-Optical Networking for IP Quad-play Traffic and Services. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1573427708000064
Apache jclouds, https://jclouds.apache.org/.
E. Fajardo, J. Dost, B. Holzman, T. Tannenbaum, J. Letts, A. Tiradani, B. Bockelman, J. Frey, and D. Mason, “How much higher can htcondor fly?” in Journal of Physics: Conference Series, vol. 664, no. 6. IOP Publishing, 2015, p. 062014.
K. Vahi, M. Rynge, G. Papadimitriou, D. Brown, R. Mayani, R. Ferreira da Silva, E. Deelman, A. Mandal, E. Lyons, and M. Zink, “Custom execution environments with containers in pegasus-enabled scientific workflows,” in 15th International Conference on eScience (eScience), 2019, pp. 281–290, funding Acknowledgments: NSF 1664162, NSF 1826997, NSF 1443047.
C. Boettiger, “An introduction to docker for reproducible research,” SIGOPS Oper. Syst. Rev., vol. 49, no. 1, p. 71–79, Jan. 2015.
G. M. Kurtzer, V. Sochat, and M. W. Bauer, “Singularity: Scientific containers for mobility of compute,” PLOS ONE, vol. 12, pp. 1–20, 05 2017.
Scitech, “CASA Nowcast Pegasus Workflow,” https://github.com/pegasus-isi/casa-nowcast-workflow.
——, “CASA Wind Pegasus Workflow,” https://github.com/pegasus-isi/casa-wind-workflow.
——, “CASA Hail Pegasus Workflow,” https://github.com/pegasus-isi/casa-hail-workflow.
D. Dossot, RabbitMQ essentials. Packt Publishing Ltd, 2014.
“SCinet Technology Challenge 2019,” https://sc19.supercomputing.org/scinet/technology-challenge/.
“SC’19 Technology Challenge Blog,” https://sc19.supercomputing.org/2019/11/22/inaugural-scinet-technology-challenge-at-sc19-brings-supercomputing-and-networking-together-to-reimagine-the-future-of-data-driven-scientific-applications/.
“Json – JavaScript object notation,” https://www.json.org/json-en.html.
Unidata LDM, https://www.unidata.ucar.edu/software/ldm/.
Amazon Elastic Compute Cloud, http://www.amazon.com/ec2.
Microsoft Azure Cloud, https://azure.microsoft.com/en-us/.
Google Cloud, https://cloud.google.com/.
Rackspace Cloud, https://www.rackspace.com/.
AWS CloudFormation, http://aws.amazon.com/cloudformation.
OpenStack Heat Project, https://wiki.openstack.org/wiki/Heat.
C. Wang, K. Thareja, M. Stealey, P. Ruth, and I. Baldin, “Comet: A distributed metadata service for federated cloud infrastructures,” in 2019 IEEE High Performance Extreme Computing Conference (HPEC), Sep. 2019, pp. 1–7.
FutureGrid, https://portal.futuregrid.org/.
I. Foster, “Globus online: Accelerating and democratizing science through cloud-based services,” IEEE Internet Computing, vol. 15, no. 3, pp. 70–73, May 2011.
J. Mambretti, J. Chen, and F. Yeh, “Next generation clouds, the chameleon cloud testbed, and software defined networking (SDN),” in 2015 International Conference on Cloud Computing Research and Innovation (ICCCRI), Oct 2015, pp. 73–79.
J. Liu, E. Pacitti, P. Valduriez, and M. Mattoso, “A survey of data-intensive scientific workflow management,” Journal of Grid Computing, vol. 13, no. 4, pp. 457–493, Dec. 2015.
G. Galante, L. C. Erpen De Bona, A. R. Mury, B. Schulze, and R. Rosa Righi, “An analysis of public clouds elasticity in the execution of scientific applications: A survey,” Journal of Grid Computing, vol. 14, no. 2, pp. 193–216, Jun. 2016.
E. F. Coutinho, F. R. de Carvalho Sousa, P. A. L. Rego, D. G. Gomes, and J. N. de Souza, “Elasticity in cloud computing: a survey,” annals of telecommunications – annales des télécommunications, vol. 70, no. 7, pp. 289–309, Aug 2015.
J. Wang, M. AbdelBaky, J. Diaz-Montes, S. Purawat, M. Parashar, and I. Altintas, “Kepler + cometcloud: Dynamic scientific workflow execution on federated cloud resources,” Procedia Computer Science, vol. 80, pp. 700–711, 2016, international Conference on Computational Science 2016, ICCS 2016, 6-8 June 2016, San Diego, California, USA.
R. F. da Silva, R. Filgueira, I. Pietri, M. Jiang, R. Sakellariou, and E. Deelman, “A characterization of workflow management systems for extreme-scale applications,” Future Generation Computer Systems, vol. 75, pp. 228–238, 2017.
S. Ostermann, R. Prodan, and T. Fahringer, “Dynamic cloud provisioning for scientific grid workflows,” in 2010 11th IEEE/ACM International Conference on Grid Computing, Oct 2010, pp. 97–104.
M. Malawski, K. Figiela, M. Bubak, E. Deelman, and J. Nabrzyski, “Scheduling multilevel deadline-constrained scientific workflows on clouds based on cost optimization,” Scientific Programming, vol. 29, pp. 158–169, Jan. 2015.
S. Abrishami, M. Naghibzadeh, and D. H. Epema, “Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds,” Future Generation Computer Systems, vol. 29, no. 1, pp. 158–169, 2013.
M. Dickinson, S. Debroy, P. Calyam, S. Valluripally, Y. Zhang, R. Bazan Antequera, T. Joshi, T. White, and D. Xu, “Multi-cloud performance and security driven federated workflow management,” IEEE Transactions on Cloud Computing, pp. 1–1, 2018.
I. F. Senturk, P. Balakrishnan, A. Abu-Doleh, K. Kaya, Q. Malluhi, and Ümit V. Çatalyürek, “A resource provisioning framework for bioinformatics applications in multi-cloud environments,” Future Generation Computer Systems, vol. 78, pp. 379–391, 2018.
W. Gerlach, W. Tang, K. Keegan, T. Harrison, A. Wilke, J. Bischof, M. D’Souza, S. Devoid, D. Murphy-Olson, N. Desai et al., “Skyport: container-based execution environment management for multi-cloud scientific workflows,” in Proceedings of the 5th International Workshop on Data-Intensive Computing in the Clouds. IEEE Press, 2014, pp. 25–32.
W. Gerlach, W. Tang, A. Wilke, D. Olson, and F. Meyer, “Container orchestration for scientific workflows,” in 2015 IEEE International conference on cloud engineering. IEEE, 2015, pp. 377–378.
J. A. Novella, P. Emami Khoonsari, S. Herman, D. Whitenack, M. Capuccini, J. Burman, K. Kultima, and O. Spjuth, “Container-based bioinformatics with pachyderm,” Bioinformatics, vol. 35, no. 5, pp. 839–846, 2018.
P. Di Tommaso, M. Chatzou, E. W. Floden, P. P. Barja, E. Palumbo, and C. Notredame, “Nextflow enables reproducible computational workflows,” Nature biotechnology, vol. 35, no. 4, p. 316, 2017.
C. Zheng and D. Thain, “Integrating containers into workflows: A case study using makeflow, work queue, and docker,” in Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing. ACM, 2015, pp. 31–38.
C. Zheng, B. Tovar, and D. Thain, “Deploying high throughput scientific workflows on container schedulers with makeflow and mesos,” in 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 2017, pp. 130–139.
Acknowledgements
This work is funded by NSF award #1826997. We thank Mert Cevik (RENCI), engineers from UNT and LEARN for the UNT stitchport setup. Results in this book chapter were obtained using Chameleon and ExoGENI testbeds, both supported by NSF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Papadimitriou, G. et al. (2023). Dynamic Network-Centric Multi-cloud Platform for Real-Time and Data-Intensive Science Workflows. In: Darema, F., Blasch, E.P., Ravela, S., Aved, A.J. (eds) Handbook of Dynamic Data Driven Applications Systems. Springer, Cham. https://doi.org/10.1007/978-3-031-27986-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-27986-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27985-0
Online ISBN: 978-3-031-27986-7
eBook Packages: Computer ScienceComputer Science (R0)