skip to main content
10.1145/3383583.3398589acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections

Streaming Analytics and Workflow Automation for DFS

Published:01 August 2020Publication History

ABSTRACT

Researchers reuse data from past studies to avoid costly re-collection of experimental data. However, large-scale data reuse is challenging due to lack of consensus on metadata representations among research groups and disciplines. Dataset File System (DFS) is a semi-structured data description format that promotes such consensus by standardizing the semantics of data description, storage, and retrieval. In this paper, we present analytic-streams - a specification for streaming data analytics with DFS, and streaming-hub - a visual programming toolkit built on DFS to simplify data analysis workflows. Analytic-streams facilitate higher-order data analysis with less computational overhead, while streaming-hub enables storage, retrieval, manipulation, and visualization of data and analytics. We discuss how they simplify data pre-processing, aggregation, and visualization, and their implications on data analysis workflows.

References

  1. A. Batch and N. Elmqvist. 2018. The Interactive Visualization Gap in Initial Exploratory Data Analysis. IEEE Transactions on Visualization and Computer Graphics, Vol. 24, 1 (Jan. 2018), 278--287.Google ScholarGoogle ScholarCross RefCross Ref
  2. G.H. Brimhall and A. Vanegas. 2001. Removing Science Workflow Barriers to Adoption of Digital Geologic Mapping by Using the GeoMapper Universal Program and Visual User Interface. In Digital Mapping Techniques. U.S. Geological Survey Open-File Report 01--223, Tuscaloosa, AL, USA, 103--115.Google ScholarGoogle Scholar
  3. J. Demvs ar, T. Curk, A. Erjavec, C. Gorup, et almbox. 2013. Orange: data mining toolbox in Python. The Journal of Machine Learning Research, Vol. 14, 1 (2013), 2349--2353.Google ScholarGoogle Scholar
  4. S. Jayarathna and F. Shipman. 2017. Analysis and Modeling of Unified User Interest. In 2017 IEEE International Conference on Information Reuse and Integration (IRI). IEEE, San Diego, CA, USA, 298--307.Google ScholarGoogle Scholar
  5. Y. Jayawardana and S. Jayarathna. 2019. DFS: A Dataset File System for Data Discovering Users. In ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE, Urbana-Champaign, IL, 355--356.Google ScholarGoogle Scholar
  6. S. Kandel, J. Heer, C. Plaisant, J. Kennedy, et al. 2011. Research directions in data wrangling: Visualizations and transformations for usable and credible data. Information Visualization, Vol. 10, 4 (2011), 271--288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Kothe. 2014. Lab streaming layer (LSL). https://github.com/sccn/labstreaminglayer, Vol. 26 (2014), 2015.Google ScholarGoogle Scholar
  8. J.J. Thomas and K.A. Cook. 2006. A visual analytics agenda. IEEE computer graphics and applications, Vol. 26, 1 (2006), 10--13.Google ScholarGoogle Scholar

Index Terms

  1. Streaming Analytics and Workflow Automation for DFS

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020
        August 2020
        611 pages
        ISBN:9781450375856
        DOI:10.1145/3383583

        Copyright © 2020 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 August 2020

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        Overall Acceptance Rate415of1,482submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader