ABSTRACT
The growth of time-sensitive heterogeneous data in industry-grade datalakes has recently reached unprecedented momentum. In response to this, we propose Synerise Monad - a prototype of a real-time behavioral modeling platform for event-based data streams. It automates representation learning and model training on massive data sources with arbitrary data structures. With Monad we showcase how to automatically process various data modalities, such as temporal, graph, categorical, decimal, and textual data types, in a time-sensitive way allowing for real-time time feature creation and predictions. Monad's distributed and scalable architecture coupled with efficient award-winning algorithms developed at Synerise - Cleora and EMDE - allows to process real-life datasets composed of billions of events in record time. The Monad ecosystem showcases a viable path towards real-time event-based AutoML.
- Michaŀ Daniluk, Jacek Dąbrowski, Barbara Rychalska, and Konrad Goŀuchowski. 2021. Synerise at KDD Cup 2021: Node Classification in Massive Heterogeneous Graphs. Association for Computing Machinery's Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) KDD Cup Open Graph Benchmark (OGB) Challenge (2021). https://ogb.stanford.edu/paper/kddcup2021/mag240m_SyneriseAI.pdfGoogle Scholar
- Michaŀ Daniluk, Barbara Rychalska, Konrad Goŀuchowski, and Jacek Dąbrowski. 2021. Modeling Multi-Destination Trips with Sketch-Based Model. 14th ACM International Web Search and Data Mining Conference (WSDM) WebTour Workshop on Web Tourism (2021). http://ceur-ws.org/Vol-2855/challenge_short_3.pdfGoogle Scholar
- Jacek Dąbrowski, Barbara Rychalska, Michaŀ Daniluk, Dominika Basaj, Konrad Goŀuchowski, Piotr Bąbel, Andrzej Michaŀowski, and Adam Jakubowski. 2021. An Efficient Manifold Density Estimator for All Recommendation Systems. In Neural Information Processing: 28th International Conference, ICONIP 2021.Google Scholar
- Barbara Rychalska, Piotr Bąbel, Konrad Goŀuchowski, Andrzej Michaŀowski, Jacek Dąbrowski, and Przemysŀaw Biecek. 2021. Cleora: A Simple, Strong and Scalable Graph Embedding Scheme. In Neural Information Processing: 28th International Conference, ICONIP 2021.Google Scholar
- Barbara Rychalska and Jacek Dabrowski. 2020. Synerise at SIGIR Rakuten Data Challenge 2020: Efficient Manifold Density Estimator for Cross-Modal Retrieval. The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) eCom Workshop Challenge (2020). https://sigirecom.github.io/ecom20DCPapers/SIGIR_eCom20_DC_paper_1.pdfGoogle Scholar
Index Terms
- Synerise Monad - Real-Time Multimodal Behavioral Modeling
Recommendations
Synerise Monad: A Foundation Model for Behavioral Event Data
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalThe complexity of industry-grade event-based datalakes grows dynamically each passing hour. Companies actively gather behavioral information on their customers, recording multiple types of events, such as clicks, likes, page views, card transactions, add-...
Hyper-Stacked: Scalable and Distributed Approach to AutoML for Big Data
Machine Learning and Knowledge ExtractionAbstractThe emergence of Machine Learning (ML) has altered how researchers and business professionals value data. Applicable to almost every industry, considerable amounts of time are wasted creating bespoke applications and repetitively hand-tuning ...
A General Recipe for Automated Machine Learning in Practice
Advances in Artificial Intelligence – IBERAMIA 2022AbstractAutomated Machine Learning (AutoML) is an area of research that focuses on developing methods to generate machine learning models automatically. The idea of being able to build machine learning models with very little human intervention represents ...
Comments