A multimedia dataset for object-centric business process mining in IT asset management

This manuscript introduces a multimedia business process dataset provided by a German research institute. The dataset was systematically collected in a laboratory environment that reflects the workspace of IT staff managing IT Asset Management (ITAM) processes. It encompasses data from 121 process instances across six basic processes, captured using 37 video recordings from two camera perspectives, motion tracking, environmental sensors, an ITAM system trace, and event log data from user interactions. The data is made available in its raw state and processed form. The object-centric event log format (OCEL) provides discrete business process events from system activities. Event data from reality is supplied as raw video files and logs from environmental sensors. The video files were also manually labelled with identifiable business process activities and their associated entities. This multimedia dataset has been designed as a resource for developing, training, and evaluating process mining techniques based on unstructured data. Consequently, the dataset design emphasizes the traceability of activities and entities across the multimedia data sources.


Value of the Data
• Based on a systematic data collection setup, this data set provides coherent data from executing 121 process instances in IT asset management.• By integrating data from various sources, including IT system logs, video cameras, and ambient sensors, the dataset provides a multimedia view of process execution.This fusion of media streams allows researchers to explore correlations between system-generated events and real-world activities.• The alignment of system-generated event data with observational data from workplace environments bridges the gap between IT system behaviour and physical actions.Researchers can investigate how system events manifest in practical scenarios.• The dataset offers both raw and processed data.System event logs follow the OCEL 1.0 format, while real-life event data includes video recordings and sensor logs.Video data was manually labelled with segments of business process activities.• Fellow researchers are encouraged to leverage this dataset for developing, training, and evaluating process mining techniques.The data in structured and unstructured formats allow researchers to move away from the labelled ground truth and to analyze the data according to their specific focus.• Questions and problems for future research that might benefit from the dataset include the development of abstraction techniques to identify process activities from low-level events, the consideration of uncertainty in events or data, and searching for techniques that aim to enrich process data from unstructured data sources gradually.

Background
Process mining encompasses a range of individual techniques that facilitate various analyses, including process discovery, monitoring, and process improvement based on historical process execution data [ 1 ].These techniques rely on traces of process event data, primarily extracted from information systems such as ERP systems.While these systems capture many aspects of critical business processes, there remain significant blind spots in process activities.A growing research stream aims to address these blind spots by developing approaches illuminating previously unexplored areas-specifically, leveraging distributed, unstructured data sources.These alternative event data sources include audio, image/video, text, sensor data, and raw system logs [2][3][4][5].To advance these approaches, researchers seek well-documented, annotated raw data that provides insights into process behaviour, the entities involved, and the relationships across different data sources.An ideal domain for studying business processes involving both in-system and out-of-system data is IT asset management (ITAM).In ITAM, staff oversee the entire lifecycle of assets within IT systems while managing physical objects in a controlled and straightforward environment.

Data Description
The Business Processes in IT Asset Management Multimedia Event Log dataset [ 6 ] comprises 121 prescripted business process instances of six baseline processes in ITAM.The process instances were recorded in 36 scenes in a controlled laboratory environment for data collection.Each scene contains multiple completed process instances that may partially or fully overlap within one scene.The process environment simulates that of a small IT department overseeing ITAM processes.Within this context, IT staff diligently manage IT assets while clients interact with them-whether by collecting or returning them for repair.The dataset includes various IT assets like laptops, monitors, and keyboards.Fig. 1 gives an overview, and the following descriptions outline the process types, recorded scenes, and corresponding process instances, as well as detail the dataset files.For further exploration, an interactive dataset documentation is accessible online 1 and linked to the data repository.

ITAM Process Overview
1. New asset inventory.IT staff integrates new assets into the existing inventory framework.Assets must be unpacked, tested, inventoried in the ITAM system, installed, and stored in the warehouse.2. Asset disbursement to clients .IT staff manage inventory issuance using the ITAM system, recording transactions and client acceptances.In the warehouse, assets are retrieved, inspected for quality, and issued to clients.3. Asset repair .IT staff identify and diagnose asset issues at the repair desk.The asset is tested after thorough quality checks and necessary repairs to ensure full functionality.The repaired asset is then updated in the ITAM system and returned to the warehouse.4. Defective asset return for repair .IT staff processes the return of defective assets that need repair.The process starts at the IT working desk, where IT staff document the transaction, perform a thorough quality assessment, and check the asset into the ITAM system to ensure it is logged as awaiting repair.The asset is then moved to the repair desk for further handling.5. Asset return.The asset return process involves logging returned assets into the ITAM system and conducting physical inspections to ensure quality.After verification checks, assets are stored appropriately in the warehouse.6. Self-service asset check-out .This automated process allows clients to independently check out rental assets from the IT inventory via the ITAM system, enhancing convenience and efficiency.Clients access the ITAM system to verify asset availability and reserve their choice, preventing double bookings.After reservation, clients collect their assets from the self-service storage and complete the transaction without needing direct assistance from IT staff.

Environment and objects
The recorded data was collected within a controlled environment, specifically an office room divided into distinct areas (see Fig. 3 ).To facilitate cross-data-source correlation, all human actors and objects featured in the scenes were assigned types and often artificial IDs.Table 2 provides details of the environments and objects utilized throughout the dataset.

Scenes and process instances
The dataset is organized at a scene-by-scene level.Table 3 presents the 36 distinct scenes in the overview, each paired with its corresponding set of process instances.Notably, all these scenes were recorded within a single day on 25.03.2024.The computer mice are located in the mouse cupboard (the left one on the video).The mice are not labeled.Headsets H1-H11 The headsets are located on the self-service desk and can be requested via self-service.Webcams K1-K8 The webcams are located on the self-service desk and can be requested via self-service.
The dataset's organization follows a folder-per-scene structure, where each folder (e.g., 'scene01/') mirrors the layout depicted in Table 4 .In addition to the scene-level files, the dataset folder 'global/' houses the event log file named 'itam_ocel.json'.This file captures the events tracked within the ITAM system, formatted in OCEL 1.0 [ 7 ].Next to this file, 'global_ocel.json'    is the merged log of 'itam.jsonocel'and the individual * _ocel.json'files.This file provides the most holistic picture of the dataset.Moreover, beyond the static files in the data repository, the project website features a per-scene metadata-enhanced video player [ 8 ] with the annotations from ' * _vid.json' as illustrated in Fig. 2 .

Experimental Design, Materials and Methods
Data were recorded following a script on a scene level, prescribing the involved process variants, IT assets, actors, and planned process deviations (e.g., theft, ignored activities).In planning the data recording schedule, the frequency of non-deviant process instances was prioritized over process instances that demonstrate deviances.The laboratory environment consisted of a single room equipped with sensors.IT assets and human actors are labelled for visual recognition in the scenes.

IT asset selection and labeling
The processes selected for inclusion in this dataset represent typical IT asset management activities encountered in corporate settings.These processes encompass the entire lifecycle of IT assets, from acquisition and deployment to maintenance and eventual decommissioning.The selection of IT assets such as laptops, monitors, mice, headsets, keyboards, and web cameras was strategic, designed to reflect a range of typical interactions that IT staff and clients encounter in a realistic IT asset management setting.
Data labelling in this dataset was executed with high precision to facilitate detailed process analysis.Each asset and human actor involved in the dataset was assigned an identifier, ensur- ing traceability and consistency across all data types, from video recordings to sensor data and system logs.Here is how the labelling was structured: • IT Assets: Every asset, including laptops, monitors, and peripherals, was labelled with a unique identifier.This meticulous labelling allows for precise tracking of each asset as it moves through various processes and interactions within the IT environment.• Human Actors: Each participant in the dataset, representing IT staff and clients, was assigned a code (e.g., A1, C1) to anonymize and track their interactions with the IT assets and systems.• Environmental Setup: The laboratory environment was divided into specific areas (e.g., warehouse area, working area) and specific locations within these areas (e.g., repair desk, laptop shelf, door), each labelled to associate the actors' and assets' movements and activities with particular locations.

Data collection and preprocessing
The data collection and preprocessing phase was designed in a certain way to capture a detailed and comprehensive dataset using cameras and a variety of sensors strategically placed to monitor key environmental-and interaction metrics within the designated locations.The room measures 5.50 m by 7.50 m.Fig. 3 gives an overview of the placement of each camera and sensor.These are positioned to optimize data collection for process mining analysis.The zero position of the room, used as a reference for all measurements, is defined as the left bottom corner behind the entrance door.

Ambient sensors and video cameras
To facilitate process mining analysis on video and sensor data, several cameras and ambient sensors were installed in the room to provide video data and both discrete and continuous sensor data.Table 5 gives an overview of the camera and sensor positions.The two cameras were positioned in opposite corners.Each sensor was connected to a separate Raspberry Pi 3B + .The detailled Python scripts used for collecting the sensor data can be found on our Solve4X Project Website.We used the following ambient sensors: • Reed switches were installed on the room door and on cupboard doors to monitor when they were opened and closed.This setup helps track access to stored IT assets.An event was recorded every time the status of the switch changed from close to open or vice versa.To ensure consistent measuring, the sensor setup was thoroughly tested before each recording, i.e., doors were opened and closed manually and the recorded data was analyzed before starting the live-data collection.• An ultrasonic distance sensor (Model: HC-SR04) was mounted inside a shelf above a laptop stack.This sensor measured the stack height, providing real-time data on the inventory level.The Python script on the microcontroller was designed to record a distance once per second.However, the time between the two consecutive measurements varies in reality.Here again the setup was tested thoroughly before each recording by manually adjusting the stack height to create a predefined distance to the sensor and by analyzing the resulting measurements.
For technical reasons and due to the production process, the sensor measures accurately to fractions of a centimeter, which is sufficient for our use case.• Temperature and humidity sensor (Model: DHT11): Located at a higher position near the window to measure environmental conditions, which could affect both device performance and optimal storage conditions.The Python script for this sensor was designed to record a temperature and humidity pair every five seconds.However, the time between two consecutive measurements also varies in reality.The sensor was again tested thoroughly before each recording by analyzing measured temperature and humidity values before the live-data collection.

IT asset management system ITAM activity recording.
The open-source ITAM system Snipe-IT2 was installed and configured for the experiment.Assets and client accounts were created upfront.The system features all the functionality needed to manage IT assets for use in the planned ITAM processes in a web interface accessible to IT staff and clients.Both user roles can log into the system and perform process activities in sync with those visible on the video.All transactional activities involving IT asset status changes are recorded in the ITAM system's database during the data acquisition experiment.Fig. 4 depicts a screenshot of the ITAM system.
Data Processing.After the data recording, the transactional events were exported and transformed into an OCEL 1.0 event log file [ 7 ] using a custom library. 3 The ITAM system keeps track of all relevant activities and involved objects by design (see Fig. 5 ).

Motion tracking
Motion recording.Human actor movement data was recorded using technology offered by MotionMiners, 4 a technology provider for analyzing manual work processes.The planning, placement of measurement beacons, and the definition of the areas in the room influence the measurement accuracy.Areas must be planned without overlap.The beacons were mounted to fixed points at 80-90 cm height.For calibration, we followed instructions provided by the tool vendor: Using a mobile phone and equipped with a set of measurement devices, a author walks each area, performing various gestures and positions that may be relevant for later recoding.This procedure is repeated three times.Raw data from motion tracking was recorded for scenes 1-15 only.Each human actor was equipped with three sensors, attached to the wrists and the belt, continuously recording movement data and position information after the room was equipped and calibrated with wireless beacons.
Data Processing.After the data recording, motion tracking data was exported from the individual wearable sensors and consolidated per set (i.e., two wrist sensors and one belt sensor).The raw data is stored granularly (20 events/second).In a processing step, the events were aggregated to one-second intervals.In an additional processing step, the technical Set-IDs were mapped to the human actor IDs as references throughout the dataset.This mapping was semiautomatically performed using the scene script containing the information about which human actor is wearing a particular motion tracking sensor set in each scene.

Web task mining
Web task recording.User interaction logs help analyze business processes on a task basis (the individual building parts of a process activity) [ 3 ].The IT staff laptop was equipped with a custom browser plugin to capture interactions within the web browser.The plugin records detailed actions such as mouse clicks and keystrokes.
Data Processing.After the data recording, the user interaction log was split into individual files per scene.

Video data collection and processing
Video Recording.The project involved recording video data from two opposite camera angles to cover all activities fully.The recordings were synchronized and segmented into individual scenes.To achieve a comparable appearance of the videos and avoid disturbances, each camera's autofocus was disabled and artificial lightning was used to maintain brightness.Videos were processed by cutting raw material into 2 × 36 individual, pairwise frame-exact, scene files.The Fig. 6.Video annotation with process instances and segments representing process activities.
processing was done using LosslessCut (v3.59.1.0),5a graphical user interface for the FFmpeg6 library for lossless operations on media files.Also, pairs of videos for the same scene were vertically stacked using FFmpeg.These stacked video files are deposited in the dataset.
Video Annotation.After processing the raw video, the author team manually annotated business process activities using a custom tool designed and built for the project [ 9 ] (see Fig. 6 ).The web application is initialized with a configuration file per scene to identify the video file location and predetermined scenes as well as involved actors to prefil selection menus for faster editing.Users can then choose a scene for annotation and play the video on the right side of the application.On the left side, all segments within a scene are listed for a quick overview.New segments can be added by clicking a button, and the current position in the video is used as the start point of the activity.The end timestamp can be selected by seeking the end of the segment in the video player.Users select the activity, objects involved, actors, and location from the menus and save the segment.After aligning on a coding style within the author team (i.e., what activities should be coded, when objects, human actors, and locations are coded), the tool was helpful in quickly and accurately annotating the videos in parallel.The annotation files could be downloaded for further processing and imported again for corrections.With the annotations, we provide ground truth for activity detection and process mining analysis ( Table 1 ).

Limitations
Motion tracking data is only available for Scene 01 -Scene 15.A short setup was required for most scenes to ensure equipment was in place and human actors could change roles.Hence, all data except for data from the ITAM system were cut to match the exact intervals recorded on video.Therefore, concatenation of data does not lead to a continuous picture.While any effort s were made to ensure continuity regarding the allocation and state of IT assets, ambient sensors might expose notable jumps in data from scene to scene (e.g., a physical laptop was placed on the laptop shelf in a setup period to being able to issue said laptop in the following scene, leading to an unexplained jump in the distance sensor data).Further, the ITAM lifecycle and processes were simplified to generate many process instances (e.g., individuals should not pick up four laptops and monitors daily and return them for repair).

Ethics Statement
We obtained informed consent from all participants for their participation in this dataset collection and publication of anonymous data.

Fig. 2 .
Fig. 2. Visualization of a scene with information from * _vid.json and an interactive video player on the dataset website.

Fig. 3 .
Fig. 3. Room plan with areas and sensor positions.

Fig. 4 .
Fig. 4. Screenshot of the ITAM system showing an accessory page from which users can check out available webcams.

Fig. 5 .
Fig. 5. Screenshot of the ITAM system showing the overall activity report.

Table 1
General dataset properties.

Table 2
Listing of environment locations and objects referenced across the dataset.

Table 3
Scenes and processes.

Table 4
Structure of files and their respective fields and features per scene.
location List of location identifiers (see Table 2 ) * _ocel.jsonJoint video and sensor OCEL.* ( continued on next page )

Table 5
Sensor overview and placement.