Social Media Metadata Forensic Ontology Model

,


INTRODUCTION
The usage of the Internet and social media is increasing year by year.According to data from We Are Social on digital resource usage in Indonesia in February 2022, as cited from datareportal.com, the total population in Indonesia is 277.7 million, with 73.7% (204.7 million) Internet users, and 68.9% (191.4 million) social media users, which is a 12.6% increase from January 2021.The most widely used social media platforms among individuals aged 16 to 64 are WhatsApp at 88.7%, Instagram at 84.8%, Facebook at 81.3%, TikTok at 63.1%, Telegram at 62.8%, and Twitter at 58.3% [1].
This proliferation has given rise to various social media phenomena and behaviors over time, impacting personal lives, communication dynamics, and even criminal activities.These include trend or phenomena such as self-disclosure by sharing information about their activities and In the realm of digital forensics, the process of ensuring the admissibility of evidence in court involves several stages [10].These stages can vary in number and order based on different cases and opinions.However, four of them are particularly important: acquisition, research, analysis, and presentation [11].In the context of social media, two primary sources provide digital evidence: the devices owned by victims or suspects (clients) and the service providers (servers).These sources are critical during the acquisition stage, which serves as the foundation for subsequent investigation and analysis.
As we delve further into the realm of digital evidence, researchers often turn to ontological models to establish knowledge bases supporting the analysis process.Notably, David Christopher Harrill and Richard P. Mislan introduced the Small-Scale Digital Device Forensics (SSDDF) ontology, which has been further incorporated into the Device Forensic sub-ontology by Nickson M. Karie, M.Sc, and Hein S. Venter, Ph.D [12]- [13].These ontological models play a crucial role in structuring and organizing the digital evidence landscape.
While some ontologies have been developed to address digital forensics in the context of social media, a notable gap remains.This gap becomes evident when considering the work of Edlira Kalemi, Sule Yildirim-Yayilgan, Elton Domnori and Ogerta Elezaj, who developed the SMoNt ontology specifically related to this topic.The perspective they adopt focuses on the digital evidence found in social media metadata, which can be considered valid evidence in court.Despite the wealth of information provided by social media metadata forensics, including diverse entities such as user profiles, messages, status posts, photos, friends, groups, and more [14], [15], these studies do not delve into the crucial connection between digital evidence and the electronic devices used.Not many ontologies associate social media with electronic devices, even though in digital forensics, one of the three crucial stages in investigations is acquisition [11].This stage involves collecting electronic devices from suspects and/or victims as evidence, which is then acquired as digital evidence.While the existing research landscape boasts separate ontologies for digital devices and social media, a conspicuous void remains when it comes to integrating these critical aspects in the field of digital forensics.This gap hinders the holistic and efficient examination of digital evidence in cases that involve both small-scale electronic devices and social media platforms.Bridging this gap has the potential to revolutionize the way digital forensics is conducted, offering investigators and law enforcement agencies a comprehensive tool to navigate the complex interplay between devices and social media data.By developing a unified ontology, this research strives to address this critical gap and contribute to the advancement of digital forensics, ultimately enhancing our capabilities in investigating criminal activities in the digital age.
Therefore, this research aims to develop a new ontology model that can map both sides of forensics: the small-scale electronic device aspect using the SSDDF subclass, which defines classes for device types (cell phones, smartphones, tablet computers, notebook computers, and others), and the social media data aspect within the social media forensics subclass.By combining these two classes, the small-scale electronic device aspect defined in the SSDDF subclass and the social media data aspect within the social media forensics subclass, the research aims to create a unified ontology.The outcome of this research is expected to map the relationship between mobile devices and social media metadata in the hierarchy of ontology classes and objects.

METHODS
In this research, we implement experimental method to map the ontology.In the subsequent phase of the research, a case study was conducted involving the interaction between the suspect's account (Garry Swihart), who uses a Samsung Galaxy Mega 2 Android device, and the victim's account (Norah Nolan), who uses a Samsung Galaxy J1 Ace Android device.

Figure 1. Case study implementation scheme
As mentioned before and depicted in Figure 1: 1. Devices and Accounts a. Account A (Victim -Fictitious): Norah Nolan, a fictitious individual, portrayed as a regular user of social media platforms, using a Samsung Galaxy J1 Ace Android device, which serves as her primary means of accessing and engaging with online content.b.Account B (Suspect -Fictitious): Garry Swihart, a fictitious character, depicted as an individual with potential suspicious activities on social media, using a Samsung Galaxy Mega 2 Android device as his main tool for interacting with others on digital platforms.

Interaction within the Facebook Platform:
The initial contact occurred when the fictitious suspect (Garry Swihart) initiated a friend request to the fictitious victim (Norah Nolan) on the Facebook platform.Subsequently, he extended an invitation to join a group titled "Branded Bags & Accessories," where he held the position of group administrator.This initial interaction marked the commencement of their communication within this social media ecosystem.

Case:
Within the Facebook group "Branded Bags & Accessories," the fictitious suspect (Garry Swihart) strategically posted content aimed at capturing the attention of the fictitious victim (Norah Nolan).These posts were designed to pique her interest, and as a result, she engaged by commenting on several of them.This initial interaction within the group led to further communication through private messages.

Investigation:
In response to the escalating interaction between the fictitious victim (Norah Nolan) and the fictitious suspect (Garry Swihart) through private messages, investigators initiated the process of gathering evidence.This involved meticulous collection and preservation of all pertinent digital communications, including text messages, multimedia files, and associated timestamps.The investigation's objective was to construct a comprehensive timeline of the interactions, identify potential evidence of any illicit activities, and scrutinize the intentions and actions of both fictitious parties involved.

Device Conditions and Limitations
This research involved the use of two Samsung Android smartphones: Samsung Galaxy Mega 2 and Samsung Galaxy J1 Ace.These devices differ in terms of their software specifications.The Samsung Galaxy Mega 2 has the latest ROM (Baseband G750HXXU1ANI5) and runs on Android version 4.4.4(KitKat).On the other hand, the Samsung Galaxy J1 Ace has the latest ROM (Baseband J11FXXU0AQE1) and operates on Android version 5.1.1 (Lollipop).
These devices where choosen is to maximize completeness of data acquisition opportunities by utilizing rooting methods on the Android KitKat and Lolipop version platform.For more detailed information, please refer to Table 1, which provides a comprehensive overview of these specifications.In the case of the two devices, the following conditions can be observed: The FB is still available in the Play Store.Hence, there are no obstacles encountered, unlike in previous Android versions.

Data Acquisition
After implementing the case study, we proceeded with data acquisition from both devices using the following methods: 1) Rooting both devices using magisk, to ensure comprehensive access to the device's data and applications, we utilized the Magisk rooting method.Magisk is a suite of open-source software for customizing almost all version of Android with more than 260 contributors on the project development [16], [17], especially to it root feature.Rooting allows for elevated privileges, enabling the extraction of meaningful data that might otherwise be inaccessible [18] 2) Conducting the acquisition of both devices using the dd method, employed this method to create a bit-by-bit copy of the device's storage.This approach ensures acquiring more data [19], including text messages, images, application data, and system files. 3)Transferring the acquired data from the Android Debug Bridge (ADB) shell to the host computer using nc (Netcat) command installed by Busybox.ADB is the open-source tools to run command-line operation such as installing and debugging apps in the android device [20].it can access dd command in the android device also.ADB is a common tool when like [18], [19], [21].Busybox is a set of tiny UNIX programs for small or embedded systems [22], we use it to install and then run nc command (Netcat) in both devices to perform networking operation.Once the data acquisition process was complete, the acquired data, in the form of '.dd' files, was transferred from the ADB shell to the host computer with nc command.This step is essential for further analysis and examination of the collected digital evidence. 4)The transferred data acquisition results are in the .ddfile extension.The acquired data from both devices was saved in '.dd' file format, a raw extension well known for acquisition disk image in digital forensics.

Challenges and Limitations
In the case study, we discovered several challenges and limitations: 1) Data Limitations for FB and FBL, one notable challenge was the limited scope of acquired data from Facebook (FB) and Facebook Lite (FBL).While FB provided contact and local media data that could be obtained as evidence, FBL had more restricted data access, offering only contact data.
2) Inclusion of Messenger (FBM), another challenge involved integrating an additional application, Messenger (com.facebook.orcaor FBM).While Messenger offers specific functions and features for sending messages between FB users, extracting data regarding private messages or conversations between the two accounts presented its own set of challenges.
These challenges and limitations influenced the data collection process and should be considered when interpreting the findings of this research.They underscore the complexities and nuances involved in digital forensics and the acquisition of data from social media platforms.

Practical Application of SSDDF Ontology Figure 2. SSDDF ontology
In Figure 2, we present the Small-Scale Digital Device Forensic (SSDDF) Ontology, a crucial component of our research that plays a central role in categorizing and organizing digital evidence from small-scale digital devices.This ontology serves as the foundation for our exploration of digital forensic investigations on devices such as smartphones, memory cards, and embedded systems.
In this section, we illustrate the practical utility of the Small-Scale Digital Device Forensic (SSDDF) ontology in real-world digital forensic investigations.The SSDDF ontology serves as a valuable tool for law enforcement agencies and forensic experts, enhancing their capabilities in the following ways: 1) Streamlined Data Analysis: The SSDDF ontology provides a structured framework for organizing and categorizing digital evidence obtained from small-scale digital devices.By leveraging predefined classes and relationships, investigators can efficiently analyze data, leading to quicker insights.
2) Cross-Device Correlation: In multi-device investigations, the SSDDF ontology allows investigators to correlate data from various sources.For example, it enables linking evidence from smartphones, memory cards, and embedded devices to reconstruct a comprehensive timeline of events.
3) Enhanced Data Retrieval: With well-defined classes and properties, the ontology simplifies data retrieval.Investigators can quickly locate relevant information, such as chat histories, media files, or user profiles, leading to more effective case resolutions.

4) Integration with Existing Tools:
The SSDDF ontology can be integrated with existing digital forensic tools and software, making it accessible and user-friendly for forensic experts.This integration streamlines the investigative process without requiring extensive retraining.

Development of Small-Scale Digital Device Forensic Ontology
Developing an ontology for Small-Scale Digital Device Forensic (SSDDF) based on data acquisition can be done.In the Android system, there are several storage blocks, namely mmcblk0 representing internal storage and mmcblk1 representing external storage (if available), as shown in Figure 3.In the internal storage, we can explore the userdata partition along with its contents ("data", "media", and "system") to search for findings related to the Facebook applications (FB, FBL, and FBM) as shown in Figure 4. Within the "data" folder, we can find all data associated with installed applications, specifically for Facebook, each application has a consistent package name (folder) starting with com.facebook followed by the package name for each respective application.From the structure of the folders and the files found, a lot of data was discovered that indicates the presence of log files and databases representing the activities of both accounts, as obtained from the implementation of the case study.Some of these findings can be observed in Table 2.The system-level accounts database that holds information about user accounts on the device.
Based on the findings above, it can be concluded that SSDDF ontology needs to add several more structured classes to map the findings into the ontology.Essentially, we raise the competence question of "How to map the Small-Scale Digital Device Forensic ontology?"This is an initial assumption, considering the complexity of the existing data structure within the devices.
The presence of storage blocks representing mmcblk0 (Internal) and mmcblk1 (External) storage suggests that the primary focus of the exploration process should be on Internal storage.External storage, on the other hand, is optional as it involves removable media.Furthermore, within the internal storage, the presence of the data, media, and system folders raises the hypothesis that there is a need for separate classes to map these directories for clearer identification and categorization.As a result, the competence question can be further divided and refined as follows:

1) How to map based on the nature of Internal and External storage?
Justification: The inclusion of classes to distinguish between internal and external storage elements (e.g., Internal and External classes) is crucial for precise data categorization.Digital forensic investigations often involve differentiating between data stored within the device's internal memory and data on removable external storage (e.g., memory cards).This distinction aids investigators in focusing their analysis on relevant data sources and ensures that the ontology accurately reflects the nature of the storage medium.

2) How to map the important structures present in internal storage?
Justification: Internal storage within small-scale digital devices contains various critical structures (e.g., data, media, system) that require separate mapping.These structures are key to organizing and categorizing digital evidence effectively.By including classes such as data, media, and system, we ensure that investigators can precisely identify and access these essential components during the forensic analysis.This granularity enhances the ontology's utility in reconstructing digital timelines and extracting pertinent information.

3) How to map the structures present in external storage?
Justification: While the primary focus lies on internal storage, external storage (e.g., memory cards) remains a potential source of digital evidence.Including classes to map external storage ensures that investigators can account for and analyze data stored on removable media when relevant.This flexibility accommodates diverse scenarios in digital forensic investigations, where external storage may contain valuable information related to the case.
By providing these justifications, we aim to clarify the rationale behind the selection and inclusion of specific classes and properties within the SSDDF ontology.These additions are designed to align with the practical requirements of digital forensic analysis, facilitating more efficient and effective investigations.

Implementation of Competency Questions
Competency questions play a pivotal role in defining the scope and structure of the SSDDF ontology.These questions guide the ontology's development and help ensure it effectively addresses the needs of digital device forensic analysis.
Initially, the SSDDF ontology's original class structure serves as a foundation.Subsequently, based on the competency questions, additional class structures are meticulously created to enhance the ontology's relevance and utility.
For instance, let's delve into the first competency question: "How to map based on the nature of Internal and External storage?"This question prompts the creation of classes dedicated to distinguishing between internal and external storage elements, ensuring precise data categorization and retrieval.
The resulting ontology, as depicted in Figure 5, provides a visual representation of how these classes are integrated to address this specific competency question.For the second competency question, the classes that can be mapped are shown in Figure 6.For the third competency question, the classes that can be mapped are depicted in Figure 7. From the implementation of the three competency questions above, the complete SSDDF can be visualized in Figure 8. Integration with existing ontologies, such as the Social Media Evidence ontology, offers a holistic approach to digital device forensic analysis.In this research endeavor, the SSDDF ontology [12], [13] seamlessly converges with the established Social Media Evidence ontology [14], [15], creating a unified and comprehensive knowledge framework.
The motivation behind this integration is to leverage the strengths of both ontologies.While SSDDF excels in structuring digital device forensic data, the Social Media Evidence ontology specializes in capturing metadata and contextual information from social media platforms.The amalgamation of these domains enriches our ability to derive meaningful insights from digital evidence.
Methodologically, this integration involves mapping relevant classes and properties from each ontology to ensure compatibility and data interoperability.By doing so, we can seamlessly correlate digital device data with social media activity, enhancing the depth and breadth of forensic analysis.
However, it's essential to acknowledge that this integration is not without its challenges.These include reconciling differences in class definitions, handling overlapping properties, and maintaining ontology consistency.Nevertheless, the benefits far outweigh these challenges.
In conclusion, Figure 9 provides an overview of the resulting integrated ontology.This collaborative effort between SSDDF and the Social Media Evidence ontology opens new avenues for advanced digital device forensic investigations.

Class Structure
There are two major classes in ontology, DeviceForensic and SocialMediaEvidence.Each of them has its sub-classes : The DeviceForensic ontology and the SocialMediaEvidence ontology offer distinct yet interconnected frameworks for digital forensic investigations.The DeviceForensic ontology primarily focuses on organizing and categorizing data acquired from small-scale digital devices, encompassing information related to storage, files, and device characteristics.Conversely, the SocialMediaEvidence ontology specializes in handling evidence originating from social media platforms, including user profiles, messages, connections, and online activities.
In many digital forensic cases, investigators encounter scenarios where evidence retrieved from small-scale digital devices, such as smartphones or memory cards, intersects with social media activity.For instance, a suspect's smartphone may contain chat histories, multimedia files, or location data relevant to a social media-related investigation.By employing both the DeviceForensic and SocialMediaEvidence ontologies in tandem, investigators gain the ability to seamlessly correlate and analyze evidence originating from these disparate yet interconnected sources.
The integration of device data with social media evidence facilitates a comprehensive and enriched contextual analysis of digital evidence.This synergy allows investigators to delve deeper into the circumstances surrounding a case.For instance, device data might reveal the timestamp of a photo, while social media data can provide insights into the user who shared it on a social platform.This combined context is invaluable for building a complete and accurate narrative of events, potentially uncovering critical details that might be missed when analyzing each type of evidence in isolation.
By forging a link between the DeviceForensic and SocialMediaEvidence ontologies, investigators can adopt a unified approach to digital forensic investigations.This unified framework empowers them to organize, analyze, and draw connections between evidence, regardless of whether it originates from a digital device or a social media platform.This simplifies the investigative process and enhances efficiency, ultimately leading to more effective and insightful results.
In essence, the integration of these two ontologies offers a cohesive and comprehensive solution for digital forensic experts and law enforcement agencies.It allows them to tackle the complexities of modern investigations that involve both digital devices and social media platforms, facilitating a more holistic and thorough examination of digital evidence.This approach opens up new avenues for advanced digital device forensic investigations, where evidence from various sources can be interconnected and analyzed within a unified framework.

Data Properties Structure
In the data properties structure, the researcher has added several properties, such as directory, has_app, and parent_directory.These three properties will be useful when mapping individuals related to DeviceForensic.

CONCLUSIONS
In this research, we exhibit the evolution of pre-existing ontologies as an aid to systematically categorize digital evidence located on both server and client endpoints.In this manner, the ontological framework can function as an advanced procedural methodology in the rigorous scrutiny of case files pertaining to social media incidents.We employed the sophisticated capabilities of the Social Media Digital Evidence Ontology, enabling us to meticulously organize data derived from the server-side service provider.In our pursuit for detail-oriented and specific mapping, we took into consideration the enhancement and subsequent deployment of the Small-Scale Digital Device Ontology (SSDDF).This was primarily to delineate the storage architecture within Android smartphones more explicitly, and concurrently, to organize data originating from mobile apparatuses, hereby referred to as clients.
Our contributions in this study have been twofold.Firstly, we harnessed the sophisticated capabilities of the Social Media Digital Evidence Ontology, enabling us to meticulously organize data derived from the server-side service provider.This ontological framework has emerged as a robust procedural methodology for rigorously scrutinizing case files related to social media incidents.Secondly, recognizing the need for explicit delineation of storage architecture within Android smartphones, we introduced the Small-Scale Digital Device Ontology (SSDDF).SSDDF plays a pivotal role in categorizing digital evidence obtained from mobile apparatuses, often referred to as clients.
The practical implications of our research are profound.With the deployment of SSDDF, forensic investigators and law enforcement agencies gain a structured framework for organizing and categorizing digital evidence extracted from small-scale digital devices.This empowers them with the ability to efficiently analyze data, leading to quicker insights, more effective cross-device correlation, and enhanced data retrieval.Moreover, SSDDF seamlessly integrates with existing digital forensic tools, eliminating the need for extensive retraining and streamlining the investigative process.
Looking ahead, there is an imminent need for additional research aimed at mapping storage systems spanning an array of mobile device platforms.These range from iOS-based smartphones and smartwatches to Unmanned Aerial Vehicles (UAVs) and beyond.Such advancements will serve to enrich the existing ontological infrastructure.Furthermore, a concerted effort to expand

Figure 4 .
Figure 4. User data partition structure and Facebook application package list.

Figure 5 .
Figure 5. Addition of a class for the first competency question in the SSDDF ontology.

Figure 6 .
Figure 6.Class addition for the second competency question in the SSDDF ontology.

Figure 7 .
Figure 7. Class addition for the third competency question in the SSDDF ontology.

Table 1 .
Map of the device in the case study

for the Android Lollipop version (used by the suspect)
The Facebook application (com.facebook.katanaoras FB) is not available in the Play Store.However, as an alternative, Facebook Lite (com.facebook.liteoras FBL) is available as a lightweight version of the regular Facebook application.Therefore, Facebook Lite has become a convenient choice for installation2) 1)for the Android KitKat version (used by the victim)

Table 2 .
Findings of digital evidence