Artificial Intelligence of Things for Smarter Healthcare: A Survey of Advancements, Challenges, and Opportunities

Healthcare systems are under increasing strain due to a myriad of factors, from a steadily ageing global population to the current COVID-19 pandemic. In a world where we have needed to be connected but apart, the need for enhanced remote and at-home healthcare has become clear. The Internet of Things (IoT) offers a promising solution. The IoT has created a highly connected world, with billions of devices collecting and communicating data from a range of applications, including healthcare. Due to these high volumes of data, a natural synergy with Artificial Intelligence (AI) has become apparent - big data both enables and requires AI to interpret, understand, and make decisions that provide optimal outcomes. In this extensive survey, we thoroughly explore this synergy through an examination of the field of the Artificial Intelligence of Things (AIoT) for healthcare. This work begins by briefly establishing a unified architecture of AIoT in a healthcare context, including sensors and devices, novel communication technologies, and cross-layer AI. We then examine recent research pertaining to each component of the AIoT architecture from several key perspectives, identifying promising technologies, challenges, and opportunities that are unique to healthcare. Several examples of real-world AIoT healthcare use cases are then presented to illustrate the potential of these technologies. Lastly, this work outlines promising directions for future research in AIoT for healthcare.


I. INTRODUCTION
H EALTHCARE systems have long been strained by a globally ageing population and a rise in chronic illness. This has been increasingly apparent since the outbreak of the COVID-19 pandemic, which pushed many healthcare centres to breaking point -during significant outbreaks, many patients suffering from COVID-19 and other unrelated illnesses were cared for in makeshift facilities [1] and via telehealth technologies [2]. Manuscript  The Internet of Things (IoT) offers promising solutions in providing improved standards of healthcare both in and out of clinical settings. The IoT is broadly defined as a network of interconnected devices that are able to gather, exchange, store, and process data using an Internet backbone. IoT as it pertains to health applications is often referred to as Healthcare IoT (H-IoT) or Internet of Medical Things (IoMT).
IoMT is being increasingly utilized around the world. The global value of this field estimated to exceed $158 billion USD in 2022, with approximately one-third of this cost attributed directly to connected medical devices [3]. Significant applications for the IoMT lie in the diagnosis, monitoring and management of chronic conditions such as diabetes [4], dementia [5], Parkinson's disease [6], epilepsy and other seizure disorders [7], and sleep disorders [8]. Other applications include rehabilitation after medical events [9], medication adherence [10], assisted living [11], digital twin development [12], and the treatment of patients and management of outbreaks in pandemic events such as COVID-19 [13].
Existing communications technologies including narrowband IoT (NB-IoT) and 5G have facilitated increased connection of medical devices to the IoT, and this will continue to increase as emerging communications technologies such as 5G New Radio Reduced Capability (RedCap), 6G, and IoTover-satellite are implemented. As more medical devices are connected to the IoT, a key challenge that has arisen is managing and utilizing the quantity of data that is generated by these devices [14]. This massive data cannot realistically be processed by individuals, and thus a need for artificial intelligence (AI) to make sense of the vast quantities of data becomes apparent. The natural synergy between AI and IoT has become apparent in recent years; AI needs to learn from large amounts of data to make successful discoveries, and IoT needs assistance in making meaningful discoveries from the vast data that it generates. This synergy has lead to the emergence of the Artificial Intelligence of Things (AIoT) [15], a new era of IoT systems empowered by AI that is the driving the paradigm shift from Healthcare 4.0 to Healthcare 5.0 [16]. The era of Healthcare 4.0 saw the introduction of highly-connected, patient-centric care highly dependent on IoT; however, these systems often do not provide smart health management, or emotive and personalised care. Healthcare 5.0 builds on the strengths of Healthcare 4.0 while simultaneously seeking to overcome its weaknesses. In particular, Healthcare 5.0 leverages This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ advanced technologies in an AIoT architecture to provide pervasive care that includes intelligent remote monitoring, smart self-management, personalized medicine, and emotive telemedicine. This paradigm shift will see high-quality healthcare available to more people, regardless of location and other limiting factors.

A. Contributions of This Survey
The literature contains several excellent reviews offering a variety of perspectives on AI and IoT in the healthcare field. These are discussed and compared in detail in the following subsection. In this work, we fill a key gap in the literature by focusing on the synergy between AI and IoT that has lead to the emergence of AIoMT. The key contributions of this work are as follows: 1) We fill a gap in the literature through conducting a thorough review of the synergy between AI and IoT as it pertains to healthcare. This is in contrast to previous works, which have considered AI and IoT as separate entities. 2) We present a unified architecture for the Artificial Intelligence of Medical Things (AIoMT) identifying key levels of computing and communications. Novel concepts including 5G New Radio Reduced Capability (RedCap), embedded AI, swarm learning, and explainability are considered. We further propose a new crosslayer AI paradigm for the unified AIoMT architecture. 3) We explore the use of AIoMT for the entire pipeline of healthcare, including monitoring, diagnosis, prognosis, treatment, and disease discovery. 4) We present several use cases highlighting how AIoMT can be utilised to support care for prevalent diseases and conditions, including dementia, stroke, breast cancer, and COVID-19. 5) We conclude our work by summarising the lessons learned, highlighting the existing challenges and opportunities in the literature, and providing suggestions for future research. This thorough review critically examines current literature on healthcare AIoT (AIoMT). To guide further research in the AIoMT domain, we also identify the open challenges and recommend future directions for research in this important field.

B. Related Work
Artificial intelligence and the Internet of Things have both been active fields of healthcare research in recent years, and thus many survey papers have previously considered these two topics separately. However, the fusion of AI with IoT into a unified Artificial Intelligence of Things has not been broadly considered in previous surveys. Additionally, previous review papers have not considered emerging technologies in the AIoT domain, including explainability, embedded AI, and swarm learning.
Current trends and future directions for IoT in healthcare were examined by two recent surveys [17], [18]. Both reviews focused on developing an IoT architecture for healthcare, along with identifying key requirements. The key topic of AI is briefly mentioned, without in-depth exploration or discussion of the fusion between AI and IoT.
In contrast, a survey by Qadri et al. [21] comprehensively explored recent advancements in healthcare IoT across a more complete architecture. Their examination of the literature included AI, however discussion is limited to several case studies without in-depth focus on algorithms and fusion with IoT. Edge computing is discussed, however embedded AI is not explored. Additionally, this review omits the crucial topic of explainability.
Explainability is considered in the context of healthcare by one recent review by Markus et al. [19]. However, their work does not consider IoT. Thus, key topics such as embedded AI are overlooked, and the synergy between AI and IoT is not explored.
Another review [20] considers embedded AI and explainability. However, the broader IoT architecture is not considered. Healthcare is mentioned as an application, however use cases are not explored.
The literature on artificial intelligence and IoT for healthcare is rich, however no survey to date has comprehensively reviewed all of the key technologies that comprise unified AIoT healthcare system. In this review, we investigate how AI and IoT can be used together to create robust, accurate, and explainable AIoT systems for healthcare.

C. Organization
The remainder of this paper is structured as follows: Section II constructs a unified architecture for AIoMT, informed by previous research in the AIoT and IoMT domains. It further analyses key technologies in AIoMT, including sensors, communication technologies, and computing resources. Section III then broadly categorises and several domains of AIoMT research across the healthcare pipeline, including monitoring, diagnosis, prognosis, and explainability. Section IV then explores several timely use cases for AIoMT, examining how AIoMT can be leveraged across the entire healthcare pipeline for specific applications including dementia care, stroke and stroke recovery, and pandemic management. In Section V, we summarise our findings and recommendations, highlighting the key challenges and opportunities that remain in the AIoMT field. Finally, Section VI concludes this work.
The key content of this paper is illustrated in Figure 1 for convenience.

A. AIoMT Architecture
This subsection discusses the architecture for healthcare AIoT applications. In this work, we adapt a three-layer computational hierarchy as shown in Fig. 2. We also consider the application layer, which includes a variety of healthcare domains that will utilize AIoT.
As shown in Fig. 2, the device computing layer is comprised of devices that interact directly with the user. Perception  devices include wearable devices that measure health parameters from the user, as well as devices such as phones and tablets that are often equipped with simple sensors for fundamental health monitoring. Devices for monitoring the environment and a patient's compliance with treatment plans also fall within perception devices. On the other hand, actuating devices are those that act to assist a person -for example, a robot that aids a person in rising from bed, or a smartphone application that alerts a diabetic person to low blood-glucose levels. Actuators would typically exist in systems with multiple perception devices. As most devices have very little computational power, only extremely lightweight AI algorithms can typically be deployed at this level; more computationally intensive work needs to be offloaded to higher levels at the cost of latency.
At the edge computing layer, an increase in computational power is provided by higher-power devices such as phones, tablets, computers, and gateways. The term 'edge' is used with some inconsistency in the literature, in some instances referring to peripheral sensing devices and in other cases referring to a device (such as a phone or gateway) that is connected to said peripherals. For clarity, we distinguish between 'embedded' and 'edge' layer devices in this work. However, it is worth noting that certain devices such as phones can exist in either edge or design layers; some are able to directly sense health data from a user, while others can only provide supporting computational power for peripheral devices.
The edge layer is not essential for all applications; in many IoT networks, embedded layer devices will communicate directly with cloud layer devices using long-range communications such as NB-IoT or LoRA. The cloud computing layer provides significant computational storage and processing power to support information sharing and applications with higher data loads.
AI algorithms can be implemented on one or more layers of the model for healthcare AIoT shown in Fig. 2. Traditional IoT network structures focused on conducting machine learning at the cloud computing layer, with lowerlevel devices focusing on gathering data and applying simple pre-processing techniques. However, much recent research has focused on developing lightweight edge and embedded AI algorithms [22], [23] for direct implementation on lowerpowered devices, as well as investigating AI implementations on enabling hardware technologies such as field-programmable gate array (FPGA) and complementary metal oxide semiconductor (CMOS) devices [24], [25].
Another approach that has been the subject of research interest is cross-layer AI, where lightweight AI algorithms are employed on the device or edge layer for tasks such as data quality assessment, preprocessing, and intelligent offloading scheduling [23], [26]. More traditional AI algorithms can then be implemented at the cloud computing layer. This approach can have several advantages, including decreased latency and improved handling of data fusion from multiple devices or sensors, whilst also maintaining good accuracy [27].
The remainder of this section investigates recent advancements in the technologies that form the healthcare AIoT architecture, namely sensors and devices, communications, computing resources, and artificial intelligence.

B. Sensors and Devices
Sensors and devices for healthcare applications are many and varied, and thus specific healthcare sensor applications and types have been the topic of entire reviews [28], [29], [30]. In this work, we briefly overview commonly used healthcare sensors and devices in AIoT research, as well as identifying emerging technologies that offer significant potential to the future of the AIoT field. 1) Sensors: Wearable sensors remain the key enabling technology for AIoT research due to a myriad of advantages including low form factor, user comfort, and ease of deployment. Two common sensors for cardio-respiratory health monitoring are photoplethysmogram (PPG) and electrocardiogram (ECG) sensors. PPG sensors are routinely used to measure the vital signs of heart rate and blood-oxygen saturation. PPG involves directing light into an artery, where some of the light is absorbed by the blood. The remaining light is reflected back or passed through the artery, with a photodiode or similar sensor used to measuring this non-absorbed light and thus capture heart activity waveforms and vital sign information. Meanwhile, ECG uses one or more electrodes to record the electrical activity of the heart, with the recorded signals commonly used to assess cardiac health in detail. Both PPG and ECG are available as wearable sensors, and thus recent research has investigated their use for a myriad of tasks including blood pressure monitoring [31], [32], respiratory rate measurement [33], cardiac abnormality identification [34], cardiac event prediction [35], and respiratory illness detection. ECG and PPG waveforms have also been used to assess stress, fatigue, and depression. Both PPG and ECG sensors are also readily available within or as add-ons for commercial research and fitness devices, including Empatica EmbracePlus [36] and Apple Watches [37]. These factors have lead to the use of ECG and PPG for general and specific health monitoring remaining an active field of research, and AI is increasingly prevalent in this domain.
Aside from ECG and PPG, there are several other sensors in the literature for the measurement of cardiovascular health parameters. Numerous studies have investigated the use of electromechanical sensors, such as high-sensitivity pressure and strain sensors, to measure the key parameters of heart rate [38], [39] and blood pressure [38], [39], [40]. Such sensors are commonly based on materials that change in electrical resistance or capacitance in response to applied pressure or strain. Further works have also explored the use of electroacoustic sensors for collecting audio-based heart activity waveforms that can be utilised to calculate heart rate and related parameters [41]. The key focus in recent literature has been making these sensors more wearable through use of novel materials and manufacturing techniques, so substantial opportunity remains in developing machine learning techniques for interpreting the data produced by these sensors.
Recent studies have also explored many methods for respiratory monitoring outside of ECG and PPG. Devices such as strain sensors and inertial measurement unit (IMUs) have been used to quantify changes in respiratory pressure during breathing in several studies [42], [43]. Strain sensors capture movement of the chest wall via changes in resistance or capacitance of the sensing material during inhalation and exhalation cycles, while IMUs use their inbuilt accelerometers and gyroscopes to gather rich information about chest wall movement which can be used to extract respiratory information. Mechano-acoustic sensors have also been increasingly explored [44], [45] to capture respiratory information from respiration sounds. The use of electrodes affixed to the chest to measure changes in thoracic impedance during the respiratory cycle has also shown promise [46]. There is significant potential for artificial intelligence to be utilized with each of these sensor types for tasks such as measurement of respiratory rate, monitoring of respiratory symptoms, and diagnosis and management of respiratory ailments.
Various sensors discussed so far provide vital sign monitoring for four of the five vitals: heart rate, blood-oxygen saturation, blood pressure, and respiratory rate. The fifth vital sign is temperature. Generally speaking, temperature sensing is a largely solved problem, however challenges remain in sensing core body temperature. The key challenge is ensuring reliable contact between sensor and skin to enable accurate readings. This has been the subject of several recent studies on body temperature sensing, which have focused on improving the degree and consistency of contact between sensor and body through developing temperature-sensitive fibres and films [47], [48], [49], [50], [51] that could be implemented into textiles and patch-based devices.
Aside from wearable sensors for monitoring vital signs, there has also been an increase in research into non-contact solution. One method that has been gaining interest for measuring cardiorespiratory vital signs is remote PPG (rPPG), also known as non-contact PPG (ncPPG) or imaging PPG (iPPG). Standard PPG involves directing a light source into an artery, and measuring the amount of light that is reflected back. A similar principle is used by rPPG, which employs image processing techniques to record light reflections in spectra of interest from video footage of a patient or user, with this then converted to heart activity waveforms. Signals obtained from rPPG can thus be used for many of the same applications as standard PPG, and thus recent research has explored its use for estimating heart rate [52], blood pressure [53], respiratory rate [54], and blood-oxygen saturation [55]. There is strong potential for computer vision artificial intelligence techniques to be utilized in this domain to better extract rPPG waveforms from video and subsequently utilize the waveforms to make measurements and predictions.
Measurement of vital signs is just one domain for wearable and non-contact healthcare sensors. Much research is also being conducted on the development and usage of sensors for monitoring of specific health parameters relevant to various conditions. One prevalent topic is the measurement of blood glucose levels with non-invasive sensors based on sweat [56] and interstitial fluid [57], [58], which would offer significant advantages to people living with diabetes and related conditions. These sensors are electrochemical sensors that typically utilize an enzyme as a receptor for the molecule or compound of interest. The chemical reaction is converted into a small electrical current, which can then be measured and utilised to quantify the presence and concentration of the target molecule or compound. While much research to date has focused on diabetes management, sweat sensors have also recently showed promise for monitoring of parameters such as lactate levels during exercise [59], blood-alcohol content [60], and electrolyte levels [61].
Another prevalent topic in the literature is the use of electroencephelogram (EEG) data to detect and diagnose various neurological conditions, as well as identifying and predicting neurological events. EEG sensors utilise electrodes placed across the scalp to monitor the electrical activity of the brain. There are many EEG sensors commercially available, including Emotiv EpocX [62] and NeuroSky devices [63]. EEGs are commonly used in clinical settings, however the use of many electrodes placed across the scalp does not enable longterm wear. Some research has thus aimed to improve the wearability of EEG. Several studies have utilized single-lead EEGs for diagnosing and monitoring conditions such as sleep apnea [64], [65], epilepsy [66], mild cognitive impairment (an early sign of dementia) [67], insomnia [68] and more.
There is also research interest in non-EEG solutions for the detection and prediction of seizure events. Several works have identified PPG [69], [70] and surface electromyography (sEMG) [71], [72] as candidates for seizure identification. The use of accelerometers and IMUs [73], [74] is also prevalent in the literature, where researchers aim to use information about movements to identify, classify, or predict seizure events. The complexity of using sensor data describing movement to identify seizure events has lead to recent researchers utilizing machine learning to ensure accurate results [73], [74].
Accelerometers and IMUs are also widely used in other healthcare monitoring applications. A prevalent topic in the literature is the use of accelerometers for fall detection [75], [76], a significant risk for older persons and those living with motor and mobility disorders. Accelerometers have also explored for identification of freezing-of-gait (FoG) events in Parkinson's disease and related disorders [77], as well as general gait pattern recognition for purposes such as rehabilitation [78], identification of multiple sclerosis [79], and general fitness management [80]. Accelerometers have also been used in monitoring disease progression and symptoms for those with Parkinson's disease [26] and multiple sclerosis [81]. In many accelerometer-based studies, artificial intelligence has been used to process data and make meaningful predictions [76], [77], [78], [80].
Virtual reality (VR) and extended reality (ER) devices have also been shown to offer much promise in the AIoMT domain. Several studies have shown that VR can reduce anxiety and stress during medical treatment, from routine dental care [82] to cancer treatment [83]. VR has similarly been shown to reduce anxiety in persons living with dementia in aged care facilities [22], and there is some evidence that it may be useful in treatment of medical phobias [84] and phantom limb pain [85]. Aside from treatment uses, there is also evidence that VR approaches can be combined with various sensors to aid in the diagnosis of developmental disorders, including attention deficit hyperactive disorder (ADHD) [86] and autism spectrum disorder (ASD) [87], as well as neurological disorders such as dementia [88].
Overall, the literature is rich with sensors and devices suitable for monitoring and assessing a wide range of health parameters and conditions. As this topic is extremely broad, this section serves to overview the key wearable and noncontact sensing technologies that are foundational to AIoT healthcare systems.

C. Communications
Communications technologies for IoT applications are many and varied, and have thus been the topic of many highquality and focused reviews [89], [90]. In this section, we provide an introductory overview of prevalent and upcoming communications technologies, as well as briefly introducing IoT-over-Satellite which can operate in both types of bands. We categorise current and emerging communications technologies for IoT-based healthcare broadly into three categories: unlicensed-band terrestrial, licensed-band terrestrial, and satellite-based communications.
1) Unlicensed-Band: Many communications technologies operate in the industrial, scientific and medical (ISM) radio frequencies. In terms of long-range communications, the most prevalent unlicensed-band standards are LoRaWAN (Long Range Wide Area Network) [91] and Sigfox [92].
LoRaWAN is a medium access control (MAC) protocol built upon the Long Range (LoRa) physical layer protocol, which uses chirp spread spectrum (CSS) modulation over a bandwidth of at least 125 kHz to minimise the impacts of interference due to high traffic in the unlicensed bands. Multiple access is managed using an ALOHA type of protocol [91]. LoRaWAN operates in a star-of-stars topology [91], with each gateway having a range of approximately 5km in urban areas [93]. A single gateway can support approximately 40,000 nodes, with each node assigned a unique 64-bit Extended Unique Identifier (EUI-64) key for addressing [94]. It operates in the unlicensed bands of 868 MHz in Europe and 915 MHz in the U.S., with a high network capacity, data rates of 0.25 to 5.5 kbps [94], and a maximum payload size of 243 bytes [93]. LoRaWAN can be accessed in several ways; users can pay for access to networks owned and operated by LoRaWAN Public Network Operators, or can develop and maintain their own LoRaWAN Private Network [95]. The flexibility and accessibility of LoRaWAN networks have made it a popular choice in recent research in the healthcare IoT domain [96], [97].
Sigfox also operates in the 868 MHz and 915 MHz bands. It operates in star topology, and achieves a range of approximately 10km in urban areas [93]. Sigfox nodes are typically designed to be uplink only, limited to 140 messages per day with a maximum payload size of 12 bytes. Downlink messages can be requested 4 times per day [92]. Sigfox has good coverage in much of western Europe, with limited coverage available in many other countries [92]. The Sigfox MAC layer relies on random frequency time division multiple access (RFTDMA), with addressing performed using a unique 32bit device ID [98]. Sigfox uses ultra narrow band modulation using differential binary phase shift keying (D-BPSK) to support power efficiency and reduce device cost [98]. Despite having several compatible healthcare devices listed on their website [99], recent research has not focused on Sigfox as an enabling technology for healthcare. Additionally, Sigfox faced significant financial difficulties in 2022 and filed for bankruptcy before being acquired by UnaBiz [100]. For these reasons, it is suggested that LoRaWAN is currently a more suitable open-standard long-range communications technology for the healthcare space.
In terms of short range communications, Bluetooth [101] and Wi-Fi [102] continue to dominate the healthcare IoT space. Bluetooth Low Energy (BLE) operates on the ISM 2.4GHz band using frequency hopping spread spectrum (FHSS) to minimise interference with other standards on this band. Modulation is performed using Gaussian frequency shift keying (GFSK) [101]. BLE networks can be configured in either star or mesh topology, and range is typically short, ranging from 25-125m [103]. Typical data rates range from 800-1400 kbps [104]. BLE devices are identified using a 48-bit unique address, with security supported by the four available pairing modes and encryption [101], [104]. These features have made it attractive for short-range healthcare applications in research, including contact tracing [105] and emergency room triage [106].
Wi-Fi has several standards with potential to be leveraged in healthcare IoT. In terms of classic Wi-Fi, the latest generation in use at time of writing is Wi-Fi 6E (802.11ax), with Wi-Fi 7 (802.11be) currently in development. Each of these standards is designed for operation in the 2.4, 5, and 6 GHz open bands. Wi-Fi 6 can achieve data rates of up to 9.6 Gbps [102], with Wi-Fi 7 expected to exceed 40 Gbps [107]. The Wi-Fi 6 standard utilizes 1024 quadrature amplitude modulation (1024-QAM) [102], with Wi-Fi 7 expected to utilise 4096-QAM (also called 4K-QAM) [107]. Wi-Fi 6 utilises orthogonal frequency division multiple access (OFDMA) to manage multiple devices, and MAC addressing follows the EUI-48 format [102]. The most prevalent use for these classic Wi-Fi standards is in connecting devices to a local network for delivery to the cloud or a health service directly [108]. Another standard of interest is Wi-Fi HaLow (802.11ah), which operates between 750 MHz and 928 MHz and offers a longer range of up to 1km. It supports modulation techniques including BPSK, quadrature phase shift keying (QPSK), and up to 256-QAM [109]. Although HaLow has been designed for IoT, there has not been widespread adoption or real-life validation. Thus, Wi-Fi HaLow has potential suitability for healthcare, but further validation is required.
A final open-band technology worth noting is radio frequency identification (RFID) technology, particularly the ultra-short range near field communication (NFC) standard. NFC is a device-to-device communication standard that operates on the 13.56 MHz band using amplitude shift keying (ASK) modulation. It typically achieves a range of less than 2cm and a data rate of up to 424 kbps [110]. NFC enables lower-powered device design, as NFC readers provide the power needed for an NFC tag to respond. This means that NFC-enabled health monitoring devices only need to have sufficient power for any on-board sensors, enabling small stick-on wearables [111] and implantables [112]. As NFC is readily built into many smart phones, apps can be readily designed to interface with NFC-enabled devices and objects, supporting at-home healthcare applications [113].
2) Licensed-Band: Licensed-band communications technologies are primarily cellular standards, with networks operated by telecommunications companies. Key communications technologies of interest are narrowband IoT (NB-IoT), 5G, 5G New Radio Reduced Capability (RedCap) and the upcoming 6G. In the context of healthcare, 5G offers attractive features such as high security, high data rates, strong reliability, and low latency [89]. 5G operates in various closed frequency bands, with the majority of commercial operators using spectrums in the range of 3.3-4.2 GHz. However, sub-GHz and millimetre wave frequencies of 26 GHz and 40 GHz are also available [114]. Modulation and access control vary between 5G networks, however QAM and orthogonal frequency division multiplexing (OFDM) schemes are common. Large-scale multiple access typically uses OFDMA, however non-orthogonal multiple access (NOMA) schemes have been proposed to further increase network capacity for 5G and future 6G networks [89], [115]. Common uses for 5G-IoT in healthcare include serving as a backbone for connectivity to cloud or health facilities [108] and telehealth supported by wearable devices and video services [116].
The upcoming 6G standard promises to reduce latency and increase reliability further, as well as supporting a greater number of connections. 6G will also feature increased data rates to enable faster communication of larger data formats, such as livestream video [115]. The benefits of these improvements in the healthcare IoT domain are clear, as high reliability and low latency are vitally important for time-critical health emergencies. In terms of spectrum, 6G is expected to operate in the same closed bands as 5G, while also expanding into tetrahertz frequency bands including optical [115]. Operation in the THz bands will enable higher data rates at short range, which would have significant benefits for applications such as telesurgery and health robotics. Other applications for 6G-IoT in healthcare include rapid-response unmanned aerial vehicles (UAV), remote diagnosis and treatment of illness, and enhanced connectivity between first responders and hospitals during emergencies [115].
NB-IoT is another key cellular technology. It differs from 5G and 6G in that it has been designed specifically for IoT applications, prioritising range, signal penetration, and low power usage [117]. It serves as a competitor to open standards like LoRaWAN and Sigfox. NB-IoT operates in the licensed Long-Term Evolution (LTE) and Global System for Mobile Communications (GSM) bands, meaning that it can coexist with widely deployed 3G and 4G technologies [114]. This is a significant advantage in terms of coverage, as 3G particularly is widely deployed. NB-IoT features a data rate of 26 kbps in downlink and 66 kbps in uplink, with higher latency of 1s [118]. NB-IoT utilises QPSK or 16-QAM modulation, with multiple access managed using OFDMA for downlink and single carrier frequency division multiple access (SC-FDMA) for uplink [90]. For this reason, NB-IoT is best suited to smaller data packets from wearable healthcare devices, as opposed to data types such as imagery or video.
An upcoming technology for small-to-medium data packets is RedCap, which has been primarily developed for low-power devices such as wearables. It offers lower latency and higher reliability than NB-IoT [119], and is capable of delivering larger packets such as low-resolution video. RedCap devices will be required to support 64-QAM and will utilise existing 5G access methods [120]. As deployment of RedCap begins, the key limitation will be a lack of coverage; 4G remains much more broadly deployed than 5G. However, for those with access to it, RedCap-enabled devices will offer significant advantages over their NB-IoT predecessors.
3) Satellite: Another communications technology of interest in modern healthcare IoT applications is satellite. Satellite communications operate in a wide spectrum of frequencies, from 1-40 GHz [121]. This spectrum overlaps with several open-band frequencies, but predominantly includes licensed bands. The key advantage of satellite communications in healthcare is simple: it enables connectivity in rural areas that are not covered by cellular or other communications standards, thus supporting health monitoring of persons in hard-to-reach locations [122].
A high-coverage constellation of low-Earth orbit (LEO) satellites is ideal for health applications, as these constellations provide the lowest latency of all satellite types [123]. However, LEO satellite communications still have high latency and power consumption compared to cellular and other openband standards -thus, this technology is only recommended for use in geographical areas that have no alternative coverage, or for applications that are not time-critical.

D. Machine Learning Algorithms
The final building block of the AIoMT architecture is artificial intelligence. Machine learning (ML) algorithms are many and varied, and many recent works hybridize or slightly modify prominent algorithms. In this section, we broadly overview ML algorithms that have been utilized in recent healthcare AI and AIoT literature, before briefly overviewing key metrics for assessing and comparing AI models in different contexts.
1) Convolutional Neural Networks: Convolutional neural networks (CNNs) are a prevalent group of machine learning algorithms that are particularly prominent in the image recognition field, particularly in the medical domain [124]. They are prevalent in the healthcare domain due to their strong ability to identify features, regardless of the location or orientation of the objects or features of interest within an image or waveform. CNNs achieve this by stepping through the image piece by piece, inspecting one small segment at a time and using convolutional operations to extract feature representations. Deeper CNN structures effectively perform feature extraction on the lower-level feature representations, enabling them to learn increasingly complex features.
Due to their ability to identify features in images, they have commonly been used in medical imaging tasks such as neurodevelopment prediction [125], brain tumour identification [126], and COVID detection [127]. However, CNNs and hybrid CNN models have also found use in processing medical waveforms such as PPG and ECG [31]. Hybrid CNN models have been found to be more successful than pure CNN models where there is a temporal dependency; for example, pure CNN has been found to underperform compared to models where CNN is hybridised with long short-term memory (LSTM) models in predicting blood pressure from time-series PPG waveforms [31], [128].
2) Recurrent Neural Networks: Recurrent neural networks (RNNs) are supervised learning algorithms that are commonly used to interpret sequential data, such as language or time-series waveforms. They have the ability to 'remember' what they have seen in the past by passing information from a previous time step forward to the next time step. However, classic RNNs still struggle to understand dependencies between features that are a long way apart; leading them to be largely replaced by long short-term memory (LSTM) networks.
LSTM networks are an advanced RNN that sweeps through a sequence of data one item at a time, using several gates to determine which information to remember and what to forget between steps. As they are designed for sequential data analysis, LSTM models have been used to interpret timeseries and other data in many healthcare applications including emotion recognition [129], cardiac arrhythmia detection [130], and epilepsy detection [131]. In one study, both RNN and LSTM elements are used to predict depression risk from ECG waveforms [132]. Interestingly, LSTMs have been found to underperform in scenarios where raw time-series data is converted into other formats. One example is seen in [133], where EEG signals were represented in other formats prior to being analysed by LSTM and SVM models; in this scenario, SVM achieved close to 100% accuracy while LSTM never exceeded 80% accuracy. Similarly, a recent study which applied LSTM to time-series vital sign measurements (as opposed to raw sensor data streams) was shown to underperform in predicting the onset of COVID-19 [134].
Another common variation on classic RNN and LSTM networks involves passing the data through forwards and backwards, known as bidirectional usage. This allows the network to learn from past and future items in the sequence, which has been shown to improve performance when interpreting healthcare signals such as PPG and ECG [33], [135].
3) Support Vector Machines: Support vector machines (SVMs) are supervised machine learning models that are relatively low in complexity. The simplest SVM seeks to generate a line that can be drawn to separate data points belonging to two different classes to enable binary classification; the seperating line is called a hyperplane. Where a 2-dimensional line is insufficient to divide the data accurately, it is possible to use 3-dimension or higher SVMs, which seek to find a plane that can separate the two classes of data. SVMs can also be used for multiclass classification by finding hyperplanes between every one-versus-all combination of classes, or can be used for regression by treating the hyperplane as effectively a line of best fit. The simplicity of SVMs to implement and understand has seen them widely used for healthcare applications, particularly where the input features are relatively simple. Recent research has explored the use of SVMs and their variants for a wide range of problems, including fatigue identification from heart rate variability parameters [136], COVID-19 diagnosis from laboratory test results [137] and Parkinson's disease diagnosis from voice features [138]. SVMs have also been utilised for human activity recognition [139], [140], where data from a variety of motion and activity sensors are fused to predict the onset of dementia; however, in both works SVM was found to severely underperform compared to other simple algorithms such as random forest. This suggests that SVM performs better on preprocessed features, and is less successful in sensor fusion tasks.

4) Random Forest Models:
Another common supervised learning algorithm in the healthcare domain is random forest (RF), most commonly used for classification problems. RF is an ensemble method, with RF models comprised of a large number of decision trees with different configurations, which are then trained to predict the desired output. Each individual decision tree makes a number of sequential decisions based on conditions applied to the features, before generating a final output. The overall output of the RF model is typically determined by majority vote -whichever output is selected by the majority of trees becomes the output of the whole forest. RF is therefore highly suited to applications such as choosing the correct diagnosis and identifying risk factors. Its simplicity to implement and interpret is its greatest advantage in the healthcare domain, and has seen RF used in many recent research papers seeking to differentiate between mental health conditions [141], diagnose breast cancer [142], and predict mortality in acute kidney injury patients [143]. While RF can perform well in many scenarios, it has also been found to severely underperform in interpreting sensor data for depression and anxiety assessment, achieving an accuracy of just 65.3% [144]. Similarly, RF showed underwhelming performance in predicting respiratory rate using complex features extracted from Wi-Fi channel state information, achieving just 79% [145]. This suggests that RF performs at its best where relatively simple data is used, and shows underwhelming performance when faced with more complex features. 5) Autoencoder Models: Autoencoders are a group of selfsupervised machine learning models that have recently garnered much attention, primarily for reconstructing image data. They are comprised of two stages: encoder and decoder. The encoder stage compresses the input data into a lowerdimensionality representation of the original data, while the decoder aims to reconstruct the compressed data back to its original state. Autoencoders are typically trained on highresolution and clean data, and the trained model can then be used to enhance data that is low-resolution, noisy, or in greyscale. Additionally, they can be used for abnormality detection as the reconstructed output can be compared to the true input to identify any areas of difference. For this reason, autoencoders have been increasingly researched in the medical imaging space, in applications including denoising of retinal tomography images [146], resolution enhancement for MRI [147], and localisation of anomalies with regional enhancement for COVID-19 identification in chest Xrays [148]. However, at least one study has identified that novel transformer models outperform autoencoder-based models in diagnosing COVID-19 from chest X-rays [149], suggesting that autoencoders may not be the most suitable model for healthcare image processing in all situations.

6) Attention Models:
A key weakness of the classic machine learning models discussed so far is that they struggle to interpret the broader context of an image or sequence; even LSTMs begin to 'forget' what they've seen in longer sequences. The concept of attention was introduced to address this problem for sequence-to-sequence (seq2seq) models, such as those translating one written language to another. Attention mechanism generate encoding vectors that reflect how important other components of a sequence or image are with respect to a particular point in that sequence or image. Attention mechanisms therefore enable the machine learning model to better understand the context of an input by considering all or a subset of the inputs when tuning the weights for the output, known as global and local attention, respectively.
Attention is a mechanism rather than a standalone model, and thus it can be applied to classic models such as CNNs and LSTMs to improve their context awareness and thus overall performance. This has been has been demonstrated in healthcare applications such as dementia detection from magnetic resonance spectroscopy [150], abnormal ECG classification [151], and video-based heart rate extraction [152]. However, a recent study investigated the performance of attention mechanisms in the prediction of diabetes based on discrete health features, and found that the attention-based models did not align with general statistical analysis, and thus cautioned that these models had limitations in certain clinical settings [153].
Generally, attention is implemented as an additional encoder-decoder or encoder-only layer within such models. One disadvantage of classic attention models is that training and testing takes longer than it would without attention.
Another key disadvantage is that the inputs are only considered with respect to the output, rather than an input being able to interact with all others and thus self-determine which other inputs are important to it. Both of these problems have been addressed through the introduction of transformer and self-attention models, discussed in the following subsection. 7) Transformer Models: Transformer models are a recent advancement in attention-based models, introducing the novel self-attention mechanism which enables direct interaction between inputs, such that the model can learn what attention is should be given to each input by all other inputs and itself. This is done via a series of encoding vectors that represent the importances of inputs to one another. These are ultimately aggregated and fed through a decoder to produce an output. Transformer models can replace RNNs and CNNs entirely, unlike classic attention mechanisms. Their shallower architecture leads to significantly faster training times compared to classic attention models, a key advantage in domains such as natural language processing (NLP) and computer vision. Transformers are also typically trained using self-supervised learning followed by supervised fine tuning; a significant advantage as less manually labelled data is required for them to succeed.
Transformers were originally developed for seq2seq problems such as label generation and report summarization, and in healthcare this has been leveraged for applications such as ECG abnormality identification and labelling [154] and automatic disease coding based on descriptive diagnostic text [155]. One limitation of classic transformers, however, is that the decoder produces one token at a time; it effectively is producing text from left-to-right and is unable to see items from the future of the output sequence.
This limitation has lead to the development of the Bidirectional Encoder Representations from Transformers (BERT) pre-trained model [156], which utilises only the encoder component of the transformer architecture. The encoder is capable of seeing tokens from both past and future, thus enabling the model to learn from a more complete context. The BERT model is pre-trained for masked language model and next sentence prediction tasks using a document-level language corpus comprised of over 3 billion words. BERT has been used in recent literature to develop models for depression risk identification [157], epidemic surveillance [158] and interpretation of radiology reports [159].
The success of transformers for language models has lead to an interest in these algorithms for other applications, particularly computer vision. Vision transformers treat images as a sequence of 'patches'. Recent works have explored the use of vision transformers for healthcare for radiology applications such as COVID-19 lung screening [160] and breast tumour malignancy assessment [161], with vision transformers found to routinely outperform CNN where large data is available. Transformers also have the advantage of being readily explainable in the spatial domain, as the patch-based approach can be used to visually highlight the areas that contributed to the prediction [160].
Several recent works have also explored vision-language transformers, which use vision transformer encoders and language transformer decoders to generate text reports based on an image. In healthcare, vision-language transformers have primarily been explored for generating radiology reports [162] and describing pathology images [163]. Transformers are generally well regarded in the literature for image processing and are currently a hot topic, but they are not a guaranteed solution. At least one recent study has shown that transformer-based models can still underperform with image-based data, with a transformer composite model achieving only 60% accuracy in identifying COVID-19 in lung ultrasound images [164]. This indicates that the quality and quantity of the data is still essential in developing high-performing transformer models.
Transformer models are an interesting development for the AIoT. They require large quantities of data to train successfully, meaning they are naturally synergistic with IoT technology. Due to their strong performance with analysing sequential data, it is highly likely that transformers could be broadly used to interpret time-series data from healthcare IoT devices -however, this has not been widely investigated. The ability to train on unlabelled data is also high advantageous in the AIoT context, as manually labelling such a significant volume of data is infeasible. Despite these key advantages, exploration of transformers for healthcare applications is in its infancy, and thus is a suggested direction for future research.

8) Key Metrics for AI Algorithms:
In subsequent sections, we discuss the performance of AI algorithms in the literature based on a range of key metrics. For binary and multiclass classification, these metrics include fundamental statistical measures of accuracy, specificity, sensitivity, precision, and F1 score. Ability to discriminate between two classes is commonly quantified in medical AI using area under the receiveroperator curve (AUROC) and area under the precision-recall curve (AUPRC), as these metrics are less affected by imbalanced data. For regression problems, statistical measures of mean absolute error (MAE), root mean squared error (RMSE), standard deviation (SD), and Pearson's correlation coefficient (PCC) are commonly used to understand error.

E. Layers of Learning and Cross-Layer AI
A key requirement of many AIoT systems is the implementation of AI on devices with varying computational capabilities. An AIoMT system will typically include embedded, edge, and cloud implementations of AI algorithms. In this section, we introduce the three layers of learning and current advancements in each layer.
1) Embedded AI: Embedded AI is an emerging field of research focused on developing lightweight AI algorithms for implementation on low-powered devices, such as smartwatches and other wearables.
In the healthcare context, researchers have primarily focused on training low-complexity machine learning algorithms such as RF [165], SVM [166], and shallow CNNs [167] for implementing embedded AI. Other shallow NN structures, including recurrent and hybrid structures, would also be strong candidates from embedded AI applications.
Aside from designing lightweight AI architectures, research into implementing embedded AI onto efficient hardware such as FPGA [24] and CMOS [25].
2) Edge AI: Edge AI is often used synonymously with embedded AI in the literature, however in this work we consider these as two separate domains. Here we define edge AI algorithms as those operating one step away from the end device, such as the smartphones and computers illustrated in Fig. 2.
The computational capabilities of edge devices varies greatly, ranging from relatively low-powered smart phones through to high-end computers with powerful graphics cards. As such, a broad range of AI algorithms have been implemented at this layer in healthcare applications.
In lower-powered devices, many of the same algorithms utilised in embedded AI again find use in edge AI, again including RF [168], [169], SVM [169], and CNN [170], [171]. Additional algorithms including LSTM have also been utilised on the edge [172].
3) Cloud AI: Any AI algorithm that can be implemented in edge or embedded devices can also be implemented in the cloud; however, cloud AI is most commonly used where there are extremely large data sets or collaborative data sets. Cloud computing resources are also typically needed to support complex models such as transformer models, which generally need significant quantities of data and are computationally expensive to train. In the context of healthcare, cloud AI is commonly used for medical imaging applications due to the size of this data [173], [174]. Cloud AI is also commonly utilized for pre-training of algorithms before they are deployed to edge AI devices such as smartphones [169].
4) Cross-Layer AI: Different layers of learning have differing advantages. Edge and embedded AI offers low-latency and high preservation of privacy, while cloud computing offers significantly higher computational power. Cross-layer AI offers a solution that leverages the advantages of multiple layers of learning. Several recent studies have utilized crosslayer AI to conduct initial processing on the edge before offloading to the cloud for higher-level processing [170], [175], decreasing latency whilst simultaneously leveraging high-power computing resources where required.
Another application of cross-layer learning is in federated learning, a distributed learning architecture highly suited to the healthcare context. To create machine learning algorithms that generalize to a global population, large and diverse databases are required. However, patient privacy is paramount and thus centralized databases cannot be created. Federated learning addresses this problem through training of machine learning models on decentralized edge devices, with the resulting model then shared via cloud computing resources. Models from all participating edge devices are fused, and the resulting model can be returned to the participating edge devices for further use [176]. Preliminary studies have illustrated the benefits of federated learning for applications such as functional MRI analysis [177] and classification of clinically significant prostate cancer using apparent diffusion coefficient imagery [178].
While federated learning is privacy preserving, the dependency on a central server for learning creates a single point of failure; if the central server is compromised, then participating nodes all suffer. To address this issue while still preserving privacy, the concept of swarm learning was recently proposed for medical contexts [179]. Swarm learning is similar to federated in that all nodes conduct local learning on embedded or edge devices and only share the results of that learning, however it differs in that the nodes aggregate the local learnings using a peer-to-peer structure with blockchain technology. The use of blockchain ensures that only legitimate nodes can join the network, and the peer-to-peer structure ensures that the system is resilient against failures of a single node. Swarm learning has been tested for medical applications including tuberculosis detection and COVID-19 diagnosis, with promising results [179].

F. Lessons Learned
In this section, it was found that many of the physical building blocks required for such a system are now readily available. Many health sensors such as PPG, ECG, and EEG are well-established and widely available, providing researchers with much opportunity to collect data for AIoMT studies and applications. In the materials science space, novel sensors such as sweat and interstitial fluid sensors remain an active field of research for detecting and managing both shortand long-term conditions, from lactic acid build-up to diabetes. Non-contact sensing methods are also a prevalent topic in recent literature, with techniques such as rPPG and channel state analysis increasingly explored for the measurement of various health indicators, from vital signs to physical activity.
Additionally, recent advancements in communications technology have provided several highly suitable standards and approaches for AIoMT systems, with upcoming technologies further benefits. 6G is expected to extremely fast communication and significant network capacity, and thus will drive advanced telehealth applications, from rapid-response medical UAVs and telesurgery. Meanwhile, RedCap is a clear successor to NB-IoT, offering lower latency and higher reliability due to its 5G backbone. This will primarily support lower-powered devices, but will be capable of sending larger quantities of data than its NB-IoT predecessor. Lastly, satellite-based IoT communications will be critical for healthcare to be truly pervasive; dense LEO constellations can provide coverage in regions where cellular networks are unavailable.
In terms of machine learning, it was interesting to note that many of the studies identified in this literature review used fundamental ML models such as SVM and RF. These approaches typically involved extensive feature extraction from image or time-series data, which has the potential to introduce human bias. Relatively few papers considered the application of ML models directly to raw data, despite the clear suitability of LSTM and transformer models for timeseries and other sequential data types. Furthermore, very few papers investigated the application of emerging ML models such as transformers and autoencoders to healthcare problems, and thus there is much opportunity remaining in applying advanced models to such problems.
Our review also identified many novel technologies that have changed the shape of AIoT for healthcare applications in recent years. The gradual shift of AI towards the edge and embedded layers has significantly altered the traditional model of simply offloading to the cloud. Continued development of embedded and edge AI algorithms will lead to healthcare systems with lower latency, enhanced privacy, and higher fault tolerance. Cross-layer AI has also emerged as a solution to problems with higher computational needs, enabling some processing to occur at the embedded or edge layer before offloading the remainder of the work to higher-powered edge or cloud devices.
In terms of preserving privacy where cloud AI is utilised, the techniques of federated and swarm learning both offer significant promise. Both techniques involve the training and subsequent sharing of local models, without sharing the data that enabled that training. It is suggested that swarm learning offers stronger benefits than federated learning, due to the robust and decentralized peer-to-peer structure of the approach.

III. AIOT FOR HEALTH DOMAIN CHALLENGES
With the components of the AIoMT now established, this section explores the key domains of healthcare where AIoMT can be utilized to improve outcomes for clinicians and patients alike. This section begins with the fundamental domain of health monitoring, before exploring how this can be further applied for diagnosis and prognosis. Lastly, we examine the literature on using explainability techniques within AIoMT to improve clinician trust and general understanding of diseases and outcomes.

A. Health Monitoring
The most fundamental application of AIoT is monitoring of health parameters, both for general health and management of specific conditions. A typical AIoT health monitoring system utilizes on-body devices and/or environmental sensors to acquire data from the patient, as shown in Fig. 3. Machine learning techniques are then used to extract meaningful information from this data.
1) Wearable Monitoring: In one recent study, an AIoMT system was developed for monitoring workers in hot environments for signs of heat stroke [180]. The developed wearable device incorporated humidity, temperature, and multiple PPG sensors, along with a 3-axis accelerometer. Metrics including heart rate, activity, and a personalised heat stress temperature were derived and used as inputs to several machine learning models. It was found that a simple k-nearest neighbours model was stronger for this task than more complex algorithms such as RF and SVM. The complete system was validated in a high-temperature work environment, and was found to identify 96.7% of heat stroke cases.
Mental fatigue is another key health and safety risk in any workplace. One recent study [136] proposed an AIoMT system that linked heart rate variability (HRV) parameters with mental fatigue. HRV statistics were extracted from ECG signals obtained via a chest-worn device. Several machine learning models were trained to conduct binary classification (fatigued or not fatigued), with SVM found to outperform k-nearest neighbours (kNN) and linear regression (LR) approaches.
Cardiac health monitoring was considered in a critical care setting by another recent study [181]. Information about existing health status gathered from patient interviews was fused with features of the ECG signal, blood oxygen saturation, and body temperature measurements obtained via wearable devices. RF, SVM, and shallow fully-connected neural networks (FCNNs) were trialled for identifying a patient's cardiac health status as healthy or unhealthy on a continuous basis. The RF model achieved the highest accuracy of 80% on a dataset of 12 patients.
There have also been several studies seeking to monitor chronic health conditions with AIoMT systems. In one study [182], such a system was developed for the monitoring and classification of seizures in patients with epilepsy. A device that can be worn on wrist or ankle was used to measure electrodermal activity via sweat sensor, accelometry, and blood pulse volume via PPG. These parameters were then processed by a shallow CNN with the aim of identifying a seizure and classifying the type. The best results for seizure detection were seen using only accelerometer and PPG data, achieving an AUROC of 0.752. Interestingly, the model utilizing only accelerometer data as an input performed more strongly in classifying the type of seizure, achieving the highest AUROC for classifying 5 of the 9 considered seizure types.
Glucose and glycated haemoglobin are key parameters for monitoring the health of people with diabetes, however current methods for obtaining these parameters are somewhat invasive. One recent pilot study [183] developed an AIoMT system that leverages non-invasive wrist-worn sensors to measure heart rate via PPG, body temperature, electrodermal activity and accelometry. Random forest models are then used to assess 27 glucose variability metrics and glycated haemoglobin levels. It was found that 11 of the 27 glucose variability parameters could be predicted with <10% error, as could glycated haemoglobin. The study also reported the importances of each input parameter in successfully predicting each output parameter. The results indicate that input feature importance varied greatly depending on the parameter of interest, however in all cases it was clear that data from all four sources contributed strongly to successful measurement of glucose variability and glycated haemoglobin parameters.
2) Non-Contact Monitoring: Incorporating environmental monitoring devices, such as camera or radar, can also be useful in many health monitoring applications. In one recent study [133], an AIoMT system for monitoring fatigue was developed, fusing data from forehead-wearable EEG sensors and video-derived eyelid features for classification by SVM and LSTM models. It was found that LSTM can identify fatigue with up to 75.71% accuracy without any calibration to an individual user. It was further found that SVM can identify fatigue with up to 99.64% accuracy when fine-tuned to an individual. Interestingly, models trained on fused data only performed slightly better than those trained on eyelid features alone but significantly better than those trained on EEG data alone.
Some AIoMT studies move away from wearables entirely, using only environmental sensing devices. In one such study, non-contact monitoring of HR and RR is conducted in a neonatal cohort using image data gathered from a camera [184]. A CNN model is used to identify the region of interest -in this case, the infant's face -before cardiorespiratory activity data is extracted from colour and motion artefacts. These are then used to calculate RR and HR.
Another study developed a non-contact AIoMT system for measurement of RR using channel state information, with a Wi-Fi router used as the transmitter [145]. A respiration signal is obtained based on reflectance of the signal from the patient, with feature extraction then conducted. Four machine learning models were trialled for the measurement of RR based on the obtained signal. The strongest model was a KNN approach at 83.33% accuracy, with RF and SVM close behind at just over 79% accuracy. This approach has the advantage of being noncontact and privacy-preserving, as no image data is recorded.
Another privacy-preserving AIoMT health monitoring scheme using solely environmental sensing is presented in [185], where channel state information is for activity classification. Radar spectograms are gathered from interference with Wi-Fi signals, and several machine learning models are trialled for accurate classification of activities such as walking, sitting, and falling. The strongest performing model was a CNN, achieving an accuracy of 95.30%.
Overall, monitoring is a the most fundamental domain for AIoMT systems. Without monitoring, the diagnosis, prognosis, and explanations discussed in the following subsections would not be possible. This section has identified that AIoMT is highly suitable for enhanced monitoring in non-invasive and Health conditions for which AIoMT-based diagnosis has been explored in the literature.
wearable ways, which has the potential to greatly improve remote health services, support independent living for at-risk individuals, and enhance other forms of care.

B. Diagnosis
Monitoring of health status is important for providing instantaneous snapshots of a person's health and how it changes over time. However, this alone does not enable for diagnosis of health conditions that may affect a person. The domain of AIoMT-enabled diagnosis focuses on leveraging modern technology to better identify a wide range of mental and physical health conditions, as well as developmental disorders, as shown in Fig. 4. In this subsection, we explore AIoMT-based diagnosis in these three areas.
1) Physical Health: Accurate and early diagnosis of COVID-19 has understandably been a prevalent topic of research since the beginning of the pandemic, and this includes research in the AIoMT field. In one study, an AIoMT system for early detection of COVID-19 infection was developed in [186]. ECG signals were obtained via Apple Watches, then features describing the heart rate variability (HRV) and resting heart rate (RHR) were extracted. Several machine learning techniques were trialled for detecting COVID-19 infection from these parameters, with a gradient boosting decision tree approach achieving the strongest performance. After fine-tuning, the model achieved 77% accuracy and 76.8% sensitivity.
In another recent study, laboratory results from EHRs were used to identify patients with COVID-19 using various machine learning models [137]. It was found that an SVM model was the best performer in diagnosing COVID from 15 clinical variables, achieving accuracy of 93.33% and AUROC of 0.88.
Another serious short-term illness prevalent in several regions of the world is malaria, a mosquito-borne disease caused by a parasite. Malaria is endemic in regions of Africa, and rapid diagnosis is required for saving lives and preventing outbreaks. One recent study developed a prototype low-cost AIoMT system for field diagnosis of malaria [187]. Microscopy is used to collect images of slides containing blood samples taken from participants, with a CNN-based model then used to identify the presence of parasites. The diagnostic sensitivity was 91.1%, with species of parasites also accurately identified in 92-93% of cases. Additionally, the model attempted to identify parasite density within the slides. However, this was less successful; the model was within ±25% of the reference microscope count in only 23% of slides.
There are also many long-term illnesses where outcomes can also be improved through early diagnosis. One clear example is chronic kidney disease (CKD), which ultimately causes renal failure. If caught in early stages, treatment is substantially more effective. To address this, one recent study [188] developed a straightforward AIoMT tool for diagnosing earlystage CKD at low cost, using simple vital sign measurements along with results of urine and blood tests. They trained multiple machine learning models for this diagnostic task and found RF to be the strongest performer, with a diagnosis accuracy of 99.50%.
Outcomes for heart disease can also be improved through early diagnosis. One recent study used a novel digital twin approach based on AIoMT to accelerate identification of heart issues [12]. ECG sensors are used to gather cardiac activity signals, with LSTM then used to identify the presence of several different types of heart arrhythmia. Digital twins of each patient are created, where both raw sensor data and AI arrhythmia assessments are stored for clinician access. This approach showed high accuracy in assessing most types of arrhythmia, and the digital twin approach allows for clinicians to receive timely updates and make prompt decisions; it also supports comparison between multiple patients' digital twins to enable identification of similar cases and thus further support treatment planning.
Cancer is another longer-term illness where early diagnosis is critical. Many late-stage cancers are difficult to treat and result in high mortality across the world; however, early diagnosis can greatly improve outcomes for many types of cancer. Skin cancer is one prevalent cancer where early detection is critical. In one recent study, cancerous skin lesions were identified using image data [189]. Images were segmented to identify potential lesions, with various features then extracted. These features were then processed by an ensemble SVM and RF classification approach, which achieved 85.31% diagnostic accuracy. This strategy is promising, as many people have access to smartphone cameras; thus, regularly taking photos of suspicious lesions could lead to significantly earlier detection of skin cancer.
Prostate cancer is another prevalent form of cancer where early intervention can greatly improve outcomes. In one recent study, an AIoMT system utilised a photoacoustic probe to identify potential prostate cancers [190]. Features were extracted from the photoacoustic spectrum, with linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) used process the data and identify cancers. The QDA model performed slightly more strongly, with a diagnostic accuracy of 81.7% and a sensitivity of 78.2%. The authors suggest that these results indicate stronger performance than MRI whilst being less invasive than biopsy.
Early diagnosis is also critical for breast cancer. Mammography is a common imaging technique for detecting breast cancer, and is reasonably accurate. However, it is less accurate in younger people and requires specialist analysis. Automatic diagnosis has the potential to address these issues, and thus was considered by a recent study which used an extreme learning FCNN approach for identifying potential cancers in mammographs [191]. On one database, their model achieved 91.13% diagnostic accuracy and 0.95 AUROC; on another database, it achieved diagnostic accuracy of 100% and AUROC of 1. Their results also indicated a strong ability to classify tumours as malignant or benign, achieving 100% accuracy on the small test set. Further validation on a large set would be required to confirm the strong performance of this model.
Diagnosis of diseases that are chronic or degenerative has also been widely considered in the literature. In one study, an AIoMT system was developed for the diagnosis of diabetes [192]. Data from sensors including photoplethysmogram, body temperature, glucometer, and blood pressure sensors were obtained via an Arduino. Several demographic features were also gathered by means of a questionnaire. All data was processed by an ensemble learning method including logistic regression, kNN and SVM models; then a voting approach was used to summarise the outputs, which in turn was passed to a RF model for final diagnosis prediction. This method achieved an diagnostic accuracy of 98.4% and 0.984 AUROC on the collected dataset.
Early diagnosis of Parkinson's disease can improve the efficacy of several treatments, and thus has been a topic of interest in the literature. One recent study [138] developed an AIoMT tool that utilises SVM to process features of vocal patterns, namely vowel phonations, in order to diagnose Parkinson's disease. This model achieved a diagnostic accuracy of 92.21%. Voice signals were also considered in another recent work [193], where machine learning methods including kNN, RF, and Naive Bayes were applied to features extracted from open-access voice recordings. RF performed strongly, achieving 95.58% diagnostic accuracy.
Alzheimer's disease is another common degenerative disease; it causes the brain to atrophy over time. It is the most common cause of dementia, leading to cognitive and physical decline as the condition progresses. Early diagnosis enables management of symptoms to support a high quality of life for a longer period. In one recent study, data gathered from environmental motion sensors was used to predict the early onset of dementia based on the participants' activity [140]. Several machine learning models were trialled, including FCNN, RF, and SVM. The strongest model was a boosted decision tree model, which achieved an accuracy of 92.59%. Brain imagery has also been used in recent studies seeking to diagnose Alzheimer's disease; one such study recently developed a hybrid CNN and SVM model for processing MRI images [194], achieving an accuracy of 94.8% and AUROC of 0.997.
2) Mental Health: Recent literature has sought to diagnose a broad range of depression and anxiety disorders. In one recent study [144], a low-cost AIoMT system was proposed for diagnosing major depressive disorder using EEG, eye tracking, and galvanic skin response sensors. Several machine learning models were trialled for processing the input data, including RF, logistic regression, and SVM. The logistic regression model exhibited the highest performance, achieving an accuracy of 79.63% in diagnosing major depressive disorder.
Depression has also been diagnosed using electronic health record (EHR) and medical imaging along with machine learning. In one study, clinical, laboratory and demographic data were collected from EHRs [195]. After training several machine learning models, it was identified that RF was the most suitable algorithm; achieving an accuracy of 89% and AUROC of 0.87. In another study [196], functional connectivity brain maps were processed using an SVM classifier, achieving an AUROC of 0.99 for distinguishing persons with and without depression in a small cohort. Bipolar disorder is another common depressive disorder which is characterised by significant mood swings between depression and mania, however it is frequently misdiagnosed. One recent study [197] aimed to address this issue through the development of an AIoMT system that utilises EEG signals to identify the condition. After trialling multiple machine learning methods, it was identified that a gradient boosting approach could accurately diagnose bipolar disorder in 94% of cases using features extracted from the EEG signals. SVM and decision tree also performed quite strongly, achieving an accuracy of over 87.48%. It is possible that an RF approach would prove a strong candidate for this application given that a single decision tree was able to achieve such strong performance.
Diagnosis of anxiety disorders have also been considered in the literature. In one recent study, motion data from smartphones and IMUs was used for the identification of anxious behaviours including hair pulling and hand tapping [198]. Multiple machine learning methods were trialled for classifying anxious behaviours based on the acquired motion data, with CNN and LSTM found to be 92% accurate. This has significant diagnostic potential; smartphones are widespread, and thus a tool such as this could be broadly deployed to identify warning signs of anxiety disorder.
Panic disorder is a common anxiety disorder that can coexist with other anxiety diagnoses. However, it is critical to identify where panic disorder is present so that treatment can be tailored to reduce the effects of panic on the patient. This issue was considered by a recent study that sought to develop an AIoMT system to distinguish panic disorder from other anxiety disorders using heart rate variability (HRV) metrics [199]. The HRV metrics are extracted from an ECG device, and several machine learning models were trialled for using these metrics to diagnose panic disorder. Simple logistic regression was found to be the strongest model, achieving an accuracy of 78.4%. This is significant as HRV parameters can be extracted from ECG or PPG in readily-available wearable devices and smartphones.
Post-traumatic stress disorder (PTSD) is an anxiety disorder caused by a stressful event or prolonged traumatic experience in the patient's life. Symptoms overlap with other anxiety disorders, as well as major depressive disorder. As such, accurate identification of PTSD is critical for ensuring that patients receive the correct treatment. In one recent study, SVM was used to process P300 wave data from EEG sensors to diagnose PTSD, both on its own and in the presence of comorbidities [200]. SVM was able to distinguish between healthy controls and PTSD sufferers in 82.09% and 82.56% of cases for mono and comorbid PTSD, respectively. Additionally, SVM could distinguish between PTSD and major depressive disorder in 70.34% of cases.
3) Development Disorders: Accurate diagnosis of developmental and behavioural disorders, particularly in young children, is critical for early and appropriate intervention to ensure that the right support can be provided. As such, this topic has attracted increasing interest in recent literature.
Autism Spectrum Disorder (ASD) is one prevalent developmental disorder, and is often challenging to diagnose. Early diagnosis can aid in the development of individualised support plans that improve outcomes for persons with ASD. For this reason, several recent studies have sought to use AIoMT technology to improve diagnosis of ASD. In one recent study [201], a system was developed wherein machine learning is used to process home videos to identify characteristic ASD behaviours. Features were extracted from the uploaded home videos, with machine learning models including SVM, RF, and logistic regression trialled for identifying children with ASD. SVM showed the highest performance, achieving an accuracy of 91.79% and AUROC of 0.946.
In another recent study, demographic information from EHRs as well as responses to survey questions were processed by various machine learning methods to diagnose ASD in toddlers, children, adolescents, and adults [202]. The SVM model showed the best performance, achieving accuracies 97.82% for the toddler subset, 99.61% for the child subset, 95.87% adolescent subset, and finally 96.82% for the adult subset. Another study used EHR information to identify ASD in persons where overlapping diagnoses were present, namely anxiety disorders and conduct disorder [203]. It was found that RF was suitable for this task, with results showing sensitivity of 0.89-0.94 across all RF models trialled.
Another common developmental disorder that primarily affects learning is dyslexia, which is primarily characterised by difficulty in reading. Children with dyslexia can also experience speech impairments and difficulties in retaining new information. With tailored support, many people with dyslexia can succeed in school and work environments; thus, early diagnosis is critical for improving outcomes. In one recent study [204], eye movements were recorded while participants completed several reading and other visual tasks. SVM and linear regression models were then trialled for processing the eye movement features, with linear regression found to accurately identify dyslexia in 81.25% of cases either reading tests or LED-based saccade eye movement tests.
Behavioural disorders are another group of developmental disorders, and one of the most prevalent is attentiondeficit/hyperactivity disorder (ADHD). Despite it being wellknown, ADHD is commonly under-diagnosed, particularly in girls and women [205]. Persons with ADHD can experience symptoms including learning difficulty, impulsive risk-taking behaviour, and aggression; if left untreated, ADHD has also been associated with the development of depression and anxiety. As such, accurate diagnosis is critical so that individuals with ADHD can develop strategies for managing their symptoms. As such, one recent study [206] developed AIoMT system for diagnosing ADHD using features extracted from EEG signals. The strongest model was SVM, achieving an accuracy of 94.2% and AUROC of 0.964 in distinguishing ADHD and non-ADHD individuals. A similar approach was used in another recent study [207], where raw EEG signal segments were used as inputs to several machine learning models. LSTM was shown to be the strongest performer for this task, achieving an accuracy of 90.50% in identifying individuals with and without ADHD.
Overall, diagnosis is a key area for AIoMT in healthcare. Many conditions, both physical and mental, are difficult to diagnose and distinguish from one another; particularly in low-resource settings without access to specialist equipment and clinicians. AIoMT-enabled diagnostics offers significant potential to rapidly identify potential conditions, streamlining the process of receiving a correct diagnosis and suitable treatment.

C. Prognosis Assessment
Diagnosis of health conditions is critical to enable early intervention and a higher level of care. However, it is also valuable to know the severity and likely outcome of a condition; this helps clinicians to make more informed decisions about treatment paths, and ultimately can improve outcomes for a patient. This is the domain of prognosis assessment. As illustrated in Fig. 5, the goal of prognosis is to predict outcomes and their severity in patients who are hospitalized or undergoing continuing treatment for an injury or health condition. Prognosis is another area where the AIoMT offers significant advantages.
1) Short-Term Outcomes: One critical application for AIoMT in prognosis assessment is in predicting serious outcomes in critical care units. Mortality is clearly the most serious outcome in this setting, and early prediction of mortality risk can aid in decision making regarding treatment paths. One recent study [208] monitored variations in vital signs over a 24-hour period, using a hybrid CNN-LSTM machine learning model to classify patients as high-or low-risk of mortality within several time windows. The strongest performance was an AUROC of 0.884 when predicting mortality risk within a 3-day window, which provides a significant time window for clinicians to make decisions that may change the outcome.
In another study, mortality of patients diagnosed with sepsis in low-resource critical care settings is predicted using only heart rate variability information [209]. A wearable patch obtains an ECG signal over a 24-hour period, with various heart rate variable parameters then extracted. An LSTM model is then used to interpret these parameters and identify risk of mortality, with the final model achieving an AUROC of 0.70.
The strong link between sepsis and mortality has also lead researchers to develop AIoMT systems for predicting onset of sepsis early. In one novel sepsis prediction study [210], a flexible ECG sensor is used to obtain 30-second signals. Fourteen time-series parameters are extracted from the signal before an embedded fully-connected NN algorithm calculates the risk of sepsis within several time windows. The embedded AI algorithm achieved a 95% accuracy in predicting sepsis onset within 1-hour, however performance decreased significantly for increasing time windows, dropping to 77.5% accuracy for 6-hour onset. To address this, the AIoMT system was expanded through offloading the embedded AI's prediction to the cloud for fusion with EHR information for the patient in question. This lead to a significant increase in sepsis onset prediction for time windows as large as 6-hours. Where only demographics were known, fully-connected NN performed strongly; where both demographics and comorbidities were known, linear and logistic regression showed the highest performance.
Aside from sepsis and mortality, there are several other serious outcomes in critical care units that must be considered by clinicians. One recent study proposed an AIoMT system for predicting serious outcomes [211], including adverse neurological, respiratory, circulatory, and infection outcomes. RR and HR were continuously acquired from ECG signals, while PPG was used to obtain HR and blood oxygen saturation. Additionally, systolic blood pressure (SBP) and diastolic blood pressure (DBP) were intermittently measured using an ambulatory monitoring device. An AUROC of 0.91 was achieved using a boosted ensemble model, significantly outperforming simpler models including kNN and RF.
In the pediatric cohort, acute kidney injury (AKI) occurs after kidney impairment and currently is treated only through supportive care. A recent study aimed to predict whether patients were likely to suffer AKI so that injury could be prevented before it occurred [212]. Vital sign, laboratory results, and other clinical information were utilized to train an age-dependent ensemble learning model to predict the outcome of AKI up to 48 hours prior to the outcome occurring. An AUROC of 0.89 was achieved, indicating strong prediction performance. Additionally, the developed model provided actionable feedback to clinicians based on current treatments, enabling them to intervene early and potentially prevent AKI in some patients.
2) Long-Term Outcomes: Long-term outcomes are also important for many cohorts. In babies born preterm, there is an increased risk of adverse neurodevelopmental outcomes. Several recent studies [213], [214] developed AIoMT models for predicting Bayley-III scores [215] at age 18-24 months, with these predictions indicating the likely neurodevelopmental trajectory of the infant and thus allowing for early intervention where necessary. Both studies used CNNs to predict neurodevelopment based on structural and functional brain maps that were derived from magnetic resonance imagery (MRI) and diffuse tensor imagery (DTI) [213], [214], while one work [214] also measured and included clinical variables. Across these two studies, mean absolute error (MAE) ranged from 10.7-11.7, representing approximately a relatively low error margin of approximately 10%.
Predicting recurrence of cancers has also been considered in the literature, including cancers affecting the breast [216] and bladder [217]. One recent study investigated the prediction of breast cancer recurrence [216], using various machine learning models including random forest and logistic regression. They used data from EHRs including free-text histopathology reports as the inputs, achieving strong sensitivies exceeding 90% for most models. Meanwhile, another study investigated the prediction of bladder tumor recurrence using features extracted from pathological images. SVM and RF were trialled, with the RF successfully predicting 86.7% of cases of recurrence within a 2-year period. In both cases, these tools prove useful in determining which patients are at the highest risk of cancer recurrence, a useful tool for clinicians in determining post-cancer care requirements.
Overall, AIoMT systems have been applied to a broad range of prognosis challenges. Promising results have been seen in identifying both short-and long-term prognosis for a range of illnesses, and similar techniques could be applied for understanding prognosis in many more.

D. Explainability
The ability to diagnose disease and predict outcomes is a powerful tool provided by AIoMT. However, another advantage of modern AIoMT systems is their ability to explain how those predictions were made to clinicians, improving trust in the tool and enabling more informed decision making. The advantages of explainability do not end there; explainable AIoMT systems can also be used to better understand the risk factors, biomarkers, and other features associated with a particular condition or outcome. This in turn can lead to targeted medical research to support better treatment and prevention strategies for a wide range of conditions and outcomes. Explaianbility is also critical for the development of socially responsible AI [218], as it can be used to improve trustworthiness and dependability of a model. It can also unveil hidden biases or reliability issues, allowing the approach to be revised until a truly responsible model is achieved.
1) Medical Images: Improving clinician trust in artificial intelligence tools is a key objective of explainability techniques in healthcare. Explanation of diagnosis decisions, particularly in medical images, is a key application of explainability in the AIoMT domain. A common approach in the literature is the development of heat-mapping techniques that highlight the areas a model used to make its decision, as illustrated in Fig. 6. This approach visually highlights areas that the model has used to make an outcome, thus improving clinician trust. In one recent study [219], a novel heat-mapping approach dubbed High-resolution Activation Mapping (HAM) was applied to an attention-based model for the diagnosis and subsequent explanation of Alzheimer's disease in MRI images. Evaluation of the explanations found that white matter was generally identified correctly, outperforming several previous approaches to heat-mapping.
In another study [220], brain tumours were identified in MRI images using a lightweight CNN model. Brain tumours were further classified by malignancy and type, with a heatmapping approach known as class-activation mapping (CAM) then applied to localise the tumour. The explanations made by the model were evaluated by a small group of 10 clinicians, who were provided with a series of survey questions regarding the usability and trustworthiness of the system. The survey results suggested that trust in the AI model's decision making was improved by the provided explanations, however a larger sample size of clinicians would be required to validate this claim.
Brain tumour identification and classification was also considered in [221], where explainability techniques LIME (local interpretable model-agnostic explainability) [222] and SHAP (SHapley Additive Explanations) [223] were applied to explain predictions made by a CNN model. SHAP acts by highlighting areas that the model associated with the outcome in red, and areas that went against the outcome in blue. This helps clinicians to better identify cases where the model was unsure -thus reducing the risk that trust is lost with a single incorrect prediction. Meanwhile, LIME was used in this context to segment the regions of the brain that the model considered relevant; this performed less well than SHAP.
Another recent study [224] applied multiple explainability techniques including LIME (local interpretable model-agnostic explainability) [222], CAM approaches, and RISE (randomized input sampling of black-box models) [225] were compared for explaining COVID-19 diagnoses made by a deep CNN model based on CT images. It was found that the areas highlighted by RISE most closely matched human annotation by experts. Fifty clinicians were then asked to evaluate the explanations made by the various models using a surveybased approach, with understanding of the model's decisions found to generally increase as a result of explanation. It was suggested that trust also improved with explanation, however feedback from clinicians indicated that domain expertise remained necessary to validate the model's predictions, as they were not always accurate or precise. This indicates that there remains room for improvement in explaining CT images.
Explanation techniques can also be used to identify areas within medical images that are associated with a particular diagnosis or outcome. Despite medical advances, the brain remains an enigmatic organ; there is still relatively little known about it. To uncover some of the mysteries of the brain, several recent studies have turned to explainability techniques. In one such study [125], a CAM-based approach was utilised to identify regions of the brain associated with atypical motor development in infants, with predictions made by a CNN using MRI images. Their heat-mapping approach indicated several regions of the brain were strongly correlated with motor development outcomes, including the motor cortex, somatosensory regions, cerebellum, and occipital and frontal lobes. In another study [214], atypical neurodevelopment in infants born preterm was predicted using a CNN model to process MRI and clinical features. Predictions were explained through the calculation of feature importances using a partial derivative approach relating each feature of interest to the predicted outcome. Using this approach, several functional and structural white matter connections were identified as strongly predictive of adverse neurodevelopmental outcomes. Results of these and similar studies can be leveraged by future researchers aiming to better understand and ultimately improve neurodevelopmental outcomes by providing potential regions of the brain to target.
2) Other AIoMT Data: Aside from medical imaging, much AIoMT data is in the form of discrete or time-series values. It is often useful for explainability techniques to be applied to AIoMT systems operating with such data, as this can again aid in improving clinician trust in systems, while also contributing to biomarker discovery.
Mortality prediction is one area where explainable AIoMT systems are essential. Treatment decisions made based on mortality prediction by an AIoMT system could be the difference between life-or-death. As such, clinicians need to understand how a system has made the prediction. In one recent study [226], an explainable AIoMT system was proposed wherein mortality in a neonatal cohort was predicted from variation in vital sign data using a shallow CNN-LSTM structure. Predictions were then explained using SHAP scores, which revealed parameters including gestational age and median respiratory rate in a twelve-hour period to be strongly predictive of mortality. This in itself helps to build clinician trust, as the relationship between gestational age and mortality is well established in clinical practice. SHAP was then further used to assess individual predictions, producing force plots that indicated which parameters contributed towards the prediction (and any that went against it). The results of SHAP force plotting give clinicians a snapshot of how the model made an individual decision, as well as how confident it was in that decision; thus supporting interpretation of the prediction. The findings also highlight which vital sign parameters are most critical to monitor in this cohort.
In another study focusing on an adult cohort of septic patients, mortality was predicted using an RF and then explained with both SHAP and LIME scores [227]. SHAP scoring was applied in a global context to identify features which are predictive of mortality in this cohort, ultimately identifying parameters including Glasgow Coma Score (GCS), first-day urine output, and blood urea nitrogen to be critical. LIME scoring was then applied in a local context to generate force plots highlighting which features contributed most strongly to individual mortality predictions. The globallevel results from SHAP highlight the critical biomarkers of mortality in this cohort, which may help to guide overall improvements in treatment and practice into the future; meanwhile, the local results from LIME serve to support clinician trust in individual model predictions.
Identification of key biomarkers for illnesses has also been considered by several works. SHAP scoring has recently been applied to explain predictions of chronic kidney illness made by an RF model based on a large number of laboratory values [188]. A global-level view was taken to identify key biomarkers of the condition. Parameters including haemoglobin concentration, packed cell volume, and serum creatinine were identified as key biomarkers for the condition. These results provide meaningful information that both validate that the model is making sensible predictions, whilst also highlighting the relative importance of various parameters to support future treatment development.
Understanding regions of the brain associated with mental health conditions has also been considered [200], where an SVM model was used to classify PTSD and depression from time-series EEG data. A feature selection approach was implemented to identify the EEG features that best discriminate between the two conditions, and between each condition and a healthy control. Through their study, several characteristics of the P300 wave derived from EEG were identified as predictive of PTSD, including reduced amplitudes and prolonged latency. The results provide insight into signal features that may be helpful for accurately diagnosing PTSD into the future.
Explainability techniques have also been applied to wearable sensor networks to identify sensor readings predictive of different kinds of human activity [228]. In this recent study, an LSTM-based model was used to classify 12 different human activities based on time-series data from gyroscopes, accelerometers, magnetometers, and ECG sensors placed across the body. Global-view LIME explanations were then applied to the model to reveal which sensor values were most strongly linked with each type of movement. The results of this analysis are valuable for designing more reliable systems for the interpretation of human activity, which is essential for robust detection of falls and unusual behaviours in persons who are elderly or unwell.
Heat-mapping has also been applied to explain predictions from time-series data. A modified CAM approach was used in one recent study to explain arrhythmia diagnoses made by an autoencoder model from ECG waveforms. The CAM-based approach used colour to highlight the parts of the waveform that were most predictive of arrhythmia, commonly highlighting local peaks and spikes. This technique serves to flag key areas to expert clinicians, who can then evaluate the model's prediction to finalise a diagnosis. It may also serve to highlight novel regions of interest within ECG waveforms associated with arrhythmia for future research and enhanced diagnosis.
Overall, explainability techniques offer significant potential for developing socially responsible AIoMT systems that clinicians can place their trust in. Explanations can also lead to the identification of novel biomarkers for various illnesses and conditions, which in turn supports research into improved, targeted treatments. Many different types of data can be explained using a range of techniques, from discrete laboratory values to complex medical images.

E. Lessons Learned
In terms of health monitoring with AIoMT, it was found that well-established wearable technologies including PPG and ECG are still widely used in the literature, with novel methods for extracting healthcare metrics from these non-invasive wearables continuing to be found. Furthermore, several pilot studies have utilised sensor fusion to synthesize data from multiple fundamental devices including PPG, accelerometers, and sweat sensors for enhanced health monitoring. Sensor fusion approachs have shown promising results in monitoring complex conditions such as heat stress, epilepsy, and diabetes with non-invasive and wearable healthcare devices, albeit on small cohorts. Promising results have also been seen where sensor fusion is used to combine information from wearable and non-contact sensors for fatigue monitoring. These promising results suggest that there is much information that can be extracted from relatively simple physiological signals where the correct AI model is applied to the task.
An active topic of research in health monitoring is the use of entirely non-contact monitoring systems. In our review, it was identified that imagery and channel state information are the two key data types utilized in non-contact monitoring. Imagery is particularly useful for obtaining rPPG to measure cardiorespiratory health parameters, while channel state information approaches can be utilized to assess activity in a privacy preserving manner.
Advancements have also been made in the areas of diagnosis and prognosis using AIoMT. It was learned that many recent studies achieved strong results using data obtained from wearable devices, often in conjunction with other patient information such as demographic or clinical variables. In terms of diagnosis specifically, studies that sought to diagnose physical health conditions generally showed stronger performance than those seeking to identify mental health conditions or developmental disorders; likely due to there being greater existing knowledge in the assessment and diagnosis of physical health. The majority of diagnostic studies were treated as a binary classification problem -i.e., was the disease present or not -rather than using multiclass approaches to identify severity. For diseases such as cancer, multiclass classification to grade and stage cancer severity is an important direction for ongoing research.
For prognosis, the majority of studies again sought to classify outcomes in a binary manner -for example, mortality versus non-mortality, or cancer recurrence versus non-recurrence. Several studies focusing on neurodevelopment in babies born prematurely aimed to predict outcomes more specifically, by developing regression models that provided a developmental score on a continuous scale. This approach provides significantly more information to the clinician about the infant's likely outcomes than a binary classification approach.
Several explainability techniques were found to be prevalent in the literature, particularly heat-mapping such as CAM and feature importance identification approaches such as SHAP and LIME. CAM and SHAP were shown to be particularly useful for image and waveform data, capable of clearly highlighting regions of an image used to make a decision. For discrete features, both SHAP and LIME were commonly used to understand feature importances in local and global decision making. In all cases, these methods clarify how AIoMT models make decisions -however, most studies do not consider whether clinician trust is genuinely improved through the use of explainable tools. One area where explainability has been shown to offer potential is in identification of biomarkers; the illustration of features considered important by an AIoMT system can aid researchers in determining potential characteristics of a disease to target with treatment, new metrics for monitoring the progression of a condition, improved methods for diagnosis, and much more. There remains much opportunity in applying explainability to medical datasets to identify potentially novel markers of disease.

IV. USE CASES FOR AIOMT
We have now explored the overarching domains of healthcare where AIoMT can be utilized to improve outcomes for patients and carers. In this section, we examine several specific use cases for AIoMT technology, and explore how the AIoMT techniques and technologies identified in previous sections can be applied to provide comprehensive care throughout the entire healthcare pipeline. In particular, we highlighting several recent studies applicable to the selected use cases and further provide recommendations on how these systems could be enhanced to further improve outcomes and quality of life.

A. Dementia Care
Dementia is a collection of cognitive and behavioural symptoms caused by various neurological conditions, including Alzheimer's disease and Parkinson's disease. As dementia progresses, individuals can experience changes in behaviour including aggression and wandering, as well as cognitive changes such as memory loss and difficulty expressing their thoughts and feelings. There is currently no cure for dementia, but early identification of dementia can help affected individuals and their support networks to develop appropriate care plans.
AIoMT can provide valuable tools for all stages of dementia care, starting with early identification of dementia. Several studies have used an activity recognition approach, using machine learning to process data from wearable [140] or non-contact [140] motion sensors to determine whether a person is experiencing mobility symptoms caused by dementia. Other studies have used machine learning to identify dementia from data including EEG signals [229], MRI images [194], and a combination of keystroke data and basic activity information [230]. Virtual reality was employed in another recent study aimed at early diagnosis of dementia [231]. Participants were given navigation tasks in a virtual 3D environment, with metrics about performance then analysed using a RF model to identify dementia with high accuracy.
After a dementia diagnosis, many people will continue to live independently for some time. As dementia progresses, the risk of falls and other injuries gradually increases. AIoMT can support dementia patients and their families by monitoring daily activity and identifying abnormal behaviour; such information can be used to alert caregivers when their assistance is needed, as illustrated in Fig. 7. One novel study was able to detect abnormalities in behaviour by using machine learning to analyse smart meter recordings that captured a person's interactions with electronic devices in their home [232]. This approach has the advantages of being non-invasive and privacy-preserving. Other AIoMT approaches that offer similar advantages are those which use channel state information for activity classification [145], [185] could be applied to dementia care in the future.
Caregivers for people with dementia can also be supported by AIoMT technologies. Aggression and agitation are common symptoms of dementia, and can put caregivers at risk of harm. To address this, one pilot study [233] developed an AIoMT system where wearable devices and environmental monitoring were utilized to identify distressed behaviour in dementia patients. Motion and physiological parameters were obtained from accelerometer, photoplethysmogram, sweat, and skin temperature sensors in a wrist-worn device, and further activity data was gathered via multiple cameras. Machine learning was used to process this data and predict distress and agitation. In a subsequent study, the same research group identified that the wearable device alone provided strong accuracy in identifying agitation [234].
Expressing emotions to caregivers can also be challenging for people with dementia, and can itself lead to frustration and aggression. To address this, one recent study [235] proposed an AIoMT system that utilises machine learning to classify the emotions of the dementia patient based on data from wearable EEG sensors. Novel machine learning models, particularly transformer-based models, have also been developed to identify emotion from speech [236] and image-based [237] data; such technology may also be suitable for dementia cohorts. Such technologies could greatly aid caregivers in understanding what their patient is feeling and what care they might require at that point in time.
A digital twin framework for supporting carers and healthcare providers for people with dementia has also been recently proposed [238]. In the proposed framework, AIoMT systems such as those for agitation and emotion assessment would provide real-time feedback to carers while also being utilised to build a digital twin of the patient. When new patients are assessed, AI or other algorithms would be used to find the most similar digital twins from previous patients, before fusing these to create a template digital twin for the new patient. The digital twin profiling can also be used by clinicians to identify signs of deterioration or other relevant trends in a patient's condition.
Dementia can cause much distress to patient and caregivers alike. Unfortunately, there is still much that remains unknown about dementia, and no cure exists for any dementiacausing condition. Several recent studies have therefore sought to identify factors associated with dementia and dementiacausing conditions to aid future research and care decisions. In one study, demographic and health risk factors for future dementia diagnosis were identified using SHAP scores [239]; providing insight into areas where risk could be reduced. In another study, biomarkers were identified from a range of cognitive tests and medical imaging by assessing the relationship between each individual parameter and cognitive decline outcomes [240]. Such results would aid clinicians in identifying at-risk persons.
Other studies have also sought to understand how dementia appears in the brain. One study [241] used LIME scoring to understand which features of MRI imagery and which gene expressions indicate Alzheimer's disease -offering future researchers valuable information regarding areas to target for diagnosis and treatment into the future. Meanwhile, another study used an explainable machine learning model with a heat-mapping approach was developed to identify regions of the brain associated with Alzheimer's disease [219]. Such a tool could be used to improve clinician trust and enhance our understanding of the condition.
Overall, dementia is a collection of symptoms which can have devastating effects on patients and carers alike. However, AIoMT offers many promising solutions for improving diagnosis and care for people with dementia. It has also shown promise in identifying risks and biomarkers for dementia that will certainly guide future research; it is quite likely that discoveries made by AIoMT will contribute to the discovery of new and improved treatment options for dementia into the future.

B. Stroke and Stroke Recovery
Stroke, also known as cerebrovascular accident (CVA), occurs when there is a disruption of blood flow to the brain. Rapid diagnosis and treatment can minimise damage to the brain, however long-term impacts affecting speech and mobility are common in stroke survivors. AIoMT has the potential to improve the entire stroke care pipeline. A comprehensive AIoMT system would begin with identification of at-risk persons from routine healthcare information. Several recent studies have demonstrated that machine learning can identify individuals who are at risk of future stroke using a combination of clinical and demographic variables obtained through EHRs and health monitoring. In one study [242], several approaches were used to predict ischaemic and hemorrhagic stroke. An AUROC of 0.974 was achieved using a hybrid CNN to predict ischaemic stroke; meanwhile an SVM approach achieved the highest AUROC of 0.970 in predicting hemorrhagic stroke. In another study [243], an ensemble approach was used for predicting stroke from fundamental demographic and health information in [243], achieving an AUROC of 0.989. The risk prediction step is crucial as where risk is known, it can be lowered through a range of preventative strategies.
Even where risk factors are reduced, stroke risk cannot be completely eliminated. Where stroke does occur, it is crucial to identify it as quickly as possible to enable early treatment and minimise brain damage. It has previously been shown that stroke can be diagnosed using CNNs to process computed tomography (CT) images of the brain [244]. Some research has also indicated that stroke may be able to be identified prior to hospital admission using RF to process EEG features [245] or SVM to process paramedics' text reports [246], as illustrated in Fig. 8. Another recent study used a biomarker discovery approach incorporating FCNN, RF, and SVM to identify gene expressions that are linked with the occurrence of a stroke, ultimately finding several microRNA molecules as candidates; this knowledge could be useful to accelerate diagnosis of stroke in inpatients. Overall, utilising AIoMT for pre-hospital and inhospital assessment of the patient could greatly improve speed of diagnosis, and thus support faster treatment of this critical condition.
While treatment can minimise damage, post-stroke impacts including speech and mobility difficulties are common. Assessing the likely long-term outcomes for a stroke survivor can improve treatment decision making. It has been shown that this can be achieved using FCNN to process clinical, medication, and demographic information obtained via monitoring and clinician reports [247], [248]. One study [248] also identified parameters associated with adverse outcomes, which included Glasgow Coma Score, atrial fibrillation information, type of stroke, and age.
Where long-term outcomes are non-ideal, interventions such as stroke rehabilitation are commonly used to support recovery. However, adherence to rehabilitation exercises is a challenge after patients are discharged from hospital. Recently, an AIoMT system was developed to identify when participants were performing the prescribed exercises using accelerometer data. Furthermore, the system tracked changes in how the exercises were being completed to identify how the patient's condition changed over time [249]. While recovery is ongoing, falls are a common source of further injury; a recent study [250] has shown that this can be mitigated through AIoMT systems implementing ML-enabled fall detection based on IMU data, with wearable airbags then deployed to minimise impact of a fall.
Overall, stroke care is a clear use case for AIoMT technology. Early identification of risk using AIoMT can enable patients to reduce their risk where possible. In the event where stroke still does occur, AIoMT can be utilized to support rapid diagnosis and post-stroke care. Finally, explainable techniques can be used to identify biomarkers and risk factors associated with stroke, enabling further enhancements to diagnostic systems and treatment planning, as well as offering avenues for future research into stroke prevention.

C. Breast Cancer
Cancer is a leading cause of deaths worldwide. Types of cancer are many and varied, and AIoMT has the potential to aid in many stages of diagnosis, treatment, and recovery. In this case study, we focus on the most commonly occurring cancer globally -breast cancer. Due to its prevalence, breast cancer is one of the highest causes of cancer-related mortality. However, if diagnosed and treated at an early stage, survival rates are between 95-100% in Australia [251].
AIoMT can aid in improving the breast cancer care pipeline, through improving diagnostics, assessing prognostics, assessing suitable treatment paths, and identifying patients at risk of recurrence. Each of these tasks greatly aids in detecting new or recurring cancer early, maximising the chance of patient survival.
Early diagnosis is currently dependent on access to healthcare and specialist doctors. However, recent research has suggested that AIoMT may enable at-home screening for breast cancer. In one novel study [252], infrared images of the breast taken via a smartphone are processed by several CNN-based models to assess whether the participant has breast cancer. Many of their experiments showed accuracy exceeding 95%, and importantly sensitivity to breast cancer was also high. Performance was highest where coloured infrared images were available, although modest accuracy was also achieved with greyscale images. One limitation of this work is that the database only contained images of breast cancer versus healthy persons; no images of persons with non-cancerous breast lumps or tumours were included. Another study [253] aimed to address this issue through using a deep CNN model to distinguish between benign and malignant breast lumps in mammography images, achieving a significant accuracy of 99.12%.
After diagnosis of breast cancer, it is also useful to identify the severity. As such, several recent studies have investigated AIoMT approaches for grading and staging breast cancers. Histological grading of tumours indicates how rapidly a tumour will grow and spread, and thus is useful when developing treatment plans. One novel study utilised gene expression information from health records along with breast tumour samples to assign a histological grade to the tumour using a gradient-boosted decision tree approach [254]. The results were promising, with an accuracy of 90% and an AUROC of 0.88 achieved. In another work [255], microscopy images of breast cancer tumours are processed using a CNN model to identify the stage of cancer; stage indicates how far the cancer has spread. The accuracy achieved for breast cancer staging was 97.81%, and accuracies exceeding 98% were also recorded for identifying the type of breast cancer. Histological grading and staging both provide important information about the severity of the cancer. Improved assessment of cancer characteristics can minimise the misdiagnosis or underdiagnosis of high-risk cancer, whilst also reducing over-treatment of low-risk cancer.
In addition to grading and staging, several studies have also investigated AIoMT techniques for predicting response to treatment. Neoadjuvant chemotherapy (NAC) is used to reduce the size of the cancer prior to surgical intervention or radiotherapy, however standard usage is only effective in roughly 70% of patients [256]. Adjustments to treatment can be made to improve outcomes for patients who do not respond to NAC, and thus early identification of non-responsiveness could aid in accelerating the treatment course. In one study [256], features extracted from computerized tomography images were processed by boosted decision trees to identify whether tumors would shrink by at least 30% in response to treatment. An accuracy of 88% was achieved, however AUROC was only 0.632. In another study [257], features extracted from MRI images were fused with clinical variables to predict responsiveness to NAC. Using a FCNN network structure, an AUROC of 0.975 and an accuracy of 91.2% were achieved.
In some cases, NAC can result in a pathologic complete response (pCR); a lack of cancer tissue identified in biopsy samples. This is a positive scenario as the patient is then considered to be in remission, with no further treatment required unless the cancer reoccurs. Identifying candidates for whom pCR may occur following NAC has the potential to greatly aid in developing treatment plans. As such, one study sought to predict the occurrence of pCR in patients receiving NAC using demographic and clinical parameters from EHRs. After trialling several ML models, it was found that a boosted decision tree approach was the strongest for this task, achieving an AUROC of 0.810.
After breast cancer has been successfully treated, there remains a significant risk of recurrence; an outcome linked with mortality. Identification of patients at high risk of recurrence could aid in earlier detection of recurring cancer, thus improving outcomes for the patient. Additionally, recurrence can be minimised through additional treatment. Due to these clinical benefits, one recent AIoMT study sought to predict the risk of recurrence from features of histopathological microscopy images [258]. Using a boosted decision tree approach, an AUROC of 0.72 was achieved. Additionally, a Fig. 9. Example of explainable AIoMT system for diagnosing, staging, and predicting treatment outcomes from breast cancer imagery. feature importance ranking approach identified several potential predictors of recurrence were identified from among the features. One limitation of this work is that no distinction was made between regional recurrence or distant recurrence (also known as metastasis), with the latter known to be strongly correlated with mortality; a similar approach that made this distinction would likely be more useful in guiding treatment planning.
While the AIoMT approaches explored in this case study have the potential to improve cancer diagnosis and treatment planning, the unfortunate fact remains that not all cancers will respond to existing treatments. Due to this, breast cancer remains a significant cause of mortality globally. Improving upon existing treatments and identifying new ones is therefore critical to reduce mortality rates in breast and other cancers, and is of constant interest to cancer researchers. Treatment development is another area that AIoMT, and particularly explainable AI, has the potential to assist with. In one recent study [259], ribonucleic acid (RNA) sequencing of immune cells in the tumour microenvironment (the healthy cells immediately surrounding the cancerous ones) was conducted, and SHAP scores were calculated to understand which microenvironment features were related to positive outcomes. Their work identified that B cells, CD8+ T cells, M0 macrophages, and NK T cells are critical microenvironment features associated with ≥5 year survival rates. These results can aid future researchers in targeting treatments towards altering tumour microenvironments to be more hostile to the tumour, thus improving prognosis. In another study [260], a heatmapping approach was used to highlight morphological features of histopathology images that are linked with molecular features such as gene expressions, and thereafter linked with prognosis. This identification of molecular features associated with prognosis can greatly aid future cancer researchers in developing targeted treatments and applying precision medicine.
Overall, this case study highlights that AIoMT technologies offer great value to breast cancer diagnosis and treatment, as well as improving our understanding of breast cancers. As illustrated in Fig. 9, the diagnosis, grading and staging, and predicted treatment outcomes of breast cancer could all be combined into a single system, offering great benefit to lowresource settings. Many of the AIoMT strategies explored here can be easily adapted to other cancers, and thus AIoMT has the potential to have a significant positive impact on a much broader cohort of cancer patients and their loved ones.

D. COVID-19 Management
The SARS-CoV-2 virus was first identified in late 2019, and it rapidly began to spread. COVID-19, the disease caused by the virus, has since lead to millions of deaths around the world. Despite improvements in treatment and increasing rates of vaccination, the highly virulent SARS-CoV-2 continues to spread and result in significant deaths.
Due to the significant global impact of COVID-19, the development of AIoMT techniques and systems to diagnose, treat, and manage the spread of the disease has been an extremely active field of research. In this case study, we highlight how AIoMT can be used in the ongoing fight against COVID-19.
Early detection of SARS-CoV-2 infection is critical to minimise the spread of the disease. To assist with this, several recent studies have investigated the use of AIoMT techniques based on common devices to detect the onset of COVID-19 in early stages. In one study [261], an AIoMT framework is proposed wherein symptoms are collected via a smartphone using a combination of sensing and survey-based approaches. Several machine learning models were trialled for identifying COVID-19 from such data, with a FCNN approach achieving the strongest performance of 0.955 AUROC and 92.89% accuracy.
Another recent study [134] sought to identify the onset of COVID-19 prior to the development of symptoms. Time-series vital sign information was obtained using a commerciallyavailable wrist-worn device, and an LSTM model was developed to process the data. An AUROC of 0.68 and a sensitivity of 0.73 were achieved. While these numbers leave room for improvement, detection of any cases prior to symptom onset is a significant outcome for reducing the spread of SARS-CoV-2, as quarantine can begin prior to the more infectious symptomatic period.
Another approach for COVID-19 diagnosis is medical imagery AIoMT systems, which can be helpful for processing many patients quickly in lower-resource settings. One study [262] trialled various CNN structures for distinguishing COVID-19 from healthy and pneumonia-affected lungs in CT images, achieving an accuracy of 99.51% and an AUROC of 0.994. Another study [263] trialled a broad range of ML models for distinguishing between COVID-19 affected and healthy persons based on chest x-ray images, achieving an accuracy of 94.7% with both a ResNet and an SVM structure. Lung ultrasound images have also been considered as a data source, with one pilot study [164] applying a vision transformer approach and achieving a sensitivity of 60%; a result that could likely be improved upon with further research. Of these approaches, an x-ray based method is likely the most useful. X-ray images are generally cheaper and faster to gather than CT scans, and strong accuracy has been achieved in research to date. After COVID-19 diagnosis, many high-risk patients require ongoing monitoring so that deterioration can be identified. In Australia, many high-risk patients are admitted to virtual care [264], where clinical staff remotely monitor symptoms and request hospitalization in cases where the patient's condition worsens. However, the monitoring is predominantly based on self-reporting and clinicians do not always check in daily. A better solution would be to remotely monitor health parameters automatically and continuously, so that potential issues can be flagged quicker. This was the focus of one recent study [265], where a wearable sensor mesh built into a vest was developed containing PPG, ECG, electromyography, acoustic cardiography, and acoustic myography. Cough and breathing sounds are classified into COVID-19 and non-COVID-19 categories using CNNs in a generative adversarial network approach. Preliminary results showed that this enabled identification of COVID-19 in 80% of cases, but the authors indicate that the intention of this system is to eventually monitor the condition and recovery of patients in a telehealth context. In a similar way, the diagnostic study presented in [134] would be extendable to remotely monitor recovery of virtual ward patients. Such monitoring could be fused with existing systems for selfreporting symptoms to further improve assessment of patient condition, as illustrated in Fig. 10.
In terms of determining which patients should be admitted to virtual or physical wards, AIoMT-based models for identifying severity of illness and prognosis offer much potential. One recent study used RF models to classify COVID-19 severity based on features extracted from CT scans along with clinical variables [266]. Severity was classed as moderate, severe, or critical based on clinical staging. Distinction between moderate versus severe/critical was achieved with an AUROC of 0.927, and subsequent distinction between severe and critical was achieved with an AUROC of 0.929. Furthermore, partial vs prolonged recovery was classified with an AUROC of 0.960; complete recovery was not included as insufficient complete recovery cases were available. Lastly, RF regressors were developed to predict length of treatment parameters including duration of hospitalization, duration of intensive care stay, and duration of oxygen inhalation. Root mean square errors (RMSEs) of 0.88, 0.69, and 0.92 weeks were recorded for these three parameters, respectively. Each of these findings provides valuable information for clinicans making triage decisions, particularly in hospitals that are at or over capacity due to COVID-19 outbreaks.
Each of the metrics assessed by [266] are very useful for triaging patients, however they do not consider the prediction of mortality risk. Mortality prediction can further aid in triage by identifying the most critically ill patients, allowing for prioritisation of resources. As such, several studies have investigated mortality prediction for COVID-19. In one study [267], variables including demographics, clinical and laboratory results, and CT image features were processed using gradient boosted RF models, achieving an AUROC of 0.9521 in distinguishing mortality from non-mortality. In another study [268], clinical variables extracted from EHRs were utilised to predict mortality, achieving an AUROC of 0.941 with a RF approach. Using univariate analysis, it was further identified that individual features could predict mortality with reasonably high AUROC, particularly leukomonocyte percentage, urea, age, and blood oxygen saturation. It was found that each of these features on its own yielded AUROCs of 0.917, 0.867, 0.826, and 0.704, respectively. Another study also sought to understand the clinical factors associated with COVID-19 mortality, first developing a RF model that achieved 83.4% accuracy in mortality prediction before applying principal component analysis. Age, impaired renal function, and elevated C-reactive protein (an indicator of acute inflammation) were the three factors related most strongly to mortality.
In patients who survive COVID-19, particularly after severe illness, there is a prevalent risk of developing so-called 'long COVID', characterised by prolonged post-COVID symptoms that include fatigue, respiratory symptoms, and cognitive difficulties and may last for several months. Relatively few studies have investigated the prediction of long COVID over a long period. In one early study [269], demographic and clinical variables obtained during initial SARS-CoV-2 infection were processed using various ML techniques to predict indicators of long COVID six months after infection. An ensemble learning approach that combined models including RF, SVM, and FCNN achieved strong results in predicting the presence of long-COVID indicators after 6 months. In particular, AUROC values of 0.81, 0.75, 0.72, and 0.6 were achieved for predicting the presence of any CT abnormality, severe CT abnormality, lung function impairment, and symptoms, respectively.
The use cases of AIoMT in management of COVID-19 are not limited to treatment of individuals. AIoMT approaches have also been investigated on a population-wide level to model the spread of COVID-19, enabling governments and healthcare providers to better prepare for outbreaks. In one study [270], forecasting of next-day recovered and new confirmed cases based on case data from the previous 5 days was conducted using a variety of models. A CNN-LSTM model was the strongest performer, predicting next-day cases with a mean absolute percentage error (MAPE) of 0.628-6.021% across a range of different countries. The CNN-LSTM model also was the strongest at predicting recovered cases, with MAPE values of 1.180-5.395% in most countries of interest, with the exception of India where a MAPE of 16.113% was achieved. It is possible that this is due to how recoveries are recorded, rather than a fault of the model. In another study [271], confirmed case numbers are predicted one week in advance using data about current case numbers and enforced COVID-19 policies such as mask wearing and capacity limits on gatherings. Several models were trialled, with LSTM showing the strongest performance. In most of their tests, MAPE ≤10% was achieved. However, MAPE exceeded 100% in two scenarios where the model was applied to data from Brazil. It was hypothesized that this was due to reporting issues. Overall, methods for forecasting upcoming case numbers offer valuable information that can support hospitals and health centres in preparing for case spikes, which in turn supports better outcomes for patients.
AIoMT techniques have also been utilised to identify potential medications that may aid with improved treatment of COVID-19. In one novel study, a FCNN was used to identify compounds that interacted with various COVID-19 proteins of interest, in an effort to identify candidate drugs for COVID-19 treatment. Using an explainable AI technique of leave-oneout random sampling, several candidate medications were identified -including medications already approved for use against other conditions such as hepatitis C. These results are significant and may aid in accelerating the development and implementation of novel and effective treatments for COVID-19, which in turn would greatly improve patient outcomes.
Overall, the use of AIoMT systems is critical in the ongoing fight against COVID-19. From this case study, it is clear that AIoMT can be used to support rapid diagnosis, triage patients based on their risk, and identify patients at risk of long COVID. On a population scale, AIoMT can also be used to model the spread of COVID-19, enabling health systems to better prepare for outbreaks. Lastly, treatment discovery can be supported by AIoMT; this may also be applied to development of enhanced and longer-lasting vaccinations as new variants of SARS-CoV-2 continue to emerge.

E. Lessons Learned
Through our exploration of multiple case studies, we have identified that AIoMT systems offer significant benefits to a broad range of healthcare problems, from the patient level through to the community level. Our case study analysis has also highlighted how AIoMT can be used throughout the entire care pipeline for a range of illnesses; from diagnosis, through to monitoring and management, and in some cases to treatment and recovery.
The use cases examined here focus on a select few prevalent health conditions, however it is clear that many of the techniques applied could be transferred to other widespread health conditions. Many of the techniques developed for dementia care -including activity detection and emotion communication -can be adapted for use in other forms of assisted living, and for other persons with high care needs. Research on stroke recovery have identified the potential of AIoMT for use in rehabilitation; this could be explored for many other forms of rehabilitation from injury and illness. The usefulness of the AIoMT techniques utilised for breast cancer diagnosis, management, and treatment have clear applications to other forms of cancer. Finally, many of the technologies utilized for COVID-19 management could be adapted for use in the management of influenza and other epidemics.
One limitation that was clear across all use cases examined is that most works to date have applied AI retrospectively to data gathered by IoT; very few works have conducted realtime trials of end-to-end AIoMT systems. As such, many systems require testing in real healthcare settings to confirm their performance and assess their impact before they can be deployed at scale. This should be a key goal for future research, and will require collaboration between technology and health experts.

V. FUTURE RESEARCH DIRECTIONS
Our review has identified that AIoMT is already making an impact on health monitoring and management. However, there are many areas where improvements can be made. This section highlights key challenges and opportunities that future researchers should consider.
A similar limitation is evident in the AIoMT studies that seek to process image-based data [133], [184], [185], which primarily use CNN or SVM. Incorporating advanced computer vision models such as autoencoders and vision transformers would likely lead to significant improvement in performance. As such, there are substantial research opportunities which remain in the application of advanced ML models to raw image or time-series data.
2) Embedded and Edge AI: Most AIoMT studies to date have primarily focused on one level of AI computing; generally edge computing performed on desktop machines or cloud computing. Significant research opportunity remains in developing and implementing lightweight AI algorithms that can be moved to the embedded level or lower-powered edge devices. In scenarios where more computational power is required, it would be beneficial to utilise intelligent offloading and crosslayer AI approaches to optimise the use of available resources. This is therefore suggested as a direction for future research in this area, as cross-layer approaches would reduce latency, increase system robustness, and support patient privacy.
3) Data Fusion: Through our exploration of diagnosis systems in Section III-B, it was identified that many studies used different types of inputs to achieve diagnostic goals. For example, the studies which focused on diagnosing depression [144], [195], [196] each used different types of inputs.
Two studies used image-based data, while another used clinical and demographic information. Each of the studies showed promising results, and thus it is likely that performance could be improved further by fusing multiple data sources together -for example, a combination of EEG, other clinical variables, demographic parameters, and medical imagery would likely provide higher diagnostic performance than any single data source can provide alone. As such, data fusion for diagnostics and prognostics remains an open research opportunity and is recommended as a future direction. 4) Reducing Dependency on Clinical Parameters: Another limiting factor with many recent studies is a dependence on clinical and laboratory variables. While these may be readily available for hospitalised patients, they are unavailable for persons with limited access to healthcare or in telehealth applications; thus, such systems offer little value outside of clinical settings. Future AIoMT studies would benefit from investigating strategies to reduce dependence on parameters obtained in clinical settings, and instead develop systems that can monitor and diagnose based solely on data from wearables and environmental sensors.

5) Descriptive Illness Classification:
In terms of diagnosis and prognosis, a key limitation of many studies is that outcomes are classified in a binary manner where further classification would be more meaningful. In one study, binary classification of 'serious outcomes' [211], however it would be more meaningful to further classify these outcomes into categories such as neurological, respiratory, and infection. Additionally, categorization of AKI severity would further improve the work presented in [212], as would identification of when cancer recurrence is most likely to occur in [216], [217]. This would better enable clinicians to devise treatment plans and thus further improve outcomes for these patients. A similar approach would also be helpful in diagnostics, where classification of cancer stage or identification of anxiety type would greatly aid in treatment decision making for clinicians. As such, the development of more descriptive prognostic and diagnostic tools is a significant opportunity for future researchers.
6) Explainability: Explainability tools offer much potential, and a significant area for future research is simply applying these tools to complex diseases. This can aid in identifying biomarkers and characteristic features of diseases, consequently guiding development of enhanced diagnostic practices, novel medications, and other improvements to patient care. Application of established explainability techniques such as SHAP and LIME offer significant research opportunity in this domain. The development of enhanced explainability tools is also recommended as a future direction for research; studies in this domain should use SHAP and LIME explainability as a benchmark for comparison.
The use of explainability tools for communication of individual decisions to clinicians is also a promising domain. Clinician trust is critical for adoption of AIoMT systems in the healthcare space. Studies that have utilised heat-mapping and feature importance plotting techniques can clarify how a model makes a decision, however relatively few studies have sought to confirm whether these explanations actually improve clinician trust and understanding. It is thus essential that future research seeks to validate developed explainability tools through surveying expert clinician cohorts, as this will help to ensure that developed AIoMT systems are socially responsible and genuinely useful to healthcare providers. The development of a standardized framework for validating and comparing explanation tools would have a significant impact in this domain.
Overall, significant advancements have been made in the AIoMT domain in recent years. However, many studies still focus on either IoT or AI. It is critical that future research considers the synergy between these two technologies, as they offer more benefit to healthcare systems when used together rather than apart.

VI. CONCLUSION
In this work, we have conducted a scoping review of stateof-the-art literature in the AIoMT domain, highlighting the strong synergy between AI and IoT technologies. Our review begins with an exploration of the key building blocks of AIoMT, first examining prevalent and emerging sensors and devices in the literature, with a particular focus on noninvasive and privacy-preserving approaches. Communications in the licensed and unlicensed bands are also considered, along with IoT over satellite. Upcoming communications standards including RedCap and 6G-IoT are included in our exploration. Machine learning algorithms for healthcare applications are then explored, covering a broad range of well-established and novel algorithms that have been implemented for a wide variety of healthcare problems in recent research. We conclude our exploration of the AIoMT architecture by examining the layers of learning; the computing resources on which AIoMT depend. Embedded, edge, and cloud AI resources are all considered, as are the novel methodologies of federated and swarm learning for privacy-preserving machine learning that have been proven suitable for healthcare settings.
With the architecture of AIoMT established and thoroughly explored, our review then continues to uncover novel research conducted in key health domains. Wearable and non-contact health monitoring solutions were explored, with many found to show strong performance in monitoring general health. AIoMT systems for diagnosing conditions were also thoroughly explored, spanning systems for diagnosing physical, mental, and developmental conditions. Many systems in the literature performed strongly in diagnosing the condition of interest, however much room for improvement remains; particularly in distinguishing between conditions with overlapping symptoms. Prognosis was also found to be an active field in the literature, with a strong body of work seeking to identify shortterm and long-term patient outcomes in various conditions and healthcare settings; a critical task for triaging resources and determining treatment paths.
The novel area of explainable AI was also explored in the context of AIoMT. Heat-mapping and feature importance plotting approaches in the literature have both shown potential for improving clinician interpretation of decisions made by AIoMT systems, however further research is needed to validate whether this genuinely improves clinician trust. The second area where explainability shows significant potential is in understanding the biomarkers associated with diseases, conditions, and outcomes. Global-level feature importance examination can aid in the development of improved medications, tailored treatment, and rapid diagnostics. The use of these methods is relatively new in the literature, and thus significant opportunity remains in applying these strategies to improve understanding of novel and challenging health conditions.
To illustrate the importance of AIoMT in practical settings, several use cases are presented. We first identify pioneering works seeking to develop AIoMT systems and compatible tools for supporting persons with dementia and their carers, both in independent and dependent living environments. We then explore AIoMT techniques for the stroke care pipeline, from diagnosis to rehabilitation. Next, we investigate works that have sought to improve breast cancer diagnosis, treatment and recovery using novel AIoMT approaches. Lastly, COVID-19 management is considered, both for providing individual care and for monitoring and minimising the spread on a population-level scale.
Based on our thorough analysis of trailblazing studies in the AIoMT literature, we then present a synthesis of lessons learned and identify several key areas for future research. Embedded computing and the implementation of advanced AI algorithms were identified as critical for practical AIoMT systems in many domains. Additionally, improvement and validation of explainability tools offers clear and significant opportunity to future researchers. Overall, the domain of AIoMT offers many exciting opportunities for researchers seeking to make a significant impact as we move towards Healthcare 5.0.