Can Dialysis Patients Identify and Diagnose Pulmonary Congestion Using Self-Lung Ultrasound?

Background: With the recent developments in automated tools, smaller and cheaper machines for lung ultrasound (LUS) are leading us toward the potential to conduct POCUS tele-guidance for the early detection of pulmonary congestion. This study aims to evaluate the feasibility and accuracy of a self-lung ultrasound study conducted by hemodialysis (HD) patients to detect pulmonary congestion, with and without artificial intelligence (AI)-based automatic tools. Methods: This prospective pilot study was conducted between November 2020 and September 2021. Nineteen chronic HD patients were enrolled in the Soroka University Medical Center (SUMC) Dialysis Clinic. First, we examined the patient’s ability to obtain a self-lung US. Then, we used interrater reliability (IRR) to compare the self-detection results reported by the patients to the observation of POCUS experts and an ultrasound (US) machine with an AI-based automatic B-line counting tool. All the videos were reviewed by a specialist blinded to the performer. We examined their agreement degree using the weighted Cohen’s kappa (Kw) index. Results: A total of 19 patients were included in our analysis. We found moderate to substantial agreement between the POCUS expert review and the automatic counting both when the patient performed the LUS (Kw = 0.49 [95% CI: 0.05–0.93]) and when the researcher performed it (Kw = 0.67 [95% CI: 0.67–0.67]). Patients were able to place the probe in the correct position and present a lung image well even weeks from the teaching session, but did not show good abilities in correctly saving or counting B-lines compared to an expert or an automatic counting tool. Conclusions: Our results suggest that LUS self-monitoring for pulmonary congestion can be a reliable option if the patient’s count is combined with an AI application for the B-line count. This study provides insight into the possibility of utilizing home US devices to detect pulmonary congestion, enabling patients to have a more active role in their health care.


Introduction
POCUS has evolved over the last few decades, transforming from a large, cumbersome machine with limited diagnostic capabilities to a modern lightweight device capable of providing a wealth of diagnostic information [1,2].
Initially, POCUS was limited to static visual inspection and anatomical measurements. Now, US machines offer displays in real time, enabling functional assessments [2,3] that We hope that this research can provide a foundation for future studies that will investigate the potential of LUS in supporting the remote evaluation of patients.

Participants and Setting
This was a prospective pilot single-center study conducted at the nephrology ward at SUMC, Be'er Sheva, Israel, from November 2020 to September 2021.
The patients were recruited from the dialysis clinic using a convenient sampling method, as this study was conducted as a pilot study (Table 1). Subjects were included in the study if they received chronic dialysis, were physically capable of performing US on themselves, had a smartphone, used a messaging application (this criterion is designed to ensure that the patients participating in the study have basic technological capabilities), agreed to participate, were cognitively intact (this was judged by the treating nephrologist), and could sign an informed consent. Subjects were excluded from the study if they were morbidly obese (BMI > 35) or had a lung resection or lung transplant history. Patients who fulfilled all the inclusion criteria and none of the exclusion criteria were approached by a study research member and offered to take part in the study (Figure 1). The purpose, procedures, and study objectives were explained to each potential participant. The patients were asked to sign an informed consent upon agreeing to participate in the study. Subsequently, they were taught to self-scan the lungs before and after dialysis treatment, as detailed below. The hospital's ethical committee approved the study (0218-20-SOR).

Study Flow
US examinations were conducted using the Venue Go TM, a GE Healthcare US device with the cardiac 3Sc-RS probe (3 MHz), which is suitable for the lung scan and identification of B-lines when using the appropriate preset settings designated for lung scanning (FPS 31, F 3.6, P 0 dB, Dynamic range 60 dB, Depth 16 cm, and Gain 0 dB); this US device has auto-B-lines detection capabilities that were validated in a previous study [6]. The study plan involved meeting each patient for three dialysis treatments, during which a scan in each treatment would be performed before and after the dialysis. The patient would perform scans in two zones, while the researcher would perform in four zones. In the first session, participants were taught to use the US by holding the probe on the midclavicular line at zone 1. This zone was selected to simplify the procedure and because the anterior zones have high sensitivity and specificity for detecting B-lines [51,52]. They also received a brochure explaining B-lines and how to identify and count them using examples with images. The brochure was translated into all relevant languages and was provided to each patient to review at home and before scanning during each dialysis visit (Appendix A).
At the 1st visit, after the teaching session, the patients were asked to identify the number of B-lines on four separate printed images. This was conducted to ensure patients understood B-lines and how to identify them. The scan was performed on both anterior chest sides unless there was a subclavian dialysis catheter on one side; in this case, the scan was not performed on that side. In the following sessions, the participants performed the scan independently and counted the number of B-lines without any reminder or instruction from the study member. At each visit, the research team saved LUS clips from the participants' clips once the patient was satisfied with the images. The researcher performed scans in the same locations as the patient and in the mid-axillary region on both sides ( Figure 2). To avoid bias, the researchers' scans were acquired only after the patients' scans. Images from researchers' clips were also recorded and stored on the US machine. Before the clips were saved, the number of B-lines from all scans was counted by the automatic tool and kept in the device memory.

Study Flow
US examinations were conducted using the Venue Go TM, a GE Healthcare US device with the cardiac 3Sc-RS probe (3 MHz), which is suitable for the lung scan and identification of B-lines when using the appropriate preset settings designated for lung scanning (FPS 31, F 3.6, P 0 dB, Dynamic range 60 dB, Depth 16 cm, and Gain 0 dB); this US device has auto-B-lines detection capabilities that were validated in a previous study [6]. The study plan involved meeting each patient for three dialysis treatments, during which a scan in each treatment would be performed before and after the dialysis. The patient would perform scans in two zones, while the researcher would perform in four zones. In the first session, participants were taught to use the US by holding the probe on the mid-clavicular line at zone 1. This zone was selected to simplify the procedure and because the anterior zones have high sensitivity and specificity for detecting B-lines [51,52]. They also received a brochure explaining B-lines and how to identify and count them using examples with images. The brochure was translated into all relevant languages and was provided to each patient to review at home and before scanning during each dialysis visit (Appendix A).
At the 1st visit, after the teaching session, the patients were asked to identify the number of B-lines on four separate printed images. This was conducted to ensure patients understood B-lines and how to identify them. The scan was performed on both anterior chest sides unless there was a subclavian dialysis catheter on one side; in this case, the scan was not performed on that side. In the following sessions, the participants performed the scan independently and counted the number of B-lines without any reminder or instruction from the study member. At each visit, the research team saved LUS clips from the participants' clips once the patient was satisfied with the images. The researcher performed scans in the same locations as the patient and in the mid-axillary region on both sides ( Figure 2). To avoid bias, the researchers' scans were acquired only after the patients' scans. Images from researchers' clips were also recorded and stored on the US machine. Before the clips were saved, the number of B-lines from all scans was counted by the automatic tool and kept in the device memory.
Once all the scans were collected, an experienced physician (LF 10 years) operator, blind to the performer (researcher or patient), reviewed all the clips. The reviewer assessed and counted B-lines independently, blinded to the automatic tool count and patient count. Additionally, the reviewer estimated the performer's ability to correctly capture the lung ultrasonographic anatomy: the identification of a clear, centered, pleural line surrounded by two acoustic shadows. This evaluation was conducted on a scale of 0 to 2 for both researchers and patients. The lung scans were recorded and ranked in the following categories: 0 (poor)-the anatomical capture is not clear enough for B-line detection, 1 (good)-the anatomical capture is not perfect, but the reviewer can assess B-lines with moderate confidence, and 2 (excellent)-the ideal anatomical markers of a narrow pleural line with two acoustic shadows on either side (rib space) can be visualized, and the reviewer can assess B-lines with confidence. Once all the scans were collected, an experienced physician (LF 10 years) operator, blind to the performer (researcher or patient), reviewed all the clips. The reviewer assessed and counted B-lines independently, blinded to the automatic tool count and patient count. Additionally, the reviewer estimated the performer's ability to correctly capture the lung ultrasonographic anatomy: the identification of a clear, centered, pleural line surrounded by two acoustic shadows. This evaluation was conducted on a scale of 0 to 2 for both researchers and patients. The lung scans were recorded and ranked in the following categories: 0 (poor)-the anatomical capture is not clear enough for B-line detection, 1 (good)the anatomical capture is not perfect, but the reviewer can assess B-lines with moderate confidence, and 2 (excellent)-the ideal anatomical markers of a narrow pleural line with two acoustic shadows on either side (rib space) can be visualized, and the reviewer can assess B-lines with confidence.
Data collection for this study included demographic factors such as age, gender, ethnicity, years of education, and native language, as well as medical history information, including medications and chronic diseases. Clinical data included dates of admission, treatment such as dialysis duration (years) and the number of treatments per week, vital signs, and dry weight. Finally, US results were collected, including BLS detected from scans collected by the patient from two zones, and four zones by the researchers. B-line counts were collected from four different sources: the patient, the researcher, the expert, and the AI automatic tool.

Statistical Analysis
Since our reported observation had three ordinal levels, we used weighted Cohen's kappa to assess the observers' agreement and weigh the disagreements differently. We report Kw statistics and 95% confidence intervals, and compare them in an inter-observer kappa agreement matrix. The expert's count on clips captured by the researcher will be defined as the "gold standard". Kw statistics will be interpreted as follows: Kw = 0 (good as a guess), Kw = 0.01-0.2 (slight agreement), Kw = 0.21-0.4 (fair agreement), Kw = 0.41-0.6 (moderate agreement), Kw = 0.61-0.80 (substantial agreement), Kw = 0.81-0.99 (nearperfect agreement), and Kw = 1.00 (perfect agreement).
The analysis was conducted using the R statistical programming language, version 4.1.2. Data collection for this study included demographic factors such as age, gender, ethnicity, years of education, and native language, as well as medical history information, including medications and chronic diseases. Clinical data included dates of admission, treatment such as dialysis duration (years) and the number of treatments per week, vital signs, and dry weight. Finally, US results were collected, including BLS detected from scans collected by the patient from two zones, and four zones by the researchers. B-line counts were collected from four different sources: the patient, the researcher, the expert, and the AI automatic tool.

Statistical Analysis
Since our reported observation had three ordinal levels, we used weighted Cohen's kappa to assess the observers' agreement and weigh the disagreements differently. We report Kw statistics and 95% confidence intervals, and compare them in an inter-observer kappa agreement matrix. The expert's count on clips captured by the researcher will be defined as the "gold standard". Kw statistics will be interpreted as follows: Kw = 0 (good as a guess), Kw = 0.01-0.2 (slight agreement), Kw = 0.21-0.4 (fair agreement), Kw = 0.41-0.6 (moderate agreement), Kw = 0.61-0.80 (substantial agreement), Kw = 0.81-0.99 (near-perfect agreement), and Kw = 1.00 (perfect agreement).
The analysis was conducted using the R statistical programming language, version 4.1.2.

General Information
Twenty-one patients were recruited for the study. Two were excluded due to severe vision problems and refused to participate (Figure 1). A total of 283 clips was reviewed, 137 that were saved by the patients and 146 that were saved by the researchers. Of the 19 patients enrolled in the study, 12 (63%) were men, and the average age was 65.7 years. Most patients received three dialysis treatments per week, and a few underwent four. Most patients are Jewish (17, 89%) and native Hebrew speakers (16, 84%). In total, 7 (37%) have diabetes, 16 (84%) have high blood pressure, 7 (37%) have ischemic heart disease, and 2 (10%) have previously undergone a kidney transplant (Table 1).

B-Line Count Agreement
There was a moderate agreement between the gold standard (expert count on researcher's clip) and the patients' auto-B-line count clip (N = 91, Kw = 0.49 [95% CI: 0.05-0.93]). There was a substantial agreement between the gold standard clip and the auto-B-line count from the researcher's scan (N = 241, Kw = 0.67 [95% CI: 0.67-0.67]). There was an insignificant slight agreement between the patients' count on self-clips and the gold standard (expert count on researcher's clip) (N = 129, Kw = 0.2 [95% CI: −0.89-1.0]; Table 2. There was a fair agreement between the expert's count on the patients' clips and the gold standard, but without statistical significance (N = 69, Kw = 0.3 [95% CI: −0.76-1.0]) ( Table 2).  Weighted kappa values (Kw) were used for the estimation of agreement between observers. In the study, clips were saved by the patients (patients' clip) and by the researchers (researcher's clip). All the clips of the patients and of the researchers were counted for B-lines by an expert doctor (expert's count) and by the auto tool (auto-B-line count). The expert's count on researchers' clips was considered the "gold standard." Please note that the calculation of BLS by the AI is made in real time, independent of the clip chosen by the patients to save. The * marks the results that were statistically significant with p-value < 0.05

Correct Lung Image
To test the patients' ability to place the probe in the correct position and capture a reliable image with a good ultrasonographic anatomy view of the lung, the expert rated each patient's video according to the described three categories. The quality results of the clips in the first session were as follows (Table 3): 12 (21.8%) poor, 12 (21.8%) good, and 31 (56.4%) excellent, with an average score of 1.35 (from a scale of 0 to 2). In the second session, 18 (38.3%) were poor, 16 (34%) were good, and 13 (27.7%) were excellent; the average score was 0.89. In the third session, 15 (42.9%) videos were poor, 8 (22.9%) were good, and 12 (34.3%) were excellent. The average score in the third session was 0.91. In all three sessions, the rates of good or excellent were achieved by more than 50% of the patients. This was achieved without training beyond the first session.
The lung scans were recorded and ranked in the following categories: 0 (poor)-the anatomical capture is not clear enough for B-line detection, 1 (good)-the anatomical capture is not perfect, but the reviewer can assess B-lines with moderate confidence, and 2 (excellent)-the ideal anatomical markers of a narrow pleural line with two acoustic shadows on either side (rib space) can be visualized, and the reviewer can assess B-lines with confidence.

Schedules
The mean time interval between enrollment and the second session was 16.44 days (SD 12.9). This is very similar to the interval between the second session and the third session, which was 16.4 days (SD 10.3). However, the maximum interval between enrollment and the second session was 40 days with a minimum of 2 days compared to a maximum interval of 32 days and a minimum of 3 days between the second and third sessions (Table 4).

Discussion
The primary objective of this study was to determine whether self-performed LUS for B-lines could produce reliable results useful for clinical evaluation. By comparing BLS counted by the patients to BLS counted by an expert on researcher clips (gold standard), our results indicate that with the assistance of a live automatic BLS counting tool, patients are able to report the number of B-lines with moderate accuracy (Kw = 0.49, Table 2). Regarding the quality of the scans, most patients achieved a sufficient lung anatomical ultrasonographic window, defined as a good or excellent anatomical capture, even weeks after their primary and only learning session. In the first session, the success rate for at least good self-lung US imaging was 78% for patients versus 95% for researchers; in the second session, 62% versus 94%; and in the third session, 58% versus 99%, respectively. This reduction in patients' competency is understandable as no further training was conducted throughout the trial. We believe that extra training should be performed in future studies for the consolidation of self-lung ultrasound scan competency. A plausible solution is to integrate an instructional video to guide the patient in performing a self-lung US scan in real time, as shown to be successful in a study that used a portable self-US device in pregnancy [14]. Additionally, a short reminder practice at the beginning of the teaching process during the first few outpatient clinic follow-ups could improve patients' scanning ability until the patient performs sufficient scans in consecutive visits.
Our study strengthens the new concept that scans for B-lines can be achieved by patients, and we believe they can be conducted even at home in the future [45]. Results indicate a fair, however insignificant, agreement of 0.3 between the expert count on patientacquired clips and the expert count on researcher-acquired clips (gold standard, Table 2). As described in the Section 2, the calculation of BLS by the AI is made in real time, independent of the clip chosen by the patients to save. This explains the discrepancy of the moderate agreement (Kw = 0.49) between the gold standard and patients' auto-B-line count and the only fair agreement (Kw = 0.3) between the gold standard and expert count on the patient's scan. Although patients performed the test with reasonable quality, they needed to be more adept in identifying and saving the clip when the image most closely reflected the correct number of B-lines. Therefore, there was inconsistency between the quality of the acquired LUS clips by the patients (which was graded as at least moderate) and the number of B-lines that were counted from the patients' images compared to the number that was counted from a following LUS clip conducted by the researcher. The automatic B-line tool can offer a solution by allowing more time for the LUS recording, recognizing, and counting B-lines more accurately in real time. Indeed, when the automatic tool was activated on patients' scans, we found a moderate agreement, 0.49, between the AI BLS count activated on patients' scans and the expert count on the researcher's clips (gold standard). These results imply that the AI count is pivotal for implementing patient self-testing of LUS for the detection of pulmonary congestion. Future studies should give more attention to the recording technique itself. This can be achieved by planning clips with longer duration or capturing additional clips in the same area. Furthermore, relying on the AI B-line count can serve as a checkpoint for appropriate recording timing or as an alternative for a post-hoc interpretation of recorded clips by the expert.
Our results demonstrate a slight and insignificant agreement, 0.08, between the patient and expert B-line count on patients' scans. This suggests suboptimal counting ability by patients. At the same time, there is much better agreement between AI and the expert count on clips recorded by the patient and researcher, respectively, supporting the reliability of an AI-based B-line count. A recent study from our group validated the accuracy of the automatic B-line count tool used in this study [6]. This tool provides a more straightforward process for patients and clinicians. Patients can focus exclusively on obtaining an appropriate position for scanning without the effort of saving an image at the correct time or focusing on counting B-lines. AI can count the most identified B-lines tracked in a clip in real time and send the results automatically to a health care professional.
Incorporating practical usage of cellular-based, cheap US devices has several beneficial implications with the potential to guide earlier treatment, prevent exacerbations, and reduce emergency room visits and hospital admissions [12]. Our study aligns with the results of previous experimental studies that have investigated self-conducted US scans. For example, in two recent surveys, CHF patients and healthy participants were trained to perform a lung US self-exam successfully, with adequate interpretability of scans and high patient-reported self-efficacy and competence [15,47]. Additionally, a study using a portable self-US device in pregnancy concluded that it is an efficient, practical method for remote sonographic fetal assessment [12]. This progress has significant implications for enhancing patient care, especially considering the challenge of treating patients at home for evolving lung illness post-admission throughout the COVID-19 pandemic. LUS played a substantial role in the assessment of such patients in the community [53,54] in the acute setting [55] and the post-illness follow-up [56].
Several studies have shown the benefits of involving patients in their screening and follow-up process or transferring some responsibility for self-follow-up to patients [57]. For example, a review that combined conclusions from several studies on patients with diabetes found that those more actively involved in managing their condition had better blood sugar control and a lower risk of complications [58]. Another review of pediatric asthma found that those more involved in their care had better symptom control and a lower risk of hospitalization [59]. Additionally, involving patients in their care can lead to greater patient satisfaction and empowerment, as patients feel more in control of their health. This is especially important for patients with chronic conditions, who may be dealing with their disease over an extended period.
It should be emphasized that no correlation was observed between participants' years of education or profession and the degree of success in performing LUS or counting the BLS. Therefore, studying the scan is suitable for a wide range of patients, and further inclusion or exclusion criteria are not required. The training of how to operate such modalities can extend to family members or hired caregivers.
There is potential for using cheap, portable devices connected to cell phones with automatic tools to improve the self-management of patients with chronic heart failure (CHF) and patients on dialysis. A meta-analysis in which remote monitoring of heart failure patients was assessed showed that it enabled faster intervention and was associated with lower mortality [60]. By conducting continuous, early home monitoring, patients might be able to detect life-threatening conditions such as PC earlier, potentially reducing the number of hospitalizations and improving outcomes. These devices could be handy for patients who live in rural or underserved areas where access to health care may be limited [61].
This research has several limitations. First, the recruitment ability was limited as a single-centered, non-sponsored, prospective study. Accordingly, there was an objective difficulty in recruiting a large number of patients, and therefore our sample size is small. As a result, in some statistical analyses, we encountered difficulty obtaining significant results and identifying factors that influence success or failure in learning to identify B-lines. Therefore, the statistical power is restricted as the external validity, and the hypothesis requires further investigation in future, more extensive studies.
The study sample only reflects part of the dialysis patient population. Because most of the patients in the region are treated in community dialysis institutes, many of the study participants treated in the medical center's dialysis unit are relatively older and more medically complex. On the other hand, treatment in the hospital's nephrology center is probably more extensive, so patients are better balanced and less likely to exhibit B-lines on scans. The generalizability of our findings is also limited because only patients using a messaging app on their smartphones were recruited.
Some of the recordings achieved by the patients mistakenly had only still pictures acquired instead of a video clip. This technical error limited our ability to assess image quality, count the number of B-lines, and perform the sub-analysis, thus reducing our sample size and impacting the significance of our findings. We believe that a designated application in a private cellular phone that automatically starts the scan when the US cradle is placed over the chest, with a designated automatic tool for the real-time B-line count, can overcome this limitation of correct clip capture.
Finally, there were long pauses in the recruitment phase of the study due to the COVID-19 pandemic, which might have influenced patients' memory and performance, although the median time was 15.5 days between the first and second encounters and 13 days between the second and third encounters.
Our study has strengths. Only one other published study investigates patients' ability to perform a self-LUS scan [47]. Our study is the first to measure patient performance in a self-LUS scan beyond one encounter and in a long significant period of time. Finally, it is the first study to combine and validate an AI-based automatic B-line count tool with patient self-lung scans for B-lines, with the demonstration of such tool advantage.
Involving patients in self-assessment using an imaging tool operated at home is a novel concept that enables patients to take on a more active and engaging role in their health care. Our findings are the substrate for future pilot studies where patients can receive cellular-based US machines delivered to their homes to test self-US lung scans away from the research team. This may progress POCUS from being solely performed by physicians in clinical and research settings to patients using US at home, participating in their clinical assessment and care [14].

Conclusions
Dialysis patients can self-scan their lungs with good anatomical precision, and when using an AI-based counting application, accurate B-line counts are comparable to an expert's count on clips taken by researchers. We believe this study sheds light on the potential for using home US devices to detect PC, allowing patients to take a more active role in their medical care. Informed Consent Statement: Informed consent was obtained from all subjects involved in this study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Ultrasound examination creates an image of the body's organs using sound waves. The test is safe, does not cause damage, and therefore can be performed without limitation.
The test is performed using a transmitter-receiver called a transducer. The ultrasound device processes the sound waves received by the transducer into a live image displayed on the screen in real time. Since the presence of air interferes with the passage of sound waves, a gel is applied to the transducer, thus creating "direct contact" between the transducer and the skin. The test: During the test, you will be asked to take a short video of the lungs. You will hold the transducer in one hand; after a little gel has been applied to it, you will be asked to place it on one of the points marked in the figure. You can move the transducer slightly up or down until you obtain an image where you can see the space between the ribs and the lungs behind them.
Upon obtaining the image, you will be asked to count the number of B-lines in the image; an explanation is below.
After that, we will perform the test on the other side of the chest as well. In addition, we will repeat the test once more after the dialysis treatment.
B-lines: White or light gray lines starting at the top of the screen and reaching the bottom. They look similar to a curtain fluttering in the wind and can move with the movement of the device or during breathing. The B-lines can indicate the amount of fluid in the lungs.
Examples of possible images: Figure A1. Patients' scan locations, as illustrated in the explanatory brochure they received.
The test: During the test, you will be asked to take a short video of the lungs. You will hold the transducer in one hand; after a little gel has been applied to it, you will be asked to place it on one of the points marked in the figure. You can move the transducer slightly up or down until you obtain an image where you can see the space between the ribs and the lungs behind them.
Upon obtaining the image, you will be asked to count the number of B-lines in the image; an explanation is below.
After that, we will perform the test on the other side of the chest as well. In addition, we will repeat the test once more after the dialysis treatment. image; an explanation is below.
After that, we will perform the test on the other side of the chest as well. In addition, we will repeat the test once more after the dialysis treatment.
B-lines: White or light gray lines starting at the top of the screen and reaching the bottom. They look similar to a curtain fluttering in the wind and can move with the movement of the device or during breathing. The B-lines can indicate the amount of fluid in the lungs.
Examples of possible images: Figure A2. Examples of possible B-line images and appropriate interpretations, as illustrated in the patients' explanatory brochure.