Visual Cascaded-Progressive Convolutional Neural Network (C-PCNN) for Diagnosis of Meniscus Injury

Objective: The objective of this study is to develop a novel automatic convolutional neural network (CNN) that aids in the diagnosis of meniscus injury, while enabling the visualization of lesion characteristics. This will improve the accuracy and reduce diagnosis times. Methods: We presented a cascaded-progressive convolutional neural network (C-PCNN) method for diagnosing meniscus injuries using magnetic resonance imaging (MRI). A total of 1396 images collected in the hospital were used for training and testing. The method used for training and testing was 5-fold cross validation. Using intraoperative arthroscopic diagnosis and MRI diagnosis as criteria, the C-PCNN was evaluated based on accuracy, sensitivity, specificity, receiver operating characteristic (ROC), and evaluation performance. At the same time, the diagnostic accuracy of doctors with the assistance of cascade- progressive convolutional neural networks was evaluated. The diagnostic accuracy of a C-PCNN assistant with an attending doctor and chief doctor was compared to evaluate the clinical significance. Results: C-PCNN showed 85.6% accuracy in diagnosing and identifying anterior horn injury, and 92% accuracy in diagnosing and identifying posterior horn injury. The average accuracy of C-PCNN was 89.8%, AUC = 0.86. The diagnosis accuracy of the attending physician with the aid of the C-PCNN was comparable to that of the chief physician. Conclusion: The C-PCNN-based MRI technique for diagnosing knee meniscus injuries has significant practical value in clinical practice. With a high rate of accuracy, clinical auxiliary physicians can increase the speed and accuracy of diagnosis and decrease the number of incorrect diagnoses.


Introduction
Meniscal injuries are a common result of everyday exercise, and the majority of meniscal surgeries are still partial meniscectomy [1,2]. The diagnostic process prior to performing these procedures, however, is difficult, particularly the interpretation of MRI images of the meniscus [3][4][5][6]. Some injuries may be visible on MRI scans but difficult to see during arthroscopy. Surgery has been shown in several studies to decrease the likelihood of future knee osteoarthritis (KOA), enhance knee mobility, and enhance patients' quality of life and long-term satisfaction [7][8][9][10][11]. Therefore, the early and accurate diagnosis of meniscus injury, along with timely treatment, is crucial.
Knee magnetic resonance imaging (MRI) is a valuable imaging diagnostic tool for knee disease examination. MRI has been repeatedly shown to detect pathology in the meniscus and cruciate ligament. This diagnosis method has a high level of accuracy and is frequently used to determine which patients require surgery. The meniscus have a low moisture and fat content, and MRI measures low signal intensity on T1, T2 sequences in the knee joint. In the coronal view, sequential imaging of the medial and lateral meniscus reveals a hypointense, triangular structure with a sharp tip at the apex. The anterior and

The Experiment Design
In this study, we constructed a cascade-progressive convolutional neural network (C-PCNN). The research was based on knee arthroscopy reports and MRI reports. The objective was to compute the diagnostic performance of the physician with C-PCNN diagnostic judgment and the results of C-PCNN. Because T2 MRI of the knee provides more information to detect meniscus injury, sagittal T2 knee images were employed in this investigation. The knee T2 MRI pictures contained fat suppression images [12]. Concurrently, five sports medicine professionals used C-PCNN to retrospectively evaluate the knee MRIs of clinical patients and they used C-PCNN to assess meniscus injury. We then compared the performance of physicians using convolutional neural networks ( Figure 1). All meniscus-tear patients underwent arthroscopic knee surgery.
The primary outputs of C-PCNN comprise the diagnosis of meniscus injury, the lesion area, and the visualization of the lesion area. Although there are numerous classifications of the tearing direction of meniscus injury, horizontal meniscal tears are the most common. The availability of other directions of meniscal tears for training was minimal; therefore, the experiment did not define the classification of meniscus injuries. In addition, the direction of meniscus tears can be determined based on the specific visualization data.

Data Collection
Patients who underwent surgery in our hospital and healthy patients who underwent MRI screening at our hospital were included. As there was no standard format for all MRI patient data, we selected the knee MRIs of 2000 patients from 2015 to 2021, excluding pixels that were too blurry to detect and patients with knee ligament injuries or other disorders. A total of 1396 knee MRI pictures were acquired, 716 of which were normal ( Figure 2). For the remaining 680 pieces depicting meniscus injury, the age range was between 17 and 62 years Diagnostics 2023, 13,2049 3 of 14 (17-62 years). For all patients with meniscus injury in our hospital, the chief physician performed arthroscopic surgery, and the intraoperative diagnosis was the same as the MRI diagnosis of the radiologists. If a problem of knee joint meniscus and clean-up resection occurred, meniscus stitches were used. Figure 3 shows the intraoperative arthroscopic images of meniscus injury ( Figure 3).

Data Collection
Patients who underwent surgery in our hospital and healthy patients who underwent MRI screening at our hospital were included. As there was no standard format for all MRI patient data, we selected the knee MRIs of 2000 patients from 2015 to 2021, excluding pixels that were too blurry to detect and patients with knee ligament injuries or other disorders. A total of 1396 knee MRI pictures were acquired, 716 of which were normal ( Figure 2). For the remaining 680 pieces depicting meniscus injury, the age range was between 17 and 62 years (17-62 years). For all patients with meniscus injury in our hospital, the chief physician performed arthroscopic surgery, and the intraoperative diagnosis was the same as the MRI diagnosis of the radiologists. If a problem of knee joint meniscus and clean-up resection occurred, meniscus stitches were used. Figure 3 shows the intraoperative arthroscopic images of meniscus injury ( Figure 3).  Knee MRI data collection process. A total of 2000 knee MRI images were collected. Some images were blurred and indistinct, or multiple knee injuries were excluded. A total of 716 normal images and 680 images of meniscus damage were collected. Figure 3. The intraoperative arthroscopic images of meniscus injury. Intraoperative diagnosis was made according to the arthroscopy of knee joint. The arrows point to the meniscus injury.

Meniscus Anatomy and Image Preprocessing
Performing MRI of the knee joint is an efficient approach for diagnosing the meniscus. The knee menisci are two crescent-shaped discs of fibro cartilage that are located between the surfaces of the femur and tibia in the medial and lateral compartments of the joint. At the cross section, the normal meniscus is wedge-shaped, with a flat surface facing the tibia and a concave surface facing the femur [4,15]. The meniscus' fibrocartilage contains little water and fat. On a coronal picture, the anterior and posterior corners of the meniscus exhibit a low-intensity curve structure that is related to the meniscus root bone and the joint capsule perimeter. The anterior and posterior corners of the meniscus are triangular in a sagittal view. At the level closer to the middle of the knee joint, the anterior and posterior corners are shaped like a bow tie. When a meniscus injury occurs, MRI of the knee will reveal a high-signal, white and thin shadow in the meniscus region, proving that meniscus damage has occurred. To identify the tear of the meniscus, the location of the meniscus injury is first detected, followed by the extraction of the picture of the meniscus injury for feature extraction. Visual output was conducted. The knee MRI data we collected included images in different resolutions. There are background areas around the knee that do not contribute to the diagnosis. Therefore, the MRI images were clipped to be 640 × 320, 1280 × 640, and 2560 × 1280 ppi.

Meniscus Anatomy and Image Preprocessing
Performing MRI of the knee joint is an efficient approach for diagnosing the meniscus. The knee menisci are two crescent-shaped discs of fibro cartilage that are located between the surfaces of the femur and tibia in the medial and lateral compartments of the joint. At the cross section, the normal meniscus is wedge-shaped, with a flat surface facing the tibia and a concave surface facing the femur [4,15]. The meniscus' fibrocartilage contains little water and fat. On a coronal picture, the anterior and posterior corners of the meniscus exhibit a low-intensity curve structure that is related to the meniscus root bone and the joint capsule perimeter. The anterior and posterior corners of the meniscus are triangular in a sagittal view. At the level closer to the middle of the knee joint, the anterior and posterior corners are shaped like a bow tie. When a meniscus injury occurs, MRI of the knee will reveal a high-signal, white and thin shadow in the meniscus region, proving that meniscus damage has occurred. To identify the tear of the meniscus, the location of the meniscus injury is first detected, followed by the extraction of the picture of the meniscus injury for feature extraction. Visual output was conducted. The knee MRI data we collected included images in different resolutions. There are background areas around the knee that do not contribute to the diagnosis. Therefore, the MRI images were clipped to be 640 × 320, 1280 × 640, and 2560 × 1280 ppi.

C-PCNN Architecture
To diagnose lesions with high accuracy, we developed a cascade-progressive convolutional neural network (C-PCNN) based on the pyramid network [22,23]. The cascadedprogressive convolutional neural network consists of three components: the primary network, the secondary network, and the tertiary network, which correspond to three resolutions (640 × 320, 1280 × 640, and 2560 × 1280, respectively). Using Grad-CAM, the primary network developed the meniscus damage localization map. The lesion attention module (LAM) combined the features of the low-resolution image, the meniscus injury mapping generated by the primary network, and the high-resolution characteristics. The cascade-progressive output branches of the secondary and tertiary networks were used to obtain the final diagnosis results. The output branch of the high-resolution network is used to obtain the final diagnosis result. The output branches of the primary network were used to obtain the meniscus damage localization map and the lesion's location under the supervision of image annotation. The final diagnosis results were obtained by cascade-progressive output branches of the secondary and the tertiary networks. In the tertiary networks, the focal point is on the lesion features of the meniscus injury site, and the categorization and diagnosis of meniscus injury is eventually achieved. Consequently, the meniscus injury localization map generated by the primary network is used to weight the features in the tertiary network, and the specially designed attention module combines the low-resolution features generated by the primary network, the high-resolution features generated by the secondary and tertiary networks, and the meniscus injury localization map ( Figure 4).

Meniscus Localization
To find the meniscus, a meniscus injury localization map was generated in the pri mary network using weakly supervised localization. We selected the improved Grad CAM for weakly supervised localization, which was formed by the sequential inclusion of a convolutional layer, global average pooling layer, and Softmax prediction layer at the end of the network. Although it can accentuate the most distinguishing features of an C-PCNN retrieves features from images with varying resolutions. In this study, the three network layers are referred to as the primary network, the secondary network, and the tertiary network. Each of the three networks is constructed using ResNet50 modules. ResNet50 is a ResNet family network with a number of network layers and model sizes ideal for medical imaging activities. Table 1 depicts the modules used in the planned network. Specifically, the primary net is configured as {conv_1, block_1, block_2, block_3, block_4, fc}, the secondary net is configured as {conv_1, block_1, block_2}, and the tertiary network is configured as {conv_1, conv_2, block_1, block_2, block_3, block_4, fc}. C-PCNN performs features cascading and progressive propagation from the primary network to the advanced network. Different resolutions of images are directly added into the network. To accommodate the different-sized feature maps, additional independent convolutional layers or pooling layers must be added. After block_1, the primary and secondary networks' characteristics are combined. After block_2, the secondary network and tertiary network's features are merged. The advanced network receives the conv_2 module. The size of the lesion feature map was halved to ensure that fusion features had the same size.

Meniscus Localization
To find the meniscus, a meniscus injury localization map was generated in the primary network using weakly supervised localization. We selected the improved Grad-CAM for weakly supervised localization, which was formed by the sequential inclusion of a convolutional layer, global average pooling layer, and Softmax prediction layer at the end of the network. Although it can accentuate the most distinguishing features of an object, it is not the only method for achieving this effect. However, the CAM approach modifies the structure of CNN, whereas the upgraded Grad-CAM can generate visual data from any CNN-based network structure without modifying the network architecture. The meniscus damage localization map in the principal network was generated with the Grad-CAM approach. Thus, the location of each meniscus was determined. In order to give intuitive categorization results, the suggested method can not only determine whether the meniscus is damaged or not, but also determine the location of the injury.
Firstly, the primary network inputs the 640 × 320 image I 320 , and the l-th layer feature map f l i can be obtained by: f l i ] = Prinet(I 320 ) We focus on the lesion area, where l represents the third convolutional layer in block_3, and i is the i-th channel of the feature map. Then, the gradient of the feature map is calculated, and by going through the global average pooling (GAP) layer, the weight of feature map is obtained: The final step is a linear mapping of the weights and the feature graph, which is obtained using the ReLU activation function to obtain the heat map.

Weighting and Visualization of Features
This section explains how to visualize and concentrate on the characteristics of meniscus injury. The central concept of this diagnostic method is the cascade of primary, secondary, and tertiary networks based on the location of the meniscus injury. Grad-CAM obtained the lesion activation map, and the next step was to enhance the lesion feature set and lesion features. We created a LAM with two primary functions. The feature attention map between high-level and low-level network features is constructed initially. The weighting of advanced features based on meniscal injury mapping is a second function. Consequently, the LAM input consists of three components: the meniscal injury localization map, low-level network features, and high-level network features. LAM produces the weighted advanced network feature image, the meniscus injury location map generated by Grad-CAM for the low-level network, and the high-resolution network feature map. After linear transformation, new features are obtained, which are then combined with features from different levels. A 1 × 1 convolutional layer and a softmax function were used to compute the fused features for the feature weighted attention map. For the meniscus injury localization map weighting, the lesion area should be given equal weight. The meniscus injury localization map had pixels measuring 640 × 320. It was then multiplied to produce the positioning weighted attention map. Multiplying the linearly transformed feature of the feature map generated by the advanced network with the weighted attention map of the location map yields the weighted high-resolution feature. Figure 5 shows an example of a meniscus injury localization heat map in the data set. The original meniscus injury images, lesion localization images and visual outputs are shown from top to bottom. (Figure 5).

5-Fold Cross Validation Method
We divided the images of all 1396 patients into five parts labeled as K1, K2, K3, K4 and K5. Firstly, K2, K3, K4, K5 are used as the training set to train, and the mode logist_model_Flod1 is obtained. K1 is used as the test set to test, and the accuracy logis_precise_Flod1 is obtained. The model logist_model_Flod2 is obtained by training K1, K3, K4, K5 as the training set, and the accuracy logis_precise_Flod2 is obtained by testing K2 as the test set. The model logist_model_Flod3 is obtained by training with K1 K2, K4, K5 as the training set, and the accuracy logis_precise_Flod3 is obtained by testing with K3 as the test set. In this way, the five parts are used as test sets for verification, and the average value of the five evaluation results is taken (Figure 6).
The advantage of this method is that randomly produced subsamples are frequently

5-Fold Cross Validation Method
We divided the images of all 1396 patients into five parts labeled as K1, K2, K3, K4, and K5. Firstly, K2, K3, K4, K5 are used as the training set to train, and the model logist_model_Flod1 is obtained. K1 is used as the test set to test, and the accuracy lo-gis_precise_Flod1 is obtained. The model logist_model_Flod2 is obtained by training K1, K3, K4, K5 as the training set, and the accuracy logis_precise_Flod2 is obtained by testing K2 as the test set. The model logist_model_Flod3 is obtained by training with K1, K2, K4, K5 as the training set, and the accuracy logis_precise_Flod3 is obtained by testing with K3 as the test set. In this way, the five parts are used as test sets for verification, and the average value of the five evaluation results is taken ( Figure 6).
testing K2 as the test set. The model logist_model_Flod3 is obtained by training with K2, K4, K5 as the training set, and the accuracy logis_precise_Flod3 is obtained by test with K3 as the test set. In this way, the five parts are used as test sets for verification, the average value of the five evaluation results is taken ( Figure 6).
The advantage of this method is that randomly produced subsamples are frequen utilized for training and verification simultaneously, and the results are only confirm once each time, which improves the accuracy. Cross-validation may utilize limited d effectively, and assessment results can be as close as feasible to the model's performa on the test set, which can be used for model optimization. The advantage of this method is that randomly produced subsamples are frequently utilized for training and verification simultaneously, and the results are only confirmed once each time, which improves the accuracy. Cross-validation may utilize limited data effectively, and assessment results can be as close as feasible to the model's performance on the test set, which can be used for model optimization.

Comparison of Different Networks
We conducted a comparison of the performance of single networks, C-PCNN, and different underlying networks. The single network selects the primary network, the secondary network, and the tertiary network in the C-PCNN. Additional convolution neural networks include EfficientnetB0, EfficientnetB1, MobileNet, ResNet34, ResNet50 and VGG, as shown in Table 2 and Figures 7 and 8. ATM-L, M, and H, respectively, represent the primary network, the secondary network, and the tertiary network. By performing a comprehensive comparison of Figures 7 and 8, we observed that, excluding C-PCNN, the primary network performs best. According to the results of Figure 8, the single network with high resolution has the lowest AUC. We believe that with the improvement of image resolution, network performance will gradually decline. The network can achieve satisfactory results under low resolution conditions, but its performance will decline rapidly as the image resolution gradually increases. We analyzed this outcome, suggesting that with the increased image resolution, the image resolution of the non-diseased area also increases, which leads to the increase in the interference of the non-diseased area of the network, and results in the decreased network performance. Among the different baseline networks, ResNet50 demonstrated the best performance.

The Diagnose of Physicians
We searched for three orthopedic attending physicians with nine, eight and eleven years of experience, and two chief orthopedic physicians, although none of them were the chief physicians of knee MRI meniscus injury or had ever viewed the 1396 images. We divided the 1396 images into seven groups, each containing 200 images, and randomly

The Diagnose of Physicians
We searched for three orthopedic attending physicians with nine, eight and eleven years of experience, and two chief orthopedic physicians, although none of them were the chief physicians of knee MRI meniscus injury or had ever viewed the 1396 images. We divided the 1396 images into seven groups, each containing 200 images, and randomly assigned the images to five individuals, one group per individual. The diagnostic results of MRI diagnosis and intraoperative diagnosis were used as reference standards to cal- Figure 8. The ROC curve and AUC value for three networks. ATM-H represents the tertiary network of the high-resolution network, ATM-M represents the secondary network of the medium resolution network, and ATM-L represents the primary network of the low resolution network.

The Diagnose of Physicians
We searched for three orthopedic attending physicians with nine, eight and eleven years of experience, and two chief orthopedic physicians, although none of them were the chief physicians of knee MRI meniscus injury or had ever viewed the 1396 images. We divided the 1396 images into seven groups, each containing 200 images, and randomly assigned the images to five individuals, one group per individual. The diagnostic results of MRI diagnosis and intraoperative diagnosis were used as reference standards to calculate the accuracy rate of five doctors in diagnosing knee meniscus injury. With the aid of the C-PCNN, three attending physicians re-diagnosed 200 films previously diagnosed by the chief physician, and graded and labeled them accurately. Finally, we found that the speed and accuracy of knee MRI image recognition by three orthopedic attending physicians improved.

Statistical Methods
Prism version 9.0 was used to conduct statistical analyses. General descriptive statistics were used to compare the sensitivity, specificity, and accuracy of the C-PCNN with the gold standard of intraoperative arthroscopic diagnosis and the clinician's diagnosis. The probability of using C-PCNN to diagnose a meniscal tear was analyzed using a receiver operating characteristic (ROC) curve, and the area under the ROC curve (AUC) was calculated. Simultaneously, the accuracy of the coronal diagnosis of anterior and posterior meniscus injury horns was determined. Using C-PCNN, before and after using convolutional neural networks, a subgroup analysis was conducted to compare the performance of various physicians and to determine the clinical significance of convolutional neural networks.

Result
We used intraoperative arthroscopic diagnosis and MRI diagnosis as reference to validate the accuracy of convolutional neural networks in the diagnosis of meniscus injury, and compared the performance of a single network, a hierarchical network, and the C-PCNN (Table 2, Figures 7 and 8). All experimental outcomes are displayed in the table and figure. The network performance deteriorates as image resolution increases. In the cases of low resolution, the network achieves satisfactory results, indicating that although the diseased area is more distinct in the high-resolution image, the interference of the non-diseased area is also more severe, resulting in poor results for the convolutional neural network. However, when the C-PCNN is applied, the diagnostic accuracy approaches 89.8%, AUC = 0.86, demonstrating that image feature extraction and fusion with several resolutions can significantly enhance diagnostic accuracy. The accuracy of the C-PCNN in recognizing the anterior root injury was 85.6%, while the accuracy of the diagnosis and recognition of the posterior root injury was 92%. Our subgroup analysis compared the accuracy of attending physicians and senior chief physicians using the C-PCNN, and it was found that attending physicians achieved an accuracy comparable to that of senior chief physicians with the aid of this neural network (Table 3). During the test, we analyzed the time required for diagnosing MRI images. The three attending physicians had an average diagnosis time of 80 min, while the two chief physicians had an average diagnosis time of 65 min. The test was conducted on a server equipped with two Intel E5 2678 CPUs and four NVIDIA GTX 1080Ti GPUs. The time taken to process a single image during the test was 1.5 s. However, with the assistance of C-PCNN (presumably a computer-aided diagnosis system), the diagnosis time was significantly reduced. The average diagnosis time for the three attending physicians decreased to 60 min, and for the chief physicians, it decreased to 55 min.

Discussion
In this study, we created a CNN model to identify the presence and type of meniscal tears using MRI images as input data. Our developed C-PCNN performed well in determining whether meniscal tears existed or not. Although our algorithm was trained on 1396 images, larger data sets help the algorithm to perform better. A substantial amount of annotated image data, which continually improve during training and testing, is needed for real clinical application. One approach may be to mine disease course and image reports through big data because a significant amount of data annotation and accurate diagnosis is one of the limitations of obtaining large data sets. By establishing and continually growing the database, an algorithm that is more accurate and efficient can be constructed.
Of the 1396 images, 716 displayed normal knee anatomy, while 680 depicted meniscus injury. Overall, 298 of the 680 MRI knee meniscus injury images depicted an anterior horn injury. C-PCNN had an 85.6% accuracy rate in identifying anterior horn injury. In 382 instances of posterior horn injury, the accuracy of C-PCNN diagnosis and recognition was 92%. We believe that the difference in MRI sensitivity between anterior and posterior angles meniscus tears may be the primary reason for the difference in the accuracy of the anterior and posterior angles of the sagittal meniscus. Because the angle after injury is greater on the inside, the sample selection supports this. After injury of the meniscus, anterior horn injury and a higher incidence of the lateral meniscus injury occur. In the sample selection, anterior corner injury of the lateral meniscus is more prevalent; the sensitivity and specificity of MRI for medial meniscus injury were 92% and 90%; for lateral meniscus tear, the sensitivity and specificity of MRI were 80% and 95% [15,24]. Although MRI has excellent specificity and sensitivity for medial meniscal tears, there is lower sensitivity in the detection of lateral meniscal tears [14]. Therefore, in the recognition process, both readers and the convolutional neural network may be incorrect, and the diagnosis error may be resultant of the low sensitivity of the MRI to the lateral meniscus.
The error is large when relying only on MRI of the local lesion to determine the torn meniscus injury direction, because the MRI images are continuous, and the level of meniscus injuries is judged in succession, while coronal and sagittal sequences are simultaneously utilized to determine the specific location of the meniscus injury and the classification of the meniscus injury. Due to the fact that the direction of the torn meniscus injury is multifaceted, there are nine types of classification of damage tear, and the classification of different incidences makes sample collection challenging; thus, the convolution neural network cannot currently guarantee accuracy of the nine types of meniscus injury classification. Therefore, we chose convolutional neural networks to assist doctors for accurate diagnosis. Our study found that the combination of AI and human experts is more accurate at diagnosing meniscal tears than either human experts or AI alone. After the convolutional neural network determines whether or not there is an injury, the doctor determines the precise classification. The localization map of meniscus injury can be used to improve the characteristics of lesions, the direction of meniscus injury can be determined by doctors, and the rate of doctor's diagnosis can be added to increase the precision of diagnosis. This parallels the argument made by Kyle N. Kunze et al. [17]. In other words, it may be possible in the future to achieve rapid initial screening using convolutional neural networks and to strengthen the characteristics of lesions through the visualization of lesion sites in order for professional sports medicine physicians to classify meniscus injuries in greater detail.
We also compared our work with that of other groups that have conducted similar studies. V. Roblot et al. [25] utilized mask-RCNN to diagnose meniscus damage with AUC Position = 0.92 and AUC Orientation = 0.83. The meniscus was first located. The photographs were then split into four groups (background, meniscus untorn, meniscus horizontal torn, meniscus vertical torn). The AUC and accuracy drop when the direction of the meniscus tear is separated into two categories. If additional tear directions of the meniscus are categorized, the accuracy may decrease further. Hyunkwang Shin et al. [26] evaluated the meniscus injury using CNN, first localizing the injury and then classifying it as a horizontal tear (normal/horizontal), compound tear (normal/complex), radial tear (normal/radial), longitudinal tear (normal/longitudinal tear), or lateral tear (normal/longitudinal tear). The AUCs for detecting tears in the medial meniscal and lateral meniscal were 0.888 and 0.817. Regarding the ability to differentiate the type of meniscal tear, the AUCs for horizontal, complicated, radial, and longitudinal rips were 0.761, 0.850, 0.601, and 0.858. The classification performance is better than the convolutional neural network model of V. Roblot et al. [25], which we believe may be related to the accuracy of MRI images and the selection of MRI images. Benjamin Fritz et al. [27] studied a small sample size of 100 instances to compare DCNN with doctors and intraoperative observations. The sensitivity for medial meniscus injuries was considerably different between readers and the DCNN (p = 0.039), although all other comparisons exhibited no significant differences (p > 0.092). Inter-reader agreement was excellent for the medial (kappa = 0.876) and adequate for the lateral (kappa = 0.741) meniscus. It is thought that there is no substantial distinction between DCNN and its readers.
Compared to the three aforementioned studies, we have an advantage in this test as we did not impose restrictions on the MRI scan specifications, allowing for inconsistent and heterogeneous data, which we believe is more representative of clinical practice. Previous studies have indicated that the accuracy of our C-PCNN could be improved if all MRI images are obtained from the same machine or institution. In such cases, our C-PCNN achieved an accuracy of 89.8% and an AUC value of 0.86 in diagnosing meniscus injuries. The accuracy in terms of diagnosis and localization did not significantly differ from the studies conducted by Benjamin Fritz et al. and Hyunkwang Shin et al.
With the inclusion of lesion features, the heat map generated by our C-PCNN facilitated an easier determination of the direction of meniscus tears. Unlike the aforementioned studies, our C-PCNN primarily serves as an assistive tool for doctors, aiming to improve their accuracy, reduce diagnosis time, and expedite the learning curve for MRI diagnosis of meniscus injuries. Consequently, our C-PCNN is better suited for clinical use.
In contrast to radiology, the advantage of our C-PCNN lies in its utilization of T2 lipostatic images of the knee instead of requiring full-layer sequences of knee MRI. Additionally, the diagnosis time of our C-PCNN is shorter. The C-PCNN reduces the reliance on manual rules and feature engineering. This makes our method more flexible and adaptable to different types and degrees of injuries. In contrast, traditional radiological methods often rely on human expertise and professional knowledge for interpretation and diagnosis, which can be influenced by subjective factors. While MRI exhibits high sensitivity for tears, it may not be as effective in detecting fragments and is better at diagnosing medial meniscal lesions compared to lateral ones. However, by combining arthroscopic diagnosis with an improved and trained C-PCNN model, it is possible to identify more detailed information and reduce the probability of false-negative diagnoses in meniscus injuries.
The recognition of knee meniscus injuries by convolutional neural networks (CNNs) and the application of MRI images in clinical practice require the realization of multi-source and multi-layer image fusion. The diagnostic capabilities of CNNs also need to evolve from single image diagnosis to provide a comprehensive evaluation of continuous MRI outputs. This will enable artificial intelligence to simulate the diagnostic process of doctors, while observing and assessing meniscus injuries at a three-dimensional level.
Moreover, CNNs can be utilized to integrate clinical signs, symptoms, surgical methods, and complications. This integration will contribute to more accurate classifications. Additionally, the development of clinical prediction models can aid in determining the necessity of surgery and predicting the postoperative prognosis for patients. This direction of research represents our future focus, so that CNN can be better applied in the clinic, not only in disease judgment, but also in prediction, selection of surgery, prediction of postoperative complications and other directions.

Conclusions
This paper demonstrates the effectiveness of fusing a high-level network and lowlevel network to improve diagnosis performance. CNN is capable of learning multi-level characteristics from knee MRI images by detecting low-level network lesions and extracting complex high-level characteristics from high-level networks.
The detection of a torn meniscus can be performed automatically. The precision of our C-PCNN is comparable to that of musculoskeletal specialists. In addition, with the aid of cascades and progressive convolutional neural networks, the diagnostic rate and precision of physicians will be enhanced.