Nomogram prediction model for renal anaemia in IgA nephropathy patients

Abstract In this study, we focused on the influencing factors of renal anaemia in patients with IgA nephropathy and constructed a nomogram model. We divided 462 patients with IgA nephropathy diagnosed by renal biopsy into anaemic and non-anaemic groups. Then, the influencing factors of renal anaemia in patients with IgA nephropathy were analysed by least absolute shrinkage and selection operator (LASSO) regression and multivariable logistic regression, and a nomogram model for predicting renal anaemia was established. Eventually, nine variables were obtained, which are easy to apply clinically. The areas under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve reached 0.835 and 0.676, respectively, and the C-index reached 0.848. The calibration plot showed that the model had good discrimination, accuracy, and diagnostic efficacy. In addition, the C-index of the model following internal validation reached 0.823. Decision curve analysis suggested that the model had a certain degree of clinical significance. This new nomogram model of renal anaemia combines the basic information, laboratory findings, and renal biopsy results of patients with IgA nephropathy, providing important guidance for predicting and clinically intervening in renal anaemia.


Introduction
IgA nephropathy is currently the most common glomerular disease worldwide, with a high incidence in the Asia-Pacific region [1], and is the most common cause of chronic kidney disease (CKD) in China [2]. Renal anaemia is a common diagnostic complication in patients with CKD, in whom it can occur at the early stage (stages 2 and 3 according to the KDIGO guidelines). The level of haemoglobin decreases when the estimated glomerular filtration rate (eGFR) is approximately 70 mL/min/1.73 m 2 in males and 50 mL/min/1.73 m 2 in females; however, most commonly, renal anaemia appears in stage 4 CKD and worsens as the disease progresses. Among patients with more advanced disease and those on dialysis, approximately 90% have anaemia [3,4].
The pathogenesis of renal anaemia is complex; the primary mechanism is the reduced production of erythropoietin (EPO) by the paratubular apparatus. In addition, excessive expression of hepcidin, a persistent microinflammatory state, and uraemic toxins have an effect on anaemia [5][6][7][8]. Anaemia because of CKD not only affects the quality of life of the patients but also increases the incidence of and mortality from cardiovascular complications, which may lead to further deterioration of renal function and form a vicious cycle called "cardiac anaemia syndrome" [9]. In addition, renal anaemia has been identified as an independent risk factor of heart failure in CKD patients [10].
Therefore, the early diagnosis of and intervention for renal anaemia is necessary, but the individualized prediction of IgA nephropathy complicated by renal anaemia has been rarely reported and is an urgent problem to be solved. The aim of this study was to establish the first nomogram model for the personalized prediction of the risk of IgA nephropathy complicated by renal anaemia to guide clinical screening of high-risk groups and develop more targeted clinical decisions.

Study sample collection and management
A total of 658 patients with IgA nephropathy diagnosed by renal biopsy were enrolled from January 2015 to January 2016. The method of inclusion and exclusion was based on a previous study [11]. Finally, 462 patients were enrolled, including 132 patients with anaemia and 320 patients without anaemia. All patients who underwent renal biopsy signed a study protocol informed consent form for the renal clinical database at the time of admission, agreeing that their data would be used for the clinical study. The study protocol was approved by the hospital ethics committee (EC No.: S2019-309-01).

Definition of clinical measures
According to WHO recommendations, anaemia can be diagnosed in males aged ≥15 years with haemoglobin <130 g/L or in adult non-pregnant females with haemoglobin <120 g/L in regions at sea level [12].

Statistical analysis
The study design and statistical analysis of this study were carried out in strict accordance with the TRIPOD statement for prediction models [16]. All data analysis was performed using R software (version 3.6.3; https:// www.R-project.org). All clinically collected data (basic patient information, laboratory tests, pathological stage for renal biopsy) were enumeration data and are expressed as frequency (%). Least absolute shrinkage and selection operator (LASSO) regression was performed by the "glmnet" package in R software, the "rms" package was used to draw the nomogram and calibration curve, the "pROC" package was used to illustrate the receiver operating characteristic (ROC) curve, the coords function was used to return the values of the variables used in the calculation of the ROC curve, the "modEvA" package was used for plotting the precision-recall (PR) curve, and the "rmda" package was used to draw the decision curve analysis (DCA) curve. LASSO regression is a compression estimation regression method. Its greatest advantage lies in the fact that by performing penalized regression on all variable coefficients, the coefficient of the most relatively insignificant independent variable becomes zero, so that it is excluded from the modelling, improving the modelling stability, and solving the problem of having highly correlated variables in the traditional model [17]. Therefore, LASSO regression was used to screen for possible factors of renal anaemia in IgA patients.
Multivariable logistic regression analysis was performed by entering the variables screened by the LASSO regression, and the β regression coefficient, 95% confidence interval, odds ratio (OR), and P-value were calculated (statistical significance was assessed bilaterally). A nomogram prediction model was developed based on the results of logistic regression analysis [18], and all potential predictors were used in the development of the model [19]. A calibration curve was drawn to assess the calibration of the renal anaemia prediction model, where a fit closer to the ideal model indicated a better prediction. The predictive accuracy and diagnostic performance of the validated model were quantified using the areas under the ROC and PR curves (AUCs) and Harrell's concordance index (C-index) [20]. The DCA curve was used to quantify the net benefit at different threshold probabilities in IgA nephropathy cohorts and determine the clinical role of the nomogram in avoiding the possibility of false positives and false negatives [21]. The nomogram model was internally validated by repeating bootstrap resampling 1,000 times [22], and the relative corrected C-index was calculated. For all statistical data, P < 0.05 was considered statistically significant. As there were a few missing values for some variables in the data, we assumed that the data were missing at random. In R software, we performed multiple interpolations for the missing values by chained equations [23].

General information
A total of 462 patients with IgA nephropathy were divided into an anaemia group (132 cases) and a non-anaemia group (330 cases). Table 1 presents all data of the patients in both groups, including demographic information and clinical laboratory tests and pathological examination results.

Screening for factors associated with renal anaemia in patients with IgA nephropathy
A total of 17 potential risk factors for renal anaemia were included in this study. The 17 variables were reduced by the LASSO regression reducing dimension algorithm, and representative risk factors for renal anaemia were selected. The lambda parameter value with the smallest 10-fold cross-validation error was used as the optimal value for the model, and the number of variables with a non-zero regression coefficient was also counted (Figure 1a and b). The results of the LASSO regression revealed the following nine variables as risk factors affecting the development of renal anaemia in patients with IgA nephropathy: age, sex, diastolic blood pressure (DBP), serum albumin (ALB), cholesterol (CHOL), triglyceride (TG), CKD stage, mesangial hypercellularity (M), and tubular atrophy/interstitial fibrosis (T).

Logistic regression analysis
The results of logistic regression analysis using age, sex, DBP, ALB, CHOL, TG, CKD stage, M, and T are shown in Table 2.

Construction of the nomogram prediction model and judgement of diagnostic efficacy
A model containing the above independent predictors was developed and presented as a nomogram ( Figure 2).
To apply the nomogram model, the scores of different variables are first obtained on the vertical line on the nomogram. Then, the scores of all variables are added to obtain the total score, which finally allows determination of the corresponding predicted risk value by connecting the prediction line to the total score line at the bottom of the nomogram. The calibration curve for the nomogram used to predict the development of renal anaemia in patients with IgA nephropathy showed good consistency in this cohort (Figure 3a). The C-index of the cohort was 0.848 (95% confidence interval: 0.811-0.885), while the area under the ROC curve was 0.835 (Figure 3b), the sensitivity was 0.72, the specificity was 0.79, the positive predictive value (PPV) was 0.58, the negative predictive value (NPV) was 0.88, and the area under the PR curve was 0.676 (Figure 4a), suggesting that the model has good discrimination and diagnostic efficacy. An internal validation set was evaluated by bootstrap resampling, and the calculated corrected C-index reached 0.823, suggesting that the good predictive accuracy of the model was preserved. For this nomogram, visualization of the prediction model can indicate a better predictive value.

Clinical application
The DCA of the renal anaemia nomogram is shown in Figure 4b. DCA showed that the net benefit rate was >0 at a high-risk threshold between 1 and 84%, which was clinically significant, and that the smaller the threshold was between 1 and 84%, the higher the net benefit rate.

Discussion
With the change in the medical model from evidencebased medicine to precision medicine, the latter has rapidly become the focus of attention of the global medical community. The era of big data provides unlimited possibilities for the realization of individualized medicine, fibrosis; e C, cellular/fibrocellular crescent. in which treatment plans can be tailored according to the individual characteristics of each patient. Clinical prediction models are increasingly widely used in clinical diagnosis, treatment decisions, and patient prognosis management through comprehensive statistical analyses of various clinical data and have become increasingly important as the value of clinical risk prediction and benefit assessment has increased [24]. The nomogram is a graph with high-and low-score lines based on multiple clinical indicators; it is based on multivariable regression analysis, can be used to predict a certain clinical outcome or adverse event rate, and is one of the most widely used statistical methods in clinical research [25]. Moreover, it has visual and mathematical advantages and facilitates the probability calculation of risk factors or other predictor variables in clinical practice [26]. This study is the first to apply a nomogram to the risk study of IgA nephropathy complicated by renal anaemia. We obtained nine variables that are easy to apply clinically and used them to develop and validate a new tool to predict the risk of IgA nephropathy complicated by renal anaemia. Risk factors extracted from among demographic characteristics, laboratory findings, and pathological findings were included in the nomogram for the individualized prediction of disease occurrence. Internal validation of the cohort data showed that the model had good discrimination and calibration ability, especially via its high C-index, indicating that the model can be widely and accurately used in a large number of clinical samples. These nine variables were age, sex, DBP, ALB, CHOL, TG, CKD stage, M, and T, which were associated with IgA nephropathy complicated by renal anaemia. The nomogram suggests that age ≥60 years (score = 43), female sex (score = 31), DBP ≤69 mm Hg (score = 36), ALB ≤30 g/L (score = 43), serum CHOL <5.72 mmol/L (score = 23), serum TGs <1.7 mmol/L (score = 23), CKD stage 3 or higher (score = 30 for CKD stage 3 and score = 100 for CKD stages 4-5), mesangial hypercellularity (M1, score = 7), and tubular atrophy/interstitial fibrosis (score = 5 for T1 and score = 38 for T2) may be key factors in determining renal anaemia in patients with IgA nephropathy.
In this study, we found that age ≥60 years was an independent predictor of renal anaemia in patients with IgA nephropathy, similar to the study by Melissa E. Stauffer, in which the incidence of anaemia tended to increase with increasing age in elderly CKD patients [4]. The reason for this is two-fold. First, with age, the kidney changes morphologically and functionally. Renal ageing is morphologically characterized by the gradual loss of nephrons, glomerulosclerosis, tubular atrophy, renal interstitial fibrosis, and arteriosclerosis, while functional changes mainly consist of reduced renal effective plasma flow and decreased eGFR [27,28]. Second, elderly patients have notably poor prognostic factors, such as decreased eGFR, massive proteinuria, a high number of comorbidities, and relatively severe renal chronicity [29]. Another independent risk factor we identified was female sex. Oh et al. found that in IgA nephropathy, female sex was strongly associated with decreased haemoglobin [30]. Poudel et al. also found a high incidence of anaemia in women with CKD [31]. The specific reasons for the difference in anaemia between sexes are still unclear; one possibility is that hepcidicin, which is increased in postmenopausal women and constant with age in men, aggravates chronic disease-related anaemia at high levels [32]. Therefore, doctors should pay special attention to female patients with anaemia in clinical work, and sex should also be considered in the clinical observation  and intervention of patients with renal anaemia. In terms of blood pressure, we found that a lower DBP was associated with anaemia, which is speculated to be related to the frequent occurrence of increased pulse pressure in patients with vascular calcification in CKD. Vascular calcification is also another common complication of CKD [33]. Liu et al. also found that orthostatic hypotension in CKD is closely related to haemoglobin reduction [34]. Lower ALB, serum CHOL, and TGs are key factors in anaemia and may represent a state of malnutrition in the body, which is also closely related to anaemia because of chronic inflammation in CKD [7,35]. Investigation of the interaction between nutritional markers and inflammatory cytokines or adipokines is necessary to understand the development of anaemia in CKD. In the nomogram model, the other two factors accounting for the higher weight score and that were independent predictors of concurrent renal anaemia in patients with IgA nephropathy were patients with CKD stages 3-5 and pathological findings of more severe tubular atrophy/interstitial fibrosis (T2). We demonstrated that the prevalence of anaemia increases with CKD stage, similar to previous studies [11]. We also found that patients with stage 3 CKD need to be monitored for renal anaemia and possibly undergo intervention. This is in line with the results of the study by Jha et al., who found that patients with stage 3 CKD complicated by anaemia (<13 g/dL in men or <11 g/dL in women) had an equivalent risk of progression to end-stage renal disease to non-anaemic patients with stage 4 CKD, further indicating the need for early intervention in patients with CKD complicated by anaemia [36]. The reason for this may be related to the fact that tubular atrophy/interstitial fibrosis injury can reach more than 50% (T2) in patients with IgA nephropathy, which is the link between patients with stages 3-5 CKD and renal anaemia. In further analysis, an important cause of renal anaemia is the insufficient production of EPO by cells called erythropoietin-producing cells (REPs). Numerous studies have shown that REPs are intrinsically renal interstitial fibroblasts [37][38][39], and hypoxia is a switch for their activation, regulating EPO production through the PHD2-HIF2α-EPO signalling pathway [5]. More interestingly, it has been found that REPs transform into myofibroblasts upon kidney injury, leading to the development of renal interstitial fibrosis [40]. Renal interstitial fibrosis in CKD is closely related to renal anaemia, which inspires us to slow the further progression of CKD by targeting and regulating the cellular characteristics of REPs while preventing renal anaemia.
In addition, we analysed ROC and PR curves to validate the nomogram model and evaluated the clinical usability and benefits of the prediction tool through DCA, which suggested that this model can be applied to larger clinical samples. However, this study has some limitations. First, our study is a single-centre study, and The y-axis measures the net benefit. The blue solid line represents the renal anaemia nomogram. The solid grey line represents the assumption that all patients with IgA nephropathy developed renal anaemia. The solid black line represents the assumption that none of the patients with IgA nephropathy developed renal anaemia. The decision analysis curve showed that the net benefit rate was >0 at the high-risk threshold of 1-84%, which was clinically significant. Abbreviations: PR curve, precision-recall curve; AUC, area under the curve.
the model may require further external validation through a multicentre sample study. Second, the risk factors did not include all potential factors affecting renal anaemia, and thus the model may be somewhat biased.
In conclusion, we constructed a nomogram model for predicting the risk of renal anaemia in patients with IgA nephropathy. The model achieved a reasonable accuracy, discrimination, and predictive ability, indicating its potential usefulness for the clinical screening of high-risk patients and the development of more targeted intervention strategies.
Funding information: This research was funded by National Natural Science Foundation of China (2018YFC1704203).

Conflict of interest:
The authors declare no conflict of interest.
Data availability statement: The available datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.