Exploring the potential of Physics-Informed Neural Networks to extract vascularization data from DCE-MRI in the presence of diﬀusion

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is widely used to assess tissue vascularization, particularly in oncological applications. However, the most widely used pharmacokinetic (PK) models do not account for contrast agent (CA) diﬀusion between neighboring voxels, which can limit the accuracy of the results, especially in cases of heterogeneous tumors. To address this issue, previous works have proposed algorithms that incorporate diﬀusion phenomena into the formulation. However, these algorithms often face convergence problems due to the ill-posed nature of the problem. In this work, we present a new approach to ﬁtting DCE-MRI data that incorporates CA diﬀusion by using Physics-Informed Neural Networks (PINNs). PINNs can be trained to ﬁt measured data obtained from DCE-MRI while ensuring the mass conservation equation from the PK model. We compare the performance of PINNs to previous algorithms on diﬀerent 1D cases inspired by previous works from literature. Results show that PINNs retrieve vascularization parameters more accurately from diﬀusion-corrected tracer-kinetic models. Furthermore, we demonstrate the robustness of PINNs compared to other traditional algorithms when faced with noisy or incomplete data. Overall, our results suggest that PINNs can be a valuable tool for improving the accuracy of DCE-MRI data analysis, particularly in


Introduction
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a widely used imaging technique that provides information on tissue perfusion and permeability of different tissues.By analyzing the uptake and washout of a contrast agent (CA) in a tissue, DCE-MRI can help diagnose and monitor various pathologies, particularly tumors [2,47,26].To extract quantitative information of these tissues, the voxel-wise CA concentration versus time curves need to be fitted to some of the pharmacokinetic (PK) models available [17].Although there are multiple PK models, some of the most employed models are the standard Tofts model (STM) and the extended Tofts model (ETM) [39,38].Both are compartmental models that incorporate two distinct compartments -the intravascular and extravascular-extracellular space (EES) -and account for the exchange of CA between them.Unlike the STM, the ETM considers that a fraction of the voxel is occupied by blood vessels.Therefore, the ETM incorporates an additional term that includes the contribution of the intravascular CA to the total CA concentration in the voxel.
Thanks to the great advances in Machine Learning (ML), and more precisely in Deep Learning (DL), in the last years several authors have developed different DL-based methods to retrieve PK model's parameters from DCE-MRI data.Such works mainly focus on the use of convolutional neural networks (CNN) [7,41,40] and recurrent neural networks (RNN), such as long short-term memory (LSTM) networks [48].The main drawback of these architectures is that both are purely data-driven approaches.This means that these networks need to be trained on a sufficiently large dataset that includes different patients from distinct clinical centers where diverse protocols may have been applied.Besides, they do not only need the raw clinical data, but also the "ground-truth" for the parameters of the model being fitted.Although in the case of the STM or the ETM these parameter values can be extracted using a fast algorithm such as non-linear least-squares (NLLS) [37] fitting, if we want to include the diffusive term we need to incorporate other optimization algorithms, such as those proposed in [4,28,8,33,31].This poses a major challenge, since those algorithms that are fast enough for this task make assumptions that limit their applicability to certain tissues [4,8] and those which can be widely applied to different tissues are too computationally expensive to fit such a large dataset and struggle to converge to the exact solution in some situations [28,33,31].Thus, it appears that the problem we are addressing requires that the known physical laws that govern the processes of CA transport are included in the network architecture, steering away from purely data-driven approaches.
Recently, Physics Informed Neural Networks (PINNs) [29,16,3] have emerged as a promising alternative for solving PDEs and other inverse problems.PINNs combine the flexibility and scalability of neural networks with the physical constraints imposed by the underlying equations, allowing for efficient and accurate solutions even for highly nonlinear and ill-posed problems.Other authors have used this type of neural networks (NN) to fit tracer-kinetic models to DCE-MRI data [27,12], outperforming current NLLS method.Zapf et al. [46] used PINNs to estimate the diffusion coefficient governing the long term spread of molecules in the human brain from diffusion tensor imaging (DTI) MR data, showing its potential for this task.The results obtained in these works show that PINNs can successfully retrieve PK parameters and solve inverse problems of CA diffusion in biological tissues.
In this paper, we investigate the potential of using PINNs to fit diffusion-corrected pharmacokinetic models to synthetic DCE-MRI data, with the aim of establishing a robust framework for future analysis.To facilitate our exploration, we focus on 1D spatial domains while highlighting the broader implications of this approach for advancing DCE-MRI data analysis.
With this physics-driven NN architecture we aim to overcome the limitations of traditional solvers and achieve more robust parameter estimation for DCE-MRI analysis, showing the potential of PINNs to extract more accurate vascularization data from this type of MR sequences, even when faced with noisy and incomplete data.

Methods
This section begins with the introduction of the D-ETM formulation.Then, the fundamentals of PINNs will be explained, finishing with a description of the PINN implementation chosen for the D-ETM.

D-ETM formulation
The ETM is a compartmental model that, as stated previously, considers two different compartments (intravascular and EES compartments) and the exchange of CA between them.This model assumes the hypothesis of well-mixed compartments presented by Tofts [38], which states that no CA concentration gradients exist within the respective compartments (Eq.( 1)): where   (, ) is the CA concentration in the voxel;   (, ) and   (𝑡) are the CA concentration in the EES and intravascular compartments, respectively;   () and   () are the volume fraction of each of these compartments with respect to the voxel and  is the spatial coordinates vector.The D-ETM formulation, as defined in [31], adds a diffusive term to the differential formulation of the ETM, obtaining Eq. ( 2): where K Trans (x) is the extravasation rate between the intravascular and the EES compartments.This formulation is based on the concept of effective diffusivity applied to biological tissues.Essentially, we are assuming that the transport of particles through any biological tissue can be viewed as the transport of particles through a porous medium [24,25,23].Given the similarity between porosity and   , as both measure the volume fraction of "empty" space [36], the effective diffusivity coefficient can be defined as: where D is the diffusivity coefficient of CA in free medium.

Physics-Informed Neural Networks
In recent years, the field of biomedical engineering has significantly increased in the use of deep learning techniques for a wide range of applications, from medical image analysis [32] to drug discovery [5].Deep learning algorithms have shown great promise in improving the accuracy and efficiency of tasks such as disease diagnosis, prognostication, and treatment planning [22].However, many of these approaches rely on large amounts of labeled data, which can be challenging to obtain in biomedical settings [21].This is where Physics-Informed Neural Networks (PINNs) have emerged as a promising alternative, leveraging the underlying physics of the problem to reduce the reliance on labeled data and improve model generalization [6].
PINNs incorporate prior physical knowledge of the problem into the neural network architecture, making them more efficient and accurate than traditional data-driven DL approaches.PINNs can include partial differential equations (PDEs) to encode the governing physics of the problem, and then use neural networks to approximate the solution to the PDEs.This combination of physics-based constraints and data-driven learning makes PINNs particularly effective for problems with limited data and complex physical phenomena.In the following sections, we will explain the basic concepts of PINNs and how we used them to solve our specific problem.

Fundamentals
PINNs are based on two main concepts: the Universal Approximation Theorem [13] and Automatic Differentiation (AD) [11].
The Universal Approximation Theorem states that any arbitrary function, no matter its complexity, can be approximated by a NN with only one hidden layer and a finite number of neurons.
Automatic differentiation is a technique for efficiently computing the derivatives of a function specified by a computer program.It works by recursively applying the chain rule of calculus to elementary operations such as addition, multiplication, and elementary functions like exponentials and trigonometric functions.This allows us to compute exact derivatives to machine precision without the need for symbolic manipulation or numerical approximations.In other words, we can obtain the derivatives of a function with the same precision as the function itself.
D. Sainz-DeMena, M.A. Pérez and J.M. García-Aznar Therefore, one can train a NN to express solutions of time-dependent linear and non-linear PDEs from a set of inputs.Given a PDE of the form: where  [;  ] is a differential operator with parameters  acting on the hidden solution (, ).We can approximate the solution of the PDE (, ) with a NN such that (, ) ≈ (, ; ), being  the NN parameters and (, ) the input variables.
The process of NN training comprises a set of training data, a loss function  that measures the fitness performance of the NN with respect to the objective, and an optimizer that adjusts the NN parameters to minimize .In the case of PINNs, we incorporate the PDE as a constraint in  to ensure that the solution obtained by the NN satisfies the physical laws described by the PDE on a certain set of collocation points  (5) where: and   and   are the number of training points and collocation points, respectively.The first term of the loss function (Eq.( 6)) is the mean square error (MSE) between the predicted solution and the ground truth solution at the sampled training points.The second term (Eq.( 7)) is the MSE of the PDE residual at the collocation points.To compute this term, the differential operators included in the PDE are computed using automatic differentiation.Both terms are scaled by different weighting factors,   and   that control the relative importance of the data and PDE losses.Additionally, one may add other terms to the loss function related to boundary or initial conditions with their correspondent weighting factors, although they are not mandatory for inverse problems [30].
By minimizing this loss function, the NN learns to approximate the solution of the PDE at the training points while it satisfies the physical laws described by the PDE at the collocation points.For inverse problems this minimization process not only updates the NN weights and biases but also retrieves the unknown PDE parameters P.

Fitting the D-ETM using PINNs
After presenting the general concepts of PINNs, we now shift our focus to the specific implementation used to fit the D-ETM to synthetic DCE-MRI data.
Our implementation is based on previous works [29,14,12] that solved similar problems.Therefore, we opted for a densely connected forward neural network (FNN) architecture consisting of 8 hidden layers and 100 neurons per layer and the hyperbolic tangent (tanh) as the activation function.A normalization layer was included before the hidden layers to normalize the spatial and temporal coordinates to the [-1, 1] range, as it is recognized as a safeguard against vanishing or exploding gradients, as well as a stabilizing factor for the training procedure [29].The network parameters  were initialized using Glorot initialization [9] while the D-ETM parameters P were given random values within the physiological range (   between 0.05 and 0.5  −1 ;   between 0.3 and 1.0 and   between 0.01 and 0.3) [31,28].The original formulation of the loss function presented in Eq. ( 5) is modified to include two additional terms: where   represents the initial conditions,   is a soft constraint for the PDE parameters and   and   are their respective weighting factors.These additional terms are defined as: where   is the number of points used to evaluate the initial condition (   (, ) = 0 ) and   is the number of points where the PDE parameters are evaluated.Given that the three D-ETM parameters (   ,   and   ) are spatial distributions,   should be equal or greater than the spatial discretization of the data points to achieve sufficient precision.The soft constraint presented in Eq. ( 10) is defined as: Eq. ( 11) sets the lower bound for all three PDE parameters to 0, while the upper bound for the sum of   +   is set to 1.This constraint ensures that the sum of the intravascular space and the EES does not exceed the total voxel volume.These limits ensure that the parameters obtained are physically plausible.The weighting factors were all set to 1 except   , which was given a value of 1000.Based on the different tests conducted, this combination of weights leads to the best results.The loss function defined in Eq. ( 8) is minimized using the Adam optimizer [18] with a constant learning rate of 0.001 during 60,000 epochs.A schematic overview of this PINN implementation is presented in Fig. 1.
The number of training points was obtained from the resolution of the synthetic data, with a total of 360 points in the temporal domain (1 s resolution) and 60 points in the spatial domain (0.1 mm resolution) [31].That resulted in a total of 21.600 training points.  and   were set to 100 and 120, respectively.We found empirically that enhancing the spatial discretization of PDE parameters (  ) with a factor of 2 with respect to the data resolution increased the accuracy.Finally, 10,000 collocation points were distributed over the whole domain using the Latin Hypercube Sampling (LHS) method [35].
The implementation shown so far includes most of the features described in the original work that laid the foundations for PINNs [29].However, the results obtained with this implementation showed that, although the network was able to fit the data curves accurately, the error in the PDE parameters, particularly   , was too high (Appendix A).
To retrieve more accurate PDE parameters, we introduced the residualbased adaptive refinement (RAR) method [20].This method aims to improve the distribution of collocation points during training.After a certain number of epochs (  ), the PDE residual is evaluated at a new set of collocation points randomly sampled.Then, these points are ranked by their mean   value.Finally, the top k points are added to the initial list of collocation points.This technique helps the NN focus on those regions where the PDE residual is higher, enhancing the gradients corresponding to the PDE parameters in that regions.Through successive tests, the RAR parameters were set to   = 500 and  = 500.
The deep learning library Tensorflow 2 [1] is used to implement all the methods and optimizations described earlier.
It is important to note that all these hyperparameters were manually calibrated until we obtained satisfactory results.Moreover, it should be emphasized that before applying our methodology to more complex 2D cases, a proper tuning of the hyperparameters is necessary to understand and measure how each hyperparameter affect the results.

Examples of application
To assess the effectiveness of PINNs in accurately determining the D-ETM parameters in comparison to previous optimization algorithms [31], two distinct in silico cases were devised.Both cases were based on a 1D spatial domain that corresponds to a cross-section of a circular tumor with two different regions: a necrotic core and a highly vascularized rim, similar to the benchmark case used in previous studies [28,8,31].The objective of this benchmark case was to highlight the effect of diffusion in CA transport, therefore pointing out the limitations of the ETM.The different sets of synthetic CA concentration time courses were generated using the forward implementation of the D-ETM in ANSYS (Ansys Inc., TX, USA), as described in [31].In all cases the diffusion coefficient for CA in free medium, D, was set to 2.6E-04  2 ∕ [19,10]; while the arterial input function (AIF) was the same as the one used in our previous work [31].
In the first case, the distribution of    and   is homogeneous through each of the regions, taking values of 0.3 min -1 and 0.1 in the vascularized rim and 0.05 min -1 and 0.01 along the necrotic core, respectively.The second case, on the other hand, is based on a heterogeneous distribution of these two vascularization parameters:    and   defined in the ranges [0.2, 0.3] min -1 and [0.07, 0.13], respectively, along the vascularized rim and    taking values between 0.02 and 0.07 min -1 and   ranging between 0.0 and 0.05 in the necrotic core.In both cases,   was set to 0.5 through the whole domain, to replicate the configuration presented in previous works [28,8,31].We employed the absolute relative difference (ARD) metric to quantify the error in the fitted parameters for each model.
Additionally, each case was fitted with the two original models: the ETM [38] to and the D-ETM [31].The latter was fitted using two different optimization methods: the FE-based algorithm presented in [31] (D-ETM FE) and the PINN approach presented in this work (D-ETM PINN).As stated before, in the D-ETM PINN the D-ETM parameters were initialized using random values within their physiological ranges.In the case of the D-ETM FE the output from the ETM fitting was used as initial seed.This was done to reduce the complexity of the minimization process, trying to avoid local minima.

Homogeneous distribution of parameters
Taking a look at the results obtained for the homogeneous case (Fig. 2 and Table 1), it is clear that this initial seed was not sufficient to prevent the D-ETM FE to converge to a local minimum in this case.In fact, it shows a greater error than the ETM for the    distribution (87% of nodes fitted by the ETM have an ARD error lower than 20%, while the D-ETM has only 73% below that threshold).This outcome was expected, since the limitations of this D-ETM FE when dealing with homogeneous distributions were first discovered in [31].
In comparison, this test proved the robustness of the D-ETM PINN when facing ill-posed problems, getting 99% of the nodes below the ARD threshold with a median ARD of only 0.68%, 3 times lower than the ETM.The Interquartile Range (IQR) also highlights the increased dispersion in the ARD distribution of the D-ETM FE mainly and, to a lesser extent, the ETM.
A key point of these results is the ability of PINNs to accurately adjust   values in necrotic zones, where    takes values close to zero.
Previous models and algorithms [28,31] failed to retrieve accurately the distribution of   in those regions.This was caused by the vanishing effect observed in the gradient of   with respect to   , since this gradient was dependent on the    value.Therefore, when    tends to zero, the gradient does the same, causing the algorithm to converge to a local minimum.Thanks to the RAR method, PINNs are able to reduce this vanishing effect, overcoming the convergence issues and achieving great accuracy for   in necrotic regions.

Heterogeneous distribution of parameters
The second set of simulations corresponded to the same 1D domain of a circular tumor, but defining heterogeneous distributions of parameters.This case aims to resemble more closely to a real tumor where some degree of heterogeneity is present.
Results obtained are consistent with previous works [28,8,31]: the ETM tends to average the parameters distribution, failing to capture the heterogeneity shown in Fig. 3, while the two implementations of the D-ETM (FE and PINN) accurately depict this heterogeneity (Table 2).This effect is of particular significance in the case of the   , where only 46% of nodes fitted by the ETM show an ARD lower than 20%.This metric raises to 84% and 91% in the case of the PINN and FE implementations of the D-ETM.While the PINN method shows a slightly greater error for   , it outperforms the D-ETM FE in the case of    , with 94% of nodes below the ARD threshold versus 78% in the case of the D-ETM FE.The averaging effect shown by the ETM has a clear impact on the    error metrics, having only 52% of nodes with and ARD below 20%.
Regarding   , results are similar to the homogeneous case: both the ETM and the D-ETM FE fail to retrieve the   distribution.Although around three quarters of nodes are below the ARD threshold, Fig. 3 shows that in some nodes the ARD value is close to 100%.This can be explained by the vanishing effect explained previously, which prevents the D-ETM FE to converge to the solution.Again, the D-ETM PINN overcomes this issue and gets more than 98% of values below the ARD threshold, keeping the maximum ARD value below 25%.with respect to the D-ETM FE.While the latter converges to a local minimum, as in [31], the former retrieves accurately the distribution of parameters.

Table 1
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the homogeneous case.The metrics computed are the median and Interquartile Range (IQR) of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.

Table 2
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the heterogeneous case.The metrics computed are the median (m) and IQR of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.These metrics were computed from a set of 10 simulations with different heterogeneous distributions of parameters.It is worth noting that despite this good error metrics, the D-ETM PINN shows some kind of averaging patterns in some regions of the spatial domain, especially for   .This is probably due to the different effect these parameters have on the cost function depending on the algorithm.While in the case of the D-ETM FE the   curves depended on the parameters value through a forward FE simulation (see [31] for further details), therefore increasing the impact of PDE parameters (mainly   ) on the cost function; in the case of the D-ETM PINN the PDE parameters only impact part of the loss function (  and   ).Therefore, there can be small errors on the PDE parameters distribution while the total loss value  is minimized, since the data loss (  ) is being reduced by updating the NN parameters ().

Testing the robustness of the PINN approach against noisy and incomplete data
After demonstrating the increased accuracy of the D-ETM PINN with respect to the D-ETM FE, we test its robustness when faced with noise and incomplete temporal data.

Influence of noise
Initially, a set of 1D heterogeneous distributions of parameters similar to those presented in Fig. 3 were generated.Next, experimental noise was added to the generated   data curves using a Gaussian distribution with a standard deviation (SD) equal to a fraction (2.5%, and 5%) of the highest concentration value reached in the whole domain, similar to previous works [28,31].
The ARD distributions for each of the parameters and each model are shown in Fig. 4.These results show that the D-ETM PINN is much more robust to noise than the D-ETM FE, even in the case of   .When faced with medium levels of noise, both methods show similar ARD for this variable (Table 3).However, when the noise level reaches 5%, the D-ETM PINN is more accurate than the D-ETM FE (Table 4).
The results of the other two variables follow this same trend: the influence of noise is much lower in the case of the D-ETM PINN compared to the D-ETM FE.The ETM, however, seems to be unaffected by noise, reaching a similar accuracy to the D-ETM FE for high noise levels (5%).

Table 3
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the heterogeneous case corresponding to the 2.5% noise level.The metrics computed are the median (m) and IQR of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.These metrics were computed from a set of 10 simulations with different heterogeneous distributions of parameters.

Table 4
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the heterogeneous case corresponding to the 5% noise level.The metrics computed are the median and IQR of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.These metrics were computed from a set of 10 simulations with different heterogeneous distributions of parameters.

Temporal undersampling
In this final subsection, we investigated the impact of incomplete data on the accuracy of the ETM, the D-ETM FE, and D-ETM PINN.To achieve this, we conducted a temporal undersampling analysis on both the concentration data curves and the arterial input function (AIF).Specifically, we considered two scenarios where the temporal resolution was reduced to 5 seconds and 10 seconds, respectively.The aim of this analysis was to simulate realistic situations in which the temporal resolution may deviate from the ideal resolution of 1 second used in our previous experiments.
Our results, presented in Fig. 5, revealed that the ETM was almost unaffected by the undersampling, except for the   variable, which was more sensitive to it.Conversely, the D-ETM FE performed well at a temporal resolution of 5 s, but its performance degraded when the temporal resolution was reduced to 10 s, obtaining higher errors than the ETM.

Table 5
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the case with a temporal resolution of 5 s.The metrics computed are the median (m) and IQR of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.Interestingly, the D-ETM PINN demonstrated superior performance, even under the worst-case scenario of a 10 s temporal resolution.As shown in Table 5 and Table 6, the D-ETM PINN outperformed both the D-ETM FE and the ETM.With a 10 s resolution, 93% and 89% of nodes show an ARD lower than the 20% threshold for    and   , respectively.This was almost twice the proportion observed for the ETM and the D-ETM FE.Even in the case of the sensitive   variable, the D-ETM PINN maintained a relatively high proportion (67%) of values below the ARD threshold, in comparison to the D-ETM FE (18%) and the ETM (15%).
In summary, our findings suggest that the D-ETM PINN is more robust to incomplete data and performs better than both the ETM and D-ETM FE under these conditions.The ETM performs reasonably well except for the   variable, while the D-ETM FE shows good performance at a 5 s temporal resolution but struggles with further undersampling.

Discussion
DCE-MRI is a powerful imaging technique widely used in clinical practice, particularly in oncology, to assess the vascular properties of tissues.The ability to obtain functional information about tumors using DCE-MRI is useful for diagnosing, staging, and monitoring tumors' response to antiangiogenic therapies [2,47,26].However, accurately retrieving physiological parameters from DCE-MRI is a challenging task due to the complexity of the underlying pharmacokinetic models.Traditional models, such as the standard and extended Tofts models, are widely used to estimate these parameters but are known to produce inaccurate results in regions where there is significant passive delivery of CA [34,15,19,4,28,8,33,31].Other authors have proposed different approaches that include the diffusion of CA and have developed several methods to fit their models to DCE-MRI data.Nevertheless, these approaches either have a limited applicability due to the hypothesis considered or due to their convergence issues computational cost.Therefore, there is a critical need for new methods that can retrieve more accurate parameters from DCE-MRI data.
The main objective of this study was to explore the use of PINNs as an alternative to other traditional algorithms to fit one of the models that include the diffusion term: the D-ETM.To do so, we tested the performance of this PINN approach versus the FE-based optimization algorithm presented in [31].Both methods were compared to the ETM to highlight the importance of diffusion in CA delivery.
We tested these approaches on a 1D domain resembling a slice of a circular tumor with a highly vascularized rim and a necrotic core.This highlights CA diffusion effects, creating large CA concentration gradients between regions.
Our results indicate that PINNs are a promising tool to solve the ill-posed inverse problems associated with the fitting of the D-ETM to DCE-MRI data.The PINN-based approach kept almost all nodes in the domain below the acceptable error threshold.Previous algorithms, such as the FE-based, failed to retrieve the   distribution in necrotic regions, due to a low influence of this parameter on the global solution.However, the use of PINNs along with the RAR method overcome this limitation, outperforming traditional algorithms.Even in the homogeneous case, where the FE algorithm converged to a local minimum, the PINN approach depicted accurately the distribution of all the parameters.
To further demonstrate the robustness of PINNs, we tested its performance in the presence of noisy and incomplete data.The results obtained show that the PINN was affected to a much lower extent than the FE algorithm, retrieving very accurate distributions of D-ETM parameters.Taking a look at the error distribution for each of the three approaches tested we could conclude that PINNs combine (and even increase) the precision of the FE algorithm with the robustness of the ETM against noisy and incomplete data.
Despite these promising results, there is still room for improvement.First, in this study we did not perform a comprehensive hyperparameter tuning, which may have resulted in suboptimal performance.Future studies should focus on optimizing the PINN hyperparameters to further improve its performance [43].Second, there are other additional features of PINNs described in the literature that were not included in our study.These may include the use of gradient-enhanced PINNs [20], the inclusion of annealing algorithms to update each loss weight (  ) or new NN architectures optimized for PINNs [42].Incorporating these features in future studies may enhance the accuracy and robustness of the PINN approach while reducing the training time.
The main limitation of our methodology is the lengthy training time required for the PINN.Our experiments were conducted on a PC equipped with a NVIDIA RTX 3070 GPU, 32 GB RAM, and an Intel i7-11700K CPU, with an average training time of around 30 minutes.This is even slower than the current FE algorithm, which took an average of 20 minutes on the same PC.Meanwhile, the NLLS algorithm used for the ETM required only a few seconds to fit all nodes in the domain, so it cannot be directly compared to either of our methods.To address this limitation, proper calibration of the PINN may help to reduce the training time.Additionally, transfer-learning techniques [45,44] can be applied to further lower the computational cost.
Our study demonstrates the capability of PINNs to overcome convergence issues when fitting the D-ETM to DCE-MRI data, outperforming previous algorithms.Currently, this 1D implementation has a limited applicability on in vivo data as living tissues rarely have axisymmetric properties.Therefore, this work lays the foundation for further research that improves our implementation and optimizes it for its application to 2D cases, the first and necessary step before applying this methodology to in vivo cases.

Fig. 1 .
Fig. 1.Schematic representation of the PINN implementation developed to fit the D-ETM.PINNs take advantage of the computational efficiency of automatic differentiation to get the derivatives of   needed for the computation of   .As in any other inverse problem, the optimizer not only updates the NN parameters ( * ) but also the PDE parameters ( * ).

Fig. 2 .
Fig. 2. Homogeneous case.Reference values and results of the D-ETM (with both methods, FE and PINN) and the ETM fitting.Results highlight the limitation of the ETM when faced with significant diffusion gradients, tending to average the    along that region.They also show the improved accuracy of the D-ETM PINN

Fig. 3 .
Fig. 3. Heterogeneous case.Reference values and results of the D-ETM (with both methods, FE and PINN) and the ETM fitting.

Fig. 4 .
Fig. 4. Comparison of the ARD probability density function (PDF) of each of the parameters for each of the models and three different levels of noise: 0%, 2.5% and 5%.Vertical lines correspond to the median of the ARD distributions.The D-ETM PINN is much more robust to noise compared to the D-ETM FE, while the ETM does not seem affected by noise.

Fig. 5 .
Fig. 5. Comparison of the ARD probability density function (PDF) of each of the parameters for each of the models over three different time resolutions: 1 s (No undersampling), 5 s (1/5) and 10 s (1/10).Vertical lines correspond to the median of the ARD distributions.The D-ETM PINN outperforms the other two, showing great robustness even in the worst scenario.
In the case of inverse problems, where (, ) is known on some set of training points {  ,   ,   }

Table 6
Error metrics comparison between the D-ETM (FE and PINN methods) and the ETM for the case with a temporal resolution of 10 s.The metrics computed are the median (m) and IQR of the ARD distribution and the fraction of nodes whose ARD is below the defined threshold of 20%.