Cochlear pharmacokinetics - Micro-computed tomography and learning-prediction modeling for transport parameter determination

Inner ear disorders such as sensorineural deafness and genetic diseases may one day be treated with local drug delivery to the inner ear. Current pharmacokinetic models have been based on invasive methods to measure drug concentrations, limiting them in spatial resolution, and restricting the research to larger rodents. We developed an intracochlear pharmacokinetic model based on an imaging, learning-prediction (LP) paradigm for learning transport parameters in the murine cochlea. This was achieved using noninvasive micro-computed tomography imaging of the cochlea during in vivo infusion of a contrast agent at the basal end of scala tympani through a cochleostomy. Each scan was registered in 3-D to a cochlear atlas to segment the cochlear regions with high accuracy, enabling concentrations to be extracted along the length of each scala. These spatio-temporal concentration profiles were used to learn a concentration dependent diffusion coefficient, and transport parameters between the major scalae and to clearance. The LP model results are comparable to the current state of the art model, and can simulate concentrations for cases involving different infusion molecules and different drug delivery protocols. Forward simulation results with pulsatile delivery suggest the pharmacokinetic model can be used to optimize drug delivery protocols to reduce total drug delivered and the potential for toxic side effects. While developed in the challenging murine cochlea, the processes are scalable to larger animals and different drug infusion paradigms.


Introduction
Current and future treatments of inner ear disorders such as sensorineural hearing loss (Schreiber et al., 2010), M eni ere's disease (Minor et al., 2004) and genetic diseases require local delivery of restorative and protective compounds to the inner ear. Novel inner ear therapeutics and delivery systems have been a major thrust of pharmacokinetics research in the past decade Peppi et al., 2018). While significant work has been done to improve drug delivery methods (Plontke et al., 2017), challenges remain with accurately determining pharmacokinetics of drugs delivered inside the inner ear. Direct methods for measuring drug concentrations are based on measuring a specific ion (e.g.: trimethylphenylammonium) using ion-selective microelectrodes (Salt and Ma, 2001), or by drawing minute perilymph samples from the basal (Salt et al., 2003;Laurell et al., 2002) or apical turns Mynatt et al., 2006) of the cochlea or from the vestibular canals (Salt et al., 2012;Arnold et al., 2005). These methods are limited in their spatio-temporal resolution and are difficult to employ in animals smaller than the guinea pig. Perilymph sampling also results in cerebral spinal fluid (CSF) contamination in the samples (Salt et al., 2003, complicating interpretation of the concentration results. There remains a need for advanced methods to monitor drug concentrations within the three cochlear scalae over time that provide enhanced spatial resolution and are suitable for a broad range of infused compounds and animal species, including the genetically-mapped murine model system. Indirect measurement methods can preserve the structural integrity of the inner ear and help obtain spatio-temporal concentration profiles within the cochlea without the issues faced with invasive measurements and fluid sampling. For example, functional assessment methods like distortion product otoacoustic emissions (DPOAE) (Borkholder et al., 2010;Frisina et al., 2018) and auditory brainstem response (ABR) (Conn, 2008;Borkholder et al., 2014) estimate concentration within the cochlea based on the physiological response of cochlear hair cells or spiral ganglion neurons to a delivered ototoxic agent (Müller et al., 2005). In this approach, a physiological place-frequency map of the cochlea is used to correlate auditory responses with cochleotopic position (Borkholder et al., 2010). However, these methods only provide an indirect estimate of concentration, and accuracy may be impacted by non-linear sensitivity effects. Magnetic resonance imaging (MRI) has been used for qualitative (Naganawa et al., 2011) and semiquantitative (King et al., 2011;Zou et al., 2012;Yamazaki et al., 2012) assessment of gadolinium distribution within the cochlea following transtympanic injection in humans and application to the round window membrane niche in guinea pigs. However, resolution limitations preclude use of MRI for high fidelity pharmacokinetic analysis, especially in smaller rodents such as the mouse with total cochlear length less than 7 mm.
In order to enable intracochlear pharmacokinetic analysis in mice via high resolution imagery, the use of contrast-enhanced micro computed tomography (mCT) was first introduced by Haghpanahi and colleagues (Haghpanahi et al., 2013). CT works on the basic principle of combination of X-ray attenuation measurements (attenuation coefficients) taken from multiple angles, usually in a cylinder, sphere, or helix around the object, to obtain a volume consisting of slices (tomograms) in each of the three orthogonal dimensions (Kak et al., 2002). mCT is similar to the CT scanners used in medical facilities, but the scale of objects imaged is smaller than 125e165 mm Ritman (2004); Inveon Platform, 2008, and the resolution of the scans can be as small as 100 nm (Ritman, 2004). Delivery of a contrast agent with a high attenuation coefficient relative to the tissues surrounding it enables direct visualization of the contrast agent concentration within the imaged 3-D space. Haghpanahi infused an organoiodine radiographic contrast agent in vivo into the mouse scala tympani (ST) via cochleostomy during live animal mCT scanning (Haghpanahi et al., 2013). By establishing a calibrated correlation between image intensity and contrast agent concentration, the intracochlear concentrations can be noninvasively quantified in the major compartments of the cochlea. To identify fine membrane features that delineate the cochlear compartments, Haghpanahi performed a two-dimensional (2-D) image registration of selected slices to a murine cochlear atlas (Santi et al., 2008), and passed segmentation or labeling information from the atlas to the mCT scan. Concentrations were extracted at 5 locations along ST and scala vestibuli (SV) at four time points spread across a 90-min infusion (Haghpanahi et al., 2013) demonstrating the potential for direct, non-invasive assessment of intracochlear concentrations of a delivered drug.
In the present work, we extend this high-resolution technique with 3-D image registration and segmentation enabling high spatio-temporal resolution interrogation of all three major cochlear compartments in the mouse: ST, SV, and scala media (SM). The main challenge of this approach is to accurately identify the 3-D regions in the mCT scans that correspond to the cochlear structures of interest. Manual segmentation of each structure in each 3-D scan can be accurate, but the amount of effort required for manual segmentation does not enable scaling this approach to handle multiple scans of several mice. Instead, we focus on the automatic technique of atlas-based registration (Santi et al., 2008;Yang et al., 2011;Thorne et al., 1999). Atlas-based registration involves geometrically aligning an atlas (a 3-D image for which the regions in question have been pre-segmented and labeled) to the target image (a mCT scan of the current mouse). Once the atlas has been registered to the target, the locations and labels of the presegmented regions are transferred into the geometry of the target image. In this work, we use the mouse cochlea database of Santi et al. (Santi et al., 2008) as the atlas. The mouse cochlea database is a complete histological database with 17 manually segmented and labeled sections of mouse inner ear, including ST, SM, SV and the spiral ligament (SL). By capturing a series of six mCT scans over 170 min before and during infusion of a contrast agent into the murine cochlea, we measured the spatial concentration profiles through each of the scalae in each scan, and then assembled these profiles to generate spatio-temporal concentration profiles. These extracted concentration profiles are used in a learning-prediciton (LP) model of inner ear pharmacokinetics to learn transport parameters along scala, between scalae, and to clearance. This approach has the potential to enhance existing models of inner ear drug delivery by permitting extraction of spatially dependent diffusion and clearance parameters in the mouse and other rodent model systems.

Surgery and imaging
Our experiments were motivated by the study described in (Haghpanahi et al., 2013); we describe here similar details of our surgical and imaging setup.

Animals and surgical procedures
Young adult CBA/CaJ mice (age: 3e6 months) were used in all our experiments. All animal procedures were approved by the University of South Florida Institutional Animal Care and Use Committee and were performed using National Institutes of Health and veterinary standards of care. Following the procedures originally detailed in Borkholder et al. (Borkholder et al., 2010), animals were anesthetized (ketamine/xylazine) and the left ventral surface of the neck was cleaned to prepare for surgery. Surgery was performed on the left ear of the mice. A cochleostomy was drilled at the base of ST near the round window. Infusion tubing was inserted into the cochleostomy using a micromanipulator (MM3-3, World Precision Instruments, Sarasota, FL) and the cochleostomy site was sealed permanently using dental cement (3M ESPE Durelon, St. Paul, MN) and super glue. The surgery site was loosely sutured closed to provide strain relief for the infusion tubing. The mouse was placed in a 3-D printed custom holder to help maintain a steady head position during transport to the scanner and during imaging, and to provide precise placement of the isoflurane (anesthesia) delivery system during the entire scanning process. Proper positioning of the head during scanning fixes the orientation of the cochlea in the scanner, and minimizes motion artifacts that can cause errors in subsequent image processing steps. The animal's heart rate and respiration were monitored throughout the experiment to maintain the proper levels of general anesthesia done via continuous flow of 1.5% isoflurane during scanning. ISOVUE ® -370 containing 755 mg/ml of Iopamidol which equates to 370 mg/ml organically bound iodine (Bracco Diagnostics) was diluted to 240 mg/ml organically bound iodine with artificial perilymph (AP). A 30e40 cm length of US Pharmacopoeia Class VI polyimide tubing (044-I; ID 110 mm; OD, 139 mm; Microlumen, Tampa, FL) was connected to a 25 ml Hamilton syringe (1702 RNR 22S/2") that was mounted to the pump (Standard Infuse/Withdraw PHD ULTRA™ Syringe Pumps (Cat No.: 70e3005); Harvard Apparatus, Holliston, Massachusetts) allowing accurate infusion rate control. The syringe was carefully filled with 1000 nl AP, 5000 nl of 240 mg/ml Iopamidol solution, and 1000 nl AP, each separated by a 10 nl air bubble to avoid diffusion between fluids as shown in Fig. 1. The syringe pump was powered from an inverter running on a 12 V battery to permit continuous infusion during surgery and transport to the mCT scanner. Flow was set to 16 nl/min to provide continuous AP flow during surgery, transport to the scanner, and the initial baseline scan (no contrast agent). Following the baseline scan, the flow rate was changed to 32 nl/min and the second scan was started to coincide with initiation of delivery of contrast agent. Scans continued serially for 140 additional minutes with continuous infusion of contrast agent to the basal turn of ST via cochleostomy.

Infusion and flow characteristics
In this work, a total of 4800 nl of the contrast agent is infused into the cochlea during each experiment at a rate of 32 nl/min for 140 min. While the total perilymph volume of ST and SV of the mouse is only about 620 nl (Thorne et al., 1999), the patent cochlear aqueduct (CA) provides a fluidic exit where infused agent exits ST and flows into the large cranial subarachnoid space. As shown in Fig. 2, the CA is positioned basal to the cochleostomy (CO), providing a region of high concentration of infused agent at the base, with diffusion mediated transport to more apical regions and to other scalae and clearance. Within the scalae, diffusion is characterized by a concentration dependent diffusion coefficient (D), whereas inter-scalae transport and clearance are defined by transport parameters (K). These parameters are explained in detail in the pharmacokinetics modeling section.

Micro computed tomography (mCT) imaging
Imaging was performed on a Siemens Inveon PET/CT scanner. Whole head images were 1024 Â1024 Â 512 voxels with an isometric resolution of 16:38 Â16:38 Â 16:38 mm. The mouse holder was placed inside the mCT scanner and a low resolution scout view was obtained to accurately position the scan window in the region of the ipsilateral cochlea. Scans were obtained continuously for $170 min with infusion of AP for the first $30 min or until the 1000 nl AP was delivered. Contrast agent was delivered for the remaining 140 min. Each experiment consisted of 1 baseline scan and 5 contrast agent infusion scans (5 time points) with each scan captured over 20 min with a wait time of 10 min. The baseline scan is imaged with only the AP infused and no contrast agent to enable robust image registration, and to provide a zero concentration baseline throughout the cochlear scalae for quantitative assessment of contrast agent intensity. The contrast agent volume was recorded from the syringe pump and was visually verified by movement of the 10 nl air bubble between the contrast agent and trailing AP solution.
The cochleostomy experiments were carried out on a total of 17 animals, and were used to test the 3-D image registration algorithms. Of the 17 animals, three were used for final contrast agent infusion for purposes of quantitative pharmacokinetic model development and learning of diffusion and transport parameters. These specific experiments achieved minimal or no leakage at the cochleostomy site with expected rapid increases in basal contrast agent concentration. As described below, the model first learns a time dependent leakage rate based on the time-rate of change of concentration at the infusion site in ST. The model then iterates to learn the diffusion and transport parameters which are then used in forward simulations of cochlear pharmacokinetics.

Processing and extraction
After the scans are obtained post-surgery, we proceed to convert the scans to the required format, register the scans, and extract the concentration from the three main scalae as explained below.

Mouse cochlear atlas
Estimating pharmacokinetic model parameters requires Iopamidol prevent their mixing and also provide a visual reference to ensure the rate of flow through the tube. The holder, functionally the same as shown above, has three points of contact with the mouse to ensure the head stays in the required position throughout the experiment. Pressure driven flow is depicted with solid lines while diffusion is depicted with dotted lines. The infusion results in advective flow through the cochlear aqueduct (CA) and diffusion-mediated transport through the cochlear spaces. Transport between scalae and to clearance that are modeled for the murine system are depicted as K values.
obtaining spatio-temporal profiles of the contrast agent concentrations in the three main scalae (ST, SM and SV). However, the scalae are not easily identified in the mCT scans of the cochlea since fine membranes provide insufficient image contrast for visualization. In order to differentiate them, a template (atlas) of the mouse cochlea with various pre-identified regions published by Santi et al. (Santi et al., 2008) was used. The atlas was obtained from OPFOS (Orthogonal-Plane Fluorescence Optical Sectioning Microscopy) images of the decalcified cochlea of a CBA mouse. The individual region boundaries including ST, SM, SV, CA, SL, basilar membrane (BM), and organ of Corti (OC) are segmented in these images by an expert human observer based on the definition and visual recognition of the tissue borders. The image stacks are provided with metadata in TIFF format as part of the atlas, along with outlines of all the segmented regions. The color-coded segmented regions (colored outlines) of the main cochlear compartments and structures are extracted and converted into individual 2-D label maps. The label maps and the image stacks are converted to NIFTI (Neuroimaging Informatics Technology Initiative (NIFTI)) format to form individual 3-D label maps as shown in Fig. 3, and a 3-D atlas, each having 5:435 Â5:435 Â 20 mm non-isotropic resolution.

mCT calibration for contrast agent concentration
The values associated with voxels in a mCT scan are in Hounsfield units (HU), which is a linearly transformed version of linear attenuation coefficients, shifted and scaled so that the radiodensities of distilled water and air at standard temperature and pressure are defined to be 0 HU and À1000 HU, respectively. To transform from HU to units of concentration (mg/ml), we carried out a calibration procedure similar to that defined in (Haghpanahi et al., 2013). We placed seven pipet tips filled with contrast agents at 50, 100, 150, 200, 240, 370 (mg/ml) Iopamidol in a custom 3-D printed holder along with a pipet tip filled with saline (reference solution for 0 mg/ml). These are then imaged in the same mCT scanner using settings identical to those used during the animal experiments. After the images are obtained, the procedure to fit the HU values to the concentration in mg/ml, and finding sensitivity cutoff is adopted from our previous work in (Haghpanahi et al., 2013). The average HU value for the entire volume of each pipet are plotted against the Iopamidol concentrations, and a linear curve is fit to the data points. An R 2 value of 0.9968 (coefficient of determination) is obtained, indicating a strong linear fit between HU values and concentration, with the conversion defined by C ¼ ðHU þ 53:1569Þ=29:4133 where C is the contrast agent concentration in mg/ml and HU are Hounsfield units from the mCT scan. A sensitivity cutoff of 2.655 mg/ml was obtained using the standard deviation of HU values of each pipet using the methods described in (Haghpanahi et al., 2013). The sensitivity cutoff is the value below which Iopamidol concentrations cannot be resolved above the noise. Average of concentrations taken in a circular window in the calibration scans with diameters ranging from 1 to 15 showed that the variation in concentration decreased below the sensitivity cutoff value when a 7 diameter or above circular window was chosen. To prevent the averaging window from including bone at the scala walls, we chose to use the minimum 7 voxel diameter circular window as ROI for concentration extraction.

Image pre-processing
Images of the entire head of each mouse are natively obtained in DICOM format (Digital Imaging and Communications in Medicine (NEMA PS3/ISO 12052)), with each slice being an individual image. Since the atlas and the scans are in different formats, the conversion to NIFTI achieves a uniform format for further processing steps. To identify the cochlear region within the whole head scan, we visually identify the basal and apical turns of the cochlea, localizing a region of interest (ROI) with 15e25 voxel buffer space around the ipsilateral cochlea to accommodate for motion caused by breathing of the animal. However, for scans with significant motion artifacts, the ROI needs to be selected individually for each of the 6 scans. This is done using ITK-SNAP software (Yushkevich et al., 2006), which enables manual selection of the starting points and size of the cuboid ROI. Even though the ROI selection process potentially can be automated by looking for the geometric features of the cochlea, we noticed that some features become ambiguous after infusion of contrast agent. Manual extraction in such cases is more time efficient and also provides accurate ROI adjustment for motion artifacts caused by animal's respiration. In order to match the resolution of the atlas with that of the mCT scans, we resample the atlas and the label maps to have isotropic 16:38 Â16:38 Â 16:38 mm resolution. The atlas image is resampled by linear interpolation, and the label maps are resampled by nearest-neighbor interpolation. The atlas is chosen over the mCT scans for resampling due to its higher resolution, and to avoid introducing artifacts into the mCT scans that can alter the contrast agent concentrations to be extracted.

3-D registration and segmentation
To solve the problem of segmenting the three major scalae in the cochlea, an atlas based registration approach is implemented in Cþþ using the Insight Toolkit library (Yoo et al., 2002). In this approach, the cochlear atlas is geometrically transformed using a cascade of transformations so that it optimally aligns with the ROI extracted from the current mCT baseline scan. In general, registration algorithms find a transformation t, such that the transformed moving image I m (the atlas) is spatially similar to the fixed image I f (mCT). These algorithms require defining (i) a class T of valid geometric transformations from which t can be drawn, (ii) a cost function F ðI m ðtÞ; I f Þ that measures the dissimilarity between the transformed moving image I m ðtÞ and the fixed image I f , and (iii) an optimization procedure for minimizing the cost function over the class of valid geometric transformations. Given these components as seen in Fig. 4, registration can be thought of as solving the following minimization problem: We consider three different classes of geometric transformations: similarity transformations, affine transformations, and B-spline free-form deformations. Similarity transformations (Fitzpatrick et al., 2000) allow arbitrary translations, rotations, and isotropic scaling of the moving image. Affine transformations (Maintz and Viergever, 1998) allow translations, rotations, different scaling along each dimension, and shear factors between pairs of dimensions. B-spline free-form deformations (Rueckert et al., 1999) enable smooth local changes in the moving image and are parameterized by a grid of control points that are allowed to undergo independent displacements. These transformations are successive subsets of one another, and they have increasing complexity. In three dimensions, similarity transformations have seven parameters, affine transformations have twelve, and B-spline free-form deformations have M 3 parameters, where M is the number of control points (uniform grid). Since the solutions to registration using simpler transformations can be used to provide initial estimates for registration using more complicated transformations, we progressively perform registration using each of these three geometric transformations, as shown in Fig. 5. Combining the results of all the three stages allows us to write the final transformation b t as the composition of the transformations found from similarity, affine, and B-spline based registration: For the cost function, we employ the negative correlation coefficient between the image data; i.e.

F
I m ðtÞ; where I fi is the i th voxel of fixed image, I mi is the i th voxel of moving image, and N is the number of voxels in the region of overlap between and I m and I f . Simpler cost functions for single modality registration, such as the sum of squared differences or sum of absolute differences, may have difficulties in regions of high concentration of the contrast agent. More complicated cost functions for multiple modality registration, such as mutual information or normalized mutual information, do not explicitly account for the correlations in the moving and fixed images that occur in this application.
For the optimization procedure, we use Regular Step Gradient Descent (RSGD) (Shor, 1970) for minimizing the cost function with respect to each class of geometric transformation. RSGD is based on gradient descent but incorporates an extra initial hyperparameter known as the relaxation factor. The relaxation factor decreases the step lengths of the transformation parameters (the amount of change in each iteration) every time a gradient change occurs in the direction of the minima. We define the criterion for terminating the optimization procedure to be when the step length between two consequent iterations is less than or equal to 10 À6 .
Using this final transformation b t, the label maps are transferred into the geometry of the mCT scans. Although we restrict our analysis to the three main scalae and SL, b t can be applied to any number of regions that exist in the atlas. Finally, to account for minor motion artifacts in subsequent scans, we apply a transformation to the B-spline transformation result, identical to similarity transformation, but restrict it to only rigid translations, simulating movements that might be caused by animal respiration.

Concentration extraction
Given the segmented volumetric representations of the scalae (label maps), the medial axis for each scala is extracted using morphological thinning (Lee et al., 1994;Wagermaier et al., 2017) followed by pruning (removing unwanted branches of the medial axis) (Wilson, 2018). Morphological thinning relies on the concept of connected components, where every voxel in the label map is considered to be connected to every other voxels in 3-D by 26 connections (3 Â3 Â 3 grid). When the voxel is at the surface of the label map, it will have fewer than 26 connections. Such surface Fig. 4. Illustration of the registration process which takes a fixed (mCT) and moving (atlas) image as input, and calculates a metric (cost function) between them to optimize the transformation. The optimizer stores the final transformation parameters that are used by the moving interpolator to fill in values at non-voxel locations. Fixed transform, fixed interpolator and virtual image are temporary intermediate stages created to help with optional pre-transformation, interpolation necessary for pre-transformation, and to carry out computations in a physical grid apart from the fixed image respectively. voxels are removed iteratively towards the center of the label map, retaining the end point voxels. This is repeated till all the surface points are removed, thinning it in the process until only a 1e2 voxel thick axis is left. Due to the uneven nature of anatomical volumes, the thinning might assume multiple end points and create erroneous random branches from the central axis that need to be pruned as shown in Fig. 6. This pruning is done by manually fixing the start and end point of the medial axis, and iteratively eliminating any terminating point which is not part of the medial axis. The start point (zero point) is visually selected to be the bifurcation point of the round window membrane concave region in ST. For SV, a bifurcation is seen in the medial axis at the region where the vestibule starts, which is picked as the start point. A plane normal to the tangent at the start point for SV, cuts the SM medial axis, which is picked as the zero point in SM. The ending points of all three scalae are chosen to be the final point at the apex in the medial axis that can be visually discerned.
The final medial axis is defined in terms of a discrete set of voxels. In order to extract a robust estimate of the concentration from the scalae, we define the concentration at each medial axis location as the average of concentrations in a plane through that medial axis location and normal to the medial axis. Normal planes are obtained by defining a differentiable curve through the discrete set of voxels on the medial axis and by identifying the normal and binormal vectors to the curve. Cubic splines provide a mechanism for defining such a differentiable curve; a series of three cubic splines are fit to the three components of the medial axis, and then the cubic splines are reparametrized in terms of arc length. This reparameterization serves two purposes: the continuous curve can be used to divide the scala equally, and tangent, normal, and binormal vectors can be obtained by first order differentials at evenly-spaced points along the medial axis. Planes spanning the normal tangent and bi-normal vectors are obtained by interpolating intensity values from the scan along these vectors. Average voxel intensities are extracted in a 7-voxel diameter circle about each re-parameterized axis point, followed by conversion of average intensity to concentrations in mg/ml using the linear conversion defined in Section 3.2. The concentration profile extracted from the baseline scan is subtracted from infusion scans concentration profiles to account for any values not representing Iopamidol concentrations. This procedure is repeated for the three main scalae to obtain the concentration profiles required for modeling pharmacokinetics.

Pharmacokinetic modeling
The spatio-temporal concentration profiles that are extracted for the ST, SV, and SM can be used to estimate diffusion and transport parameters of a pharmacokinetic model describing transport and clearance of drugs in the murine cochlea. The model is based on the reduced 1-D partial differential equations in Fig. 5. Overview of the preprocessing and registration stages -Given the original DICOM images, the region of interest (ROI) for the infused cochlea is extracted from each scan and stored in NIFTI format. A Similarity and Affine registration is done against the resampled Atlas images (Schreiber et al., 2010), followed by a 3-D deformable B-spline registration. The affine registration ensures that we have a good atlas starting point for the B-spline registration. The segments are then separated into ST, SV, SM and SL, and because the SL is only 2 to 8 voxel thick, it is omitted from further processing. Given the segmented regions, the medial axis is calculated and re-parameterized in terms of arclength. Intensities (Hounsfield units) from 7 voxel diameter discs are averaged at every 1% point along the medical axis for each scalae in each scan. Intensities are converted to concentration (mg/ml) based on a pre-determined calibration curve. Fig. 6. Example of medial axis extraction from an ST label map. A is the label map obtained from the registration and segmentation process. B is the label map after morphological thinning is applied. The thinned label map still contains stray branches (red arrow) off of the medial axis that are removed in the subsequent pruning process, which yields the right-most label map. It can be observed that the round window membrane (RWM) branches are missing from the final medial axis owing to the fact that we consider zero-point of ST for concentration extraction after the round window branch, near the cochleostomy site. C represents the resulting medial axis. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) (Plontke et al., 2007), with significant modifications to the equations and the parameters used. The main differences to our model are the inclusion of concentration based diffusion coefficient, considering advective flow from cochleostomy site to the cochlear aqueduct, calculating leakage out of cochleostomy site as a function of time, transport only between ST-SM and SM-SV, and with clearance directly out of SM. Fig. 2 provides a graphical view of the kinetics considered by our model within the cochlea. In this section, we describe the pharmacokinetics of drug diffusion and transport within and across the ST, SM, and SV with the help of a compartment model, and then we show how the spatio-temporal concentration profiles extracted from the mCT scans can be used to solve the inverse problem of estimating diffusion and transport parameters.

Drug transport and diffusion
As represented in Fig. 2, our model relies on three key assumptions: there is an advective flow component from the cochleostomy site towards the cochlear aqueduct, diffusion mediated transport is responsible for transport from base to apex, and there is transport between scalae and to clearance. SL primarily overlaps with SM in the mouse cochlea, so clearance in the model is restricted to transport out of SM considering SL as a membrane. The model can be represented as compartments as shown in Fig. 7 and the equations can be generalized based on diffusion coefficient within the compartment, and transport coefficients between the compartments.
Consider three compartments representing ST, SM, and SV with concentrations C ST , C SM and C SV that vary along the length of the scalae. The advective flow from cochleostomy site to CA in ST is defined by time dependent flow rate F. The transport between ST-SM and SM-SV are defined by K STÀSM and K SMÀSV , and by K SMÀclearance between SM and clearance. The concentration within SL, representing clearance, is held at 0. The detailed diffusion equation as shown in supplementary material (eq. (B.2)) can be simplified to explain the kinetics in each compartment as shown below.
ST : CO to CA : ST : CO to apex : The diffusion coefficient D is assumed to be the same for all the three compartments. For a solute molecule with molecular weight MW and viscosity of the solution h, the theoretical diffusion coefficient is defined as: where, k B ; T; r H ; r and N 0 denote Boltzmann constant, temperature in Kelvin, hydrodynamic radius, density of solute molecule, and Avogadro's number. The hydrodynamic radius in particular is important here as it can be used as a first order scaling of transfer coefficients to any new molecule as shown later in section 6. Viscosity was observed to change with concentration of the Iopamidol solution used in our experiments, consistent with empirical measurements in other pharmacokinetics work (Bowen and Williams, 2001;Li et al., 2014). However, there is no closed form equation between viscosity and concentration that can be integrated into eq. (8). The relation for concentration dependence was provided by (Fujita, 1952), wherein Fick's second law is modified using the Einstein-Stokes equation (Edward, 1970). This relation to calculate the concentration dependent diffusion coefficient is shown below.

Dðx; tÞ
where D 0 is the theoretical diffusion coefficient (eq. (8)) at zero concentration and with h presented as the zero concentration solution viscosity. g is a learned parameter characterizing the concentration dependent decay in the diffusion coefficient. This decline in diffusion coefficient with increasing concentration has been empirically measured for Iopamidol solutions from 0.08M to 1M by Fontanive (2011). For implementation, K's and gamma are initialized to a random value between 0 and 1, and concentrations along length in all three scalae are initialized to 0 at time zero. The concentration at every time step can now be calculated using these initial estimates. These calculated concentrations are later used to optimize the randomly initialized parameters as shown in section 4.2. Each scan is an aggregate of 20 min of scan time, making the concentrations profiles obtained from each scan a temporal average of 20 min. For practical implementation, the time step was defined to be Dt ¼ 0:1sec, making it possible to simulate a concentration profile for every Dt and average those for six 20 min intervals to replicate the scanning process. Therefore, the equation to average concentrations over I total time steps is shown below.
where, CðnÞ is the estimated concentration averaged over 20 min at n th point on the scala, so that it can be compared to the corresponding concentration extracted from the scans.

Estimating diffusion and transport parameters
To estimate contact area between scalae, morphological dilation is used to find the region of intersection between different compartments. Morphological dilation expands the label maps until the region of intersection is obtained between them. This region of intersection is treated as the area of contact. To this end, the label map of the semipermeable basilar membrane is used for contact area between ST-SM. In the absence of Reissner's membrane from the atlas, we make use of the SM and SV label maps to obtain contact area between them. For the transport from SM-clearance, the intersection are between SM-SL is calculated similarly using morphological dilation. Eqs. (5)e(7), take the area of contact AðxÞ, volume VðxÞ and cross sectional area SðxÞ as fixed quantities derived from the native atlas as shown in eq. (B.2). However, the transport coefficients K, concentration dependent diffusion parameter g, and D 0 are initialized to random values between 0 and 1, and iteratively learnt using optimization.
Using the molecular weight of Iopamidol (MW ¼ 777.08 g/mol), and density r of the solute (2.204 kg/l), D 0 can be theoretically calculated from eq. (8). However, the viscosity h of solution that Iopamidol would be diffusing into, which in our experiments is perilymph, is unknown and is substituted with that of water (0.68 cP at 311K). This brings the calculated value of D 0 ¼ 6:7 Â 10 À4 mm 2 =s. However, in order to be precise with respect to perilymph in the mouse cochlea, D 0 is also learnt with the remaining transport parameters and is comparable to the calculated value. Gradient descent is chosen as the optimizer for the learning process. It uses the gradient of the cost function (first order difference between current and previous loss values), to update the initialized parameters. The update is controlled by a hyperparameter known as the learning rate (a) that can speed up or slow down the update over iterations. As described in the flowchart of Fig. 8, the model takes in the initialized parameters and calculates the concentration profiles as output for every time step using equation (eq. (B.8)) in supplementary material. These profiles are then averaged for 20 min time steps using eq. (11). These are compared with their respective time points' extracted concentrations and the loss is calculated using mean squared loss cost function.
where, C sim ðnÞ is the averaged simulated concentration, C scan ðnÞ is the extracted concentration from the respective time point and corresponding position. The maximum intensity value at the tip of the inserted cannula at the cochleostomy site (C maximum ) is used to normalize the cost. The gradient (current iteration loss: L , previous iteration loss: L À1 ) considering D 0 as an example, is given by: As shown in Fig. 8, the gradient for each parameter is multiplied by their respective learning rate, and subtracted from the previous parameter value to get the new parameter value for simulation in the forward model. The learning rate is initialized to low values in the range of 10 À2 to 10 À3 and they remain unchanged till the end of optimization. The optimization continues until the change in loss is less than 0.01%.
The transport parameters are initially learnt by keeping a constant flow rate profile with respect to time F in eq. (4). Since the estimated ST volume between the cochleostomy site and the cochlear aqueduct is only 35 nl in the mouse, this region should be saturated almost immediately given the 32 nl/min flow rate. We observed that in some experiments, the peak concentration was reached in 70e100 min (3rd to 4th time point after Iopamidol infusion), suggesting a leak through the cochleostomy site that slowly sealed during the course of the experiment. To estimate the flow rate including leakage over time, we simulated the concentration exclusively at the infusion point, with the flow rate initialized to 32 nl/min. The gradient of the error was calculated by comparing these simulated concentrations with the extracted concentrations at the infusion point. The flow rate at each time point was optimized using the gradient, and a final flow rate profile was obtained by interpolating between the time points. The transport parameters are then re-learned using the new flow rate profile that accounts for time-varying leakage at the cochleostomy site.

Results
This work involved 17 animal experiments, out of which all the experiments were used to test and refine the registration and segmentation algorithms. Iopamidol infusions from three experiments were used for determination of transport parameters, and for development and testing of the pharmacokinetics model. Fig. 9 shows the results of registration and segmentation of the cochlea. Results are displayed as the difference between stages of registration in three orthogonal views. For visual assistance, the difference has been scaled between 0 (100% overlap) and 1 (0% overlap). The intensity values inside the scala are different for the atlas and the scan owing to their different imaging modalities. Registration result are quantified using the Jaccard Similarity Index (Jaccard, 1901) which calculates the ratio of intersection of segmented region with ground truth to their union. We take the Jaccard similarity index of selected manually labeled label map slices with the equivalent slices of registered label map. The overlap after the affine stage was calculated to be 58.28%, which improved to 79.61% after B-spline stage as seen in Fig. 9. The region outside the cochlea is empty in the atlas, whereas it is filled with various tissues in the scans resulting in lower Jaccard index. Moreover, due to variation in cochlea of animal used in atlas compared to animals used in our experiment, there will be inaccuracies in the registration result. Table 1 details the transport parameters extracted for each of the three different experiments. The concentration dependent diffusion coefficient as shown in Fig. 10 has been normalized by the extracted diffusion coefficient at zero concentration (D 0 ¼ 6:226 Â 10 À4 mm 2 =s). The form of this decay is similar to that empirically measured by Fontanive (2011), however their more rapid decay was attributed to aggregation of Iopamidol molecules at high concentrations, with an estimated average of 2.3 molecules per aggregate at a concentration of 0.8M. Although the Iopamidol molecules in our solution do not aggregate, the change in diffusion coefficient can be credited to increased interaction between Iopamidol molecules at higher concentrations. The black line shows the maximum value of concentration delivered in our experiments, so all the values of the diffusion coefficients used by the model lie to the left of this line. Fig. 11 shows the extracted and simulated concentrations, and the error between them (extracted -simulated concentration). The concentrations are normalized to the tube tip concentration and displayed as a percentage of the length of the scala. The mean value of the learnt parameters over three experiments is used in the forward simulations. Concentration profiles are labeled 10, 40, 70, 100, 130 min, to approximate the temporal mid-point of each scan. Simulated results are the average results over 20 min windows centered on the labeled time to approximate the experimental paradigm. The concentration peaks do not align across scala in the plots of Fig. 11 primarily due to length normalization, with a smaller contribution due to zero-point selection. Visualization of the location of peak concentration within each scala in the 3D cochlear model revealed the peaks do indeed align. The Euclidean distance between medial axis peak locations were: 0.45 mm (SV-SM), 0.47 mm (ST-SM), 0.75 mm (ST-SV).

Comparison to FluidSim
FluidSim is the gold standard in inner ear pharmacokinetics forward simulation software published by Alec N. Salt (2002). The   software considers a single fixed diffusion coefficient calculated using molecular weight of the solute molecule. It uses half-times (t 12 ) to represent inter-scala transport instead of transport coefficients. In order to convert our transfer coefficients (K) to halftimes, a relation provided in (Salt et al., 1991a(Salt et al., , 1991b is used. Given a small cross section of volume V at the base of scala, with corresponding area of contact with the adjacent scala A, then the relation between t 1=2 and K is given by: The cochlear aqueduct location is aligned in both models and presented as the zero length point in ST. The zero points of SM and SV in FluidSim were determined qualitatively by calculating the distance in mm from apex to zero point in LP model, and zero point in FluidSim was fixed at the same distance. We used the average values of the learned transport parameters for conversion to half times to use in FluidSim. Diffusion coefficient D, in FluidSim was set to 5:9 Â 10 À4 mm 2 =s based on molecular weight of Iopamidol. Flu-idSim does not allow transport between ST and SM directly, rather it incorporates transport from ST to organ of Corti (OC), and then to SM. To account for these differences in the model structures, the OC-ST half time is set to 0.5 min so that transport between ST to OC is fast. The FluidSim OC-SM half time is equated to our ST-SM transport coefficient to approximate equivalent transport. For clearance, SL-SM half time was equated to our SM-clearance transport, and clearance half time out of SL was also set to 0.5 min. Within FluidSim, all half times are provided for the 0% point of the scala, with the half times scaled along the length based on the cross-sectional area of the scala from which transport occurs. The simulated concentration are shown in Fig. 12. The values used for the simulation are presented in Table C1 in the supplementary material.

Pulsatile delivery
Another application of the LP model is exploration of different drug delivery paradigms to optimize concentrations within the cochlea. This is important to maintain drugs within a therapeutic window avoiding high concentrations that may be toxic, while maintaining concentrations that provide a therapeutic benefit. The LP model is structured to provide temporal control of flow rate for a delivered compound. To explore the impact of pulsatile flow on concentration profiles and total drug delivered, a range of on-off times from Ton ¼ 1 mine10 min and Toff ¼ 1 mine19 min at a fixed flow rate of 8 nl/min were evaluated in forward simulations for a basal turn ST cochleostomy infusion. A notional therapeutic window was arbitrarily defined between 20% and 80% of the infused drug concentration for illustration purposes. One result is presented in Fig. 13 where the infusion was pulsed on for 1 min followed by an off period with zero flow for 3 min. The results are compared to continuous flow at 8 nl/min demonstrating the ability to significantly reduced the high basal concentrations associated with continuous delivery while minimally impacting apical concentrations. This approach enabled a 75% reduction in total drug delivered while keeping concentrations within the notional therapeutic window.

Discussion
This method of using mCT imaging to measure concentration within the cochlea is effective in providing reliable spatio-temporal data of the major scalae. This is a significant advantage over invasive methods such as ion selective electrodes and direct sampling, and enables quantification in the challenging murine system where cochlear dimension preclude use of established techniques practiced in larger rodents. This technique enables simultaneous measurement of the concentration evolution over time, addressing a primary limitation of sampling methods. These concentrations can be determined within all major scala, including scala media which is a major advancement over current state of the art. Infusion of a hypertonic solution creates the potential for osmotic water entry into the cochlea at the cochleostomy site. However, any osmotic entry is not significant in this preparation due to the positive pressure within the mouse cochlea coupled with the pressure of advective flow from the cochleostomy site to the cochlear aqueduct. While our model does include a time-dependent leakage rate out of the cochleostomy site, we had several experiments with essentially zero leakage where the concentration at the basal turn of ST rapidly achieved the infused concentration. This suggests that osmotic water entry with dilution of the Iopamidol at the injection site was not a factor in these experiments. Additionally, a concentration dependent diffusion parameter was determined which accounts for interaction between solute molecules at high concentrations that are present near the infusion site. The forward model incorporates this parameter resulting in reduced diffusion along the length of the scala for regions of high concentration. This effect although used in many physical applications, has not previously been evaluated within the context of inner ear drug delivery.
The 3-D image registration enabled high spatial resolution extraction of concentration which is a significant improvement over past 2-D work. While we do not have intrusion of cochlear bone in our segmented label maps, registration errors can occur due to inter-animal anatomic variability in the cochlea shape. These errors can be mitigated by creating statistical shape models to account for the variations, or by using atlases derived from multiple specimens. Statistical shape models (SSM) take advantage of the statistical differences in the shape of scalae derived from the segmented label maps. One approach to creating SSMs would involve initializing a shape model using principal component analysis (PCA) on a small percentage (10e15%) of label maps from baseline scans, and iteratively update it using the label maps from the rest of the baseline scans. This iterative method allows for incremental addition of shapes from new experiments, and can be scaled to multiple compartments of the cochlea.
The pharmacokinetic model is based only on the three main scalae ST, SM and SV. This was owing to the fact that extraction of concentration requires a circular averaging window 7 voxels diameter (minimum) as discussed in section 3.5, and small cavities like spiral ligament or organ of Corti of the mouse do not have enough voxels to fit this criterion. SL in particular was only 2 to 8 voxels wide, inhibiting continuous concentration extraction. Moreover, due to inaccuracies in the deformable registration step, registered label map of SL might include bone at the boundary towards the cochlear wall. Using a larger animal model should mitigate this problem, providing additional regions that can be included in the pharmacokinetic model.
The learnt parameters vary slightly in different experiments owing to experimental variations, and dissimilarity in anatomy of different animals. However, the mean parameters obtained from the three experiments provide sufficiently accurate simulations of concentrations as seen by the error plots in Fig. 11. The concentration profiles obtained with the cochleostomy infusions exhibit basal concentrations close to the infused concentration, with a significant drop in concentration from base to apex. Even with longer infusion times, higher concentration at the apical end will be limited by clearance mechanisms. This is a well-known limitation of inner ear drug delivery with either cochleostomy or round window delivery. Forward model simulation results suggest pulsatile delivery approaches may reduce peak basal concentrations with minimal impact to apical concentrations with the potential to enhance therapeutic benefit of delivered agents. This pulsatile delivery also dramatically reduced the total volume of drug delivered, with implications to required reservoir size, total power required for implanted system, and potential toxicity as excess drug is cleared to blood and the subarachnoid space.
Comparison of our simulations to that from FluidSim utilized the extracted transport parameters from our imaging experiments, converted to half-times for entry into the software package. The FluidSim model has a more complex anatomical description of the cochlea, requiring some half time parameters to be minimized to enable comparison between models. The simulated concentration profile differences between the two models were within 16%, with the LP model providing a steeper decrease in concentration in ST from base to apex, and a higher peak concentration in SM. One key difference between the two models is scaling of transport parameters based on anatomical dimensions of the cochlear spaces. In FluidSim, half times are scaled based on the cross-sectional area of the scala from which the solute is diffusing. In the LP model, transport coefficients (Ks) are scaled based on this same crosssection, but also the contact area between the scalae. For example, the contact area between ST and SM increases from base to apex as the Basilar membrane widens. This will increase transport out of ST into SM and may account for the lower apical ST concentrations observed in the LP model as compared to FluidSim. It is also possible the more complex anatomical representation of FluidSim contributed to differences between model results. However, the match between models provides validation of inner ear pharmacokinetics developed using very different approaches to measurement of intracochlear concentrations and determination of transport parameters.
Direct quantification of agents infused into the murine cochlea has only been accomplished using mCT, first with 2D analysis at 5 points along the length of ST and SV (Haghpanahi et al., 2013), and here with 3D analysis and 100 points along the length of ST, SV, and SM. In both cases, significant basal to apical concentration gradients were measured, with the highest concentrations in the basal turn of ST. These gradients persist even after two hours of infusion. Forward simulations to steady state reveal that such gradients are inherent to this method of intracochlear drug delivery with diffusion mediated transport from base to apex, and interscala transport and clearance limiting apical drug concentrations. These results are also consistent with indirect measures of concentration based on auditory measures with cochleostomy based infusion of ototoxic agents to the murine cochlea where the greatest auditory shifts are observed for high frequencies, close to the infusion site (Borkholder et al., 2010) (Chen et al., 2006). Similar results have been achieved in larger rodents with cochleostomy based infusion of ototoxic agents to the guinea pig cochlea (Chen et al., 2005). Interestingly, responses were measured out to 8 kHz, with robust responses at 16 kHz and 32 kHz which are approximately 4.2 mm and 1 mm from the base of ST based on a tonotopic map (Robertson and Johnstone, 1979). This guinea pig 16 kHz response is at a distance equivalent to the most apical point in the mouse ST (Thorne et al., 1999). This could be due to different pharmacokinetic transport within the guinea pig cochlea, or it could be differences in diffusion coefficient, interscala transport, and clearance between the Iopamidol used in the present work, and DNQX used with the guinea pig experiments (Chen et al., 2005). A key limitation of the indirect measures of intracochlear concentration is they provide only a semi-quantitative estimation of general concentration along the length of the cochlea, and provide no information on concentrations within each scala and differences between them. Perilymph sampling following injection of tracer agents provides a method of quantifying perilymph concentrations in both ST and SV. This technique has been used in the guinea pig with injection of fluorescein isothiocyanate (FITC)-dextran-containing solution at a basal turn ST cochleostomy, with subsequent sampling of perilymph from the lateral semicircular canal after a delay of 120 min (Salt et al., 2015). These experiments support lower concentrations in SV than in ST as revealed in the present work. Overall, concentrations gradients from base to apex are higher in ST than SV and SM, with the highest concentrations achieved at the basal turn of ST as expected from prior studies. For applications requiring infusion of molecules other than Iopamidol, D 0 and K can be scaled to the molecule under consideration. The theoretical diffusion coefficient D 02 can be calculated for a new solute molecule (2nd molecule) given the value of D 01 for Iopamidol (1st molecule), by using eqs. (8) where, molecular weight and density of the 1st molecule (MW 1 ;r 1 ), and the 2nd molecule (MW 2 ; r 2 ) are known quantities. Similarly, transport coefficients K are scaled using the same relationship as eq. (15) as shown below.
where K 2 , and K 1 represent the transport coefficients for the 2 nd and 1 st molecule respectively.
A key limitation of the presented work is that the molecules which can be quantified are limited to those exhibiting image contrast for mCT imaging. There are a wide range of triiodo compounds that can be imaged, however translation of pharmacokinetic parameters to other compounds with therapeutic benefit is currently limited to the proposed scaling of D 0 and K based on molecular weight and density of the solute as shown in eqs. (15) and (16). However, this scaling does not consider the impact of molecular properties on transport through biological membranes. Salt and Plontke (2018) proposed incorporation of lipophilicity and topological polar surface area (TPSA) into cochlear pharmacokinetics to enhance translation to therapeutic compounds. However, there are no known equations to enable direct incorporation into the K values determined via the LP model or sampling/computational techniques. One approach would be to leverage the mCT imaging techniques presented herein, with a range of triiodo compounds that span the lipophilicity/TPSA space. These parameters can be incorporated into the learning paradigm to provide advanced transport parameters for forward model simulations, where molecular characteristics are directly entered for an arbitrary compound. This remains a future direction of research to advance intracochlear pharmacokinetic models.

Conclusion
We have developed a novel pharmacokinetic model that learns the parameters describing transport of compounds within the mouse cochlea. A non-invasive mCT imaging technique in combination with contrast agent delivery via a basal turn ST cochleostomy was used to directly measure the spatio-temporal concentrations within all three major cochlear scalae: ST, SM, and SV. Techniques were developed to register a 3-D atlas to the mCT images to segment the main scalae and to obtain high spatial resolution concentration profiles. These profiles were used in a Learning-Prediction optimization paradigm to extract a concentration dependent diffusion coefficient and transport parameters between scalae and to capture clearance. Forward simulation results from our Learning-Prediction pharmacokinetic model indicate the efficacy of using mCT imaging techniques to characterize in vivo local drug delivery. Results compared favorably to FluidSim, validating both approaches for quantifying inner ear pharmacokinetics. Future work will extend the model to larger rodents and humans, improving the model to include other cochlear regions, and testing the model on various infusion compounds, to help facilitate novel inner ear drug delivery paradigms and systems for clinical trials.