Notch signaling and taxis mechanisms regulate early stage angiogenesis: A mathematical and computational model

During angiogenesis, new blood vessels sprout and grow from existing ones. This process plays a crucial role in organ development and repair, in wound healing and in numerous pathological processes such as cancer progression or diabetes. Here, we present a mathematical model of early stage angiogenesis that permits exploration of the relative importance of mechanical, chemical and cellular cues. Endothelial cells proliferate and move over an extracellular matrix by following external gradients of Vessel Endothelial Growth Factor, adhesion and stiffness, which are incorporated to a Cellular Potts model with a finite element description of elasticity. The dynamics of Notch signaling involving Delta-4 and Jagged-1 ligands determines tip cell selection and vessel branching. Through their production rates, competing Jagged-Notch and Delta-Notch dynamics determine the influence of lateral inhibition and lateral induction on the selection of cellular phenotypes, branching of blood vessels, anastomosis (fusion of blood vessels) and angiogenesis velocity. Anastomosis may be favored or impeded depending on the mechanical configuration of strain vectors in the ECM near tip cells. Numerical simulations demonstrate that increasing Jagged production results in pathological vasculatures with thinner and more abundant vessels, which can be compensated by augmenting the production of Delta ligands.

• functions_CUDA.cuh: declares all the functions used in the code executed in GPU (and shows in which .cpp,.cu or .cc le are dened).
• cpmfem.cpp: contains the main() function, which calls all other functions of the model.
• init.cpp: initializes pixels, nodes, cells and vessels structures at zero.
• init.cu: sets the initial distribution of cells in pixels, nodes, cells and vessels structures, impose external forces and constrains. This le also contains functions that copy from CPU to GPU or vice versa and then call functions in write.cpp to save output.
• read.cpp: loads input, for instance cell positions.
• write.cpp: saves output, such as cell positions and strains.
• FE_local.cpp: element stiness matrices (and using them to calculate stresses and strains).
• Fe_nodes2dofs.cu: some bookkeeping between the set of elasticity equations and nodal forces and displacements.
• branching.cu: performs the branching of a blood vessel.
• notchsignaling.cu: calculates number of proteins related to the signaling processes for each cell.
• proliferation.cu: splits in two the cells that meet some conditions.
• mt.cpp: contains the Mersenne twister algorithm for generation of pseudorandom numbers.
The structures and arrays used to store data of the model are the following: • VOX: structure with data related to pixels. Arrays contain (M − 1) 2 elements, where M is the number of nodes on a side of the square domain.
ctag: (int array) contains the id of occupying cell, 0 if no cell.
vtag (int array) contains the id of occupying vessel, 0 if no vessel.
• NOD: structure with data related to nodes. Arrays contain M 2 elements.
.fx, .fy: (oat arrays) contain x and y component of the force that it is exerted by cells in this node.
.ux, .uy: (oat arrays) contain x and y component of the displacement that it is suered by this node.
.restrictx, restricty: (boolean arrays) contain 1 if there is a nodal restrictions in x or y direction, 0 otherwise.
• CEL: structure with data related to cells. Arrays contain a maximum of 10 · (M − 1) elements, but it is only lled with non-zero values up to the number of cells at that moment.
.tip: (int array) position in the grid of the cell's pixel that is closer to the hypoxic area.
.tail: (int array) position in the grid of the cell's pixel that is further to the hypoxic area. .vess: (int array) id of the vessel that the cell belongs.
• NDJ: structure with the amount of cells proteins of signaling processes. Arrays contain a maximum of 10 · (M − 1) elements, but it is only lled with non-zero values up to the number of cells at that moment.
.N: (oat array) amount of Notch in the cell.
.D: (oat array) amount of Delta in the cell.
.J: (oat array) amount of Jagged in the cell.
.I: (oat array) amount of NICD in the cell.
.Vr: (oat array) amount of VEGFR2 in the cell.
.V: (oat array) amount of active VEGF in the cell.
• VES: structure with data related to vessels. Arrays contain a maximum of M − 1 elements, but it is only lled with non-zero values up to the number of vessels at that moment.
.tiptag: (int array) id of the tip cell of the vessel.
.tipvox: (int array) id of the pixel stored on .tip (CEL) of the tip cell.
.proltag: (int array) id of proliferating stalk cell of the vessel.
.branch: (int array) number of MCTS in which the new vessel has to maintain the direction or, after that, the id of the cell that could be a new tip cell of a new sprout.
.bx, by: (oat arrays) x and y coordinates of the branching direction during the number of MCTS that the vessel should follow it, 0.0 otherwise.
.ncell: (int array) number of cells in the vessel.
.parenttag: (int array) id of the vessel from which it branched, 0 if it is a initial sprout.
.ndescen: (int array) number of sprouts that branched o the vessel. These structures and arrays have a copy in the CPU and another copy in the GPU. The main modules of the code work with the copy in the GPU and the copy in the CPU is updated to generate the output les.

Initialization of random numbers and structures
The CUDA Random Number Generation library (cuRAND) and the Mersenne twister algorithm are used in this code. After the declaration of all the variables needed, the Mersenne twister algorithm is initialized.
After that, all structures dened above are initialize at zero in CPU and copied to GPU (functions in init.cpp le and init.cu). At this point, we have two options: , starts a new simulation. In function of the number of initial sprouts we have chosen, structures are modied to set new cells. Two kernels (kernel is function that runs on the GPU) are used to do that (kernels in init.cu). The rst one assign one thread to one pixel and modify CEL and VOX structures, placing equispaced cells of one pixel sized. The second one assign one thread to one initial sprout and modify VES structures, lling data of the vessel.
• init_MCTS = 0, i.e., continues other simulation. In this case, data from output les is loaded in the CPU copy of the structures(functions in read.cpp le and init.cu).

Output les with used parameters and initial structures
In order to have registered the used parameters and the initial state of the simulation, output les with .out extension are generated.
parameters.out is a le with the name, value and description of the parameters set in def.h.
The data of structures is stored in the following les, where X is the value of init_MCTS: • dvX.out: stores a matrix of size (number of initial sprouts) × 12. Each row is a vessel and each column is an item of the VES structure, ordered as follows: .tiptag, .tipvox, .proltag, .birth, .death, .isactive, .branch, .bx, .by, .ncell, .parenttag, .ndescen.

Arrangement of ECM strains and displacements
In this module, the necessary preparations are made to calculate ECM strains and displacements, given by elasticity problem (9) of our manuscript. The matrix K will remain xed throughout the MCTS, therefore it is calculated once before starting. K is a sparse matrix so the problem is needed to be solved with an appropriate method, like the Preconditioned Conjugate Gradient (PCG) method using ILU decomposition. The algorithm is provided by NVIDIA corporation and it is implemented on the GPU using CUBLAS and CUSPARSE libraries.
The following steps are performed in this module: 1. Set forces made by cells and forces in the boundary nodes. Set restrictions in the boundary nodes. These two actions are carried out in a single kernel that assign one thread to one node (kernel in init.cu).  [4]. However, we solved the system Ku = f using parallel computing, so that K matrix must be adapted. We have made this adaptation through a complex function that converts the output format of K from Van Oers et al. [4] code to CSR format. K is decomposed into three arrays:

Set element stiness matrices
• Kval is an array of size 10 · 2 · M · M . This array contains the non zero values of matrix K.
• Kcol is an array of size 10 · 2 · M · M . It contains the index of the column of corresponding nonzero value written in Kval.
• Krow is an array of size (e) Create the analysis info object for the K matrix.
(f) Perform the analysis for the Non-Transpose case.
(g) Copy K data to ILU0 vals as input.
(h) Generate the Incomplete LU factor H for the matrix K using cudsparseScsrilu0.
(i) Create info objects for the ILU0 preconditioner.

Arrangement of VEGF concentration
In this module, the necessary preparations are made to calculate the concentration of VEGF, given by initial-boundary value problem (5) -(7) of our manuscript. Firstly, we introduce the following scalings in order to nondimensionalize: and [G] are non-zero parameters. Substituting these new variables into the PDE, we obtain: Regarding to the cell binding term, [x] 2 leads to have two terms which are O(1), cell binding term and diusion term, and the PDE obtained is: The resulting factors are much less than one, so the solution of can be used to approximate the VEGF eld in our model. This equation can be solved with the Finite Dierence Method (FDM) using ve-point stencil. [2].
But, rst of all and in order to simplify the notation, dimensionless parameters and functions will be used without tildes. The notation ∇ 2 is more suitable for the laplacian operator; the symbol would lead to confusion in numerical work where x and y are used for grid spacing. To sum up, we obtain the following Poisson problem: . . . Let C ij represent an approximation to C(x i , y j , t), where (x i , y j ) have been described in domain section. Note that time variable is not been taken into account because the solution of the PDE of (0.1) on each MCS is not time-dependent. To discretize the PDE of (0.1), centered nite dierences have been used for x− and y−derivatives, which gives: • If i = 1 and: The above equations can be collected together into a matrix equation Ax = R. The coloured nodes of the gure 2 (b) are not considered since their values are already known. So, the equations to take into account are from (0.3) to (0.11). A is a matrix of size N 2 F D , with N F D = (M − 3) 2 that contains the factors that appear multiplying C ij terms, i.e. 0, 1 and −4. x is an array of size N F D that contains the values for C ij terms that satisfy the matrix form equation. The rst time we calculate x, it is initialized at zero, but in future Monte Carlo time steps we use the solution calculated in the previous MCTS. R is an array of size N F D and it is formed by the right -hand side (RHS) of the previous equations.
A is a sparse matrix so the system is needed to be solved with an appropriate method, like the Preconditioned Conjugate Gradient (PCG) method using ILU decomposition. The algorithm is provided by CUDA and uses a Compressed Sparse Row (CSR) format, so we have decomposed A into three arrays: • Aval is an array of size nz F D = 5N F D − 4 √ N F D . nz F D is the number of non zero values of A, i.e. ve -point nite dierence scheme minus number of frontier nodes per boundary times number of boundaries. This array contains the non zero values of matrix A.
• Acol is an array of size nz F D . It contains the index of the column of corresponding nonzero value writen in Aval.
• Arow is an array of size N F D + 1. It contains the cumulative number of non-zero values in rows of A, starting with 0 in the rst gap.
These three arrays, plus x and R, are needed to use the mentioned algorithm. After that, the x array is included in a bigger one V , which also contains the values of VEGF concentration at the boundaries.
The following steps are performed in this module: 6. Associate to each cell the value of VEGF concentration on the bottom left grid point of the pixel selected in .tip of CEL structure, i.e., update .vegf of CEL structure. This takes place in a kernel where is assigned one thread to one cell (kernel in chemotaxis.cu).

& 16. Output les with structures
These output les are used to save data about the simulation every certain MCTS and then to be able to visualize some data or make statistics. Files are generated with an .out extension; they store data of structures. The frequency at which these data are stored can be changed according to interest of the simulation, but normally it is every 50 MCTS. But before generating the les, the copy in the CPU is updated with the new data of the structures that are in the GPU.
The generated les in MCTS X are: 8. Cell source module The following steps are performed in this module (kernels in cellsource.cu): 1. For each initial tip, nd the pixel belonging to a cell that is closest to the point where the rst cell of this initial tip was place.
2. Kernel that places as many cells as they could be placed between at the left of the pixel found before and x = 0. The distance between this pixel and x = 0 must be, at least, one cell diameter (10 µm) to place one cell. This kernel assigns one thread to one pixel.
We only take into account this module in case (B) (cell elongation & cell overtaking) described in the manuscript. For more details about the this process, look up the main text.

Branching module
The following steps are performed in this module (kernels in branching. For more details about the branching process, look up the main text.

Cell proliferation module
The following steps are performed in this module (kernels in proliferation.cu): 1. Kernel that builds an array with the labels of the cell that can proliferate for each vessel. This kernel assigns one thread to one vessel.
2. Kernel that checks if cells in the built array meet certain requirements to proliferate. nprol is the number of cells that have met the requirements. This kernel assigns one thread to one vessel.
3. Kernel that prepares data arrays for K-means algorithm. This kernel runs nprol times.
4. K-means algorithm is used to form two groups of pixels of each cell. After that, every structure and contact_perimeter array are updated with the new data of these two cell that have formed through the groups of pixels. The two kernels involved run sequentially in a loop, nprol times, one for each cell.
For more details about the cells proliferation process, consult our manuscript.

VEGF concentration module
The following steps are performed in this module (kernels in chemotaxis.cu): 1. Kernel that calculates right-hand side R of the system Ax = R (see 6. Arrangement of VEGF concentration). This kernel assigns one thread to one node.
2. Solve Ax = R system using PCG method where x is the solution calculated in the last MCTS.
3. Kernel that includes the array x in V . This kernel assigns one thread to one node.
4. Kernel that associates to each cell the value of VEGF concentration on the bottom left grid point of the pixel selected in .tip of CEL structure, i.e., update .vegf of CEL structure. This kernel assigns one thread to one cell.

Signaling processes module
In this module, the nondimensionalized version of the system of Ordinary Dierential Equations (ODEs), Eqs. (14) -(19) of the main text, is numerically solved.
This system is solved with the explicit Euler method each 10 MCS, due to the diculty of parallelize this module. According to Boareto et al. [1], the role of tip and stalk cells could change every two hours, so that updating parameters values of the proteins involved each 0.44 hours is enough.
On each step of time, a system of six ODEs musts to be solved on each cell. This is done in a kernel that assigns one thread to one cell and calculates approximate solution with the explicit Euler method to the cell. Then, the approximate solution is incorporated to NS structure through a kernel that assign also one thread to one cell. The maximum number of steps has been chosen takes into account the convergence of the Euler method. Other numerical methods have been taken into account but have been discarded because they required more computational time and the error was approximately the same for the time step size we use.

ECM strains and displacements module
The following steps are performed in this module: 4. Kernel that updates types of cells and selects possible new tip cells of new sprouts. this kernel assigns one thread to one pixel.

Compiling and IDE
We have used the integrated development environment (IDE) Microsoft Visual Studio to edit the code, to compile multiple source les (including .cu les and CUDA libraries) and build the executable le.

Hardware and computation time
The computation time of each simulation in a computer with Intel(R) Core(TM) i7-7700K CPU @4.20 GHz processor, 64.0 GB RAM and NVIDIA GeForce GTX 1080 graphics card is about 4 hours.