Algorithmization and Simulation of the Chain-like Structures' Dynamics-interrelations between Movement Characteristics

We describe new approach for algorithmization of the chain-like structures' dynamics. The main underlying idea of our approach is the sequentialization of the moves. The resulting algorithm enables efficient sampling of vast state spaces related to considered phenomena. In our simulation experiments the algorithm appeared to be much faster than other algorithms known from literature. Therefore it enables researchers to study complex models of such systems as biomolecules or artificial polymers. In this paper we perform simulation study of interrelations between various parameters specifying different movement characteristics. Also the impact of these parameters on the time taken to cover a given distance along a given trajectory in the motion space is analyzed.


INTRODUCTION
Chain-like structures (CLS) are common in Nature, and thus they are widely encountered in biological, physical and technological processes.An important example of such processes involves a polymer transfer through a cell's membrane which is vital to virtually all living organisms [7].From the physical point of view polymers are complex systems that can be studied by Monte Carlo simulation [11,9].However, the complexity of possible polymer's behaviour along with the variety of the environment space futures form a real challenge for any algorithm used for this kind of studies.
The principal aim of this presentation is to depict an innovative concept for the algorithmization of the CLS movement.The main underlying idea is to sequentialize each move of the CLS.This concept enables us to modelamong others -the propagation of the tension through the whole CLS structure.Our approach significantly reduces the search space consisting of the allowed (acceptable by Nature) states.Thus, in our opinion it is obvious that the resulting algorithm is faster than those that have been presented so far in literature [2,6,10,14].Due to increased efficiency, the new algorithm also enables the researcher to take into account various additional features of the CLS.In this paper we use the proposed algorithm in simulation studies of the interactions between a number of algorithm parameters which are used to model numerous movement characteristics of the CLS.These characteristics are primarily related to the elastic properties of the chain.
Below, in Section II, we define basic notions and state principal assumptions which are necessary for description of our algorithm.The algorithm itself is presented in Section III.Next, in Section IV, we describe a simulation framework that we use to study some important aspects of modeling of the CLSs' dynamics.We focus on the relation between certain algorithm parameters and the speed of the CLS movement.Based on simulation experiment results we build metamodels for this realtionship.They are presented in Section V.

BASIC DEFINITIONS AND ASSUMPTIONS
In the formal description of the algorithm we use specific terminology which is necessary for precise denotations, which we define below.
An abstract CLS position is a finite sequence c={c 1 ,..c n } of 2D points c i ={x i ,y i } such that the distance d between two successive elements is bounded by given limits D min , D max , i.e.
. The elements of the sequence c are called segments of the CLS.
The segment c n is called a head of the chain, while the element c i is called a tail.The number n is called a length of the chain.
Assumption 1 (discretization of the motion space) The chain moves along the integer lattice nodes, i.e. the coordinates {x,y} of the chain segments are integers.
It is a very common assumption in the literature, [6,10,13].However, it is obvious that in this case to achieve better approximation of the continuous motion space, it can be assumed that the abstract unit of the length is equivalent to a given number of grid sides.The greater is the number of grid sides per unit (GSPU), the better is the approximation.
Any pair of the consecutive segments is called a bond.The structure of the CLS is defined by the mutual relations between all segments and bonds.
In the description of the algorithm we will use the notions of steps and moves.The first one is connected with a segment, whilst the second one is related to the whole CLS.
A step carries a given single segment from one lattice node to another.A single step can randomly transform a given segment into one from a set of its one step reachable nodes (OSRN), i.e. nodes satisfying the condition: .A move of the CLS consists of steps made by consecutive segments as long as the tension in the structure exists.So, the move reflects the tension propagation.If the first step does not create any tension in the CLS structure, then the move may consist of a single step only.
Assumption 2. (sequentialization of the CLS movement): Every move of the CLS is initialized by only one segment.Then the tension propagation trough the CLS can be sequentialized into a sequence of steps.
The first to go segment (FTG) is the segment which is chosen by the algorithm to initialize every next move of the CLS.The choice of FTG is made randomly according a given probability distribution defined on the CLS.The distribution will be denoted as FTGCD.The FTGCD may model various physical aspects of the CLS.An example of a problem, where the FTGCD is not uniform are polyelectrolytes, or, more generally, charged chains where the Coulomb interaction as well as the influence of charge fluctuations lead to models with different choice probabilities reflecting the charge volume.
A movement trajectory is a sequence of consecutive CLS positions stored in matrix C whose i-th row is interpreted as a CLS position after i-1 moves (at moment i).Thus symbol c ij denotes the position of segment j in the CLS at the moment i.
The final CLS position after the move is made may be influenced by various physical laws.In our algorithm these lows can be incorporated into simulation model by a probability distribution defined on the OSRN set.This distribution defines probabilities of different directions and/or lengths of each step.This step probability distribution may also depend on the segment's coordinates.It will be denoted by SPD.
Some features of the environment as well as assumed properties of the CLS itself, may result in existence of actually forbidden nodes (AFN).For any given moment and for each segment AFN there exists a subset of OSRN, and, obviously, it may vary in time.For example, one of such restrictions may be a requirement that in a given node, at most a given number of segments can be placed (e.g. the repton model [13], self-avoidance restriction etc).Yet another example is the existence of other objects that occupy some nodes, e.g. a cellular membrane.
The subtraction of SRN and AFN will be called a set of actually accessible nodes (AAN): AAN=SRN -AFN.The distribution SPD truncated to the set AAN will be called actual step probability distribution and denoted as ASPD.
In many real-world physical problems, such as polymer behaviour inside a tissue of living organism, we should also take into account additional constraints related to the biochemical nature of the system under consideration.Thus we define additionally the cost connected with the polymer structure.The cost of the polymer structure and its location in the motion space is a function F representing its fitness connected with its conformation and/or other external (e.g.environmental) properties.The lower cost, the better fitness of the polymer structure and position.
Assumption 3 (CLS position acceptance): The new position of the CLS is accepted with a probability depending on its cost.

THE ALGORITHM
The above assumptions and ideas are implemented in the algorithm for CLS movement simulation.Its block scheme is presented in Fig. 1.

Fig. 1 Block scheme of the algorithm for CLS movement simulation
In Step 0 (initialization) the space and movement parameters (D min , D max , L max , GSPU) are set.At the same step the user also sets the initial chain position c curr and program evaluates its current cost function value F C .Dependently on the simulation setup the first position of the CLS may also be chosen at random.Next, in Step 1, according to given FTGD the FTG is selected.The chosen position is denoted c curr , f .The essence of our algorithm is the Step 3: move compliment.During this step successive steps of remaining segments are made to obtain a new chain position c new .The algorithm chooses sequentially segments c new,i , i=f-1,..,1 and draws randomly neighboring nods for their next position according the OLPD and AFR.This process is terminated for the first k, k=f-1,..,1 for which the following condition holds: d(c curr,k ,c new , k+1 ) < g.If k>1, then for i=1,..,k we assume c new , i =c curr,i .Next the algorithm chooses sequentially segments c i , i=f+1,..,n and draw at random their next position according the OLPD and AFR.This process is terminated for the first k, k=f+1,..,n for which the following condition holds: d(c curr,k ,c new , k-1 ) < g.If k<n, then for i=k,..,n we assume c new , I =c curr,i .
In difference to all algorithms reported in literature previously, the proposed method of CLS position generation assures that it -the new position -satisfies the most fundamental chain properties.Consequently, the cost function that is usually adopted to verify the position does not reject any one generated in our approach.However in many problems, there are also some additional properties and/or requirements that should be satisfied by the CLS position and/or structure.Thus, whether or not the new position can be finally accepted is verified in Step 4. In this step the cost F N of the new CLS position and the difference δ = F N -F C are computed.If δ<0 then c new is accepted.
Alternatively, the new position c new is accepted only if a random variable U having the uniform probability distribution on the interval [0,1] satisfies U ≤ A(δ).If c new is accepted then c curr is replaced by c new , otherwise c curr remains as is.After generation of the new CLS position the termination condition is verified.In a given simulation experiment this condition is closely related with the investigated problem, but usually it involves some requirements that must be satisfied by the CLS position, e.g.whether the CLS has reached a given set of states or whether it has passed through a pore in a membrane.If the termination condition is met the simulation is terminated, otherwise the algorithm returns to Step 1.After the simulations are terminated, the algorithm returns the output, which dependently on the simulation purpose, may consist of the movement trajectory and its various statistical characteristics.

SIMULATION STUDIES AND THE RESULTS
In this section we present a simulation framework that we use for the analysis of CLS dynamics.It is assumed that the CLS is going upwards towards a boundary (which may reflect e.g.presence of cell membrane).We want to investigate the influence of the introduced above algorithm parameters on the number of steps which the CLS needs to made in order to reach the boundary -let us denote this quantity as NSB.It is a problem similar to the one investigated in [1] with the help of other algorithm, where the probabilistic distribution of the NSB where investigated.It is assumed that the probabilities that the step is made upwards are twice as big as the probabilities of a move in the opposite direction.In each case, at the initial position the CLS is placed parallel to the boundary and below it in a distance equal to 30 units.The initial position of CLS along with an example of its simulated final position is presented in Fig. 2. Fig. 2 An initial position of the CLS along with an example of its simulated final position -CLS length equals 10.The boundary is presented as a dashed line.
The movement parameters which reflect the CLS elastic properties are primarily the D max and L max .In our introductory studies we examine the influence of these parameters on the movement "time" measured in terms of the NSB.We assume the following values for other parameters used in the simulations: GSPU=2, D min equals 1 unit (2 grid sides).In our simulation framework the parameter D max takes on the values 2,4 or 6 units, while the parameter L max takes on the values 3,6,or 9 units.The NSB obviously depends also on the CLS length.In our simulation we study three different lengths of CLS: 10, 20 and 40.Results of our simulation experiment are presented in Tables 1,2 and 3   Based on the presented results the obvious conclusion is that the longer the CLS, the more "time" it needs to cover the distance to the boundary -for a greater length the NSB is greater in each case of the analogous values of the two parameters under study.However, studying the results presented in the above tables one may also notice some more interesting features of the examined relationship.The most important one is that the impacts of the two parameters (D max and L max ) mutually interact.We see that the greater is L max , the faster is the movement.However, the acceleration is stronger when the parameter D max is less than L max .But it is the necessary condition for the "spring effect".Consequently, the results prove that the impact of the spring effect is well reflected by our algorithm.
To describe this important relations in more general form we build proper metamodels.

METAMODELS FOR THE RELATION BETWEEN NSB, D MAX AND L MAX
In a number of real-world problems, as well as in various stochastic simulation analyses, the data gathered through the observations process posses the black-box structure.In the analysis of the input-output relation under the black-box representation, a special role is played by socalled metamodels [8].A very general definition of the metamodel is the following: a metamodel f is an approximation of the input/output relating function φ that is defined by the underlying simulation model.Metamodels are built on the base of the simulation data with the help of various statistical techniques, first of all the regression analysis.In this section we present metamodels relating the algorithm parameters N, D max , L max and the movement speed characteristic NSB.The assumed form of the metamodel is the following: where Z is a random variable (disturbance) with E(Z)=0 and finite variance.
In our simulation experiment the parameters N, D max , L max where chosen from the following ranges of integers: It appears that the relation given by f in (1) has nonlinear character.What is especially interesting the parameters N, D max , and L max interact -the impact of one of these parameters depends on the values of the others.Based on the preliminary data analysis, we consider the following shape of the regression function f: The estimates b i of the regression coefficients β i , i=0,...,7, are as follows: It turns out that the model has really good statistical characteristics.For example its coefficient of determination equals R 2 = 0.9800 and all indicted explanatory variables have significance below p=10 -17 (usually p=0.05 is considered as good enough!).
The metamodel (2) describes well the character of the interrelations between the algorithm parameters and their mutual influence on the simulated CLS movement speed.It allows a proper "calibration" of the algorithm parameters in order to improve the quality and adequacy of the final model in its given real-world application.
It is also very important that the simulation results confirm our intuitions.Such confirmation of the intuitively expected features of the investigated relationship supports the adequacy of the presented approach for modeling CLS movement.Although in physics literature, particularly in polymer studies, the problem of the elasticity of the bonds is treated as very important one, such experiments have not been reported in the stochastic simulation literature as yet.The presented algorithm enables us to study various aspects of the CLS movement when the whole structure demonstrates spring properties and the tension propagation should be taken into account.

FINAL REMARKS
Based on the simulation experiment described in this paper we see that the concept of sequentialization of the CLS movement leads to fast and effective algorithms that allow to model CLS dynamics in an efficient way.Although the CLS model investigated here was intentionally chosen in its simple version (e.g. a cost of current CLS position was not taken into account) the results received in our simulations are very important.They prove that even purely geometric algorithm parameters as D max , L max can be used to model some important physical features of the CLS as e.g. its elastic properties.The mutual interrelation between the model parameters and their impact on the movement speed can be studied with the help of the matamodel presented in Section V.Such studies allow the experimenter better modeling of a real-world phenomena involving numerous variation of chain-like-bodies transportation processes.
The algorithm can also be easily applied to more sophisticated physical and biological problems.Eminent examples come from the so-called DNA origami technique which allows engineers to precisely locate the ends of DNA strand on a substrate and in consequence to use the DNA as a piece of scaffolding to produce few-nanometerresolution pattern of computer chips [5,12].
Specifically, in the spirit of the presented algorithm, it is worth mentioning that one can easily take into account the interactions among the segments (e.g. the so-called spring cost or other possible costs connected with the CLS structure alone which are important in a number of specific, biologically-oriented problems).Also the possible interactions between the segments (e,g.monomers) and the environment determining the movement space (e.g solvent's molecules) can be implemented into the simulation experiment with the help of presented here approach.Some applications of the presented algorithm to other realistic polymer's models were presented in [4].Another polymer studies as well as some further modifications of the algorithm itself are in the progress.Kamila Bartłomiejczyk graduated from the Faculty of Mechanical Engineering and Computer Science at the Czestochowa University of Technology in 2011 with the degree MSc in informatics.Her thesis title was "Epromotion.How to utilize the Internet network for the promotion?"In 2012 from the same department she received the BSc degree in mathematics -the thesis title was "Polynominal interpolation of real functions".Now she is a PhD student in the field of informatics at the same institution.Her scientific research is focusing on analysis of chain-like structures translocation.

Step 4 :Step 2 : 6 :
Acceptance of new CLS position Step choice for FTGStep Return the output Step 5 : Verification of the termination condition satisfied not satisfied ISSN 1335-8243 (print) © 2013 FEI TUKE ISSN 1338-3957(online), www.aei.tuke.sk 8243 (print) © 2013 FEI TUKE ISSN 1338-3957(online), www.aei.tuke.sk .83752, b 1 = 24.7051,b 2 = -33.9772,b 3 =-1.59403,b 4 = 2.36439, b 5 = -12.159,b 6 = 281.96,b 7 = -0.0496536 November 8, 2013, accepted December 19, 2013 BIOGRAPHIES Andrzej Z. Grzybowski received his PhD degree in the field of mathematics from the Institute of Mathematics, Wroclaw University of Technology, Poland, in 1991.His thesis title was "Minimax decisions in some problems of control of stochastic systems".In 2013 he received D.Sc.degree from the Institute of System Engineering and Informatics, University of Pardubice, Czech Republic.His habilitation thesis was entitled "Modeling and simulation in decision making under uncertainty".Since 1983 he works at Czestochowa University of Technology, Poland, where now he is an associate professor at the Institute of Mathematics, Faculty of Mechanical Engineering and Computer Science.His scientific research is focusing on artificial intelligence as well as on formal and Monte Carlo analysis of various problems arising in statistical decision theory.Zbigniew Domański received the Ph.D and D.Sc.degrees in solid state physics from the Institute of Low Temperature and Structure Research, Polish Academy of Sciences, Wroclaw, Poland, in 1987 and 1997, respectively.From 1988 to 2000 he was a senior researcher with the Institute of Theoretical Physics, University of Lausanne, Switzerland.Now he is a professor with the Faculty of Mechanical Engineering and Computer Science, Czestochowa University of Technology (CUT), Czestochowa, Poland.His research interests range from applied mathematics, statistical mechanics to transport phenomena in nanotechnological systems. (

Table 1
. Mean values of NSB in relation to the parameters D max and L max -simulation results obtained for CLS Length equal to 10

Table 2
Mean values of NSB in relation to the parameters D max and L max -simulation results obtained for CLS Length N= 20

Table 3
Mean values of NSB in relation to the parameters D max and L max -simulation results obtained for CLS Length N=40