Particle swarm optimisation driven low cost single event transient fault secured design during architectural synthesis

: Owing to aggressive shrinking in nanometre scale as well as faster devices, particle strike manifesting itself into transient fault span-ning multiple cycle and multiple units will be the centre-focus of application speci ﬁ c datapath generated through high-level synthesis (HLS)/ architectural synthesis. Addressing each problem above separately leads to large area/delay overhead; thus tackling both problems concurrently, leads to huge incurred overhead. To tackle this complex problem, this paper proposes a novel low cost particle swarm optimisation driven dual modular redundant (DMR) based HLS methodology for generation of a transient fault secured design secured against its temporal and spatial effects. The authors ’ approach provides a low cost optimised fault secured solution through a particle swarm optimisation exploration framework based on user area-delay constraints. Results indicated that proposed approach obtains an area overhead reduction of 34.08% and latency overhead reduction of 5.8% compared with a recent approach.


Introduction
Application specific computing hardware can become increasingly vulnerable to threats due to highly energised alpha particle strike [1][2][3]. Transient faults can be exploited to create security vulnerabilities such as affecting output computational value. More explicitly, standard vulnerability arising from a transient fault results into erroneous output value due to bit flips causing catastrophic consequences on the systems [1,[4][5][6][7][8][9][10].
Transient faults are temporary faults that can emanate in the following environments: (a) Space region due to radiation strike [4] (b) Terrestrial regions. Transients fault in terrestrial regions is caused due to ionising radiation resulting from alpha/neutron particles emitted from IC package impurities used in computing systems. These particles are known to be the major contributing factor to the transient on the ground [4]. In IC packages, some trace amounts of uranium and thorium impurities are found which are actually responsible for emitting alpha particle causing transient faults. Transient faults can be single or multi-cycle in nature. The upsurge in density per unit area is adversely impacting the device and general systems reliability.
Moreover, the disposal of faster devices is a characteristic of current technologies that induces key apprehensions to the fault detection researchers. This is becoming more common for current technologies as particles with low linear energy transfer (LET) values even yield a transient extending more than the anticipated cycle time of circuits. Thus producing multi-cycle (k c -cycle) transient fault (extended interval transient in the range of ∼ 0.01 ns-1 ns indicated in [8]) in a device, technology advancement and LET of particle influence both play a chief role. Furthermore, owing to huge scaling in nanometre technology ensuing in attenuation of geometric measurements of a device, prospect of multi-unit transient fault (k m -unit) is greater now (particle strike resulting in nanometre range is available in [7,11]). This phenomenon is called multi-unit transient fault (MTF) stemming from a single radiation hit. In other words, a particle strike can affect two neighbourhood units bi-directionally in a similar fashion causing common-mode transient faults (as evident through literatures [8,12,13]). An example is also shown in [12] which show the prospect of a single strike producing bi-directional error by upsetting two units in an identical manner. Present algorithms for physical designs do not take into account the spatial influence of transient fault (i.e. affecting multiple units simultaneously) before determining placement of units. This is because they only consider area, delay and power (not fault security) during determining an optimised placement. Thus a placement algorithm may place two identical units in a fashion that a single strike affects both equivalently, triggering same effect (even simple rotation of two identical units while placement is not effective) [1,2,4,[14][15][16][17].
However, developing low cost approach for handling both multicycle transient and multi-unit transient fault simultaneously is an expensive affair. Handling the above problems at the higher level of design abstraction i.e. behavioural level [during high-level synthesis (HLS)], may incur lower design overhead as well as ensure reliability awareness from the very initial stage of a design flow. Moreover, addressing this problem at the higher abstraction level also provides greater reliability from the beginning of the design flow. Considering worst possible value (strength) of transient fault in temporal domain (k c ) and spatial domain (k m ) as design specification/constraint during HLS (at behavioral level) in conjunction with an advanced design space exploration (DSE) framework, enables generation of a low-cost design solution that is concurrently multiple transient and multi-cycle transient fault secured. This paper presents a novel low-cost security to the dual problem (multi-unit transient and multi-cycle transient fault) through HLS [12,[18][19][20][21][22][23].

Motivation: low-cost optimised multi-cycle and multi-transient fault security aware HLS
Following are the reasons: (i) First reliability occurring due to single radiation strike is a key concern especially multi-cycle transient and multi-unit transient fault [24,25], where handling it at higher abstraction level, i.e. during high level synthesis appears beneficial, owing to lesser available details at this level and greater chances of minimising overhead through advanced optimisation heuristics. (ii) Second, due to paradigm shift in the area of HLS from performance/area optimisation to fault security optimisation, automated techniques are required that can perform exploration of a low cost solution. (iii) Finally, handling both multi-cycle transient and multiunit transient fault is expensive and time consuming and requires imposing multi-layer security constraints which sustain large area and delay overhead that do not abide by the user budget. Therefore, automated low cost economical design solution that satisfies the user budget as well as minimises chip area/delay is proposed in this paper.
2 Contributions of the paper

Novel contributions of the paper
Following are the novelties of the present paper: (a) Novel low cost optimised multi-cycle and multi-unit transient fault secured design based on user area-delay constraints during fitness evaluation. (b) Novel particle swarm optimisation (PSO)-based exploration methodology for high level synthesis for inclusive fitness assessment of a multi-unit and multi-cycle transient fault secured design solution. (c) The proposed approach obtains an area overhead reduction of 34.08% compared with [26], while simultaneously attaining a latency overhead reduction of 5.8% compared with [26]. [26] (a) The proposed approach presents low cost optimised multi-cycle and multi-unit transient fault security during HLS, while [26] only presents fault security methodology (with no optimisation in the secured design technique). (b) The proposed work integrates PSO DSE framework with the fault security approach to yield optimal design solution based on user constraints. However, [26] does not consider user constraint during generating optimised fault secured design solution. This resulted in expensive designs where the user cost budget is not met. Further, non-optimised designs generated by [26] incur huge silicon overhead and delay expenditure. (c) Detailed sensitivity analysis of the PSO parameters on the quality of DSE results and exploration runtime is presented, which was not included in the seminal version.

Related prior work
Our work focuses on providing low cost optimised simultaneous fault security at behavioral (architecture) level against temporal and spatial effects of transient faults occurring due to single particle strike through physically aware high level synthesis methodology. The only work on similar topic is provided in authors' previous work [26] on non-optimised transient fault security during HLS. However there is no effort (zero work) in prior art to provide concurrent low cost optimal solution to multi-cycle transient and multiple transient fault security during HLS (note: the difference between proposed work and [26] is discussed earlier).

Multi-cycle transient fault security
Multi-cycle transient fault security at behavioral level was handled in [6,27,28] through concurrent error detection scheme where replication of control data flow graph (CDFG) operations were performed. Once the scheduling of the dual modular redundant CDFG was generated, then specific hardware allocation rules were imposed to ensure detectability. Sengupta and Sedaghat's approach [6] presents more comprehensive hardware allocation rules, as it encompasses the condition of minimal hardware availability as well. In other words, Sengupta and Sedaghat's approach [6] can provide multi-cycle fault security even with the availability of single hardware type which is not possible in [28]. In [28] minimum two distinctive hardware is obligatory for allocation to sister operations of original and duplicate unit in DMR. This approach [6] provides more flexibility and robustness.

Multiple transient fault security
Owing to rare occurrence of multiple transient faults in past technologies, it did not receive much attention in the literature. Further, besides the work presented [26], there has been absolutely no effort on handling multi-unit transient fault during HLS. Rusu et al. [ 11] used simulation-based technique to focus on multiple transient faults where magnitude of transients resulting from single radiation strike was assessed. Further modelling transient fault propagation was performed in [5] when a fault occurred at the gate output within a logic circuit. Thus the aforesaid techniques tackle multiple (or multi-unit) transient fault at lower level.

Background and motivation of PSO
PSO is an advanced optimisation technique widely deployed in solving complex optimisation problems across several domains [29][30][31][32][33][34]. In this paper we have integrated PSO based DSE with HLS-based fault security methodology by mapping its generic procedure to the requirements of the current problem. Generic PSO is a population-based search technique where particles fly through a search space. In PSO, the position of an ith particle is transformed by summing the velocity to the current position [32]: Updating of the velocity is performed by using the following function: where x i (t + 1) and x i (t) are the position (represented through 'n' dimensions) of a particle at time 't +1' and 't', respectively; v i (t + 1) is the velocity at time 't +1' updated per dimension (it reflects the step size in each dimension, i.e. distance covered per unit time in each dimension); ɷ is called the inertia weight, b 1 is the cognitive learning factor, b 2 is the social learning factor, r 1 , r 2 are random numbers in the range [0, 1], x lb i is the best position of ith particle with respect to the minimisation problem, x gb is the global best position found so far.

Motivation of using PSO algorithm as DSE framework
PSO as an exploration backbone is considered more productive than co-existing evolutionary techniques such as genetic algorithm (GA) and Bacterial Foraging Optimisation Algorithm (BFOA) owing to the following reasons: (i) Literature [33] suggests that clinically pre-tuned PSO framework yields better optimal results and quicker convergence to optimal during DSE compared with evolutionary algorithms such as GA.
(ii) PSO-based DSE ensures channelled searching such as alteration in search path using velocity vector when a particular direction is found un-productive. (iii) Control parameter such as inertia weight inside PSO assures an optimal balance between exploration-exploitation during searching through linearly decrease within [0.9-0.1]. (iv) Other control parameters such as acceleration coefficient within PSO enables balance between cognitive and social learning [33].

Problem formulation
Given a CDFG, explore its design space and find a low cost optimal design solution (X i ): Note: in the proposed approach, solutions which violate user area and latency constraints are discarded during iterative PSO driven exploration. Only the solutions which meet the user constraints are allowed to move forward into next iteration and finally evolve into a low cost solution.

General description of proposed methodology
The proposed approach shown in Fig. 1 accepts module library, application in the form of CDFG (available as standard benchmarks in [12,35]), resource constraints (X i ), user constraints of area/ latency and characteristics of transient fault in temporal (indicated as 'k c -cycle') and spatial domain (indicated as 'k m -units')a s design constraints. The fault security measures for nullifying the spatial and temporal impacts of transient fault are handled exclusive of each other. However, both the algorithms used for nullifying the effects are connected through physical design driven HLS phenomenon. MCT is resolved during one of the high level synthesis steps while MFT is resolved during one of the physical design steps. However, inserting separate security constraints during physical design driven HLS incur massive chip area and schedule delay overhead. Therefore, optimisation needs to be performed to mitigate this problem and yield a low cost design. Our work employs a PSO driven DSE framework to produce a low cost multi-cycle transient and multi-unit transient fault secured design.
The proposed approach comprises of three major processing segments: (a) Fault security segment: responsible for converting a normal design into a multi-cycle transient and multiple transient fault secured design. This segment further contains dependent blocks (i) k c -cycle transient fault secured block (ii) k m -unit transient fault secured block.
On the basis of the resource constraint (provided as particle position during PSO process) information, scheduling of a DMR design is performed. The DMR design is created by duplicating all the operations of CDFG along with existing original unit. Duplicating operations of CDFG (i.e. duplicate unit) is required for providing detection capability in case transient fault occurs. This is because adding a duplicate unit with 'proposed hardware allocation security rules' enables to produce distinct output value than its original counterpart. Subsequent comparison of respective output value from original and duplicate units yields a difference in magnitude, thereby providing transient fault security (detection). Scheduling of DMR design is performed by providing greater priority to original operations over duplicate ones, in case of conflicts. Once the DMR scheduling is generated, then it is fed into the proposed  The process of PSO driven exploration operates as follows: firstly, the particles are initialised which after processing by the fault security segment is fed to fitness evaluation block for updating the local and global best particle position. Next, velocity for each particle is determined, followed by clamping if necessary for preventing swarm explosion. This evaluated velocity is used to explore new resource configuration represented as particle position, which is again fed to the fault security segment for processing. This process iterates until the terminating criteria is reached, to finally yield a low cost k c -cycle transient and k m -unit transient fault secured design. The details of major PSO steps are described in Section 5.5.
(c) Fitness evaluation segment: This segment is responsible for evaluating the fitness of the k c -cycle transient and k m -unit transient fault secured design produced by the fault security segment. Information such as latency of DMR schedule (derived from output of k c -cycle transient fault secured block and area of enveloping rectangle of chip floorplan derived from k m -unit transient fault secured block are fed as input into this fitness evaluation block. The output of the fitness evaluation block is fed into the design optimisation segment for ascertaining the updated local and global best particle position. The fitness evaluation of our proposed approach is a function of normalised fault secured latency and chip area given as follows: where A DMR MCT−MFT and L DMR MCT−MFT are defined earlier in Section 5.1. The details of our fitness function are discussed in Section 5.6.

Constructing a k c -cycle transient fault secured design
This sub-section explains the approach that transforms a DMR schedule into a 'k c -cycle secured DMR schedule and allocation'. The primary inputs to the approach including X i (particle position representing resource configuration), CDFG, temporal strength of transient fault, and library of available hardware are fed into the proposed approach. The process of generating a DMR schedule involves design of original and duplicate unit (U OG and U DP ) concurrently scheduled based on resource constraint using list scheduling algorithm. Once scheduling is complete, the hardware allocation step is executed for operations of both original and duplicate units. The output of this approach yields a 'k c -cycle secured DMR schedule and allocation' which is used for fitness assessment through proposed cost function. The hardware allocation rules are as follows [6]: (i) Assign opn (v) ε U OG and opn (v′) ε U DP to dissimilar resources (hardware units). (ii) If unavailable, then: Retain same allocation for v′(as v)i nU DP so that: (iii) If (i) is false, then: Push v′ (and its children) ε U DP one CS below such that condition (i) becomes true.
Hazards occur between sister operations of original and duplicate units, in case hardware allocation rules (i), (ii) or (iii) of duplicate unit is violated. In other words, TFH occurs when the following condition is satisfied: In case of above condition, the hazards are resolved by shifting the operation (and its successors) of duplicate unit which is affected in lower CSs such that t(v ′ )-t(v) >k c . Therefore, once the 'k c -cycle secured DMR schedule and allocation' is generated then the outputs are fed into a multi-phase set-up that comprises of comparators (C1, C2 and C3) in the first phase, followed by voter in the second phase. Inspired from [36], this set-up is designed to safeguard against a likely faulty comparator. The possibility of fault arises when the comparator is itself susceptible against particle strike.

Constructing a k m -unit multiple transient fault secured design
As discussed earlier, once a 'k c -cycle secured DMR schedule and allocation' is obtained, its latency from schedule is derived for cost evaluation. Next, the list of hardware modules are extracted so that it can be fed an input into k m -unit transient fault secured block. The algorithm executed in this block converts a 'k c -cycle secured DMR' into a 'k m -unit and k c -cycle secured DMR'. The algorithm is a physical design floorplan which is achieved such that the hardware modules (from the list L[k]) are placed without any k m -unit constraint violation, i.e. hardware modules allocated to two sister operations of original and duplicate units are always placed at least k m -units apart. We note here that k m -unit is considered worst case spatial impact of transient fault. This enables two hardware modules of similar operations to never produce identical wrong outputs despite transient fault, as they are not placed within k m -units range. Let us consider that 1 unit = 768 nm (particle strike resulting in nanometre range is available in [7,11]). Note: k m = 4 is only an example value used for explaining proposed algorithm. k m -unit can be any worst case value fed as input for transient fault security which is based on the estimated spatial strength of particle strike. The rules framed safeguards that any sister hardware modules are prohibited to be placed in a floorplan within the vicinity of k m units. This is owing to the fact if within k m units both sister hardware modules are placed in a floorplan, then a radiation strike causing multi-unit transient fault may impact both the units equally producing equivalent incorrect outputs. Thus error detection block (or comparator) will not be able to detect fault. The proposed diagonal floorplan growth rules are highlighted in Fig. 3.

Particle encoding:
Particle position 'X i ' of an 'ith' particle for a given CDFG is represented as follows: where, N(R d ) = number of occurrences of dth resource type.

Calculation of new particle position:
New particle position is calculated in this step with initiation of iteration process. In each iteration of PSO-DSE, new resource value of a particle X i is calculated in dth dimension: which can be expanded as specified in the following equation [32]: where, R + d i = new resource value of particle X i in dth dimension and R d i = previous resource value of particle X i in dth dimension; V + d i = new velocity of particle X i in dth dimension (i.e. step length taken per unit time in dth dimension) updated through the following equation: where, 'R d lbi ' is the resource value of X lbi in dth dimension and 'R d gb 'is the resource value of X gb in dth dimension. Note X lbi = {R 1 lbi , R 2 lbi , ..., R D lbi } and X gb = {R 1 gb , R 2 gb , ..., R D gb }

Determination of local and global best position:
In iteration 1, current position (X i ) assumes the value of local best position for an ith particle X lbi . This is because in first iteration there is no previous local best position for an ith particle. In addition, (8) is used to determine the global best position (X gb ) of the population: where, C X i f lbi = local best fitness of particle 'X i ' and 'X gb ' designates the global best particle position with minimum cost among all particle positions (X 1 ,…,X n ) [ 32].

Terminating criteria:
The proposed approach stops should one of following is true: maximum count of iteration have surpassed (M = 100) or S 1 : No enhancement is seen in R gb over '£' iteration count. (£ = 10); S 2 : all particle's velocity become zero (V + = 0).

Cost evaluation:
Subsequent to generation of multi-cycle and multi-unit transient fault secured design corresponding to a particle position, the magnitudes of DMR schedule latency and chip  area are determined and passed into the cost evaluation block. The proposed fitness (C f (X i )) evaluation function is shown in the following equation: where, w 1 = w 2 are the user quantified weights of latency and floorplan area, respectively, typically maintained at 0.

Motivational example
As shown in Fig. 5, a two-cycle transient fault is assumed (corresponding to a transient pulse size of 200ps for a particle LET extracted from [8]) for example. Where, one-cycle or control step = 100 ps). It is to be noted that the values used for explanation is only an example and any other value can be used (as shown in results in Section 6, four-cycle transient fault strength is shown). Transient pulse width can range from ∼0.01 ns to ∼1 ns depending on particle LET [7,8]. By taking into account the worst case transient magnitude, k c -cycle value can be quantified as input. Then a list L[k], which contains list hardware modules present in the DMR SDFG, is prepared subsequent to generation of two-cycle transient fault secured schedule (DMR SDFG). For example list of hardware modules for SDFG DMR of Fig. 5 Table 1 shows the conflict between sisters operations and their corresponding hardware allocation. Based on the identified conflicts, the k m -unit secured design is generated. As discussed in Section 5.2, proposed MTF security algorithm adds converts a k c -cycle transient fault secured design into a k c −k m secured design by performing placement of units such that it achieves k m -unit security. We note that the geometric dimensions of the units (based on 15 nm NanGate technology library) are highlighted in Table 2. Fig. 6 shows the two-unit (k m = 2) transient fault secured valid floorplan generated using the rules stated in Section 5.4 (white space indicates dead space in chip floorplan). For example assuming the strength of the multiple transient fault is k m =2 units (i.e. worst case impact is 1536 nm. In this example, since size of smallest functional module is 768 nm, hence the worst possible impact of MTF is consecutive neighbourhood functional modules) and M1 and M2 is one such conflicting hardware pair due to allocation to sister operations 1 and 1' (shown in Fig. 4). Thus, M1 and M2 should be placed at least k m = 2 units apart (i.e. at a gap of 1536 nm) to follow two-unit MTF security. This is because violation of this would impact both neighbourhood modules M1 and M2 equally, causing both 1 and 1' to produce identical wrong output. As seen, M1 and M2 have been placed four units apart. Similarly, M1 and M3 and M2 and M3 have conflicts due to allocation to sister operations 3 and 3' and 5 and 5' respectively, hence M3 and M4 and A1 and A2 are also placed at least two units apart. However, since M1 and M3 are not conflicting hardware, hence they can be placed as neighbours during floorplan. The corresponding k m -unit multiple transient fault secured floorplan is shown in Fig. 6 which is checked for normalisation through slicing floorplan tree and normalised polish expression [37,38].

Experimental results
The proposed approach constructs by high level synthesis low cost designs that are concurrently multi-cycle (k c ) and multi-unit transient (k m -unit) fault secured. The low cost fault secured design is evaluated on the following:  The proposed approach has been realised in java and run on Intel Core-i5-3210M CPU with 4 GB DDR3 memory and frequency of 2.5 GHz. Equal weightage to chip area and schedule delay (w 1 = w 2 = 0.5) were provided during cost evaluation.

Proposed results with respect to user constraints
As evident in Tables 3 and 4, for numerous user resource constraints, low cost transient fault secured designs are explored with swarm size (S) = 7. The solutions explored not only satisfy the user constraints of chip area and latency but also comprehensively minimises the hybrid cost value (discussed in Section 5.1). For example for k c = 2-cycle and k m = 2-units, the ARF floorplan produces concurrent multi-cycle transient and multi-unit transient fault secured design. Similar results are observed for all other benchmarks tested. (Note: due to imposing security constraint of multiple transient faults, area overhead may be likely for the designs. However, the schedule delay does not change, as MTF security constraint does not change scheduling of operations. In addition, it is to be noted that designs that abide by k c -cycle transient fault security constraint, yields into larger schedule delay). Further, Tables 3 and 4 also shows the final fault secured design solution explored through proposed physically aware DMR driven high level synthesis. Therefore, the results in Tables 3 and  4 indicate that based on the user constraints specified, proposed approach is capable of exploring a low cost k m -unit and k c -cycle transient fault secured design solution.   The results of the proposed approach in terms of fitness (cost) value achieved for the final fault secured design solution with respect to varying swarm sizes is shown in Table 5. As evident from the results, for swarm size = 7, the proposed approach explores the lowest cost (global best) design solution for almost all benchmarks. Therefore the quality of solution found is highest at S = 7. Only for FIR, the lowest cost solution is explored at S = 5, as it is a medium size benchmark.

Proposed result: iteration convergence and execution runtime
The proposed methodology first creates a k c -cycle transient fault secured DMR schedule based on MCT fault security constraint rules. Subsequently, information from this design output is fed into the k m -unit MTF secured block. The respective iteration of convergence where low cost k c -cycle transient and k m -unit multi transient fault secured design is generated for varying swarm sizes is shown in Table 6. As evident from Table 6, it is seen that with increase in swarm size, the iteration of convergence either remains constant or decreases during exploration a low cost optimal solution. Further, the average exploration/implementation runtime to find the final design solution through proposed PSO driven exploration process during physically aware HLS is also reported in Table 7. As seen in Table 7, the proposed approach produces low cost fault secured solutions for small, medium and large  size benchmarks within acceptable runtime. In other words, the implementation runtime of the proposed approach is in few seconds irrespective of the size of the benchmark handled. Therefore our approach is scalable for large problem size.

Comparison with related work
To the best of the authors' knowledge, there is no low cost optimised framework that provides simultaneously security during high level synthesis against multi-cycle (k c ) and multi-unit (k m ) transient fault impacted. We simultaneously achieve this temporal and spatial security through a novel DMR design driven physically high level synthesis. There is no prior art that provides low cost optimised simultaneously security against temporal and spatial effects during HLS. We therefore compared proposed approach with [26] which does not tackle minimisation of overhead attained due to imposing multi-layered fault security rules and do not yield a low cost economical design solution. Note: The comparison results with un-hardened designs for similar benchmarks have been already discussed in [26], thus have not been included here. Results of comparison with [26] have been presented for two types of user constraints namely 'towards maximum value' and 'at mid-value' in Tables 8 and 9 respectively. 'Towards maximum value' indicates relaxed constraints, while 'mid-value' indicates tighter constraints. Results in Table 8, indicate that for all benchmarks, the proposed approach attains lower cost final fault secured solution than [26], for almost all benchmarks. However, the proposed approach never attains a higher cost solution than [26]. Similar trend is observed for the results reported in Table 9 i.e. the proposed results are lower in magnitude than [26] for mostly all benchmarks. More specifically, for FIR benchmark, approach [26] yields a fault secured solution with final cost as −0.1646, while proposed approach yields a solution with final cost as −0.2227. Therefore proposed low cost approach yields a better quality fault secured solution with significantly lower design cost than [26] (which is also well within user constraints specified). However for ARF benchmark, the quality of results remains unchanged, as similar solutions are found for both proposed and [26]. Table 10 presents the comparison of design overhead (i.e. latency and chip area) for proposed approach and [26] with baseline approach (non-fault secured design). As evident from Table 10, for almost all benchmarks, the design overhead of proposed approach wrt to baseline (non-fault secured design) is lower than design overhead of [26] w.r.t. to baseline. This is due to lack of optimisation performed in [26] for fault secured designs, resulting in expensive design solutions. Simple calculations revealed that the proposed approach obtains a reduction in area overhead of 34.08% compared with [26], while simultaneously attaining a latency overhead reduction of 5.8% compared with [26]. This analysis provides strong evidence that not only simultaneously resilient design against multi-cycle and multiple transient fault is crucial for future generation of systems, but its low cost model that satisfies user chip and delay budget is equally critical from economic stand point.

Security evaluation
In a DMR system, hardware assigned to every operation is vulnerable to transient fault. The number of potential transient fault in a DMR system is directly proportional to the number of hardware assignments. For example, the DMR system of DCT application (Fig. 5) has 56 hardware assignments to operations resulting into 56 vulnerable. Similarly, for other benchmark applications, the total security vulnerabilities that are secured through proposed approach are shown in Fig. 7.

Conclusion and future work
This paper presented the first approach on developing a low cost concurrent multi-cycle and muli-unit transient fault secured design during physically aware HLS driven by DMR concept. In addition, it proposes the unification process of high level synthesis and physical design floorplanning to leverage the accuracy involved during design point evaluation based on user chip area-delay constraints specified. The future work targets detailed placement and wirelength estimation besides floorplanning during physically aware high level synthesis. Further, we intend to refine floorplan quality by imposing internal optimisation prior to fitness evaluation.

Acknowledgment
This work acknowledges the financial support provided by Media Lab Asia, Ministry of Electronics and Information Technology, Government of India.