Error Correction For Soft Errors

All the electronic component devices and planetary mission spacecraft’s correspond toward a particular radiation rise within the space radiation atmosphere. The illuminating particle in environment consists of high energy electrons, protons, alpha particles and cosmic rays. This can cause latch-up and transient upsets in memories. Integrated circuits such as memories can be radiation hardened against the total dose effects and single event phenomenon by using special processing and design techniques.


Introduction
All the electronic component devices and planetary mission spacecraft's correspond toward a particular radiation rise within the space radiation atmosphere. The illuminating particle in environment consists of high energy electrons, protons, alpha particles and cosmic rays. This can cause latch-up and transient upsets in memories. Integrated circuits such as memories can be radiation hardened against the total dose effects and single event phenomenon by using special processing and design techniques.
A soft error is an issue in electronic component device which causes a momentary circumstance that alters the stored data in an unintended way. The change of stored information is referred as either soft errors or hard errors, which result from hardware failure. If the proton strikes a memory cell, it can alter the contents that the cell contains. Even cosmic rays cause this phenomenon from time to time [1]. Without cautious testing, a soft error caused by steady issues can be misjudged as a hard error.
The soft error reactions on universe and electronic equipment's are shown in Figure 1. Soft errors mainly occur, when there is a wrong signal or information, and it is not a mistake or breakage. This is caused by the radiation induced transient faults in electronic systems. Neutrons from astronomical beams and alpha particles from the bundling and holding materials are the real radiation sources, which may cause transient faults either in single event upsets or single event transients.
Radiation induced soft errors are reliable in most of the electronic products [2]. Main sources of radiation are: » Neutrons from cosmic rays and » Alpha particles from packaging and bonding materials » Soft Errors are classified as: • Single Event Upset and

• Single Event Transient
SEUs are inverting errors that upset the states in memories or sequential logic. SETs are voltage pulses that occur in combinational logic. Increasing speed also increases the probability that SETs are Abstract Radiation induced soft errors are susceptible to most of the electronic products with the development of CMOS technology. A particle striking on any of the electronic products can produce soft errors that can be either single event upset or single event transient. There are various techniques such as FERST, BISER, TMR, DMR, DICE, SEC-DED, DEC-TED, EDAC, PARSHIELD, and STEM for soft error elimination. But these techniques do not provide self-checking capability, and has high area, output corruption. Soft Error and Timing error Tolerant Flip Flop (SETTOFF) is used to conquer these drawbacks. The self-checking is provided at the transition detection part. SETTOFF is designed for normal operation and fault operation. This has higher area overhead than BISER. So BISER by means of self-checking capability has been proposed to conquer the limitations by reducing the area. BISER by means of self-checking can yield better results with reduced area overhead, power and delay compared to SETTOFF architecture and BISER. The architecture is implemented using SPICE and simulation waveforms are obtained.
captured. SET propagating through a digital circuit and resulting in an incorrect value in sequential logic can be considered as an SEU. Low operating voltages decrease the energy necessary to provoke soft errors.
The soft errors are non-permanent defects which are caused by the radiation effects or power supply problems. Due to these causes, the memory is not damaged but the contents stored in the memory cells are shuffled. If these soft errors are not checked and corrected properly, it may lead to hard error. The hard errors are the permanent causes which occur during the manufacturing defects and wears. Due to these causes the component is dead, memory is damaged and so it cannot store any data which results in replacement of the component or memory. Mostly hard errors are frustrating while soft errors can be solved by rebooting in some of the applications.
The security based applications which are used in space are protected by Triple Modular Redundancy (TMR) [3]. But in lots of cases TMR is not a suitable solution for less critical solutions because its area and power are 200% as well as too expensive. Memory arrays can be secured by means of conventional error correction codes [4][5][6][7][8].
Multi Bit Upsets (MBUs) and the obtained SET from combinational gates has turned out to be a major issue at current technology [9]. Error correcting codes cannot address SETs and are cost expensive to address MBU since there is a need for redundant bits.
A solution to this problem is to design the SETTOFF architecture [10] which can correct SEUs and detect the captured SETs, and the timing error in the logic state. It can also tolerate the soft errors in the system. The detected errors are recovered by the replay recover mechanism. But SETTOFF requires large area compared to that of BISER. So the self-checking capability of SETTOFF is introduced with BISER for less area overhead.
The remaining section of this paper is organized as follows. The previous techniques are explained in section 2. The proposed work is illustrated in section 3. The results and discussion are described in section 4. The paper is concluded in section 5.

Previous Techniques
Soft error mitigation techniques and timing error mitigation techniques [11] are suitable only in register files and are not completely SEU tolerant. The SET and TE occurrences in the combinational gates are detected during the write operation and corrected by the recovery mechanism. If an SEU is detected during hold cycle the recover mechanism cannot further find the last operation which cannot reexecute to overwrite the SEU.
FERST cell [11,12] protects the register that use three C-elements to alleviate both SEU and SET. SEUs tolerance of FERST latches are relatively the same to that of TMR [3]. Injected SETs can be masked by FERST latch if the delay size is correctly selected. FERST can mask only SET and SEU faults.
FERST uses feedback lines and delay elements. The feedback lines are used to mask the SEUs faults and the delay elements mask the SETs faults respectively. The flip-flops are not protected and the preceding state output is with-held during error correction.
The reliability of the system and the data integrity of computer memories semiconductor devices are improved by the error correcting codes. Due to high level of system reliability, this is a cost effective system. SEC-DED [13] is able to detect double error and correct single error in a code word. Similarly DEC-TED [13] is able to detect triple error and correct double error in a code word. In both cases, cell failure may also cause an error and result in chip failure.
In SEC-DED, there may be a false detection or else correction of multiple bit errors which is based on the code word structure. Here every single bit is effective and also results in failure of data during correction. As the double errors are corrected automatically a high level of reliability is maintained. This requires a large number of check-bits due to which there is a complexity to implement the function of detection and correction of error.
Redundancy hardened designs provide soft error rates similar to non-hardened designs within a new technology generation. Impact of scaling on MBUs is investigated using mixed mode simulations. These simulations take days for only one particle strike and need to be repeated for each new circuit technology. BISER [14] are similar to dual interlock cell [15] design in which several devices needs to be upset to corrupt the data stored in the device.
Dual interlock cell does not provide error masking and waits for the error to settle and captures it. The transistor size ratios and the parameter variation requirements are minimum. The latches and flipflops should be replaced to make them upset tolerant.
Some of these codes are capable only for detection of errors. The soft errors occurring in original circuit can be tolerated but the redundant circuits added for error tolerance [16] cannot be protected. Mainly these techniques are used for the correction of soft errors in combinational circuits. Power, area and delay are higher in these techniques.
The self-checking mechanism for correcting the soft errors is applicable only in SETTOFF. The error tolerant analysis provides the errors that can be tolerated and corrupted. Since self-checking is capable only in SETTOFF, the timing requirement is low.
The SET, SEUs that occur in latches, flip-flops are corrected and detected by BISER technique with less area overhead requirement, but this do not provide self-checking capability.
From the above analysis, both SETTOFF and BISER technique are capable of SEU, SET, timing error detection and correction but the selfchecking is implemented only in SETTOFF circuit and not in BISER technique. So the self-checking capability of SETTOFF is to be inbuilt within the BISER circuit by meeting all the requirements.

Proposed Work
The proposed work is implemented by providing the self-checking mechanism within the BISER as shown in Figure 2. Particle that strike TD part induce an Error_SEU_bar signal and this signal will propagate through the correction XOR and correct the output of flip-flop. Error_SEU_bar will be set to logic one during a fault operation, indicating an error so that the correction XOR-gate will invert N to Q. Error_SEU_bar will be set to logic zero during a normal operation, and the flip-flop will operate normally indicating that the input will propagate at the output. TD based part may propagate its output, and hence induce an erroneous Error_SEU_bar signal, which can then propagate through the correction XOR-gate and corrupt the output of the flip-flop. This section introduces a self-checking mechanism. The self-checker can be affected by radiation strikes. SEUs can arise at the state holding nodes which generate a false signal. As with the errors in the TRD part the false signal does not corrupt the register output. Now the self-checking mechanism provided by transition detection part of SETTOFF is inbuilt within the BISER circuit to provide BISER self-checking.

BISER circuit
The BISER circuit is composed of two flip-flops called scan flip-flop and system flip-flop. The BISER circuit can detect the soft errors and the timing errors. The system flip-flop reuses the latch PH2 for n bit, and this consumes less area overhead. But the scan flip-flop latches LA and LB may or may not be reused but it uses the chip area. The system portion uses the similar structure of the scan portion and each portion is a master-slave flip-flop composed of two latches. The scan portion should meet the constraints of the system portion.
All the scanning flip-flops are connected to form shift registers [14]. The operation of BISER is under two operating modes: normal operating mode and fault operating mode. During normal mode, the scanning signals (SCA, SCB, CAPTURE and UPDATE) are asserted by a low value. During fault mode, scanning signals SCA and SCB shifts the contents into latches LA and LB. The UPDATE signal moves the contents of LB to PH2. At last the clock signal CLK is applied and the CAPTURE signal shifts the contents of PH2 to LA.

Normal mode
During normal system mode the scanning signals SCA, SCB, CAPTURE from scanning flip-flop are asserted low and the UPDATE clock from the system flip-flop is high. When the clock signal CLK is low the logic value at the input D is transferred into the latches LA and PH2, and during this time due to the low input clock the latches LB and PH1 are not subjected to soft errors. Similarly when the clock signal CLK is high the input logic values are transferred into the latches LB or PH1 and the latches LA or PH2 are also not susceptible to soft errors. The error free output is obtained from Q1 and Q2.

Fault mode
During fault mode, the scanning signals SCA and SCB from scanning flip-flop are applied as an alternative into the latches LA and LB. The contents in latches LB is shifted to PH1 by the UPDATE signal from the system flip-flop. The functional clock signal CLK is applied to capture the response of the system and moves the contents of PH2 to LA by the CAPTURE signal from the scanning flip-flop. By alternatively applying the scanning signals SCA and SCB the responses of the system is scanned and are liable to soft errors and the timing errors. The soft errors can affect flip-flops only or both flip-flop and memories. The susceptible soft errors are detected and the corrected outputs are obtained from Q1 and Q2.

BISER with Self-checking
The self-checking mechanism is provided with the rising and falling transition. The two branches d0, d2 and d1, d3 are both enabled to capture the pulses generated for the rising and falling transition. Here d0 and d2 provide the rising transition; d1 and d3 provide the falling transition. The cross coupled inverter pairs prevents the transistor from discharging or charging due to leakage currents.
The C-XOR provides the correction of errors and D-XOR provides the detection of errors. When the error signal is logic one, the latches LA and LB become complements of latches PH2 and PH1 respectively, indicating that a soft error has occurred in any of the register. The error signal continues to be logic one until another soft error affects any one of the register.
If the clock input is logic zero and a soft error occurs in any of the latches, the error will not be found at the output Q. Similar results will be produced, if soft error occurs in latches PH2 or LB when the clock input is logic one and this depends on the speed.

Normal mode
During normal operating mode the latches LA and LB store the redundant copies of the latches PH2 and PH1 respectively, which makes the error signal to be low indicating there is no presence of error. So the error free output is correctly obtained from Q1 and Q2.

Fault mode
During fault operation the error signal is high which indicates the presence of error. Once the error signal is 1, the values stored in latches LA and LB become complements of the latches PH2 and PH1 respectively. Thus, the error is trapped until another soft error affects one of the latches of this flip-flop. After a pre-specified number of clock cycles, at a recovery point the system shifts out this trapped error signal using the existing scan path, which eliminates the error. Now the errors are corrected and the corrected errors are obtained from Q1 and Q2.
The redundant scan resources such as LA and LB are reused as scan flip-flops. Reusing these scan flip-flop in BISER with self-checking provides the following advantage over other techniques; by reusing the resources already available requires minimum area and routing overhead and also it can be applicable to any design.
Any soft error affecting a single latch or flip-flop inside a register is detected by the self-checking capability. This provides the self-checking capability of SETTOFF within the BISER to reduce the area overhead compared with SETTOFF.

Results and Discussion
BISER with self-checking circuit has been implemented for soft error correction during normal operating mode and fault operating mode. The presence and absence of errors are indicated by the error signal. The error signal is low during normal operation with absence of error and is high during fault operation with presence of error. In this case the occurrence of error is at the input voltage. The simulation of the circuit is done in ORCAD SPICE.

Normal mode
The input is provided to the D flipflop. The scanning signals SCA and SCB are applied alternatively into latches LA and LB, UPDATE signal move the contents of LB to PH1, CAPTURE signal move the contents of PH1 to LA.
During self-checking mechanism the error signal indicates the presence of error. During normal operation the error signal is low (0) indicating there is no error. The corrected errors are obtained from Q1 and Q2. Since there is no presence of error there is no possible for the inversion of data bits. The simulated waveform is shown in Figure 3.

Fault mode
The input is provided to the D flipflop. The scanning signals SCA and SCB are applied alternatively into latches LA and LB, UPDATE signal move the contents of LB to PH1, CAPTURE signal move the contents of PH1 to LA.
During self-checking mechanism the error signal indicates the presence of error. During fault operation the error signal is high (1) indicating presence of error. The fault can occur randomly at any node of the circuit and gets affected during any phase of the clock cycle. Here the error fault is provided at the voltage input. The corrected errors are obtained from Q1 and Q2 by inverting the data bits at the fault tolerant.
The faulty operation circuit is then switched back to the normal operation state only when soft error is overwritten by the next state input. The simulated waveform is shown in Figure 4.
The synthesis report analysis of SETTOFF circuit, BISER circuit and BISER with self-checking circuit for a single bit is shown in the Table 1. In case of SETTOFF, the whole SETTOFF circuit has to be reused for the number of input bits provided (i.e., for n bit input, n number of SETTOFF circuits are needed); which requires large area and power consumption. But in case of BISER, only the scan portion is reused for the provided number of input, and so the area, power dissipation is less compared to SETTOFF. But BISER does not provide self-checking as of SETTOFF, so BISER with self-checking is simulated.
The self-checking mechanism is provided with the BISER circuit, so this requires less power dissipation and delay requirement. Compared with the other techniques BISER with self-checking provides less area overhead, power dissipation and delay.

Conclusion
This paper implements the design of BISER with self-checking for soft error correction. SETTOFF provides self-checking and it operates under normal and fault operation, Error_SEU_bar indicates low signal (0) for a normal operation (absence of error) and it indicates high signal (1) for a fault operation (presence of error). The result obtained as part I and part II for normal operation and fault operation is analyzed. But SETTOFF has large area overhead, power and delay compared to that of BISER. BISER circuit was implemented for both normal and fault operation, but it does not provide self-checking. So BISER with selfchecking has been designed for both normal and fault operation and has been implemented. The results have been obtained for normal and fault operation. The synthesis results obtained for both SETTOFF and BISER have been compared with BISER with self-checking. The obtained results show that BISER with self-checking circuit has lower area overhead, power and delay compared to SETTOFF and other techniques.