# Low Power Charge Recycling D-FF

Karol Niewiadomski, Dietmar Tutsch University of Wuppertal Chair of Automation and Computer Science Wuppertal, Germany Email: {niewiadomski, tutsch}@uni-wuppertal.de

Abstract—The rising number of mobile applications leads to the necessity of powerful and energy-efficient designs. Field Programmable Gate Arrays (FPGAs) depict a suitable solution to this upcoming challenge. In the recent years, different FPGA designs have been released, covering the range from low-cost demands up to high-end applications in different industries. The downside of the increasing number of electronic functions in, e.g., vehicles, smartphones, etc., is limited resources of the builtin batteries. To overcome these limitations, appropriate power reduction measures have to be implemented at the circuit and architectural level. The correct function of each FPGA relies on data flip-flops (D-FFs) as basic data storage element. In this paper, a new D-FF cell design is introduced and implemented with focus on substantial power savings for low power applications and a higher resistance against differential power analysis (DPA), which is an inevitable step of side-channel attacks. This new D-FF design is compared to various, already existing implementations.

Keywords—FPGA; D-FF; charge recycling; leakage-current reduction; differential power analysis.

## I. INTRODUCTION

Mobile applications like notebooks, smartphones, tablets and wearables have changed the usage behavior over the last years. The access to information shall be available everywhere and completely independent from classic computers. This trend can be clearly seen in the current digitalization of vehicles, providing more and more features like driving assistance systems and interfaces for the connection of smartphones for displaying installed apps on the embedded infotainment system. A modern, upper-class vehicle contains more than 70 electronic control units (ECUs) to provide all features desired by consumers these days [1]. Such applications rely on the provision of sufficient processing power, which in turn requires adequate energy resources. Both, handheld computation units and vehicles have only limited battery capacities, therefore a necessity for power optimized integrated electronics is given.

One approach to overcome these challenges are FPGAs. These integrated circuits play a major role for the realization of adaptive and efficient systems, offering vast reconfiguration abilities [2] [3]. Reconfigurability goes back on arrays of memory cells like static random access memory (SRAM). In order to optimize an FPGA in terms of energy efficiency, these memory cells have to be extended with power reduction measures [4]. In addition to that, each FPGA works with flipflops, which have an influence on the overall speed of the design since they are driven by the system clock. Furthermore, approximately 30% - 70% of the total power in a clocked design is dissipated by the clocking network, which is absolutely crucial for the operation of these circuits [5]. In consequence, by carefully re-designing these commonly used D-FFs, energy consumption can be decreased by applying static and dynamic power reduction measures. Power constraints are one of the most important challenges in modern circuit design. In addition, cyber security has become a frequently discussed topic in recent years, due to many incidents and a rising awareness for data protection. Side-channel attacks, which are based on differential power analysis, illustrate a possibility how to reveal confidential data without physical access to critical devices [6]. Thus, dedicated circuit modifications at circuit level shall be used for catching potential threats.

In this paper, we investigate selected D-FF cell designs on their low power characteristics, which can not be neglected in battery-powered systems. In Section II, we give an overview about related work and key aspects of dependencies between performance and power consumption. In Section III, we investigate a selection of existing D-FF designs on their assets and drawbacks and discuss the simulation results. In Section IV, we present our charge recycling (CR) D-FF and explain the implemented circuit improvement methods for static and dynamic power reduction. In Section V, we discuss simulation results of the D-FF and analyze the benefits of power reduction measures based on these simulations. In Section VI, all previous discussions are summarized and concluded.

### II. RELATED WORK

D-FFs are the working horse in different applications, like storage registers, counters, frequency dividers, etc. FPGAs resort on these circuits in each slice, which is a basic computational element, shown in Figure 1.



Fig. 1. Simplified SLICE structure of an FPGA

Each slice contains one D-FF for storage of computed values prior to forwarding them to the next configurable logic block (CLB). Since even a low-cost FPGA, e.g., Xilinx Spartan 3A, contains up to 8320 CLBs [7], one can see the strong impact on area and energy consumption of these clocked devices. The relation between consumed power and the supply voltage, load capacitance and system clock can be seen in (1):

$$P = \alpha C V^2 f_{Clk} \tag{1}$$

The activity factor  $\alpha$  represents the cadence of write requests. A reduction of  $\alpha$  can be achieved by special memory cell designs [8] or alternatively with auxiliary comparator circuitry. Another efficient approach is reducing the operating voltage. This can be achieved by techniques like dynamic voltage scaling (DVS), which was evaluated in various publications [9]. Power gating is certainly the strongest way to achieve a measurable reduction of energy consumption. However, this can be only applied, if there is no focus on data retention. A further possibility for raising the energy efficiency is lowering the clock frequency  $f_{Clk}$ . Circuitry, which is not timing critical can be clocked down to a minimum speed, which ensures a reliable operation of the system. If certain circuit parts can be completely stopped while retaining stored logic values, full clock gating can be a feasible solution to save power [10]. Both methods can be combined on a coarse-grain or fine-grain level.

These techniques are only an extract of a set consisting of different methods on how to handle the challenges of demanding functions. A majority of these solutions require additional circuitry to be added and implemented at a higher architectural level. Our approach goes one step further and is based on direct circuit level improvements to a D-FF by reasonable selection of a suitable D-FF cell design and substantial modifications of the internal cell circuitry to achieve better efficiency. The improvements achieved on that level are essential for important energy dissipation suppression and are an inevitable step for optimization to be combined with architectural amendments.

### III. D-FF CELL DESIGNS

Different concepts have been introduced in the recent years. In general, we can distinguish between latches and flip-flops. Whilst latches are level-sensitive designs, flip-flops are egdesensitive. Latches are transparent and therefore not suitable for timing-critical applications due to possible glitches in the signal path. For avoiding glitches and in consequence timing problems in complex designs, many flip-flop designs implicate the principle of cascading master-slave D-FFs. This standard design in shown in Figure 2.

Both, master and slave unit, consist of a feedback loop of inverters and transmission gates. Once Clk is set to *HIGH*, the input data provided by D is latched in the master circuit. At this point, the transmission gate connecting master and slave circuit, is in cut-off mode and therefore avoiding any glitches, e.g., direct throughput of D to Q. When Clk is set



to *LOW*, the stored data at the output of the master circuit is latched by the subsequent slave unit and provided at the output node Q. Any changes of D will not influence the logic value stored at Q due to the fact that both transistors of TG 1 are in cut-off mode. This legacy design was the starting point for numerous variations in the past. All simulations have been performed with Cadence tools and a 90nm technology provided by TSMC at an ambient temperature of  $27^{\circ}$ C. The clock frequency was set to 250MHz.

1) SET D-FF: A simplified implementation is shown in Figure 3. Whilst the reference design of a D-FF uses 16 transistors in total, this design consists of 10 transistors only, leading to a higher chip density and reduced manufacturing costs [11].



Fig. 3. Single Edge Triggerd D-FF

Instead of 4 TGs, this design works with 1 TG and achieves the same function by replacing the remaining TGs by nMOS transistors. This reduction of transistors comes along with cutting down the number of slower and larger pMOS transistors. Furthermore, this implementation provides the generation of both Q and  $\overline{Q}$ . The functionality of the SET D-FF is similar to the reference design: glitching is avoided by complementary control of both pass-transistors M1 and M2. Latching and generation of the output values is done in the feedback loop after the activation of M2. Analog to the previous standard design, this concept relies on the preparation of complementary Clk signals, which requires additional circuitry for signal generation.

2) Low-power D-FF: Another variation, which displays an attempt on how to optimize a D-FF with respect to power consumption, is shown in Figure 4. The key aspect of this design is to eliminate short-circuit power dissipation from the feedback path [12] due to the tri-state inverter. Although keeping the same number of transistor like in the reference

design, considerable power savings can be achieved. This will be discussed in the last section of this paper.



Fig. 4. Low-power modification of D-FF

3) PPI D-FF: In order to get a better performance of a conventional D-FF, the Push-Pull-Isolation (PPI) D-FF was presented in [12]. The main advantage of this implementation is the reduced clock-to-output delay from two gates in the reference design to one gate in the PPI D-FF, which is shown in Figure 5.



The insertion of an inverter and a TG between the output nodes of master and slave latches provides a push-pull effect at the slave latch. In consequence, the input and output of the inverter in the slave unit will be driven to opposite logic values during operation. This design is approximately 31% faster than the reference D-FF, but has a power overhead of 22%. To counter the increased power consumption 2 pMOS transistors, M1 and M2, are added to the feedback loops in the master and slave latches. In direct comparison with the conventional D-FF, the PPI D-FF improves speed by 56% at an expense of 6% of additional power dissipation.

For all introduced cell designs in this paper, the average power consumption, the maximum and minimum power consumption during simulation time were traced and summarized in Table I. These results show that the reference D-FF dissipates the highest average power consumption by 1186nW, due to lack of power savings measures. The maximum power dissipation confirms this result by revealing a higher consumption by the factor of approximately 4 in direct comparison with the optimized low-power D-FF. However, this result was expected and highlights the improvements of previously introduced designs.

TABLE I. SIMULATION RESULTS (PWR)

| D-FF Type | Average Power nW | Max. Power uW | Min. Power fW |
|-----------|------------------|---------------|---------------|
| Reference | 1186             | 233.3         | 51.47         |
| SETD      | 280.3            | 26.21         | 22.39         |
| Low-power | 272.7            | 61.55         | 19.92         |
| PPI       | 435.4            | 88.71         | 28.01         |

On the other hand, similar results are reflected by measuring the leakage current of each design, shown in Table II. The reference D-FF exhibits the highest average leakage current  $I_{leak}$  by 1262nA, which is approximately fivefold higher than average  $I_{leak}$  of the low-power D-FF. Analog to the average leakage current, the maximum leakage current is also allocated to the reference design and points out that all power-optimized variations perform better in terms of energy efficiency.

TABLE II. SIMULATION RESULTS  $I_{Leak}$ 

| D-FF Type | Avg. Current nA | Max. Current uW | Min. Current uW |
|-----------|-----------------|-----------------|-----------------|
| Reference | 1262            | 336.3           | 346             |
| SETD      | 265.7           | 48.94           | 50.41           |
| Low-power | 235.1           | 28.83           | 45.9            |
| PPI       | 403.7           | 39.4            | 56.35           |

The respective simulation results are shown in Figure 6, which illustrates the input signal D, the clock signal Clk and the respective power dissipation output profiles for the presented input sequence with an alternating  $0\rightarrow 1\rightarrow 0\rightarrow 1$  sequence.



Fig. 6. Comparison Results

All designs exhibit strongly varying power consumption for each transition on the input nodes during the rising edge of the Clk signal, which comes along with an exploitable vulnerability for side-channel attacks. Glitches can be identified during the falling edge of Clk, which indicates weaknesses in the latching mechanism of master and slave latch, therefore revealing undesired transparency. None of the previously presented designs is optimized in terms of static leakage current suppression or energy recovery during runtime, which will be key aspects of our presented design in the next section.

# IV. CR D-FF

Based on the analysis of drawbacks of existing D-FF designs, we present a new approach of a low-power, energy-efficient and glitch-free D-FF, which is suitable for security-relevant applications with limited energy resources. Referring to the standard design shown in Figure 2, our intention was to redesign a new flip-flop cell from scratch. Without any direct relation to the D-FFs presented in the previous section, we present our charge recycling (CR) D-FF, which is illustrated in Figure 7.



Fig. 7. CR D-FF

This design features a series of dedicated power savings mechanisms, which will be discussed in the following sections.

#### A. Charge recycling

Storing and processing logic values in flip-flops, registers, memories leads to charging and discharging of parasitic capacitances, which are an essential part of each integrated circuit. Since the CR D-FF features dynamic logic, periodic charge & discharge cycles are an integral part of the intended function and require special attention during the design. This design works with 2 alternating phases during runtime: precharge & evaluate, which are both triggered by the Clksignal. Whilst Clk turns to LOW, M5 is turned on and in consequence also switching on the pMOS transistors M3& M6. Illustrating a critical point with respect to power savings within an integrated circuit, the precharge phase is the more deciding one. Due to the fact that these transistors are therefore in a conducting state, the capacitances at the output nodes  $Out \& \overline{Out}$  are shortened. Hence, not discharged electrons at one of the complementary output nodes are used for charging the previously discharged output node. This effect is used for equilibrating electron charges and thus relieving the battery due to the fact that less energy is needed. This is a strong method for achieving a better performance in terms of dissipation reduction during dynamic behavior.

After Clk applies a logic HIGH at the gate of M4, this transistor is turned off whereas M11 is turned on and subsequently starting the evaluation phase in terms of sensing the difference between the complementary inputs  $D \& \overline{D}$ . One of the various benefits of sense amplifier based logic is that even a small  $\Delta$  voltage between both input signals will be sensed and evaluated, providing a higher speed of the D-FF.

#### B. Dual Threshold CMOS

Leakage currents  $I_{leak}$  during standby contribute to a significant amount of total dissipation loss. By adding dedicated countermeasures, appreciable power savings can be achieved without investing much effort for realization. This can be done by the usage of transistors with a high threshold voltage  $V_{th}$ . Transistors with a high  $V_{th}$  require a proportional higher  $V_{GS}$  voltage at their gate nodes in order to be turned on, which implies a mitigation of leakage currents. This method can be combined be applying a negative  $V_{GS}$  for leading transistors into a deep turn-off status and therefore supporting suppression of leakage currents. This technique should be only applied carefully on circuit parts, which are not timing-critical since higher threshold voltages usually equal in slower signal transition. All transistors in our design are high  $V_{th}$  transistors for the sake of strongest suppression of  $I_{leak}$ .

#### C. Multi-oxide technology

Closely related to the previous section, static power dissipation can be further decreased by improving the tunneling-barrier for electrons. Undesired tunneling of electrons through the gate to bulk leads to current flows, which shall be eliminated. The relation between  $I_{leak}$  and the tunneling-barrier is shown in (2):

$$I_{leak} \propto A \left(\frac{V_{ox}}{T_{ox}}\right)^2 \tag{2}$$

Increasing the tunneling-barrier can be realized by increasing the gate oxide thickness  $T_{ox}$ . A higher oxide thickness leads immediately to a reduction of the tunneling current density  $I_{leak}$ , following the goal to extend battery lifetime of mobile devices even in standby mode. The drawback of this technique is similar to the previous one: penalty of the circuit speed may occur if not applied carefully. Based on this reason, we decided to use high  $T_{ox}$  transistors for M4, M5 and M11. All of these transistors are not timing-critical, since M4 is used to activate a dedicated sleep mode and M5 for balancing the outputs. All of these functions are not slowing the circuit speed.

### D. Clk- and power-gating

For further reduction of dynamic power dissipation, cutting off the Clk signal leads to transfer the circuit to a hold state, while maintaining the stored data inside the latches. Circuitry, which is not executing different operations over runtime, can be kept in a WAIT state, ready to continue calculation whenever the Clk signal is set to HIGH again. In the proposed design, M5 & M11 are used for stopping the D-FF from operating, but still keeping the correct data at the outputs of the cross-coupled inverters. Of course, additional circuitry driving and distributing the Clk signal over a whole design is an indispensable requirement. This can be provided by digital clock managers (DCMs), which are not covered by this paper.

In case that data storage is not necessary, gating of the supply voltage is an effective method how to save power in unused parts of a circuit. Power gating can be applied on different hierarchical levels. Our decision was to follow a finegrain approach, leading to equipping the proposed D-FF with a power gating transistor M4. If the *SLEEP* signal turns from 0 to 1, M4 is off and therefore disconnecting the D-FF from  $V_{dd}$ . If this technique is applied in accordance with clock gating, total rail-to-rail-decoupling ( $V_{dd} \& Gnd$ ) can be realized.

#### E. Stacking

Transistor stacking is a further, strong technique for subthreshold current reduction. Stacking transistors means to increase to source voltage  $V_S$  while keeping the gate voltage  $V_G$  at the same level. At a certain point of time,  $V_{GS}$  becomes negative, which leads the transistor into super cut-off mode and turns it deeply off. The more transistors are stacked in series, the better leakage current reduction will be. However, the most significant results can be achieved by adding a second transistor in series, because the effect of subthreshold current reduction becomes diminished with a rising number of transistors. Our proposed D-FF features stacking as a design principle, e.g., in the pull-down-networks of the slave latch, realized by M16 M17 and M20 & M21.

#### V. SIMULATION RESULTS

The CR D-FF senses the inputs  $D \& \overline{D}$  at the positive edge of Clk and stores these data independently from any changes at the input nodes of this circuit. Due to all implemented circuit improvements, an average static leakage current of 173nA is achieved, which is sufficiently low to be accepted. During the negative edge of Clk, the CR D-FF turns into the precharge phase, where all internal and external nodes are charged. The characteristic curves in Figure 8 show one beneficial features of the CR D-FF over the other discussed designs. This can be seen in both output curves of  $Q \& \overline{Q}$ .

Since this design features charge recycling, the output nodes and all internal nodes are precharged to  $V_{dd} - V_{th}$  only, which is beneficial for the energy balance of this circuit. The reason for this is that precharge is finished by achieving an output voltage, which is one threshold voltage below  $V_{dd}$ . Thus, the less energy from the power supply is required for precharging the CR D-FF, the more suitable circuitry for lowpower applications will be. Based on the reduced voltage range at the outputs of the master latch, it is possible to decrease permanently the supply voltage  $V_{dd \ Slave}$ . Hence, we choose a supply voltage of 800mV for the conventional slave circuit, which supports further power dissipation reduction. For a better comparison, we enhance Table I with relevant simulation results of the CR D-FF, shown in Table III.

The results in Table III show that the introduced CR D-FF outperforms most of the previously analyzed designs in terms



TABLE III. SIMULATION RESULTS (PWR)

| D-FF Type | Average PWR nW | Max. PWR uW | Min. PWR fW |
|-----------|----------------|-------------|-------------|
| Reference | 1186           | 233.3       | 51.47       |
| SETD      | 374.1          | 32.01       | 22.39       |
| Low-power | 275.7          | 73.89       | 19.92       |
| PPI       | 435.4          | 110.5       | 172.3       |
| CR        | 303.5          | 13.84       | 27.59       |

of average power consumption. It achieves the second-best performance for average power consumption (319.7nW) and the best result for maximum power dissipation (13.84uW). The minimum power consumption of 27.59fW can be neglected, since the influence of these contributions is not significant for the overall performance of all discussed designs. Even though the conventional low-power flip-flop achieves a slightly lower average power consumption than the CR D-FF, the peak power dissipation is approximately quintuple higher and it offers no resistance features against DPA. Figure 9 shows a comparison of the average power consumption.



It can be clearly seen in Table III that the CR D-FF provides the most constant power consumption among all considered designs, therefore also providing the best opportunities to be chosen in security-sensitive applications. The smaller the differences in energy consumption between each data transition are, the more difficult a differential power analysis will be, which is always the starting point for a sidechannel attack. Hence, the introduced CR D-FF provides both, remarkable low-power characteristics for mobile, embedded circuitry, which comes along with a necessity for robustness against intended attacks. However, benefits in superior energy efficiency and noticeable robustness against differential power analysis come at the cost of a higher number of transistors, shown in Table IV.

TABLE IV. TRANSISTOR COUNT AND POWER VARIATION

| D-FF Type             | Reference | SETD | LP    | PPI   | CR  |
|-----------------------|-----------|------|-------|-------|-----|
| No. of transistors    | 16        | 10   | 16    | 18    | 21  |
| Max. PWR $\Delta$ (%) | 18.78     | 94.7 | 94.03 | 98.62 | 6.8 |

This fact usually leads to a penalty in required area for manufacturing, which is certainly an aspect to be considered. A CR D-FF consists of 21 transistors and requires preparation of complementary input signals, which depend on additional wiring and therefore lead to extra area on the chip. On the other hand, this implementation provides also 2 complementary outputs with no delay between both signals and no necessity of additional circuitry for generation. Table IV also emphasizes the differences between the analyzed cells in switching behavior. Whilst the  $\Delta$  of dissipated power of the CR D-FF never exceeds variations of 6.8% in maximum, the results of the alternative designs show much higher noticeable differences. Despite the fact that all designs have been analyzed without putting a stronger focus on speed and timing aspects, further measurements on the maximum operating frequency have been done. For this purpose, the elapsed time for each switching transition was measured and compared against each other. Figure 10 illustrates a direct comparison of the output Q of all considered circuits after being stimulated with an input signal D. Depending on the switching transition and the characteristics of the flip-flops, expected differences on the edge steepness can be identified.



Fig. 10. Comparison of Switching Transitios Of All Designs

Based on these simulation results, the consumed time for a  $HIGH \rightarrow LOW$  and a  $LOW \rightarrow HIGH$  transition has been measured and summarized in Table V. The maximum achievable switching frequency  $f_{max}$  reveals the penalty in operating speed of the CR D-FF, due to the increased number of transistors. However, a maximum switching frequency of  $\approx 6.4GHz$  is still a notable result.

TABLE V. TIMING COMPARISON

| D-FF Type | T High-Low ps | T Low-High ps | Max. freq. GHz |
|-----------|---------------|---------------|----------------|
| Reference | 42.5          | 58.3          | 9.9            |
| SETD      | 422           | 101           | 1.9            |
| Low-power | 43.63         | 51.58         | 10             |
| PPI       | 60.48         | 79.16         | 7.1            |
| CR        | 41            | 114           | 6.4            |

#### VI. CONCLUSION

We analyzed a selected number of existing flip-flop designs upon their characteristics and suitability for usage in lowpower applications. Beside that, we have investigated each design on its capabilities to be resistant against differential power analysis. Our goal was to design a D-FF, which provides both, a remarkable reduction of power consumption and robustness against side-channel attacks. Hence, we designed a charge recycling D-FF, which uses the not discharged electrons at one of the complementary output nodes to support the battery during the precharge phase. This benefit comes along with the fact that the outputs of the master latch are precharged to  $V_{dd} - V_{th}$  only, providing the opportunity to power the slave latch with the same supply voltage ( $\approx 800mV$ ). Furthermore, we applied additional power saving modifications and achieved remarkable improvements of power reduction and standby leakage suppression. Simulation results have shown that the CR D-FF offers the best overall performance with an average power consumption, which reduced the dissipated power by about  $\approx 75\%$ . Complementary generation of output signals with no requirement for delay correction is a further advantage of this circuit when compared to other designs, which do not feature parallel, complementary creation of D & D. The variations of the measured power consumption do not exceed differences of  $\approx 7\%$  and remain constant independent from the switching event, which is sufficient to withstand differential power analysis and which is not achieved by the alternative flip-flops. These benefits come at the cost of a higher number of required transistors and the layout after synthesis of a CR D-FF requires careful routing of all metal interconnections between these cells for keeping the parasitic capacitances as equal as possible.

#### ACKNOWLEDGMENT

The authors thank Pierre Mayr, from Ruhr University of Bochum, for his advice on verification strategies and procedures. We would like to give credit to Grant Martin, from Cadence Tensilica, for many interesting discussion about embedded devices and low-power technologies. We are grateful to Andreas Ullrich, from University of Wuppertal, for immediate PDK / tool support.

## REFERENCES

- [1] S. Fürst, "Challenges in the design of automotive software," in Proceedings of the Conference on Design, Automation and Test in Europe, ser. DATE '10. 3001 Leuven, Belgium, Belgium: European Design and Automation Association, 2010, pp. 256–258. [Online]. Available: http://dl.acm.org/citation.cfm?id=1870926.1870987
- [2] M. Ullmann, M. Hübner, B. Grimm, and J. Becker, "An fpga run-time system for dynamical on-demand reconfiguration," in *Parallel and Distributed Processing Symposium*, 2004. Proceedings. 18th International. IEEE, 2004, p. 135.
- [3] R. A. et al., "Towards a dynamically reconfigurable automotive control system architecture," in *Embedded System Design: Topics, Techniques* and Trends. Springer, 20017, pp. 71–84.
- [4] K. Niewiadomski, C. Gremzow, and D. Tutsch, "4t loadless srams for low power fpga lut optimization," in *Proceedings of the 9th International Conference on Adaptive and Self-Adaptive Systems and Applications* (ADAPTIVE 2017), February 2017, pp. 1–7.
- [5] V. Stojanovic and V. G. Oklobdzija, "Comparative analysis of masterslave latches and flip-flops for high-performance and low-power systems," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 4, pp. 536–548, Apr 1999.
- [6] K. Tiri and I. Verbauwhede, "A vlsi design flow for secure sidechannel attack resistant ics," in *Proceedings of the Conference on Design, Automation and Test in Europe - Volume 3*, ser. DATE '05. Washington, DC, USA: IEEE Computer Society, 2005, pp. 58–63. [Online]. Available: http://dx.doi.org/10.1109/DATE.2005.44
- [7] XA Spartan-3A Automotive FPGA Family Data Sheet, Xilinx, 04 2011, rev. 2.0.
- [8] R. E. Aly, M. I. Faisal, and M. A. Bayoumi, "Novel 7t sram cell for low power cache design," in *Proceedings 2005 IEEE International SOC Conference*, Sept 2005, pp. 171–174.
- [9] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, *Digital integrated circuits- A design perspective*, 2nd ed. Prentice Hall, 2004.
- [10] C. Maxfield, The Design Warrior's Guide to FPGAs: Devices, Tools and Flows, 1st ed. Newton, MA, USA: Newnes, 2004.
- [11] M. Sharma, A. Noor, S. C. Tiwari, and K. Singh, "An area and power efficient design of single edge triggered d-flip flop," in 2009 International Conference on Advances in Recent Technologies in Communication and Computing, Oct 2009, pp. 478–481.
- [12] U. Ko and P. T. Balsara, "High-performance energy-efficient d-flip-flop circuits," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 8, no. 1, pp. 94–98, Feb 2000.